Fact-checked by Grok 2 weeks ago

Markov model

A Markov model is a stochastic model that describes a system transitioning between a finite or countable set of states, where the probability distribution for the next state depends solely on the current state and not on the prior history of the system, a property known as the Markov property or memorylessness.^[1] This framework, foundational to probability theory, enables the modeling of random processes in discrete or continuous time, often represented through transition probabilities organized in matrices for discrete cases.^[1] The concept originated with Russian mathematician Andrey Andreyevich Markov (1856–1922), who developed it as an extension of the law of large numbers to dependent random variables, first publishing on the topic in 1906 while analyzing sequences of dependent events, such as letter transitions in Alexander Pushkin's novel Eugene Onegin.^[2] Markov's work built on earlier probability research by Pafnuty Chebyshev and addressed limitations in classical independence assumptions, proving results like the central limit theorem for such chains under general conditions.^[2] Subsequent advancements, including continuous-time processes by Andrey Kolmogorov in 1931, expanded the theory to broader stochastic processes.^[3] Key components of a Markov model include states (possible conditions of the system), transition probabilities (likelihoods of moving between states), and initial state distributions.^[4] In discrete-time models, known as Markov chains, these are often visualized as state diagrams or matrices, with properties like irreducibility (reachability between all states) and aperiodicity determining long-term behavior, such as convergence to a stationary distribution.^[1] Extensions like hidden Markov models (HMMs) account for unobserved states, inferring them from observable outputs via algorithms such as the Viterbi or forward-backward methods, which have proven computationally efficient for sequential data analysis.^[5] Markov models find applications across diverse fields due to their ability to capture sequential dependencies. In finance, they model asset price movements and predict market states for risk assessment.^[6] Public health uses them to simulate disease progression and evaluate intervention strategies, such as in epidemiology for tracking infection spread.^[6] In computer science and machine learning, HMMs power speech recognition, natural language processing, and bioinformatics tasks like gene sequence prediction.^[5] Engineering applications include reliability analysis of fault-tolerant systems, where states represent operational modes and transitions model failure rates.^[4] Other notable uses span queueing theory, genetics,^[5] and operations research, such as optimizing bike-sharing or delivery networks.^[6]

Fundamentals

Definition and Markov Property

A Markov model is a stochastic model that describes a sequence of possible events, in which the probability of each event depends only on the state attained in the previous event.^[7] This framework captures systems evolving over time where future behavior is determined solely by the current configuration, making it a fundamental tool in probability theory for modeling dependencies in sequences.^[8] The core distinguishing feature of a Markov model is the Markov property, also known as the memoryless property, which formalizes this dependence structure. Mathematically, for a stochastic process \{X_n\}_{n \geq 0}, the property states that

P(X_{n+1} = j \mid X_n = i, X_{n-1} = i_{n-1}, \dots, X_0 = i_0) = P(X_{n+1} = j \mid X_n = i)

for all n \geq 0 and states i, j, i_{n-1}, \dots, i_0.^[9] This implies that the process has no memory beyond the immediate past, allowing predictions of future states to rely exclusively on the present one.^[10] In the discrete-state, discrete-time case, transition probabilities are compactly represented by a transition matrix P, where each entry p_{ij} = P(X_{n+1} = j \mid X_n = i) gives the probability of moving from state i to state j.^[11] A key concept is the stationary distribution \pi, a probability vector satisfying the balance equation

\pi P = \pi, \quad \sum_{i} \pi_i = 1,

which describes a long-run equilibrium where the state distribution remains unchanged under transitions.^[12] For illustration, consider a basic weather model with states "sunny" and "rainy": if today is sunny, the probability of sunny tomorrow is 0.9 and rainy 0.1; if today is rainy, the probability of sunny tomorrow is 0.5 and rainy 0.5. Here, tomorrow's weather probability depends only on today's state, exemplifying the Markov property without regard to prior days.^[13] Markov models are classified based on the nature of states and time: discrete-state discrete-time models use countable states and integer time steps; discrete-state continuous-time models maintain countable states but allow transitions at arbitrary real times; continuous-state variants extend to uncountable state spaces, such as positions in diffusion processes.^[14] These variants share the Markov property but differ in their analytical tools and applications.^[15]

Historical Development

The origins of Markov models can be traced to the work of Russian mathematician Andrey Andreyevich Markov (1856–1922), who sought to extend classical probability theory beyond the assumption of statistical independence in sequences of events. In his 1906 paper, titled Rasprostranenie zakona bol’shikh chisel na velichiny, zavisiaschchie drug ot druga (Extension of the law of large numbers to quantities depending on each other), Markov introduced the concept of dependent random variables by analyzing the sequence of approximately 20,000 letters in Alexander Pushkin's verse novel Eugene Onegin. He categorized letters as vowels or consonants and calculated transition probabilities, such as the probability of a vowel following a consonant (approximately 0.663), to demonstrate that successive elements in a sequence could exhibit limited dependence while still satisfying the law of large numbers. This work explicitly rejected the rigid independence assumption dominant in the probability school of Pafnuty Chebyshev, Markov's mentor, and laid the groundwork for modeling sequences where the future state depends only on the current one.^[16] Markov further developed these ideas in his 1913 memoir, Primer statisticheskogo issledovaniia nad tekstom "Evgeniia Onegina" po povodu odnoy zadachi iz teorii veroiatnostei (An example of a statistical investigation of the text of Eugene Onegin in connection with a problem in probability theory), which extended the chain model to general sequences of dependent phenomena beyond binary states. This publication, part of a symposium Markov organized to commemorate the 200th anniversary of Bernoulli's law of large numbers, formalized higher-order dependencies and applied the approach to broader statistical regularities. In the 1930s, Soviet mathematician Andrey Kolmogorov advanced the framework significantly, establishing the general theory of Markov processes in works such as his 1931 monograph Analiticheskie metody v teorii veroiatnostei (Analytical methods in probability theory) and a 1936 paper on stochastic processes with independent increments, which provided rigorous foundations for both discrete and continuous cases. Early applications of these models emerged in genetics during the same decade, where J.B.S. Haldane employed stochastic chain-like processes to model gene frequency fluctuations and mutation-selection equilibria in populations.^[16]^[17] Post-World War II developments accelerated the field's maturation, particularly with the introduction of continuous-time Markov chains by William Feller in his influential 1950 textbook An Introduction to Probability Theory and Its Applications, Volume I. Feller's treatment integrated chains into broader probability applications, emphasizing birth-death processes and limit theorems essential for queueing and reliability analysis. By the 1950s, the term "Markov chains" had become standard—first coined by Sergei Bernstein in 1926 but widely adopted during this period—fueling growth in operations research, computing, and simulation methods as computational tools emerged.^[16]

Markov Chains

Discrete-Time Markov Chains

A discrete-time Markov chain (DTMC) is a stochastic process \{X_n : n = 0, 1, 2, \dots \} with a discrete state space S, which may be finite or countably infinite, where the Markov property holds: the probability distribution of X_{n+1} depends only on X_n and not on prior states.^[18] Formally, for states i, j \in S and any history X_0 = i_0, \dots, X_n = i_n, the transition probabilities satisfy P(X_{n+1} = j \mid X_0 = i_0, \dots, X_n = i_n) = P(X_{n+1} = j \mid X_n = i_n) = p_{i_n j}, where p_{ij} \geq 0 and \sum_{j \in S} p_{ij} = 1 for all i \in S.^[19] These transition probabilities form the rows of the transition matrix P = (p_{ij}), a right-stochastic matrix.^[20] Key concepts in DTMCs include the classification of states based on return probabilities and periodicity. A state i \in S is absorbing if p_{ii} = 1, meaning the chain remains there indefinitely once entered.^[18] States are transient if the probability of eventual return to i starting from i is less than 1, and recurrent if this return probability is 1; recurrent states are further divided into null recurrent (infinite expected return time) and positive recurrent (finite expected return time).^[21] In an irreducible chain (where every state is reachable from every other), all states share the same classification: either all transient or all recurrent.^[18] Periodicity measures the greatest common divisor of return times to a state i, with i aperiodic if this gcd is 1; a chain is ergodic if it is irreducible, positive recurrent, and aperiodic.^[22] The evolution of DTMCs is analyzed using the Chapman-Kolmogorov equations, which describe multi-step transitions. For m, n \geq 1, the n+m-step transition probabilities satisfy p_{ij}^{(n+m)} = \sum_{k \in S} p_{ik}^{(n)} p_{kj}^{(m)}, where p_{ij}^{(n)} is the (i,j)-entry of P^n.^[23] In matrix form, this yields P^{n+m} = P^n P^m, enabling computation of long-term behavior via matrix powers.^[23] A stationary distribution \pi = (\pi_i)_{i \in S} for a DTMC satisfies \pi P = \pi with \sum_{i \in S} \pi_i = 1 and \pi_i \geq 0, representing a probability vector invariant under transitions.^[24] For finite-state irreducible chains, this equation is equivalent to solving the linear system \pi (P - I) = 0 subject to normalization, where I is the identity matrix; uniqueness holds for ergodic chains.^[18] In positive recurrent irreducible chains, even if infinite, a unique stationary distribution exists with \pi_i > 0 for all i.^[24] The gambler's ruin problem illustrates DTMC concepts: consider two players with initial fortunes i and N-i (total N), each betting $1per round with win probabilityp \neq 1/2; the state is the first player's fortune, with absorbing states at 0 ([ruin](/page/The_Ruin)) and N(win).[25] The chain is irreducible on{1, \dots, N-1}but transient, with absorption probabilities computable via stationary-like equations, yielding ruin probability\frac{ \left( \frac{1-p}{p} \right)^i - \left( \frac{1-p}{p} \right)^N }{ 1 - \left( \frac{1-p}{p} \right)^N }ifp \neq 1/2$.^[25] PageRank, used in Google's original search engine, models web page importance via a DTMC on pages as states, where transitions follow hyperlink structure (random surfer model), adjusted for damping to ensure aperiodicity and irreducibility.^[26] The stationary distribution gives page ranks, computed by solving \pi P = \pi iteratively, prioritizing pages with high in-link quality.^[26] For irreducible aperiodic finite-state DTMCs, the distribution converges to the unique stationary \pi: starting from \mu_0, \|\mu_n - \pi\|_{TV} \to 0 as n \to \infty, where \|\cdot\|_{TV} is total variation distance. The mixing time \tau(\epsilon) is the minimal n such that \|\mu_n - \pi\|_{TV} \leq \epsilon for all initial \mu_0, bounded by \tau(\epsilon) \leq \frac{\log(1/(\epsilon \min_i \pi_i))}{1 - \lambda_2} for $0 < \epsilon < 1, with \lambda_2 the second-largest eigenvalue of P; this quantifies convergence speed, crucial for applications like sampling.^[27] In contrast, continuous-time Markov chains involve exponential holding times rather than fixed steps.^[28]

Continuous-Time Markov Chains

A continuous-time Markov chain is a stochastic process \{X(t): t \geq 0\} with a countable state space S that satisfies the Markov property: for all s, t \geq 0 and i, j \in S, the conditional probability P(X(s+t) = j \mid X(s) = i, \{X(u): 0 \leq u < s\}) = P(X(s+t) = j \mid X(s) = i) = p_{ij}(t), where p_{ij}(t) denotes the transition probability function.^[29] The process remains in each state i for an exponentially distributed holding time with rate \lambda_i = -q_{ii} > 0, after which it jumps to another state j \neq i with probability q_{ij}/\lambda_i.^[30] The dynamics of a continuous-time Markov chain are governed by the infinitesimal generator matrix Q = (q_{ij}), where q_{ij} \geq 0 for i \neq j represents the transition rate from state i to j, and the diagonal entries satisfy q_{ii} = -\sum_{j \neq i} q_{ij} so that each row sums to zero.^[29] This matrix Q is derived as the derivative of the transition probability matrix at time zero, Q = P'(0).^[30] The transition probabilities evolve according to the Kolmogorov backward equations,

\frac{d}{dt} p_{ij}(t) = \sum_{k \in S} q_{ik} p_{kj}(t), \quad t > 0,

with initial condition p_{ij}(0) = \delta_{ij}, or equivalently the forward equations,

\frac{d}{dt} p_{ij}(t) = \sum_{k \in S} p_{ik}(t) q_{kj}, \quad t > 0.

^[29] These systems of ordinary differential equations describe the forward flow of probability distributions and the backward propagation from initial states.^[30] For an irreducible continuous-time Markov chain, a unique stationary distribution \pi = (\pi_i)_{i \in S} exists if the chain is positive recurrent, satisfying the balance equations \pi Q = 0 with \sum_{i \in S} \pi_i = 1, and \lim_{t \to \infty} p_{ij}(t) = \pi_j for all i, j \in S.^[30] In contrast to discrete-time Markov chains, which evolve via matrix powers, continuous-time chains involve exponential holding times and rate-based jumps, leading to solutions often expressed via matrix exponentials P(t) = e^{tQ}.^[29] Birth-death processes form an important class of continuous-time Markov chains on the non-negative integers, where transitions occur only to neighboring states: from state n to n+1 at birth rate \lambda_n and to n-1 at death rate \mu_n (with \mu_0 = 0)./16%3A_Markov_Processes/16.21%3A_Continuous-Time_Birth-Death_Chains) A canonical example is the M/M/1 queue in queueing theory, modeling customer arrivals as a Poisson process with rate \lambda (births) and exponential service times with rate \mu (deaths), where the state represents the queue length; the stationary distribution is geometric with parameter \rho = \lambda / \mu < 1, yielding mean queue length \rho / (1 - \rho).^[31] Another application arises in radioactive decay chains, such as the thorium series, where states correspond to unstable isotopes transitioning to daughter nuclides at decay rates \lambda_i, forming a linear birth-death process with no births (\lambda_n = 0) and deaths \mu_n > 0, solving to exponential decay probabilities via the Kolmogorov equations.^[32] Key properties include the potential for explosion in chains with infinite state spaces, where the explosion time \zeta = \inf\{t > 0: N(t) = \infty\} (with N(t) the number of jumps by time t) is finite with positive probability, as in pure birth processes with superlinear rates like the Yule process (\lambda_n = n \beta); regular chains require P(\zeta = \infty) = 1.^[29] For simulation and transient analysis, uniformization embeds the continuous-time chain into a discrete-time Markov chain by adding fictitious self-transitions at a uniform rate \alpha \geq \max_i (-q_{ii}), yielding transition matrix M = I + Q/\alpha and Poisson-distributed jump times, enabling efficient Monte Carlo methods.^[33]

Sequential and Inferred Models

Hidden Markov Models

A hidden Markov model (HMM) is a probabilistic model that extends the Markov chain framework by assuming an underlying stochastic process with unobserved (hidden) states that evolve according to the Markov property, while the observable outputs, or emissions, are generated from these states via a conditional probability distribution. Unlike a standard Markov chain where states are directly observed, in an HMM the sequence of hidden states Q = q_1, q_2, \dots, q_T forms a Markov chain, and the observation sequence O = o_1, o_2, \dots, o_T is produced stochastically from each state, capturing dependencies in sequential data where direct state information is unavailable. This model was first formalized by Baum and Petrie in their work on statistical inference for probabilistic functions of Markov processes.^[34]^[35] The HMM is defined by three key components: the state transition probability matrix A = \{a_{ij}\}, where a_{ij} = P(q_{t+1} = S_j \mid q_t = S_i) specifies the probability of transitioning from state S_i to S_j; the emission probability distribution B = \{b_j(k)\}, where b_j(k) = P(o_t = v_k \mid q_t = S_j) gives the probability of observing symbol v_k in state S_j (often represented as an emission matrix for discrete observations); and the initial state probability distribution \pi = \{\pi_i\}, where \pi_i = P(q_1 = S_i). The complete model is denoted by the parameter set \lambda = (A, B, \pi). These components enable the HMM to represent systems with latent dynamics, such as sequences where the true state sequence is inferred indirectly from observations.^[35] To work with HMMs, three fundamental computational problems are addressed. The evaluation problem computes the likelihood of an observation sequence given the model, P(O \mid \lambda), using the forward-backward algorithm, which efficiently calculates this probability in O(N^2 T) time via dynamic programming, where N is the number of states and T is the observation length. The decoding problem finds the most likely hidden state sequence Q^* = \arg\max_Q P(Q \mid O, \lambda), solved by the Viterbi algorithm, a dynamic programming approach that maximizes the joint probability P(Q, O \mid \lambda) by maintaining the highest probability path ending in each state at each time step:

\begin{align*} \delta_1(i) &= \pi_i b_i(o_1), \\ \delta_{t+1}(j) &= \max_i [\delta_t(i) a_{ij}] b_j(o_{t+1}), \\ \psi_{t+1}(j) &= \arg\max_i [\delta_t(i) a_{ij}], \end{align*}

with backtracking to recover the state sequence, achieving the same time complexity as the forward algorithm. These methods, detailed in foundational treatments of HMMs, allow for inference over hidden states.^[35] Parameter estimation in HMMs is performed via the Baum-Welch algorithm, an expectation-maximization (EM) procedure that iteratively reestimates \lambda to maximize P(O \mid \lambda) when states are unobserved. Starting from an initial guess, the E-step computes expected state occupancies and transitions using forward-backward variables, while the M-step updates A, B, and \pi as normalized frequencies, such as

\hat{a}_{ij} = \frac{\sum_{t=1}^{T-1} \xi_t(i,j)}{\sum_{t=1}^{T-1} \gamma_t(i)},

where \xi_t(i,j) = P(q_t = S_i, q_{t+1} = S_j \mid O, \lambda) and \gamma_t(i) = P(q_t = S_i \mid O, \lambda); this guarantees non-decreasing likelihood and converges to a local maximum. The algorithm was developed by Baum, Petrie, Soules, and Weiss as a maximization technique for Markov chain probabilities.^[35] HMMs have been widely applied in sequence modeling tasks. In speech recognition, hidden states represent phonetic units like phonemes, with emissions modeling acoustic features such as spectral vectors, enabling systems to decode spoken words from audio observations; early implementations achieved recognition accuracies exceeding 95% on isolated digits using left-to-right HMM topologies. In natural language processing, HMMs support part-of-speech (POS) tagging, where states correspond to grammatical categories (e.g., noun, verb) and emissions to words, allowing inference of tag sequences from sentences; robust taggers based on this approach report tagging accuracies around 96% on standard corpora like the Penn Treebank.^[35]^[36] Model performance is evaluated using metrics tailored to the application. Perplexity measures predictive uncertainty for sequence models, computed as $2^{-\frac{1}{T} \log_2 P(O \mid \lambda)}, with lower values indicating better fit for tasks like language modeling. In communication or error-prone sequences, bit error rate quantifies decoding accuracy as the fraction of incorrectly inferred bits or symbols, often below 1% in optimized HMM-based systems. These metrics provide quantitative assessment of inference quality without ground-truth states.^[35]

Markov Decision Processes

A Markov decision process (MDP) is a mathematical framework for modeling sequential decision-making under uncertainty, where an agent interacts with an environment by selecting actions that influence probabilistic state transitions and associated rewards. Formally, an MDP is defined as a tuple (S, A, P, R, \gamma), where S is the state space, A is the action space, P(s' \mid s, a) denotes the transition probabilities to next states s' given current state s and action a, R(s, a) represents the reward function, and \gamma \in [0, 1) is the discount factor that weights future rewards. This structure builds on the Markov property, ensuring that future states depend only on the current state and action, not prior history. The theory originated in the 1950s with foundational work by Lloyd Shapley on stochastic games and Richard Bellman on dynamic programming approaches to controlled Markov chains.^[37]^[38]^[39] In an MDP, a policy \pi specifies the agent's behavior, mapping states to actions; it can be deterministic (\pi(s) = a) or stochastic (\pi(a \mid s)). The goal is to find an optimal policy \pi^* that maximizes the expected discounted cumulative reward, often evaluated through value functions. The state-value function V^\pi(s) estimates the expected return starting from state s under policy \pi, while the action-value function Q^\pi(s, a) does so after taking action a in s. These functions satisfy the Bellman equations, which decompose the value into immediate reward plus discounted expected future value:

V^\pi(s) = \sum_a \pi(a \mid s) \left[ R(s, a) + \gamma \sum_{s'} P(s' \mid s, a) V^\pi(s') \right]

For the optimal value function V^*(s), the Bellman optimality equation is:

V^*(s) = \max_a \left[ R(s, a) + \gamma \sum_{s'} P(s' \mid s, a) V^*(s') \right]

This equation implies that the optimal value operator is a contraction mapping in the space of bounded functions, guaranteeing a unique fixed point solvable via iterative methods. A similar form holds for Q^*(s, a).^[39]^[40] Solution methods for MDPs include value iteration and policy iteration. Value iteration, introduced by Bellman, initializes a value function and iteratively applies the Bellman optimality operator until convergence to V^*, from which \pi^* is derived by greedy selection. Policy iteration, developed by Ronald Howard, alternates between policy evaluation (solving the Bellman equation for a fixed \pi to compute V^\pi) and policy improvement (updating \pi to greedily select actions maximizing the Bellman update). Both algorithms converge in finite steps for finite MDPs under standard assumptions. For infinite-horizon discounted MDPs, these methods yield \gamma-discounted optimal policies; finite-horizon variants truncate at a horizon T with time-dependent values, while average-reward criteria (undiscounted, long-run average) use modified Bellman equations focusing on differential rewards.^[38]^[41]^[40] MDPs find application in problems like inventory control, where states represent stock levels, actions are ordering quantities, transitions model demand uncertainty, and rewards capture holding and shortage costs—solvable via policy iteration to minimize long-term costs. In robot navigation, states are positions, actions are movements, transitions account for noisy sensors or dynamics, and rewards guide goal-reaching; value iteration optimizes paths in simulated environments. These examples illustrate MDPs' utility in balancing exploration of uncertain outcomes with exploitation of rewarding actions.^[40]

Decision and Observation Models

Partially Observable Markov Decision Processes

A partially observable Markov decision process (POMDP) extends the Markov decision process framework to environments where the agent's state observations are imperfect or noisy. Formally, a POMDP is defined by the tuple (S, A, T, R, \Omega, O, \gamma), where S is the state space, A is the action space, T: S \times A \times S \to [0,1] is the transition probability function, R: S \times A \to \mathbb{R} is the reward function, \Omega is the observation space, O: S \times A \times \Omega \to [0,1] is the observation probability function (typically O(s', a, o) for the next state s' \in S), and \gamma \in [0,1) is the discount factor.^[42] This structure models sequential decision-making under uncertainty in state knowledge, where the agent receives an observation o \in \Omega after each action rather than the true state.^[42] Since the true state s \in S is hidden, the agent maintains a belief state b(s), which represents the posterior probability distribution P(s \mid a_t, o_{1:t}, a_{1:t-1}) over states given the action-observation history. This belief serves as a sufficient statistic for decision-making, transforming the POMDP into an equivalent fully observable belief MDP where beliefs are the states.^[43] Belief updates can be computed using techniques analogous to those in hidden Markov models, such as the forward algorithm, to incorporate new observations and actions.^[44] Planning in POMDPs involves solving the belief MDP via value iteration, where the value function V(b) is optimized over the continuous belief space. For finite-horizon problems, the optimal value function is piecewise linear and convex in the belief state, allowing representation as a finite set of linear segments (hyperplanes) that can be backed up exactly during iteration. This property, established in early dynamic programming approaches, enables efficient computation despite the high dimensionality, though exact solutions scale poorly with problem size.^[43] Key solution approaches include point-based value iteration (PBVI), which approximates the value function by performing backups only at a discrete set of strategically selected belief points, yielding anytime policies with bounded error.^[45] Another method is AND-OR tree search, which builds an observation tree from the current belief and evaluates action-observation branches to select policies online, often combined with Monte Carlo sampling for scalability in large spaces.^[46] Recent advances as of 2025 include integration with deep learning for long-horizon planning, such as BetaZero, which uses learned approximations for belief-state planning in complex environments.^[47] POMDPs are applied in robot localization, where a mobile robot uses sensor observations (e.g., laser scans) to maintain a belief over possible positions and plan paths under odometry uncertainty.^[48] In medical treatment, they model sequential therapy decisions, such as adjusting dosages for chronic conditions like hypertension, balancing efficacy against side effects given noisy patient responses.^[49] The problem of finding an optimal policy for a finite-horizon POMDP is PSPACE-complete, reflecting its inherent computational intractability even for moderate-sized models.^[50] Approximations like QMDP address this by assuming immediate full observability after actions, computing a hybrid value function from underlying MDP Q-values to derive fast, near-optimal policies.

Hierarchical Markov Models

Hierarchical Markov models extend traditional Markov models by incorporating multi-level structures that capture nested dependencies in sequential data or decision processes. In these models, higher-level abstract states govern transitions and emissions at lower levels, allowing for the representation of complex hierarchies where, for example, a super-state in a reinforcement learning (RL) setting initiates and oversees a subordinate Markov chain until a termination condition is met.^[51] This hierarchical organization enables modeling of phenomena with varying timescales or compositional structures, such as skills composed of primitive actions in RL or layered linguistic features in sequences.^[52] Key types include hierarchical hidden Markov models (HHMMs) and hierarchical Markov decision processes (HMDPs). In HHMMs, the hierarchy consists of super-states, each of which emits sequences from a lower-level hidden Markov model (HMM) rather than individual observations; transitions occur between super-states only upon completion of the subordinate chain, forming a tree-like structure that recursively models multi-scale sequences.^[53] HMDPs, often used in RL, build on this by integrating decision-making, where higher-level policies select options—temporally extended actions—that execute lower-level Markov policies, potentially over partially observable environments like POMDPs at the base level.^[51] Inference in hierarchical models, particularly HHMMs, typically involves extensions of standard HMM algorithms adapted for the tree structure. Parameter learning employs variational methods, such as expectation-maximization (EM) variants that propagate messages bottom-up and top-down across levels, or sampling techniques like Markov chain Monte Carlo (MCMC) and Gibbs sampling to handle the latent hierarchy and estimate transition probabilities, emission distributions, and durations.^[53]^[54] These approaches achieve efficient inference, often in linear time relative to sequence length, by exploiting the hierarchical decomposition.^[55] A primary benefit of hierarchical Markov models is temporal abstraction, which allows reuse of learned sub-sequences across contexts and reduces the state space explosion inherent in flat models by factoring complex behaviors into modular components.^[56] This abstraction mitigates the curse of dimensionality in long-horizon tasks, enabling faster learning and planning by operating at multiple timescales simultaneously.^[57] Recent developments as of 2025 include Hierarchical Semi-Markov Models that incorporate duration-aware dynamics for better modeling of variable-length sequences in applications like activity recognition.^[58] In reinforcement learning, the options framework exemplifies hierarchical modeling for skill acquisition, where an agent learns temporally abstract actions (options) that terminate based on learned policies, facilitating hierarchical control in environments like navigation or game-playing.^[51] For speech processing, HHMMs have been applied to model prosody at multiple levels, such as phonetic, syllabic, and intonational tiers, capturing dependencies in spontaneous Mandarin speech by hierarchically predicting pitch contours and rhythm from linguistic contexts.^[59]^[60] Extensions to non-parametric settings include infinite hierarchical hidden Markov models (IHHMMs), which use priors like the hierarchical Dirichlet process to allow an unbounded number of states and levels, automatically inferring the hierarchy's depth and structure from data without fixed dimensionality assumptions.^[61] This enables flexible modeling of diverse sequence lengths and complexities in applications like genomics or natural language.^[62]

Graphical and Spatial Models

Markov Random Fields

A Markov random field (MRF), also known as an undirected graphical model, consists of an undirected graph G = (V, E), where V is the set of vertices and E the set of edges, together with a set of random variables \{X_v : v \in V\} defined on a common probability space. The key characterizing feature is the local Markov property, which states that each variable X_v is conditionally independent of all other variables given its neighbors:
P(X_v \mid X_{V \setminus \{v\}}) = P(X_v \mid X_{\mathcal{N}(v)}),
where \mathcal{N}(v) = \{ u \in V : (u, v) \in E \} denotes the neighbors of v. This property encodes multivariate dependencies through the graph structure, making MRFs suitable for modeling spatial or relational data without assuming a temporal order.^[63] The Hammersley-Clifford theorem establishes the equivalence between MRFs and Gibbs random fields for strictly positive probability distributions. Specifically, a distribution satisfies the local Markov property with respect to G if and only if it admits a Gibbs factorization:
P(X) = \frac{1}{Z} \prod_{c \in \mathcal{C}} \phi_c(X_c),
where \mathcal{C} is the set of cliques in G, each \phi_c is a non-negative potential function over the clique c, and Z is the normalizing partition function. This factorization allows parameterization via clique potentials, facilitating the representation of complex interactions. The theorem, originally from unpublished lecture notes, has been foundational in linking Markov properties to exponential family forms in multivariate analysis.^[64] Common parameterizations of MRFs include the Ising model for binary variables, where potentials encourage alignment between neighboring states, such as \phi_{uv}(x_u, x_v) = \exp(\beta x_u x_v) for x_u, x_v \in \{-1, +1\} and interaction strength \beta > 0. This model, originating from statistical physics to describe ferromagnetism, serves as the prototypical MRF and has been widely adopted for discrete spatial modeling. The Potts model generalizes the Ising framework to K-state variables, with potentials \phi_{uv}(x_u, x_v) = \exp(\beta \mathbb{I}(x_u = x_v)), where \mathbb{I} is the indicator function, allowing for multi-label dependencies while preserving the pairwise structure.^[65] Inference in MRFs, such as computing marginals or the maximum a posteriori configuration, is generally NP-hard due to the partition function Z. Approximate methods include Gibbs sampling, a Markov chain Monte Carlo technique that iteratively samples from full conditional distributions to generate configurations from the joint, as introduced for Bayesian image restoration. Belief propagation, exact on tree-structured graphs, approximates inference on loopy graphs via iterative message passing between nodes, converging to fixed points that yield marginal approximations; the loopy variant, while not guaranteed to converge, performs well empirically in practice.^[66]^[67] MRFs find application in image denoising, where spatial potentials model pixel correlations to smooth noise while preserving edges, as demonstrated in early Bayesian restoration frameworks using Ising-like models on lattice graphs. In social network analysis, MRFs, often termed Markov random graphs, capture dependencies in tie formations, such as homophily, by defining potentials over edge cliques to infer network structures from observed data. Unlike Bayesian networks, which use directed acyclic graphs to encode asymmetric conditional dependencies via factorization of the joint into conditional probabilities, MRFs employ undirected graphs to represent symmetric, non-causal associations without implying ordering.^[66]^[68]

Imprecise Markov Chains

Imprecise Markov chains, also referred to as interval Markov chains, extend traditional discrete-time Markov chains by allowing transition probabilities to vary within specified intervals rather than being fixed point estimates. Specifically, for states i and j, the transition probability p_{ij} is constrained to lie in [p_{ij}^L, p_{ij}^U], where $0 \leq p_{ij}^L \leq p_{ij}^U \leq 1, and the intervals satisfy the row-sum condition \sum_j p_{ij}^L \leq 1 \leq \sum_j p_{ij}^U for each i. This formulation enables robust modeling under uncertainty, often through worst-case optimization over the set of compatible transition matrices, and is grounded in the theory of imprecise probabilities pioneered by Peter Walley in his 1991 work on behavioral interpretations of probability bounds.^[69]^[70] These models exhibit key properties such as tolerance to perturbations in transition probabilities, preserving essential behaviors like ergodicity when the intervals are sufficiently tight. An interval Markov chain is ergodic if every compatible Markov chain is ergodic, ensuring convergence to a unique stationary distribution within bounds, which contrasts with standard chains where small perturbations can disrupt long-term stability. Analysis of these models focuses on deriving guaranteed lower and upper bounds for quantities of interest; for instance, the stationary distribution \pi satisfies \underline{\pi} \leq \pi \leq \overline{\pi}, where the bounds are computed by solving linear programs that minimize and maximize \pi subject to \pi = \pi P over transition matrices P in the interval set. Algorithms for computing interval matrix powers, such as P^n for large n, rely on forward or backward accumulations to obtain tight enclosures on transient and limiting behaviors, with computational complexity results indicating that exact bound computation is NP-hard in general but polynomial-time approximable for specific cases.^[71]^[72]^[73] Applications of imprecise Markov chains include risk assessment in finance, where uncertain market transition probabilities lead to robust portfolio optimization by considering worst-case scenarios within intervals, and fault-tolerant systems design, such as in reliability engineering, to evaluate system performance under parameter imprecision. A representative example is robust queueing models with uncertain arrival or service rates, modeled as imprecise continuous-time Markov chains, where bounds on expected waiting times are derived to ensure system stability despite rate variations; for instance, in an M/M/1 queue with interval rates, the stationary queue length bounds are obtained via solving for the imprecise generator matrix's eigenvalues. Key developments trace back to the 1990s integration with Walley's imprecise probability theory, with seminal works establishing foundational limit theorems, followed by 2000s advancements in computational algorithms and complexity analyses demonstrating practical tractability for finite-state systems.^[74]^[75]^[70]

Applications and Extensions

Forecasting and Time Series

Markov chains serve as foundational tools for forecasting in time series analysis by modeling sequential dependencies, where the next state depends solely on the current state. In autoregressive prediction, first-order Markov chains predict the next observation based on the immediate predecessor, while higher-order (order-k) chains condition on the previous k observations to capture longer-range dependencies, enhancing accuracy for non-stationary or patterned data such as financial returns or weather sequences. The entropy rate of a Markov chain quantifies its inherent predictability, defined as the limit of the conditional entropy per symbol, serving as a theoretical bound on forecasting loss; chains with low entropy rates exhibit higher predictability due to reduced uncertainty in transitions.^[76]^[77] A key extension is the Markov-switching autoregressive (MSAR) model, which allows parameters like means and variances to switch across hidden regimes, accommodating abrupt changes in time series dynamics, as pioneered by Hamilton (1989) for detecting economic turning points. In this framework, unobserved states represent distinct behavioral modes, akin to those in hidden Markov models for regime detection. MSAR models thus improve forecasting by weighting predictions according to regime probabilities, outperforming linear autoregressions in volatile environments.^[78] Parameter estimation in MSAR models relies on maximum likelihood optimization via the expectation-maximization (EM) algorithm, which iteratively computes expected regime assignments (E-step) and updates transition and emission parameters (M-step) to handle latent variables. For forecasting, these models generate point predictions by propagating filtered regime probabilities forward and construct prediction intervals by integrating over possible future regime paths, often using simulation-based methods to account for uncertainty in switches.^[79]^[80] Prominent applications include modeling economic cycles, where Hamilton's MSAR framework identifies recessions as low-growth regimes in quarterly GDP data, enabling probabilistic forecasts of downturn probabilities. In stock price volatility modeling, MSAR variants capture regime shifts between calm and turbulent periods, improving volatility predictions for assets like S&P 500 returns by leveraging clustered high-volatility episodes.^[78]^[81] Despite their strengths, MSAR models assume stationarity within each regime, limiting applicability to series with persistent trends or structural breaks outside discrete switches, and they struggle with non-Markovian residuals where errors exhibit serial correlation, potentially biasing regime inference.^[82] Implementations are available in statistical software, such as the MSwM package in R, which fits univariate autoregressive MSAR models using the EM algorithm for estimation and diagnostics. In Python, the hmmlearn library facilitates fitting of hidden Markov models adaptable to MSAR structures for regime-based time series forecasting.^[83]

Modern Uses in Machine Learning

Markov decision processes (MDPs) serve as the foundational framework for reinforcement learning (RL), enabling agents to learn optimal policies through interaction with environments modeled as sequential decision problems. Q-learning, a seminal model-free RL algorithm, directly operates on MDPs by estimating action-value functions to approximate optimal behaviors without requiring a model of the environment's dynamics. In deep RL, neural networks parameterize these value functions, allowing scalability to high-dimensional state spaces, as demonstrated in Atari games where deep Q-networks (DQNs) achieved human-level performance by leveraging convolutional layers for feature extraction. AlphaGo, a landmark deep RL system, framed the game of Go as an MDP, using Monte Carlo tree search combined with deep neural networks for policy and value estimation to select moves, culminating in its victory over world champions in 2016. In natural language processing (NLP), hidden Markov models (HMMs) initially powered sequence labeling tasks like part-of-speech (POS) tagging by modeling word-tag transitions probabilistically, achieving accuracies around 95% on benchmarks such as the Penn Treebank in the 1990s. These generative models evolved into discriminative alternatives like conditional random fields (CRFs), which directly optimize conditional probabilities over label sequences given observations, outperforming HMMs by 1-2% on POS tagging and named entity recognition tasks. CRFs address HMM limitations in handling overlapping features, becoming a staple in early NLP pipelines before integration with deep learning. Markov random fields (MRFs) underpin computer vision applications, particularly image segmentation, where they enforce spatial consistency via Gibbs energy minimization to delineate object boundaries. The seminal stochastic relaxation approach using MRFs restored noisy images by sampling from Gibbs distributions, laying groundwork for modern methods. In object detection, hierarchical MRFs extend this by modeling multi-scale relationships, such as part-whole hierarchies, enabling efficient inference on complex scenes with near-real-time performance on RGB-D datasets.^[84] Recent advancements integrate Markov models with emerging AI paradigms. Denoising diffusion probabilistic models (DDPMs), introduced in 2020, employ Markov chains to progressively add and reverse Gaussian noise, generating high-fidelity images that rival GANs in diversity and quality on datasets like CIFAR-10.^[85] Subsequent developments, such as latent diffusion models underlying Stable Diffusion (2022) and video generation tools like Sora (2024), have popularized these Markov-based approaches in generative AI, enabling text-to-image and text-to-video synthesis with unprecedented realism and control as of 2025. In quantum machine learning, hidden quantum Markov models (HQMMs) generalize classical HMMs using quantum states for hidden variables, offering exponential advantages in learning complex sequences, as shown in parameter estimation tasks with fewer samples than classical counterparts.^[86] Transformers in models like GPT exploit autoregressive Markov properties, where each token prediction conditions solely on preceding tokens, enabling efficient parallel training; self-attention mechanisms further relate to context-conditioned Markov chains, enhancing long-range dependencies. Scalability in these deep RL and generative applications has been boosted by GPU acceleration, as in distributed actor-learner architectures that train on thousands of parallel environments, reducing wall-clock time for policy optimization in large MDPs.^[87] Markov models have informed real-world AI applications during crises, such as spatial MRFs for modeling COVID-19 diffusion, where agent-based Markov processes captured neighborhood effects in infection spread across Italian regions in 2020, aiding policy simulations.^[88] In climate forecasting, hierarchical Markov chains combined with hidden states and climate indices predict multidecadal streamflow variability, improving seasonal ensemble accuracy by integrating large-scale oscillations like El Niño.^[89] Recent integrations include hybrid HMM-neural network models for financial time series prediction and explainable Markov chains in sensor data analytics, bridging classical methods with deep learning for robust, interpretable AI systems as of 2025.^[90] These integrations highlight Markov models' versatility as building blocks in modern machine learning, bridging classical probabilistic foundations with scalable, data-driven AI systems.

References

[1]
Andrei Andreyevich Markov (1856 - 1922) - Biography - MacTutor
He proved the central limit theorem under fairly general assumptions. Markov is particularly remembered for his study of Markov chains, sequences of random ...
[2]
[PDF] Markov and the creation of Markov chains
Jun 2, 2025 · We describe the life, times and legacy of Andrei Andreevich Markov (1856 -1922), and his writings on what became known as Markov chains. One ...
[3]
[PDF] An Introduction to Markov Modeling: Concepts and Uses
Markov models model complex behavior in fault-tolerant systems, using states and transitions. This tutorial covers their concepts, limitations, and when they ...
[4]
[PDF] Hidden Markov Models - Math (Princeton)
Jul 28, 2008 · In words, the Markov property guarantees that the future evolution of the process depends only on its present state, and not on its past history ...
[5]
https://web.math.princeton.edu/~rvan/orf557/hmm080728.pdf
[6]
[PDF] INTRODUCTION TO BASIC PROPERTIES OF MARKOV CHAIN ...
Feb 2, 2023 · A Markov model is a stochastic model which assumes that the transition probabilities between states depend on only the current state rather than ...
[7]
Markov chains · CS 357 Textbook
A Markov chain is a stochastic model where the probability of future (next) state depends only on the most recent (current) state.
[8]
[PDF] Markov Processes
The Markov property is the independence of the future from the past, given the present. Let us be more formal. Definition 102 (Markov Property) A one-parameter ...
[9]
[PDF] Introduction to Markov Chains 1.1 Markov Property and Existence
Definition 1.1 A sequence of random variables (Xn) is called a Markov chain if the past and future of the process are conditionally independent given the ...Missing: model | Show results with:model
[10]
[PDF] Markov Chains Handout for Stat 110 1 Introduction
If each column of the transition matrix Q sums to 1, then the uni- form distribution over all states, (1/M,1/M,...,1/M), is a stationary distribution. Proof.
[11]
[PDF] 1. Markov chains
That is, a Markov chain started out in a stationary distribution π stays in the distribution π forever; that's why the distribution π is called “stationary.” ( ...
[12]
[PDF] 18.440: Lecture 33 Markov Chains - DSpace@MIT
Simple example. ▷ For example, imagine a simple weather model with two states: rainy and sunny. ▷ If it's rainy one day, there's a .5 chance it will be rainy ...
[13]
[PDF] Markov Models
A Markov Chain is a stochastic model emulating a process where a state of a ran- dom variable at each moment of time stochastically (i.e., through a conditional.
[14]
[PDF] markov-chains.pdf - Faculty Web Pages - Kennesaw State University
Here is a second silly model of the weather: Sunny. Rainy. 1. 1. In this model, if the weather happens to be Rainy, it is doomed to be Rainy forever, and if the.
[15]
[PDF] The Life and Work of A. A. Markov
Here we present an overview of Markov's life and his work on the chains. Andrei Andreevich Markov was a gifted Russian mathematician, a disciple of the ...
[16]
Kolmogorov and the Theory of Markov Processes - jstor
(Kolmogorov called them stochastically determined processes. The name Markov process was suggested in 1934 by Khintchine.)
[17]
[PDF] MARKOV CHAINS: BASIC THEORY 1.1. Definition and First ...
Definition 1. A (discrete-time) Markov chain with (finite or countable) state space X is a se- quence X0,X1,... of X −valued random variables such that for all ...
[18]
[PDF] Discrete-Time Markov Chains - CMU School of Computer Science
24.2 Formal Definition of a DTMC Definition 24.1 A discrete-time Markov chain (DTMC) is a stochastic pro- cess {𝑋𝑛,𝑛 = 0, 1, 2, ... }, where 𝑋𝑛 denotes the state ...
[19]
[PDF] Chapter 3 Discrete Time Markov Chains
In this chapter we introduce discrete time Markov chains. For these models both time and space are discrete. We will begin by introducing the basic model, ...
[20]
Markov Chains — Computational Statistics and ... - Duke People
An ergodic state is aperiodic and positive recurrent. If all states are ergodic, then we have an ergodic Markov chain. A finite chain that is aperiodic and ...
[21]
[PDF] Classification of States
If all states in an irreducible Markov chain are ergodic, then the chain is said to be ergodic. Dr. Guangliang Chen | Mathematics & Statistics, San José State ...
[22]
[PDF] 4. Markov Chains (9/23/12, cf. Ross) 1. Introduction 2. Chapman ...
Markov Chains. 4. Markov Chains (9/23/12, cf. Ross). 1. Introduction. 2. Chapman-Kolmogorov Equations. 3. Types of States. 4. Limiting Probabilities. 5.
[23]
[PDF] Stationary distributions and limiting probabilities
Assume a Markov chain {Xn : n = 0,1,2,...} with state space S and transition matrix P. Let π = (πi )i∈S be a row vector denoting a probability distribution ...
[24]
[PDF] 1 Gambler's Ruin Problem
In fact it will hit any given integer a infinitely often, always returning yet again after leaving; it is a recurrent Markov chain. Proof : The first statement ...
[25]
[PDF] The Anatomy of a Large-Scale Hypertextual Web Search Engine
Abstract. In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext.
[26]
[PDF] Markov Chains and Mixing Times David A. Levin Yuval Peres ...
Markov first studied the stochastic processes that came to be named after him in 1906. Approximately a century later, there is an active and diverse ...
[27]
https://www.stat.berkeley.edu/~aldous/260-FMIE/Levin-Peres-Wilmer.pdf
[28]
[PDF] 1 IEOR 6711: Continuous-Time Markov Chains
A CTMC makes transitions from state to state, independent of the past, ac- cording to a discrete-time Markov chain, but once entering a state remains in that ...Missing: Feller 1950<|control11|><|separator|>
[29]
[PDF] CONTINUOUS-TIME MARKOV CHAINS Definition 1. Acontinuous ...
(3). Pt +s = Pt Ps . (These are the natural analogues of the Chapman-Kolmogorov equations for discrete-time chains.) As for discrete-time Markov chains, we ...
[30]
[PDF] 1 Continuous-Time Markov Chains
at time t. Being a Birth and Death process we need only consider the Birth and. Death balance equations (25) which take the form. λPj = (j + 1)µPj+1, j ≥ 0.
[31]
Markov chain models of nuclear transmutation: Part I – Theory
This paper presents mathematical models of the nuclide transmutation and decay chains based on discrete-time and continuous-time Markov chains.
[32]
[PDF] Tutorial on Structured Continuous-Time Markov Processes
This tutorial is intended for readers interested in learning about continuous-time Markov processes, and in particular compact or structured representations of ...
[33]
[PDF] baum.pdf
[4] BAUM, LEONARD E. and PETRIE, TED (1966). Statistical inference for probabilistic functions of finite state Markov chains. Ann. Math. Statist. 37 1554 ...
[34]
[PDF] A tutorial on hidden Markov models and selected applications in ...
3The model of Fig. 2(a) is a memoryless process and thus is a degenerate case of a Markov model. RABINER: HIDDEN MARKOV MODELS.
[35]
Robust part-of-speech tagging using a hidden Markov model
Robust part-of-speech tagging using a hidden Markov model☆. Author links ... Kupiec, 1989a. J.M. Kupiec. Augmenting a hidden Markov model for phrase ...
[36]
Stochastic Games* | PNAS
Stochastic Games*. L. S. ShapleyAuthors Info & Affiliations. October 15, 1953. 39 (10) 1095-1100. https://doi.org/10.1073/pnas.39.10.1095. View related content.
[37]
A Markovian Decision Process - jstor
As we shall discuss below, this question arises from the consideration of a dynamic programming process. A related process gave rise to an equation of the.
[38]
https://www.jstor.org/stable/24900506
[39]
Markov Decision Processes | Wiley Series in Probability and Statistics
Markov Decision Processes: Discrete Stochastic Dynamic Programming ; First published:15 April 1994 ; Print ISBN:9780471619772 | ; Online ISBN: ...
[40]
[PDF] Dynamic Programming and Markov Processes - Gwern
The policy-iteration method that will be described will find the optimal policy in a small numberof iterations. It is composed of two parts, the value- ...Missing: paper | Show results with:paper
[41]
[PDF] Partially Observable Markov Decision Processes (POMDPs ... - arXiv
Jul 15, 2021 · Formally, the POMDP model is defined as a 6-tuple hS,A,O,T,Z,Ri, where: S denotes the state space —the set of all possible states, which can ...
[42]
A primer on partially observable Markov decision processes ...
Aug 2, 2021 · Partially observable Markov decision processes are powerful decision models that allow users to make decisions under imperfect observations over ...
[43]
[PDF] Partially Observable Markov Decision Processes (POMDPs)
POMDP as Belief-State MDP. ▫ Equivalent belief-state MDP. ❑ Each MDP state is a probability distribution. (continuous belief state b) over the states of the.
[44]
[PDF] Hidden Markov Models, Partially Observable ... - Dimitrios Katselis
The filtered distribution πn is called belief state in the context of POMDPs and it is used by the POMDP controller to choose the next action.
[45]
[PDF] Point-based value iteration: An anytime algorithm for POMDPs
This paper presents PBVI, a scalable anytime algorithm for approximately solving POMDPs. We applied PBVI to a robotic version of lasertag, where it ...Missing: seminal | Show results with:seminal
[46]
[PDF] Online Planning Algorithms for POMDPs
The value ˆV(b) of a belief state b is then computed as ˆV(b) = Ps∈S V MDP (s)b(s). This can be computed very quickly, as each iteration of Equation 16 can be ...
[47]
[PDF] A point-based POMDP algorithm for robot planning
Abstract—We present an approximate POMDP solution method for robot planning in partially observable environments. Our algorithm belongs to the family of ...
[48]
[PDF] Planning Medical Therapy Using Partially Observable Markov ...
In this paper we investigate various structural extensions of the standard POMDP frame- work and approximation methods which allow us to simplify model ...
[49]
[PDF] THE COMPLEXITY OF MARKOV DECISION PROCESSES. - MIT
We show that the partially observed version of the finite horizon problem is PSPACE-complete. ... PAPADIMITRIOU & JOHN N. TSITSIKLIS. The initial state is ...Missing: POMDP | Show results with:POMDP
[50]
[PDF] A framework for temporal abstraction in reinforcement learning
Finally, note that hierarchical structures, such as options that select other options, can also give rise to higher-level options that are semi-Markov (even if ...<|separator|>
[51]
[PDF] The Hierarchical Hidden Markov Model: Analysis and Applications
Hidden Markov models (HMMs) have become the method of choice for modeling stochastic processes and sequences in applications such as speech and handwrit- ing ...
[52]
[PDF] Linear Time Inference in Hierarchical HMMs - UBC Computer Science
The hierarchical hidden Markov model (HHMM) is a generalization of the hidden Markov model (HMM) that models sequences with structure.Missing: HMDP | Show results with:HMDP
[53]
Forward-Backward Activation Algorithm for Hierarchical Hidden ...
In this paper, we propose a new inference method of HHMMs for which the time complexity is O(TN^{D+1}). A key idea of our algorithm is application of the ...
[54]
Toward Efficient Bayesian Approaches to Inference in Hierarchical ...
We study the use of various approaches to improve convergence times and mixing in Markov chain Monte Carlo methods applied to hierarchical hidden Markov models.
[55]
The Promise of Hierarchical Reinforcement Learning - The Gradient
Mar 9, 2019 · Abstraction: state and temporal abstractions allow to simplify the problem since resulting sub-tasks can effectively be solved by RL ...
[56]
Exploring the limits of hierarchical world models in reinforcement ...
Nov 6, 2024 · The implementation of HRL usually results in a temporal or conceptual abstraction on the higher levels of the hierarchy, which alleviates the ...
[57]
Hierarchical prosody modeling for Mandarin spontaneous speech
Apr 30, 2019 · In this paper, a hierarchical prosody model (HPM)-based method for Mandarin spontaneous speech is proposed. First, an HPM is designed for ...
[58]
[PDF] Symbolic Modeling of Prosody: From Linguistics to Statistics - HAL
Jun 17, 2015 · A hierarchical HMM (HHMM) is used to model a sequence of prosodic events conditionally to the sequence of observed linguistic contexts (section ...<|control11|><|separator|>
[59]
[PDF] Infinite Hierarchical Hidden Markov Models
Hierarchical Hidden Markov Models (HHMM) are multiscale models of sequences where each level of the model is a separate Hidden Markov Model (HMM) emitting ...Missing: HMDP | Show results with:HMDP
[60]
[PDF] The Infinite Hidden Markov Model - MLG Cambridge
We have shown how a two-level Hierarchical Dirichlet Process can be used to define a non- parametric Bayesian HMM.
[61]
Markov Random Fields - an overview | ScienceDirect Topics
A Markov random field (MRF) is defined as an undirected graphical model that represents a set of random variables and their conditional independence ...
[62]
[PDF] 1. Markov random fields and Gibbs distributions
The Hammersley-Clifford Theorem asserts that the process {Xt : t ∈ T} is a Markov random field if and only if the corresponding Q is a Gibbs distribution. It ...
[63]
[PDF] 1 Markov Random Fields.
An MRF is characterized by its local property (the Markovianity) whereas a GRF is characterized by its global property (the Gibbs distribution). The Hammersley- ...
[64]
http://www.stat.yale.edu/~pollard/Courses/251.spring04/Handouts/Hammersley-Clifford.pdf
[65]
[PDF] Understanding Belief Propagation and its Generalizations
The Markov random field is said to be "pairwise" because the compatibility functions only depend on pairs of sites 'and"Р. In contrast to Bayesian networks, ...
[66]
[PDF] Bayesian Networks and Markov Random Fields 1 Bayesian ... - TTIC
A Bayesian network is a special case of a Markov random field in which there is one hyperedge for each conditional probability table.
[67]
Imprecise Probabilities - Stanford Encyclopedia of Philosophy
Dec 20, 2014 · This article introduces the theory of imprecise probabilities, discusses the motivations for their use and their possible advantages over the standard precise ...Missing: Markov tolerant
[68]
Discrete time Markov chains with interval probabilities - ScienceDirect
This has become possible with the development of models of imprecise probabilities, such as the interval probability model.
[69]
Imprecise Discrete-Time Markov Chains - SpringerLink
Dec 10, 2021 · I present a short and easy introduction to a number of basic definitions and important results from the theory of imprecise Markov chains in discrete time.
[70]
[PDF] Imprecise Markov chains - 4th SIPTA Summer School
A Markov set chain is an imprecise Markov chain where the initial distribution C0 is an arbitrary compact set of probabilities and the transition set P is an ...Missing: Tolerant | Show results with:Tolerant<|control11|><|separator|>
[71]
[PDF] Hitting Times and Probabilities for Imprecise Markov Chains
In this present work, we consider the inference prob- lems of computing lower and upper expected hitting times and hitting probabilities for an imprecise Markov ...
[72]
Markov Decision Processes with Imprecise Transition Probabilities
This model of imprecision appears to be well suited for describing statistically determined confidence limits and/or natural language statements of likelihood.
[73]
[PDF] Imprecise Continuous-Time Markov Chains: Efficient Computational ...
Imprecise continuous-time Markov chains are a robust type of continuous-time Markov chains that allow for partially specified time-dependent parameters. ...
[74]
Application of Markov Chains to Analyze and Predict the Time Series
Aug 8, 2025 · In this paper, we apply the Markov chains model to analyze and predict the time series. Some series can be expressed by a first-order discrete- ...
[75]
[PDF] Entropy Rate Estimation for Markov Chains with Large State Space
Sep 24, 2018 · The goal is to characterize the sample complexity of entropy rate estimation as a function of S, γ∗, and the estimation accuracy. Note that the ...Missing: forecasting | Show results with:forecasting
[76]
A New Approach to the Economic Analysis of Nonstationary Time ...
This paper proposes a very tractable approach to modeling changes in regime. The parameters of an autoregression are viewed as the outcome of a ...
[77]
[0802.3143] Estimation of linear autoregressive models with Markov ...
Feb 21, 2008 · Abstract: This work concerns estimation of linear autoregressive models with Markov-switching using expectation maximisation (E.M.) algorithm.Missing: maximization | Show results with:maximization
[78]
[PDF] Predicting Markov-Switching Vector Autoregressive Processes
Apr 19, 2000 · A main advantage of the Gibbs sampler is the feasibility of generating forecasting intervals by producing the predicted density of yt+h given Yt ...
[79]
Stock Price Volatility Estimation Using Regime Switching Technique ...
This paper aims to estimate the stock price volatility using the Markov regime-switching GARCH (MSGARCH) and SETAR model.2. Literature Review · 4. Methodology · 5. Experiment Results
[80]
[PDF] LECTURE ON THE MARKOV SWITCHING MODEL
The Markov switching model and its variants discussed in the preceding sections are only suitable for stationary data. Let yt be the observed time series which ...
[81]
CRAN: Package MSwM
Jun 6, 2021 · MSwM: Fitting Markov Switching Models. Estimation, inference and diagnostics for Univariate Autoregressive Markov Switching Models for Linear ...
[82]
Efficient Hierarchical Markov Random Fields for Object Detection on ...
Nov 7, 2011 · We propose two Hierarchical Markov Random Field models in efforts to distinguish connected objects using tiered, binary label sets.
[83]
[PDF] Denoising Diffusion Probabilistic Models
Diffusion probabilistic models are latent variable models, Markov chains trained to reverse a diffusion process that adds noise to data. They are used for ...
[84]
Learning Hidden Quantum Markov Models
We show that our algorithm learns HQMMs with the same number of hidden states and predictive accuracy as the HQMMs that generated the data.
[85]
[PDF] IMPALA: Scalable Distributed Deep-RL with Importance Weighted ...
In this work we aim to solve a large collection of tasks using a single reinforcement learning agent with a single set of parameters. A key challenge.<|control11|><|separator|>
[86]
COVID-19 Spatial Diffusion: A Markovian Agent-Based Model - MDPI
The aim of this work was to show how these formalisms, as agent-based Markov models, are suitable for describing and evaluating systems characterised by ...<|separator|>
[87]
A hidden Markov model combined with climate indices for ...
Sep 12, 2014 · A simulation framework is developed using a hidden Markov (HM) model in combination with large-scale climate indices that drive multidecadal variability.