Fact-checked by Grok 2 weeks ago

Hawkes process

The Hawkes process is a self-exciting temporal in and , designed to model sequences of events where the occurrence of one event temporarily increases the probability of subsequent events, capturing phenomena like clustering or . Introduced by Alan G. Hawkes in 1971, it is defined by a conditional function \lambda(t) that combines a constant background rate \mu with the influence of past events \{T_i < t\} through an excitation kernel \phi, typically expressed as \lambda(t) = \mu + \sum_{T_i < t} \phi(t - T_i), where \phi is often an exponential decay function such as \alpha e^{-\beta (t - T_i)} to represent decaying influence. This formulation allows the process to exhibit both immigration (background events) and branching (self-induced events), with the branching ratio \int_0^\infty \phi(u) \, du < 1 ensuring stationarity and preventing infinite cascades. Originally developed to analyze point processes with spectral properties, the Hawkes process gained widespread adoption in the 1980s through applications in seismology, particularly via the Epidemic-Type Aftershock Sequence (ETAS) model for earthquake aftershocks, where past quakes trigger future ones. Over the decades, it has been extended to multivariate settings to handle multiple interacting event types, enabling the modeling of mutual excitations across dimensions like social networks or financial markets. Inference for Hawkes processes typically relies on maximum likelihood estimation for parametric forms or nonparametric kernel methods, with recent advances incorporating deep learning for flexible neural variants and reinforcement learning for control and optimization in dynamic systems. Key applications span diverse fields, including finance, where it models high-frequency trading and limit order book dynamics by capturing transaction clustering; neuroscience, for spike train analysis in neural activity; social media, to describe retweet cascades and information diffusion; and epidemiology, such as modeling malaria outbreaks or disease spread with self-reinforcing patterns. These uses highlight the process's versatility in capturing temporal dependencies and causality in event data, though challenges remain in handling high-dimensional data and ensuring computational efficiency for real-time inference.

Background

Point processes

A point process is a stochastic model that describes the occurrence of random events in continuous time, typically represented by a sequence of event times \{T_i\}_{i=1}^\infty where $0 < T_1 < T_2 < \cdots and T_i \to \infty as i \to \infty. These models are fundamental in fields such as statistics, probability, and stochastic processes for analyzing phenomena like arrivals in queues, earthquakes, or neuronal spikes, where events happen irregularly over time. Point processes can be distinguished by their structure and assumptions about event occurrences. Simple point processes, exemplified by the , feature events that occur independently with no simultaneous occurrences at the same time. In contrast, more complex variants include , where inter-event times are independent and identically distributed but follow an arbitrary positive distribution, allowing for greater flexibility in modeling dependencies on prior intervals without full history dependence. Key properties of point processes include the counting measure N(t), defined as the number of events occurring in the interval [0, t], which is a non-decreasing step function jumping by one at each event time. Inter-event times, denoted as X_i = T_i - T_{i-1} (with T_0 = 0), capture the waiting periods between consecutive events and are central to characterizing the process's temporal structure. The compensator function \Lambda(t), a non-decreasing predictable process, provides the expected cumulative number of events up to time t, ensuring that N(t) - \Lambda(t) behaves as a martingale under suitable conditions. A canonical example is the homogeneous Poisson process, which has a constant rate parameter \lambda > 0, meaning the expected number of events in any interval of length t is \lambda t, and events occur independently with exponentially distributed inter-event times of mean $1/\lambda. This process serves as a baseline for understanding more advanced models, such as self-exciting point processes that build upon these foundations.

Self-exciting mechanisms

Self-excitation in point processes describes a mechanism where the occurrence of an event at a given time temporarily elevates the conditional intensity rate for subsequent events within the same process, promoting temporal clustering and bursty behavior. This dynamic contrasts with homogeneous processes, where event rates remain constant and independent, by incorporating feedback from past events that amplifies future occurrences. Such mechanisms are essential for modeling systems exhibiting or effects, where isolated events are rare and sequences dominate. The conceptual foundations of self-excitation draw from diverse fields, with early inspirations in modeling the propagation of contagious diseases, where each raises the risk of new cases through direct . In this , self-exciting dynamics capture how initial outbreaks can cascade into epidemics, reflecting the inherent "contagiousness" of events. Similarly, in , self-excitation underlies models of neural firing patterns, where a single can depolarize the further, leading to bursts of spikes that represent synchronized activity in neuronal populations. These ideas predate formal frameworks but highlight the intuitive role of event-triggered reinforcement in biological systems. A prominent empirical precursor to self-exciting models appears in through Omori's law, formulated in 1894, which quantifies the decaying rate of aftershocks following a mainshock as inversely proportional to time elapsed, indicating how one seismic event triggers a sequence of dependent followers. This law illustrates self-excitation intuitively: a major destabilizes fault zones, increasing the likelihood of nearby tremors that, in turn, may induce more, forming clustered sequences rather than random occurrences. Modern self-exciting point processes, such as the Hawkes process, build on this by providing a probabilistic structure to simulate such triggering cascades. Self-exciting mechanisms differ from mutually exciting processes, which involve cross-influence between distinct event types or subprocesses, such as one affecting another's rate without intra-process . In self-excitation, the is endogenous to the process, emphasizing internal , whereas mutual excitation captures interdependencies, as also outlined in foundational work the two.

Mathematical formulation

Intensity function

The intensity function of the Hawkes process, denoted \lambda(t), represents the instantaneous expected of event occurrences at time t, conditional on the H_t of events observed strictly before t. This takes the general form \lambda(t) = \mu + \sum_{t_i < t} \alpha \phi(t - t_i), where \mu > 0 denotes the constant , \alpha > 0 is the of induced by each past event, and \phi is a positive supported on (0, \infty). This expression captures the self-exciting nature of the process, initially proposed by Hawkes for kernels and extended to general decaying kernels via a cluster process representation. The component \mu models the exogenous of "immigrant" that occur independently of prior activity, while the accounts for the endogenous , with each historical at time t_i contributing an additional \alpha \phi(t - t_i) that reflects branching from past occurrences. The \phi(u) is typically a monotonically decreasing ensuring finite temporal influence, such as the \phi(u) = \beta e^{-\beta u} for u > 0 and \beta > 0, which provides a parameter controlling the of . This property guarantees that the cumulative effect of past remains bounded, supporting the process's interpretability as a conditional .

Univariate case

The univariate Hawkes process models a single type of event where occurrences can trigger future events of the same type, simplifying analysis by focusing on self-excitation without inter-type interactions. This specialization builds on the general intensity function by restricting it to one dimension, making it suitable for introductory studies of clustering behavior in temporal data. The conditional intensity function for the univariate case is \begin{equation} \lambda(t) = \mu + \sum_{t_i < t} \alpha , e^{-\beta (t - t_i)}, \end{equation} where the sum is over all previous event times t_i in the history of the process, assuming an exponential kernel for the excitation. Here, \mu > 0 represents the constant background that drives exogenous events, \alpha > 0 denotes the of the instantaneous jump in immediately after an event, and \beta > 0 governs the rate, determining how quickly the influence of past events diminishes. These parameters allow the model to capture both baseline activity and the contagious nature of events, with the exponential form enabling closed-form expressions for many properties. A key condition for the stationarity of the univariate Hawkes process is that the branching n = \frac{\alpha}{\beta} < 1, which quantifies the expected number of offspring events triggered by each occurrence; values of n approaching 1 lead to increased clustering and higher variability in event rates, while n \geq 1 causes the process to diverge. This arises from the integral of the kernel function, \int_0^\infty \alpha e^{-\beta s} \, ds = \frac{\alpha}{\beta}, and ensures the total expected intensity remains finite over time. Simulation of the univariate Hawkes process can be efficiently performed using Ogata's modified thinning algorithm, which generates event times by proposing candidates from a homogeneous and accepting them probabilistically based on the conditional intensity to account for the self-exciting dynamics. This approach is particularly effective for the exponential kernel, as it avoids exact integration of the intensity while bounding the maximum possible rate during simulation intervals. The algorithm proceeds in the following steps:
  1. Initialize the current time t = 0, an empty event history, and initial intensity \lambda(t) = \mu.
  2. While t < T (the simulation horizon):
    • Compute an upper bound \bar{\lambda}(t) for the intensity over the next potential interval, typically \bar{\lambda}(t) = \mu + \alpha \sum_{t_i < t} e^{-\beta (t - t_i)} + \alpha to account for possible future jumps.
    • Generate a candidate waiting time \Delta \sim \exp(\bar{\lambda}(t)) and set candidate time t' = t + \Delta.
    • If t' > T, stop.
    • Evaluate the exact intensity \lambda(t') at t'.
    • Generate a uniform random variable u \sim U(0, 1).
    • If u \leq \frac{\lambda(t')}{\bar{\lambda}(t')}, accept t' as an event time, add it to the history, and update the intensity accordingly; otherwise, reject and set t = t' without adding an event.
    • Advance t = t'.
This method ensures exact simulation conditional on the parameters and is computationally feasible for moderate time horizons.

Multivariate case

The multivariate Hawkes process generalizes the univariate case to model interactions among multiple types of s, allowing for both self-excitation within each type and cross-excitation between types. This framework is particularly useful for capturing network-like dependencies in event streams, where occurrences of one event type can influence the of others. Consider a process with d event types, where events of type k occur at times \{t_i^k\}_{i=1}^\infty. The conditional for type j at time t is given by \lambda_j(t) = \mu_j + \sum_{k=1}^d \sum_{t_i^k < t} \alpha_{jk} e^{-\beta_{jk} (t - t_i^k)}, where \mu_j > 0 is the background for type j, \alpha_{jk} \geq 0 controls the magnitude of from type k to type j, and \beta_{jk} > 0 governs the rate of that , assuming kernels for analytical tractability. This formulation ensures the remains non-negative and incorporates historical events across all types up to time t. A key stability condition for the process requires the \rho(A) < 1, where A is the d \times d with entries A_{jk} = \alpha_{jk}/\beta_{jk}, representing the expected number of of type j triggered by a single event of type k. This branching ratio A quantifies the overall strength in the ; the condition \rho(A) < 1 guarantees finite expected and stationarity under mild assumptions. In this setup, the diagonal elements A_{jj} capture self-excitation within type j, analogous to the univariate case, while off-diagonal elements A_{jk} (for j \neq k) model mutual or directed excitation between types, enabling the representation of asymmetric influences. For instance, in a bivariate (d=2) process modeling competing risks such as buy and sell orders in financial markets, events of one type (e.g., a buy order) can increase the intensity of the other (sell orders) through cross-excitation, reflecting dynamics like order flow imbalances.

Properties

Stationarity

A Hawkes process is stationary if the distribution of event times is time-invariant, meaning the statistical properties of the process remain unchanged under time shifts. This ensures that the intensity function and inter-event times exhibit consistent long-run behavior, independent of the starting time. Stationarity is a fundamental property that allows for meaningful analysis of the process's dynamics and facilitates applications in modeling recurrent events. In the univariate case, stationarity holds when the branching ratio n = \int_0^\infty \alpha(u) \, du < 1, where \alpha(u) is the excitation kernel; for the common exponential kernel \alpha(u) = \alpha e^{-\beta u}, this simplifies to n = \alpha / \beta < 1. Under this condition, the process admits a unique , and the long-run mean is given by \lambda^* = \mu / (1 - n), where \mu is the background . This finite mean rate implies that the expected number of s per unit time remains bounded, preventing unbounded growth in event frequency. If n \geq 1, the process becomes non-stationary, with the diverging to as time progresses, leading to an in event occurrences. For the multivariate Hawkes process, stationarity requires that the spectral radius \rho(A) < 1, where A is the integrated branching with entries A_{ij} = \int_0^\infty \alpha_{ij}(u) \, du capturing the expected number of of type j triggered by an event of type i. In this regime, the expected is \mathbb{E}[\lambda] = (I - A)^{-1} \mu, where I is the and \mu is the background , ensuring a finite overall event rate across components. Violation of \rho(A) < 1 results in , where at least one component's grows without bound, disrupting the process's equilibrium. This condition generalizes the univariate branching ratio and underscores the interplay between mutual excitations in multi-type events.

Clustering and branching structure

The Hawkes process exhibits a natural clustering structure that arises from its self-exciting nature, which can be interpreted through a framework. In this representation, each in the process is classified either as an "immigrant," generated by the background intensity μ, or as an "" triggered by a previous . Immigrants arrive according to an independent process with constant rate μ, serving as the roots of individual clusters. Each , whether immigrant or , independently produces a random number of direct , whose arrival times follow an inhomogeneous process governed by the function φ(·). The number of direct follows a with mean n = ∫_0^∞ φ(u) du, often termed the branching ratio or fertility rate. The expected total size S of a cluster (number of events including the immigrant) is 1/(1 - n), provided n < 1 to prevent supercritical branching and ensure finite clusters on average. For the common exponential kernel, the distribution of S follows the Borel–Tanner distribution. This structure underscores the process's tendency to produce bursts of activity following immigrants, followed by decay. As a cluster process, the overall Hawkes process is the superposition of these immigrant-initiated branches, where clusters are mutually and identically distributed. This highlights the inherent temporal dependencies, with excitations propagating through lineages but not across unrelated s. A typical visualization of a depicts an immigrant at the base, branching into direct at exponentially decaying intervals, each of which may spawn further in a tree-like , illustrating the self-reinforcing dynamics until excitations fade. This branching lens not only elucidates the clustering property but also connects the Hawkes process to classical Galton-Watson branching theory, emphasizing its role in modeling contagious phenomena.

Estimation and inference

Likelihood-based methods

Likelihood-based methods for estimating the parameters of a Hawkes process rely on maximizing the log-likelihood function derived from the point process intensity. For the univariate case, the log-likelihood \ell(\theta) for parameters \theta (typically the background intensity \mu, excitation magnitude \alpha, and decay rate \beta) based on observed event times $0 < t_1 < \cdots < t_n \leq T is given by \ell(\theta) = \sum_{i=1}^n \log \lambda(t_i ; \theta) - \int_0^T \lambda(t ; \theta) \, dt, where \lambda(t ; \theta) is the intensity function. When the kernel is exponential, \phi(u) = \alpha e^{-\beta u}, the integral term admits an exact , facilitating efficient computation: it equals \mu T + \frac{\alpha}{\beta} \sum_{i=1}^n (1 - e^{-\beta (T - t_i)}), assuming the process starts at time 0. This form allows direct maximization via numerical optimization, though the objective is generally non-concave in \beta, leading to potential multiple local maxima. To address the latent branching structure in Hawkes processes—where the parent-offspring relationships are unobserved—an algorithm can be adapted by treating the branching assignments as . In the E-step, posterior probabilities of each event being triggered by a previous event (or being a background event) are computed using the current parameter estimates, often via the normalized triggering kernel contributions to the . The M-step then maximizes the expected complete-data log-likelihood, which decomposes into Poisson-like terms for background and offspring events, updating parameters separately for excitation and decay. This approach stabilizes estimation for clustered data, such as in , by iteratively imputing the incomplete branching tree. In the multivariate setting with d event types, the log-likelihood extends naturally to account for cross-excitations, incorporating coupled intensities \lambda_j(t ; \theta) for each component j = 1, \dots, d: \ell(\theta) = \sum_{j=1}^d \sum_{i: t_i^j} \log \lambda_j(t_i^j ; \theta) - \sum_{j=1}^d \int_0^T \lambda_j(t ; \theta) \, dt. Here, each \lambda_j depends on the history of all types via the excitation and kernels, forming a joint likelihood over the multivariate counting processes. For kernels, the integrals again yield closed forms analogous to the univariate case, involving exponentials or recursive sums over past events. Optimization of this likelihood remains challenging due to its non-convexity, particularly in the decay parameters, which can result in ill-conditioned Hessians and sensitivity to initialization. Numerical methods such as or quasi-Newton algorithms are commonly employed, often with multiple starting points to mitigate local optima. In high dimensions, approximations like or functions may be used to accelerate while preserving asymptotic consistency.

Moment-based methods

Moment-based methods for estimating parameters of Hawkes processes rely on equating theoretical moments or structures of the or counting process to empirical estimates derived from observed event times. These techniques are especially applicable to processes and offer scalable alternatives to likelihood maximization, particularly when probabilistic computations are challenging. Under the of stationarity, which ensures finite moments and a well-defined structure, parameters such as the background \mu, branching ratio n, and decay rate \beta can be inferred by solving systems of moment equations or optimization problems. A prominent approach is autocovariance-based estimation, where the empirical of the or binned event counts is matched to its theoretical form. For the univariate Hawkes process with kernel \alpha(u) = n \beta e^{-\beta u}, the autocovariance function of the is \gamma(\tau) = \mathrm{Cov}(\lambda(t + \tau), \lambda(t)) = \frac{\mu n \beta}{(1 - n)^2} e^{-\beta |\tau|}, with stationary mean \mathbb{E}[\lambda(t)] = \mu / (1 - n). Empirical estimates of \gamma(\tau) are obtained by binning the timeline into intervals, computing sample covariances of count increments, and then solving the nonlinear equations for \mu, n, and \beta via least-squares fitting or (GMM). This method leverages the Markovian structure of the kernel to yield closed-form moment expressions, facilitating straightforward parameter recovery in financial or seismic data applications. The minimum contrast method extends this framework by minimizing a discrepancy measure between observed and model-predicted autocorrelograms, often using an integrated squared error or a spectral contrast like Whittle's approximation. For instance, the contrast function may take the form \sum_{\tau} [\hat{\gamma}(\tau) - \gamma_\theta(\tau)]^2, where \hat{\gamma}(\tau) is the empirical and \gamma_\theta(\tau) is the theoretical one parameterized by \theta = (\mu, n, \beta). In multivariate settings, cross-covariances across dimensions are similarly matched to estimate the kernel matrix. These estimators are asymptotically consistent and under mild conditions, with implementations available in statistical software for efficient computation. Moment-based methods are computationally efficient, requiring only O(n \log n) operations for large datasets via fast Fourier transforms for spectral variants, and avoid the recursive integral evaluations inherent in likelihood computations. They are thus ideal for high-frequency data where exact likelihoods become prohibitive. However, they presuppose stationarity for valid moment derivations and tend to exhibit higher variance and in small samples, where empirical covariances are noisy; non-exponential kernels further complicate closed-form expressions, often necessitating approximations or numerical solutions.

Applications

Finance and economics

In finance, Hawkes processes are widely applied to model dynamics, particularly by treating buy and sell orders as distinct types in a multivariate to capture the clustering of order flows. This approach accounts for self-excitation within each and cross-excitation between them, reflecting how incoming orders trigger subsequent activity in the limit order book. For instance, empirical studies on tick-level data from and futures markets demonstrate that such models reproduce observed patterns of order arrival bursts, with excitation kernels often exhibiting power-law decay to match long-memory effects in trade sequences. A prominent example is the volatility modeling proposed by Bacry et al. (2015), where a multivariate is used to describe the of mid-price changes and orders, explaining the power-law decay in functions of squared returns. In this setup, the process generates Epps effects and through the branching structure, with the excitation kernel's heavy tail (exponent around 0.5) leading to in measures. This model has been fitted to high-frequency data from major exchanges, showing superior performance over traditional GARCH models in capturing intraday patterns. Empirical estimations from tick data across various assets, such as , reveal branching ratios n \approx 0.8, indicating a near-critical where order flow is highly endogenous but remains . These parameters are typically obtained via maximum likelihood on timestamps, highlighting excitation strengths that align with observed market reflexivity during normal and stressed periods, like flash crashes. Economically, Hawkes processes inform functions by quantifying how past order imbalances persistently affect future prices, with impact decaying as a due to decaying excitation. This has implications for provision, as models show resilience through rapid replenishment of liquidity post-shocks, aiding optimal execution strategies for large trades and in .

Seismology and natural events

The Hawkes process has been instrumental in modeling earthquake sequences, particularly in capturing the self-exciting nature of aftershocks following a mainshock. In , the Hawkes process provides a framework to model the Omori-Utsu law, which describes the temporal decay of aftershock rates as a power-law , typically \lambda(t) \approx K / (t + c)^p where p \approx 1, K is a productivity constant, and c is a short-term cutoff to avoid divergence. This power-law kernel fits the observed decay of aftershock activity, enabling simulations and forecasts of seismic clustering. A prominent application is the Epidemic-Type Aftershock Sequence (ETAS) model, which extends the Hawkes process to a marked multivariate framework to account for spatial locations, , and branching structures in catalogs. In the ETAS model, each event triggers offspring with rates modulated by the parent's magnitude via Gutenberg-Richter , and the temporal is the power- Omori-Utsu while incorporating spatial decay. This allows for likelihood-based on rates and detection of anomalies in observed sequences. The foundational work on these applications came from Yosihiko Ogata, who in 1988 developed statistical models for earthquake occurrences using point process residuals and introduced extensions of the Hawkes framework to fit real seismicity data, including the ETAS formulation for analyzing clustered events in catalogs like those from . Beyond earthquakes, analogous cascade processes in natural phenomena have been modeled with Hawkes processes; for instance, neuronal spike trains exhibit self-excitation where one neuron's firing increases the probability of subsequent spikes in connected neurons, quantified through multivariate Hawkes inference on spike timings. Similarly, outbreaks display branching patterns, with Hawkes models linking infected cases to subsequent infections, as seen in connections between self-exciting processes and SIR compartmental models for diseases like or COVID-19.

Extensions and variants

Marked processes

Marked Hawkes processes extend the standard univariate Hawkes process by associating an additional random variable, known as a mark M_i \in \mathcal{M}, with each event time T_i. This mark represents auxiliary information about the event, such as magnitude or type, allowing the model to capture dependencies between event attributes and the overall process dynamics. The marks are typically assumed to follow a conditional distribution f^*(m \mid t), which may depend on the history \mathcal{H}_t up to time t, enabling the process to model heterogeneous impacts of past events. The function for a marked Hawkes process is formulated as a marked \lambda^*(t, m), which specifies the rate of events at time t with m. This is commonly expressed as \lambda^*(t, m) = \lambda_g^*(t) f^*(m \mid t), where \lambda_g^*(t) is the ground for event occurrences, analogous to the univariate case but now influenced by past marks, and f^*(m \mid t) is the conditional density of the given the history. The ground incorporates mark-dependent triggering kernels, such as \phi(u, m) for the from a past event at lag u with m, yielding \lambda_g^*(t) = \mu + \sum_{T_i < t} \phi(t - T_i, M_i), where \mu is the background rate. The total for any event at time t is then \lambda(t) = \int_{\mathcal{M}} \lambda^*(t, m) \, dm. This structure allows past marks to modulate future event rates, providing a richer representation of self-. A prominent example arises in seismology through the Epidemic-Type Aftershock Sequence (ETAS) model, where marks correspond to earthquake magnitudes following the Gutenberg-Richter law, \beta e^{-\beta(m - m_0)} for m \geq m_0. Here, the triggering kernel depends on the mark via an exponential productivity term A e^{\alpha (M_i - m_0)}, making larger-magnitude events more likely to trigger aftershocks, while the temporal decay follows the Omori-Utsu law. The marked intensity becomes \lambda^*(t, m) = \beta e^{-\beta(m - m_0)} \left[ \lambda + \sum_{T_i < t} A e^{\alpha (M_i - m_0)} \frac{p-1}{c} \left(1 + \frac{t - T_i}{c}\right)^{-p} \right], capturing magnitude-dependent clustering in seismic sequences. This extension, originally adapted for earthquake modeling, highlights how marks enhance the Hawkes framework for natural event data. Estimation for marked Hawkes processes involves adjusting the likelihood to account for both event times and marks, forming a joint log-likelihood \ell = \sum_i \log \lambda^*(T_i, M_i) - \int_0^T \lambda(s) \, ds. Direct can be unstable due to the integral term and mark dependencies, so alternatives like expectation-maximization algorithms or are often employed to infer parameters such as kernel forms and mark distributions. These methods ensure and computational feasibility, particularly in high-dimensional mark spaces.

Non-parametric forms

Non-parametric forms of the Hawkes process relax the assumption of a fixed parametric shape for the φ(u), enabling more flexible modeling of excitation patterns by estimating the directly from . These approaches typically employ methods such as , spline-based approximations, or Bayesian techniques to capture arbitrary decay behaviors without predefined functional forms. For instance, applies smoothing techniques, often with Gaussian , to approximate the triggering from observed times, providing a data-driven estimate of the contributions. Spline methods, including cubic splines or logsplines, represent the log- as a with knots, fitted via maximum likelihood to handle complex shapes while maintaining smoothness. Bayesian non-parametric variants, such as those using Gaussian processes or Laplace Bayesian processes, incorporate priors to quantify uncertainty in the estimate, often leveraging cluster representations of the process for . The intensity function in non-parametric Hawkes processes is approximated as \lambda(t) \approx \mu + \int_{-\infty}^{t} \alpha \phi(t - s) \, dN(s), where the φ is estimated non-parametrically, and regularization techniques—such as penalties in reproducing kernel Hilbert spaces (RKHS) or spline smoothing parameters—are applied to stabilize the and prevent erratic fits. This form allows the background rate μ and scaling α to be jointly inferred, with the kernel shape derived from second-order statistics like in symmetric cases or via expectation-maximization in general settings. A modern extension within non-parametric forms involves neural Hawkes processes, which parameterize the conditional intensity function using neural networks, such as recurrent neural networks (RNNs) or (LSTM) units, to capture complex, history-dependent excitations without assuming specific shapes. Introduced in 2017, these models allow the intensity to self-modulate based on the sequence of past events, enabling flexible modeling of multivariate and marked processes in high dimensions. Neural variants excel in applications requiring adaptability to non-stationary patterns, such as diffusion or neural spike trains, though they require large datasets for training and may sacrifice interpretability compared to traditional methods. Recent advances as of 2025 include drift-aware formulations to handle evolving event distributions. A key advantage of non- forms is their ability to capture complex decay structures, such as power-law tails (e.g., φ(u) ∝ u^{-β} with β ≈ 1.5), which are prevalent in high-frequency financial data to model long-memory clustering in price jumps. These methods outperform alternatives in scenarios with irregular patterns, as demonstrated in simulations and real-world applications like event diffusion on . However, non-parametric estimation faces challenges including due to high flexibility, which necessitates techniques like cross-validation for or knot selection in kernels and splines. Additionally, the computational demands are significant, often scaling with the number of events and requiring efficient algorithms like block to handle large datasets without .

References

  1. [1]
    [PDF] Spectra of some self-exciting and mutually exciting point processes
    In this paper the theoretical properties of a class of processes with particu- lar reference to the point spectrum or corresponding covariance density functions ...Missing: Oakes | Show results with:Oakes
  2. [2]
  3. [3]
    Hawkes Processes Modeling, Inference, and Control: An Overview
    Hawkes processes are a type of point process that models self-excitement among time events. They have been used in a myriad of applications, ranging from ...
  4. [4]
    [PDF] Hawkes Processes on Social and Mass Media: - Uppsala University
    Hawkes processes are a class of point processes characterised by their self-exciting nature, meaning that when an event occurs, the probability of a new event ...
  5. [5]
    An Introduction to the Theory of Point Processes - SpringerLink
    Point processes and random measures find wide applicability in telecommunications, earthquakes, image analysis, spatial point patterns, and stereology.
  6. [6]
    [PDF] Chapter 2 - POISSON PROCESSES - MIT OpenCourseWare
    The Poisson process, as we defined it, is characterized by a constant arrival rate λ. ... non-homogeneous Poisson process with rate λ[1 − G(τ − t)] at ...
  7. [7]
    [PDF] Chapter 4 - RENEWAL PROCESSES - MIT OpenCourseWare
    Recall that a renewal process is an arrival process in which the interarrival intervals are positive,1 independent and identically distributed (IID) random ...
  8. [8]
    Renewal Process - an overview | ScienceDirect Topics
    A renewal process is a point process in which the interevent intervals are independent and drawn from the same probability density.Missing: distinction | Show results with:distinction
  9. [9]
    Spectra of some self-exciting and mutually exciting point processes
    This paper discusses theoretical properties of point processes, focusing on the point spectrum and covariance density functions, and a self-exciting process.
  10. [10]
    Using a latent Hawkes process for epidemiological modelling - PMC
    Mar 1, 2023 · The Hawkes process is a well known self-exciting process in which the intensity function depends on all previous events assuming infinite ...
  11. [11]
    Global infectious disease early warning models: An updated review ...
    The Hawkes process is a point process with self-exciting properties, used for modeling randomly occurring events. It assumes that the occurrence of an event ...
  12. [12]
    [PDF] Robust Estimation of Self-Exciting Point Process Models with ...
    Self-exciting point processes have been utilized in neuro- science in order to assess the functional connectivity of neu- ronal ensembles.Missing: epidemiology | Show results with:epidemiology
  13. [13]
    Magnitude‐dependent Omori law: Theory and empirical study - Ouillon
    Apr 22, 2005 · Omori's law quantifying the decay of seismic activity after a “main shock” occurring at the origin of time amounts to determining the time ...
  14. [14]
    Self-exciting point process in modeling earthquake occurrences
    The earthquakes can be regarded as point patterns that have a temporal clustering feature so we use self-exciting point process for modeling the conditional ...
  15. [15]
    [PDF] An Introduction to the Theory of Point Processes: Volume I
    Feb 5, 2018 · Page 1. An Introduction to the. Theory of Point Processes: Volume I: Elementary. Theory and Methods,. Second Edition. D.J. Daley. D. Vere-Jones.
  16. [16]
    A Cluster Process Representation of a Self-Exciting Process - jstor
    HAWKES AND DAVID OAKES events formed by the births of all the descenda grant ... processes as defined by Hawkes (1971a. b). 3. Some counting and ...
  17. [17]
    Spectra of Some Self-Exciting and Mutually Exciting Point Processes
    Thus it is clear that given a set of data from a point process we cannot expect to dis- criminate between a self-exciting and a doubly stochastic process on ...
  18. [18]
    [PDF] arXiv:1302.1405v2 [q-fin.ST] 5 Jun 2013
    Jun 5, 2013 · the process is stationary with an average intensity Λ = µ/(1 − n) ≥ µ. When n = 0, the Hawkes process is a homogeneous Poisson process ...
  19. [19]
    Coarse-Grained Hawkes Processes - MDPI
    The multivariate Hawkes process is asymptotically stationary if the spectral radius of the branching matrix. A : = ∫ 0 ∞ Φ ( t ) d t. is strictly less than ...
  20. [20]
    A cluster process representation of a self-exciting process
    Jul 14, 2016 · Daley, D. J. and Vere-Jones, D. (1971) A summary of the theory of point processes. Paper presented to the stochastic point process ...
  21. [21]
    (PDF) A Cluster Process Representation of a Self-Exciting Process
    Aug 10, 2025 · The Hawkes process, or self-exciting process, was introduced by (Hawkes, 1971; Hawkes and Oakes, 1974) , see also (Errais et al., 2010;Hawkes, ...
  22. [22]
    [1902.03714] Hawkes processes for credit indices time series analysis
    Feb 11, 2019 · ... likelihood non-convex optimization. The method was successfully tested on simulated data, then used on new publicly available real trading ...
  23. [23]
  24. [24]
    Maximum Likelihood Estimation for Hawkes Processes with self ...
    Mar 9, 2021 · In this paper, we present a maximum likelihood method for estimating the parameters of a univariate Hawkes process with self-excitation or inhibition.
  25. [25]
    [1509.02007] Hawkes and INAR($\infty$) processes - arXiv
    Sep 7, 2015 · We establish existence, uniqueness, finiteness of moments, and give formulas for the autocovariance function as well as for the joint moment- ...
  26. [26]
    None
    ### Summary of Hawkes Process: Definition, History, and Key Applications
  27. [27]
    [1502.04592] Hawkes processes in finance - arXiv
    Hawkes processes are multivariate point processes used in finance for estimating volatility, market stability, systemic risk, and optimal execution strategies.
  28. [28]
    Branching ratio approximation for the self-exciting Hawkes process
    Mar 20, 2014 · We introduce a model-independent approximation for the branching ratio of Hawkes self-exciting point processes.Missing: univariate | Show results with:univariate
  29. [29]
    Statistical Models for Earthquake Occurrences and Residual ...
    This article discusses several classes of stochastic models for the origin times and magnitudes of earthquakes.
  30. [30]
    Closed-form modeling of neuronal spike train statistics using ...
    Hawkes processes [1] are self-exciting point processes that have been applied to the modeling of random spike trains in neuroscience in, e.g., Refs. [2–5] .
  31. [31]
    SIR-Hawkes: Linking Epidemic Models and Hawkes Processes to ...
    In this work, we present a previously unexplored connection between Hawkes point processes and SIR epidemic models.
  32. [32]
    [PDF] Space-Time Point-Process Models for Earthquake Occurrences
    2.1 Self-exciting processes. If we ... (1995) which is based on Utsu-Seki empirical law of the aftershock area in space and the modified Omori law in time.
  33. [33]
    None
    ### Summary of Non-Parametric Methods for Hawkes Kernel Estimation
  34. [34]
    [PDF] Efficient Non-parametric Bayesian Hawkes Processes - IJCAI
    An alternative representation of the Hawkes process is a cluster of Poisson processes [Hawkes and Oakes, 1974], which categorizes points into immigrants and ...
  35. [35]
  36. [36]
    None
    ### Summary of Non-Parametric Kernel Estimation for Symmetric Multivariate Hawkes Processes
  37. [37]
    Non-parametric kernel estimation for symmetric Hawkes processes ...
    May 22, 2012 · This paper defines a non-parametric method to estimate kernel shapes in symmetric Hawkes processes, using second-order properties, and applies ...