Convolution theorem

The convolution theorem is a fundamental principle in Fourier analysis that states the Fourier transform of the convolution of two functions equals the pointwise product of their individual Fourier transforms, and conversely, the inverse Fourier transform of the product of two Fourier transforms equals the convolution of the original functions.^[1] This relationship, often expressed mathematically as \mathcal{F}\{f * g\} = \mathcal{F}\{f\} \cdot \mathcal{F}\{g\} where f * g (t) = \int_{-\infty}^{\infty} f(\tau) g(t - \tau) \, d\tau, enables efficient computation of convolutions in the frequency domain, particularly using fast Fourier transform algorithms.^[2] The theorem holds for functions in various domains, including time and spatial signals, and is commutative, meaning f * g = g * f.^[2] It extends to multidimensional cases and discrete signals, where the discrete Fourier transform (DFT) replaces the continuous version, facilitating numerical implementations.^[3] Originally derived in the context of integral transforms, the theorem's proof relies on the linearity and properties of the Fourier transform, such as the transform of a shifted function.^[1] In applications, the convolution theorem is pivotal in signal processing for tasks like filtering, where convolving a signal with an impulse response in the time domain corresponds to multiplying their spectra in the frequency domain, simplifying deconvolution and system analysis.^[2] It also underpins image processing techniques, such as blurring or template matching, by converting computationally intensive spatial convolutions into efficient frequency-domain multiplications.^[3] More broadly, in machine learning, the theorem supports convolutional neural networks (CNNs) by enabling fast computation of large-kernel convolutions via the fast Fourier transform (FFT), reducing complexity from O(N^2 M^2) to O(N^2 \log N) for kernel size M and input size N.^[3] These uses highlight the theorem's role in transforming complex integral operations into algebraic ones, advancing fields from physics to computer vision.^[1]

Core statements

Continuous aperiodic case

In the continuous aperiodic case, the convolution of two functions f, g \in L^1(\mathbb{R}) (the space of absolutely integrable functions on the real line) is defined as

(f * g)(t) = \int_{-\infty}^{\infty} f(\tau) g(t - \tau) \, d\tau.

This operation measures the overlap between f and a reflected, shifted version of g, and it extends to L^2(\mathbb{R}) (square-integrable functions) via density arguments and Plancherel's theorem.^[4]^[5] The convolution theorem states that the Fourier transform of the convolution equals the pointwise product of the individual Fourier transforms:

\mathcal{F}\{f * g\}(\omega) = \mathcal{F}\{f\}(\omega) \cdot \mathcal{F}\{g\}(\omega),

where the Fourier transform is defined as

\mathcal{F}\{f\}(\omega) = \int_{-\infty}^{\infty} f(t) e^{-2\pi i \omega t} \, dt.

This holds for functions in L^1(\mathbb{R}) by direct computation using Fubini's theorem for interchanging integrals, and for L^2(\mathbb{R}) by continuity of the Fourier transform as a unitary operator on that space.^[4]^[1] The inverse form of the theorem asserts the duality: the inverse Fourier transform of the product of the transforms yields the convolution,

f * g = \mathcal{F}^{-1} \left\{ \mathcal{F}\{f\} \cdot \mathcal{F}\{g\} \right\}.

Here, the inverse transform is

\mathcal{F}^{-1}\{\hat{f}\}(t) = \int_{-\infty}^{\infty} \hat{f}(\omega) e^{2\pi i \omega t} \, d\omega,

ensuring the operation is reversible under suitable integrability conditions, such as when \mathcal{F}\{f\} and \mathcal{F}\{g\} are in L^1(\mathbb{R}). This duality simplifies computations in signal processing and partial differential equations by converting integration in the time domain to multiplication in the frequency domain.^[4]^[1] A illustrative example is the convolution of two Gaussian functions, f(t) = e^{-\pi t^2} and g(t) = e^{-\pi t^2}, each of whose Fourier transform is itself, \mathcal{F}\{f\}(\omega) = e^{-\pi \omega^2} and similarly for g. By the theorem, \mathcal{F}\{f * g\}(\omega) = e^{-2\pi \omega^2}, so the inverse transform yields f * g (t) = \frac{1}{\sqrt{2}} e^{-\frac{\pi t^2}{2}}, another Gaussian with variance scaled by the sum of the originals. More generally, for f(t) = e^{-a t^2} and g(t) = e^{-b t^2} with a, b > 0, the result is (f * g)(t) = \sqrt{\frac{\pi}{a+b}} e^{-\frac{a b t^2}{a+b}}, demonstrating how the product of exponential transforms produces a convolved Gaussian.^[2]^[4] The convolution theorem for the Fourier transform was first linked to the transform by Émile Borel in 1899 through work on divergent series and was fully established by Percy Daniell in 1920 via generalizations of Stieltjes-Volterra products.^[6]

Discrete aperiodic case

In the discrete aperiodic case, the convolution of two sequences f and g, defined over the integers \mathbb{Z}, is given by

(f * g) = \sum_{k=-\infty}^{\infty} f g[n - k]

for all n \in \mathbb{Z}, where the sequences belong to the space \ell^1(\mathbb{Z}) (absolutely summable) or \ell^2(\mathbb{Z}) (square-summable).^[7] This operation models the output of a linear time-invariant system in discrete-time signal processing when one sequence acts as the input and the other as the impulse response.^[8] The discrete-time Fourier transform (DTFT) provides the frequency-domain representation for such aperiodic sequences, defined as

\mathcal{F}\{f\}(\omega) = F(\omega) = \sum_{n=-\infty}^{\infty} f e^{-j \omega n},

where \omega \in [-\pi, \pi] is the normalized angular frequency.^[9] Similarly for g, yielding G(\omega). The convolution theorem states that the DTFT of the convolved sequence is the pointwise product of the individual DTFTs:

\mathcal{F}\{f * g\}(\omega) = F(\omega) \cdot G(\omega).

^[7]^[8] This property simplifies computations by transforming time-domain convolution into frequency-domain multiplication, which is particularly efficient for long sequences in digital signal processing applications. The inverse relation follows from the inverse DTFT, which reconstructs the time-domain sequence via

f * g = \mathcal{F}^{-1}\{F \cdot G\},

where the inverse transform is

(f * g) = \frac{1}{2\pi} \int_{-\pi}^{\pi} F(\omega) G(\omega) e^{j \omega n} \, d\omega.

^[9] This bidirectional equivalence holds under the specified sequence spaces, enabling seamless switching between domains. For convergence, the DTFT of sequences in \ell^1(\mathbb{Z}) is guaranteed to exist and be continuous for all \omega, as the absolute summability ensures the infinite sum converges uniformly.^[10] For \ell^2(\mathbb{Z}) sequences, the DTFT exists in the mean-square sense but may exhibit discontinuities, though the convolution theorem still applies when both inputs are in these spaces.^[10] A representative example illustrates the theorem: consider the convolution of two identical rectangular sequences, each of length M+1, defined as x = 1 for $0 \leq n \leq M and 0 otherwise. The time-domain convolution yields a triangular sequence of length $2M+1, peaking at M+1 and linearly tapering to the sides. In the frequency domain, the DTFT of each rectangular sequence is the Dirichlet kernel X(\omega) = \frac{\sin(\omega (M+1)/2)}{\sin(\omega /2)} e^{-j \omega M /2}, so the DTFT of the convolution is |X(\omega)|^2, demonstrating how multiplication produces the frequency response of the triangular output.^[11]^[7]

Periodic variants

Continuous periodic functions

For P-periodic continuous functions f and g, the periodic convolution is defined by

(f \ast g)(t) = \frac{1}{P} \int_{0}^{P} f(\tau) g(t - \tau) \, d\tau.

This operation produces another P-periodic function, capturing the "overlap" of f and g over one period while ensuring consistency with Fourier analysis on the circle. The normalization factor $1/P aligns the integral with the measure of the fundamental domain [0, P], preventing scaling issues that would arise in the frequency domain.^[12] The Fourier series of a P-periodic function h is expressed as h(t) = \sum_{n=-\infty}^{\infty} c_n(h) e^{2\pi i n t / P}, where the complex coefficients are

c_n(h) = \frac{1}{P} \int_{0}^{P} h(t) e^{-2\pi i n t / P} \, dt.

The convolution theorem adapts to this setting by stating that the Fourier coefficients of the periodic convolution satisfy

c_n(f \ast g) = c_n(f) \, c_n(g)

for every integer n. This pointwise multiplication in the frequency domain simplifies computations for periodic signals, such as filtering or system responses in engineering contexts. The theorem holds under mild conditions, such as square-integrability over the period, ensuring the integrals converge.^[12]^[13] This periodic formulation relates to the aperiodic Fourier transform through the Dirac comb, or Shah function, \Sh_P(t) = \sum_{k=-\infty}^{\infty} \delta(t - kP). A P-periodic function can be viewed as the convolution of an aperiodic function (supported on [0, P]) with \Sh_P, yielding the infinite periodic extension. The Fourier transform of \Sh_P is (2\pi / P) \Sh_{2\pi / P}(\omega), another comb, so the transform of the periodic function becomes the aperiodic transform modulated by this comb. Applying the convolution theorem in the transform domain then recovers the coefficient multiplication after sampling at discrete frequencies $2\pi n / P.^[14] A concrete example illustrates the theorem: the periodic convolution of two P-periodic square waves, each alternating between +1/2 and -1/2 with period P (a standard even square wave), produces a P-periodic triangular wave ranging from -1/2 to $1/2. The square wave has Fourier coefficients c_n = -i/(n\pi) for odd n and 0 otherwise, decaying as $1/|n|; their product yields c_n(f \ast g) \propto 1/n^2 for odd n, matching the known series for the triangular wave and highlighting the smoothing effect of convolution.^[15]^[16] The $1/P factor in the convolution definition is essential for the theorem's clean form, as it mirrors the normalization in the coefficient integral. Omitting it would introduce an extraneous $1/P in the product c_n(f \ast g) = (1/P) c_n(f) c_n(g), complicating reconstructions and applications like signal processing. This choice ensures the periodic case parallels the aperiodic transform limit as P \to \infty.^[12]

Discrete circular convolution

In the discrete circular convolution, two finite-length sequences f and g, each of length N, are convolved using periodic extension with wrap-around, defined as

(f \circledast g) = \sum_{k=0}^{N-1} f \, g[(n - k) \mod N], \quad 0 \leq n < N.

This operation treats the sequences as periodic with period N, incorporating aliasing effects from the modulo arithmetic that differ from linear convolution.^[17] The convolution theorem for the discrete Fourier transform (DFT) states that the DFT of the circular convolution equals the pointwise product of the individual DFTs:

\text{DFT}(f \circledast g) = \text{DFT}(f) \cdot \text{DFT}(g), \quad 0 \leq k < N,

where the DFT is given by

X = \sum_{n=0}^{N-1} x \, e^{-2\pi i k n / N}.

This equivalence holds because the DFT basis functions are eigenfunctions of the circulant matrix representing circular convolution.^[17]^[18] Conversely, the circular convolution can be recovered via the inverse DFT (IDFT):

f \circledast g = \text{IDFT} \bigl\{ \text{DFT}(f) \cdot \text{DFT}(g) \bigr\},

with the IDFT defined as

x = \frac{1}{N} \sum_{k=0}^{N-1} X \, e^{2\pi i k n / N}.

This relation enables efficient computation using fast Fourier transform (FFT) algorithms, reducing complexity from O(N^2) for direct summation to O(N \log N).^[17]^[18] A representative example involves the circular convolution of two 4-point finite impulse response sequences: h = \{-2, -1, 3, 1\} and x = \{-1, 0, 2, 1\}. Their DFTs are H = \{1, -5 + 2i, 1, -5 - 2i\} and X = \{2, -3 + i, 0, -3 - i\}, respectively. The product Y = H X = \{2, 13 - 11i, 0, 13 + 11i\}, and the IDFT yields y = \{7, 6, -6, -5\}, matching direct circular summation. This demonstrates utility in FFT-based filtering for periodic or block-processed signals.^[19] To approximate linear convolution without aliasing using circular convolution, sequences are zero-padded to length at least N_1 + N_2 - 1, where N_1 and N_2 are original lengths, ensuring the wrap-around does not overlap valid terms.^[17]^[20]

Proofs

Continuous case derivation

The derivation for the continuous aperiodic case assumes that f, g \in L^1(\mathbb{R}), the space of Lebesgue integrable functions over the real line, which guarantees the absolute convergence of the relevant integrals.^[21] Under this assumption, the convolution f * g is well-defined and also belongs to L^1(\mathbb{R}), with \|f * g\|_1 \leq \|f\|_1 \|g\|_1.^[21] Consider the Fourier transform of the convolution, defined as

\mathcal{F}\{[f * g](/page/F&G)\}(\omega) = \int_{\mathbb{R}} \left( \int_{\mathbb{R}} f(\tau) g(t - \tau) \, d\tau \right) e^{-2\pi i [\omega](/page/Omega) t} \, dt.

^[21] The integrand satisfies |f(\tau) g(t - \tau) e^{-2\pi i \omega t}| = |f(\tau) g(t - \tau)|, and its double integral over \mathbb{R}^2 equals \|f\|_1 \|g\|_1 < \infty, so Fubini's theorem permits interchanging the order of integration:^[21]

\mathcal{F}\{[f * g](/page/F&G)\}(\omega) = \int_{\mathbb{R}} f(\tau) \left( \int_{\mathbb{R}} g(t - \tau) e^{-2\pi i \omega t} \, dt \right) d\tau.

^[21] For the inner integral, perform the change of variables s = t - \tau (so dt = ds and t = s + \tau), yielding

\int_{\mathbb{R}} g(s) e^{-2\pi i \omega (s + \tau)} \, ds = e^{-2\pi i \omega \tau} \int_{\mathbb{R}} g(s) e^{-2\pi i \omega s} \, ds = \mathcal{F}\{g\}(\omega) \, e^{-2\pi i \omega \tau}.

^[21] Substituting back gives

\mathcal{F}\{[f * g](/page/F&G)\}(\omega) = \mathcal{F}\{g\}(\omega) \int_{\mathbb{R}} f(\tau) e^{-2\pi i \omega \tau} \, d\tau = \mathcal{F}\{f\}(\omega) \, \mathcal{F}\{g\}(\omega),

^[21] establishing the theorem. The result extends to functions in L^2(\mathbb{R}) via the Plancherel theorem, which shows that the Fourier transform is a unitary operator (isometry) on L^2(\mathbb{R}).^[22] Specifically, the Schwartz functions (smooth with rapid decay) are dense in L^2(\mathbb{R}), the theorem holds on this dense subspace by the L^1 case (as Schwartz functions are in L^1 \cap L^2), and continuity of the Fourier transform on L^2 yields the extension.^[22] An alternative derivation uses the Fourier inversion formula, which recovers g(t - \tau) as the inverse transform of \mathcal{F}\{g\}.^[23] Substituting into the convolution integral and interchanging orders (justified by Tonelli's theorem for nonnegativity or Fubini for integrability) with a change of variables \theta = t - \tau leads to the inverse transform of the product \mathcal{F}\{f\} \cdot \mathcal{F}\{g\}, confirming the theorem.^[23] For functions not in L^1(\mathbb{R}) but in L^2(\mathbb{R}), the Fourier transforms are defined only in the L^2 sense (not pointwise), and convolutions may require approximation by mollifiers; the theorem holds with equality in the L^2 norm, addressing convergence via the Plancherel isometry.^[22]

Discrete case derivation

The discrete convolution theorem for aperiodic signals states that the discrete-time Fourier transform (DTFT) of the convolution of two sequences equals the pointwise product of their individual DTFTs, under suitable convergence conditions.^[7] Consider two discrete-time signals f and g with DTFTs \mathcal{F}\{f\}(\omega) = F(\omega) and \mathcal{F}\{g\}(\omega) = G(\omega), where the convolution is defined as

(f * g) = \sum_{k=-\infty}^{\infty} f g[n - k].

The DTFT of the convolution is

\mathcal{F}\{f * g\}(\omega) = \sum_{n=-\infty}^{\infty} (f * g) e^{-i \omega n} = \sum_{n=-\infty}^{\infty} \left( \sum_{k=-\infty}^{\infty} f g[n - k] \right) e^{-i \omega n}.

Assuming absolute convergence of the sequences, i.e., \sum |f| < \infty and \sum |g| < \infty, the double summation allows reordering by Fubini's theorem for sums.^[7] This yields

\mathcal{F}\{f * g\}(\omega) = \sum_{k=-\infty}^{\infty} f \sum_{n=-\infty}^{\infty} g[n - k] e^{-i \omega n}.

For the inner sum, substitute m = n - k, so n = m + k and

\sum_{n=-\infty}^{\infty} g[n - k] e^{-i \omega n} = e^{-i \omega k} \sum_{m=-\infty}^{\infty} g e^{-i \omega m} = e^{-i \omega k} G(\omega).

Thus,

\mathcal{F}\{f * g\}(\omega) = G(\omega) \sum_{k=-\infty}^{\infty} f e^{-i \omega k} = F(\omega) G(\omega).

For periodic variants, the discrete Fourier transform (DFT) applies to finite-length sequences of length N, where convolution becomes circular. The circular convolution is

(f \circledast g) = \sum_{k=0}^{N-1} f g[(n - k) \mod N].

The DFT of this circular convolution equals the product of the individual DFTs, leveraging the orthogonality of the complex exponentials.^[24] Specifically, the DFT basis functions satisfy

\sum_{n=0}^{N-1} e^{2\pi i (k - l) n / N} = N \delta_{k l \mod N},

which follows from the sum of a finite geometric series: for k \neq l \mod N, the sum is \frac{1 - e^{2\pi i (k-l)}}{1 - e^{2\pi i (k-l)/N}} = 0, and for k = l \mod N, it is N. To derive the theorem, compute the DFT of f \circledast g:

\mathcal{F}\{f \circledast g\} = \sum_{n=0}^{N-1} \left( \sum_{k=0}^{N-1} f g[n - k] \right) e^{-2\pi i m n / N}.

Reordering gives

\sum_{k=0}^{N-1} f \sum_{n=0}^{N-1} g[n - k] e^{-2\pi i m n / N} = \sum_{k=0}^{N-1} f e^{-2\pi i m k / N} \sum_{l=0}^{N-1} g e^{-2\pi i m l / N},

using the shift property and periodicity, resulting in \mathcal{F}\{f\} \cdot \mathcal{F}\{g\}.^[24] The theorem extends to square-summable sequences (\ell^2) via density arguments and an analog of Parseval's theorem, which equates the energy in time and frequency domains: \sum_{n=-\infty}^{\infty} |f|^2 = \frac{1}{2\pi} \int_{-\pi}^{\pi} |F(\omega)|^2 d\omega.^[25] Finite-support sequences (in \ell^1 \cap \ell^2) satisfy the theorem directly, and \ell^2 sequences approximate these densely, allowing the result by continuity of the DTFT on \ell^2.^[25]

Generalizations

Inverse Fourier transform

The inverse Fourier transform of the pointwise product of two Fourier transforms equals the convolution of the original functions, providing the dual form of the convolution theorem. Specifically, if F = \mathcal{F}\{f\} and G = \mathcal{F}\{g\}, then

\mathcal{F}^{-1}\{ F \cdot G \} = f * g,

where the convolution is defined as (f * g)(t) = \int_{-\infty}^{\infty} f(\tau) g(t - \tau) \, d\tau, assuming f and g are sufficiently integrable functions such as those in the Schwartz class or L^1(\mathbb{R}).^[1]^[4] This inverse form follows directly from the forward convolution theorem and the Fourier inversion theorem, which ensures that applying the forward and inverse transforms recovers the original function under appropriate conditions. By taking the inverse Fourier transform of both sides of the forward theorem

\mathcal{F}\{[f * g](/page/F&G)\} = F \cdot G

, and invoking the linearity and invertibility of the transform, the equivalence is established, highlighting the symmetry between the time and frequency domains.^[1]^[26] In the discrete setting, the analogous result holds for finite sequences using the discrete Fourier transform (DFT) and its inverse (IDFT). For sequences x and y of length N, the circular convolution is given by

\text{IDFT}\{ \text{DFT}\{x\} \cdot \text{DFT}\{y\} \} = x \circledast y,

where (x \circledast y) = \sum_{m=0}^{N-1} x y[(n - m) \mod N], assuming zero-padding if necessary to avoid wrap-around effects in linear convolution approximations.^[27]^[28] A practical illustration arises in signal smoothing, where multiplying the Fourier transform of a signal by a low-pass filter's transfer function (e.g., a rectangular window in frequency) attenuates high frequencies, and the inverse transform yields a convolved time-domain signal that smooths the original via the filter's impulse response, such as a sinc function.^[4] This duality is particularly advantageous in filter design, as pointwise multiplication in the frequency domain is computationally simpler than direct time-domain convolution, enabling efficient implementation via fast Fourier transform algorithms.^[27]

Tempered distributions

Tempered distributions are continuous linear functionals on the Schwartz space \mathcal{S}(\mathbb{R}), the space of smooth functions that decay rapidly at infinity along with all their derivatives.^[29] These distributions, denoted \mathcal{S}'(\mathbb{R}), extend the notion of functions to include generalized objects like the Dirac delta, enabling the Fourier transform to be defined via \langle \hat{f}, \phi \rangle = \langle f, \hat{\phi} \rangle for \phi \in \mathcal{S}(\mathbb{R}).^[30] The convolution theorem generalizes to tempered distributions as follows: if f \in \mathcal{S}'(\mathbb{R}) is a tempered distribution and g \in C_c^\infty(\mathbb{R}) is a smooth function with compact support, then the convolution f * g is a smooth function defined by (f * g)(x) = \langle f, \tau_x \tilde{g} \rangle, where \tau_x \tilde{g}(y) = g(x - y) and \tilde{g}(y) = g(-y), and the Fourier transform satisfies \mathcal{F}\{f * g\} = \mathcal{F}\{f\} \cdot \mathcal{F}\{g\} in the distributional sense.^[29] This product of distributions is well-defined because \mathcal{F}\{g\} is a Schwartz function, which multiplies smoothly with any tempered distribution.^[30] Convolution between two tempered distributions is defined provided at least one has compact support, ensuring the result is again a tempered distribution; for instance, if f \in \mathcal{E}'(\mathbb{R}) (compactly supported distributions) and g \in \mathcal{S}'(\mathbb{R}), then f * g \in \mathcal{S}'(\mathbb{R}).^[31] A key property is that the Dirac delta distribution \delta, defined by \langle \delta, \phi \rangle = \phi(0), acts as the identity under convolution: \delta * f = f for any tempered distribution f.^[32] As an illustrative example, since the Fourier transform of \delta is the constant function 1 (with appropriate normalization), the theorem yields \mathcal{F}\{\delta * f\} = \mathcal{F}\{\delta\} \cdot \mathcal{F}\{f\} = 1 \cdot \mathcal{F}\{f\} = \mathcal{F}\{f\}, confirming consistency with \delta * f = f.^[32] A proof sketch proceeds by leveraging the density of Schwartz functions in suitable topologies and the pairing definition: for a test function \phi \in \mathcal{S}(\mathbb{R}), \langle f * g, \phi \rangle = \langle f, \tilde{g} * \phi \rangle, where \tilde{g} * \phi is a Schwartz function; applying the Fourier transform to both sides and using the known theorem for Schwartz functions gives \langle \mathcal{F}\{f * g\}, \hat{\phi} \rangle = \langle \mathcal{F}\{f\} \cdot \mathcal{F}\{g\}, \hat{\phi} \rangle, establishing the equality in the distributional sense.^[29]

Multidimensional extensions

The convolution theorem extends naturally to functions defined on \mathbb{R}^d or \mathbb{Z}^d, generalizing the one-dimensional case where d=1. In the continuous setting, the multidimensional convolution of two integrable functions f and g is defined as

(f * g)(\mathbf{x}) = \int_{\mathbb{R}^d} f(\mathbf{y}) g(\mathbf{x} - \mathbf{y}) \, d\mathbf{y},

where \mathbf{x}, \mathbf{y} \in \mathbb{R}^d.^[33] The multidimensional Fourier transform of a function f: \mathbb{R}^d \to \mathbb{C} is given by

\mathcal{F}\{f\}(\boldsymbol{\omega}) = \int_{\mathbb{R}^d} f(\mathbf{x}) e^{-2\pi i \boldsymbol{\omega} \cdot \mathbf{x}} \, d\mathbf{x},

with \boldsymbol{\omega} \in \mathbb{R}^d and \cdot denoting the dot product; the inverse transform follows analogously with the positive exponent.^[33] Under suitable integrability conditions (e.g., f, g \in L^1(\mathbb{R}^d)), the convolution theorem states that the Fourier transform of the convolution equals the pointwise product of the individual transforms:

\mathcal{F}\{f * g\}(\boldsymbol{\omega}) = \mathcal{F}\{f\}(\boldsymbol{\omega}) \cdot \mathcal{F}\{g\}(\boldsymbol{\omega}).

This property holds componentwise in the frequency domain, facilitating efficient computation for higher-dimensional signals such as those in imaging or physics simulations.^[34] In the discrete case, particularly for multidimensional arrays like digital images, the theorem applies via the multidimensional discrete Fourier transform (DFT). For two M \times N arrays f(x, y) and h(x, y) in two dimensions (d=2), the 2D circular convolution is computed efficiently in the frequency domain: the 2D DFT of the convolution equals the product of the 2D DFTs, F(u, v) \cdot H(u, v), followed by an inverse 2D DFT to recover the spatial result.^[35] To avoid wraparound artifacts from periodicity, zero-padding is typically applied to extend the arrays to at least (2M-1) \times (2N-1).^[35] This is widely used in image processing for filtering operations, where the fast Fourier transform (FFT) implementation reduces complexity from O(M^2 N^2) to O(M N \log(M N)).^[35] A representative application is 2D Gaussian convolution for image blurring, where the kernel is a radially symmetric Gaussian G(\mathbf{x}) = \frac{1}{2\pi \sigma^2} e^{-\frac{\|\mathbf{x}\|^2}{2\sigma^2}}. The 2D Fourier transform of this kernel is also Gaussian, \mathcal{F}\{G\}(\boldsymbol{\omega}) = e^{-2\pi^2 \sigma^2 \|\boldsymbol{\omega}\|^2}, which is radial (depending only on \|\boldsymbol{\omega}\|) and acts as a low-pass filter by attenuating high frequencies.^[36] Convolving an image I with G thus multiplies \mathcal{F}\{I\} by \mathcal{F}\{G\} in the frequency domain, yielding a blurred image upon inversion; larger \sigma widens the Gaussian in space and narrows it in frequency, removing finer details.^[36] For separable functions, where f(\mathbf{x}) = \prod_{k=1}^d f_k(x_k) over Cartesian coordinates, the multidimensional Fourier transform factors as \mathcal{F}\{f\}(\boldsymbol{\omega}) = \prod_{k=1}^d \mathcal{F}\{f_k\}(\omega_k), and the convolution theorem applies componentwise across dimensions.^[33] This separability simplifies computations in applications like multidimensional filtering, reducing higher-dimensional operations to iterated one-dimensional transforms.^[33]

Other integral transforms

The convolution theorem extends to other integral transforms, such as the Laplace transform and the Z-transform, which share the property that the transform of a convolution equals the product of the individual transforms, albeit with adaptations for their domains./09%3A_Transform_Techniques_in_Physics/9.09%3A_The_Convolution_Theorem)^[37] For the Laplace transform, the theorem applies to causal functions, where the convolution is defined as the one-sided integral (f * g)(t) = \int_0^t f(\tau) g(t - \tau) \, d\tau for t \geq 0, and the transform satisfies \mathcal{L}\{f * g\}(s) = \mathcal{L}\{f\}(s) \cdot \mathcal{L}\{g\}(s)./09%3A_Transform_Techniques_in_Physics/9.09%3A_The_Convolution_Theorem)^[38] This formulation is particularly useful in analyzing linear time-invariant systems, such as where the output response is the convolution of an input with the system's impulse response, transforming to the product of their Laplace transforms (the transfer function).^[39] The Z-transform, the discrete analog of the Laplace transform, exhibits a similar property for sequences: the Z-transform of the one-sided convolution (f * g) = \sum_{k=0}^n f g[n - k] is Z\{f * g\}(z) = Z\{f\}(z) \cdot Z\{g\}(z).^[40]^[41] Unlike the bilateral Fourier transform, both the Laplace and Z-transforms are typically unilateral (one-sided), integrating or summing from zero onward, which suits them for stability analysis in causal systems and initial value problems but lacks the full inverse duality of the Fourier case.^[42] These theorems were developed in the context of control theory following the Fourier transform's establishment, with the unilateral Laplace transform gaining prominence through applications in electrical engineering and system dynamics during the early 20th century.^[43]^[44]

Applications

Signal processing

In signal processing, the convolution theorem plays a central role in the analysis and implementation of linear time-invariant (LTI) systems, where the output signal is obtained by convolving the input signal with the system's impulse response. This operation, expressed as y = \sum_{k=-\infty}^{\infty} x h[n - k] for discrete-time signals, corresponds in the frequency domain to multiplication of their discrete Fourier transforms: Y(\omega) = X(\omega) H(\omega). This duality enables filter design by specifying the desired frequency response H(\omega) and then deriving the impulse response h via inverse transform, facilitating efficient implementation of filters like equalizers and noise reducers.^[45] A key practical advantage arises from accelerating convolution computations using the fast Fourier transform (FFT), which reduces the complexity from O(N^2) for direct time-domain convolution of length-N signals to O(N \log N) by transforming to the frequency domain, multiplying the spectra, and applying the inverse FFT. This FFT-based approach, leveraging the convolution theorem, is essential for processing long signals in real-time applications, such as echo cancellation and spectral analysis.^[46]^[47] For instance, a low-pass filter can be designed by multiplying the input's frequency spectrum by a rectangular window function that attenuates high frequencies, with the equivalent time-domain operation being convolution with a sinc function h = \frac{\sin(\omega_c n)}{\pi n} (where \omega_c is the cutoff frequency), effectively smoothing the signal while preserving low-frequency components. This method is widely used in audio equalization to remove unwanted high-frequency noise.^[48]^[49] To handle long signals and mitigate artifacts from the circular convolution inherent in finite DFTs, techniques like overlap-add and overlap-save are employed. In the overlap-add method, the input is segmented into overlapping blocks, each convolved via FFT, and the results are added after aligning overlaps, yielding linear convolution without edge effects. Similarly, overlap-save discards corrupted overlap portions post-convolution, ensuring efficient streaming processing for extended data streams.^[46]^[50] In modern audio processing, the convolution theorem underpins real-time systems enhanced by GPU-accelerated FFT libraries, enabling low-latency convolution reverbs and spatial audio rendering since the early 2000s, with implementations like NVIDIA's cuSignal achieving up to 10x speedups over CPU methods for multichannel processing.^[51]^[52]

Probability and statistics

In probability theory, the convolution of two probability density functions corresponds to the density of the sum of two independent continuous random variables. If X and Y are independent random variables with densities f_X and f_Y, the density f_Z of Z = X + Y is given by the convolution

f_Z(z) = \int_{-\infty}^{\infty} f_X(x) f_Y(z - x) \, dx.

^[53] This operation arises naturally from the joint density under independence, f_{X,Y}(x,y) = f_X(x) f_Y(y), and marginalizing over the sum.^[54] The convolution theorem simplifies analysis via characteristic functions, which are the Fourier transforms of the densities. For independent X and Y, the characteristic function of Z = X + Y is the product

\phi_Z(t) = \mathbb{E}[e^{i t Z}] = \phi_X(t) \phi_Y(t),

where \phi_X(t) = \mathbb{E}[e^{i t X}].^[55] This multiplicative property holds because the expectation factors under independence, providing a direct application of the theorem without computing the integral convolution explicitly.^[56] The density f_Z can then be recovered by the inverse Fourier transform of \phi_Z.^[54] A representative example is the Irwin–Hall distribution, which describes the sum of n independent uniform random variables on [0, 1]. The density is the n-fold convolution of the uniform density f_U(u) = 1 for u \in [0,1]. The characteristic function is the product \phi(t) = \left[ \frac{e^{i t} - 1}{i t} \right]^n, equivalent to a product of sinc functions since \frac{e^{i t} - 1}{i t} = e^{i t / 2} \cdot \frac{\sin(t/2)}{t/2}.^[57] /05%3A_Special_Distributions/5.25%3A_The_Irwin-Hall_Distribution) This product form highlights how the theorem captures the smoothing effect of repeated convolutions, leading to a piecewise polynomial density.^[53] In statistics, the convolution theorem enables deconvolution to correct for measurement errors. Consider observations Y_j = X_j + \epsilon_j, where \{\epsilon_j\} are i.i.d. errors independent of the signal \{X_j\} with known error density f_\epsilon. The density f_Y = f_\epsilon * f_X, so the characteristic functions satisfy \phi_Y(t) = \phi_\epsilon(t) \phi_X(t). Solving for the signal gives \phi_X(t) = \phi_Y(t) / \phi_\epsilon(t), with f_X obtained via inverse Fourier transform.^[58] This approach is widely used in density estimation under errors, though it requires regularization if \phi_\epsilon(t) vanishes.^[59] The theorem extends to multivariate settings via multivariate characteristic functions. For independent random vectors \mathbf{X} and \mathbf{Y} in \mathbb{R}^d, the characteristic function of \mathbf{Z} = \mathbf{X} + \mathbf{Y} is \phi_\mathbf{Z}(\mathbf{t}) = \phi_\mathbf{X}(\mathbf{t}) \phi_\mathbf{Y}(\mathbf{t}), where \mathbf{t} \in \mathbb{R}^d and \phi_\mathbf{X}(\mathbf{t}) = \mathbb{E}[e^{i \mathbf{t}^\top \mathbf{X}}].^[54] This supports extensions to multinomial distributions, where the sum of independent multinomials follows a multinomial via convolution of probability mass functions, with the multivariate characteristic function product confirming the result.^[60]