Multi-index notation

Multi-index notation is a compact mathematical convention employed in multivariable analysis to denote partial derivatives, powers of variables, and related multi-dimensional operations using a single symbol for a tuple of exponents. A multi-index \alpha is defined as a vector \alpha = (\alpha_1, \alpha_2, \dots, \alpha_n) where each \alpha_i is a non-negative integer, and its order is given by |\alpha| = \alpha_1 + \alpha_2 + \dots + \alpha_n.^[1]^[2] For a smooth function u: \mathbb{R}^n \to \mathbb{R}, the partial derivative operator D^\alpha u (or \partial^\alpha u) represents \frac{\partial^{|\alpha|} u}{\partial x_1^{\alpha_1} \partial x_2^{\alpha_2} \cdots \partial x_n^{\alpha_n}}, simplifying the expression of higher-order mixed derivatives.^[1]^[2] Additionally, for powers, x^\alpha = x_1^{\alpha_1} x_2^{\alpha_2} \cdots x_n^{\alpha_n}, and the multi-index factorial is \alpha! = \alpha_1! \alpha_2! \cdots \alpha_n!, which facilitates combinatorial aspects like multinomial expansions.^[3]^[2] This notation streamlines the generalization of single-variable concepts to multiple dimensions, particularly in Taylor expansions and partial differential equations (PDEs). In the multivariable Taylor theorem, the expansion of a function f(x + y) around x includes terms \frac{y^\alpha}{\alpha!} D^\alpha f(x) summed over multi-indices \alpha with |\alpha| < k, plus a remainder term that can be expressed in integral or Lagrange form using higher-order multi-index derivatives.^[3]^[2] For instance, the set D^k u = \{D^\alpha u : |\alpha| = k\} collects all partial derivatives of total order k, enabling concise statements of approximation theorems.^[1] In PDEs, multi-index notation is essential for defining the order and type of equations, such as classifying the transport equation as first-order (involving D^\alpha with |\alpha| = 1) or the heat equation as second-order (with |\alpha| = 2).^[1] It also supports operations like addition of multi-indices, where \alpha + \beta = (\alpha_1 + \beta_1, \dots, \alpha_n + \beta_n), ensuring that mixed partials commute: D^{\alpha + \beta} u = D^\alpha (D^\beta u) = D^\beta (D^\alpha u).^[2] Beyond classical analysis, it appears in functional spaces like Sobolev spaces W^{k,p}(\Omega), where weak derivatives up to order k are required to satisfy \partial^\alpha u \in L^p(\Omega) for all \alpha with |\alpha| \leq k.^[2] These applications highlight its role in making complex multi-variable expressions more manageable and intuitive.^[1]

Definition and Notation

Formal Definition

In mathematics, particularly in multivariable calculus and analysis, a multi-index \alpha is defined as an n-tuple of non-negative integers, belonging to the set \mathbb{N}_0^n, where \mathbb{N}_0 denotes the non-negative integers \{0, 1, 2, \dots\} and n is the dimension of the underlying space.^[4]^[5] This structure allows \alpha to systematically represent orders or degrees in multi-dimensional contexts.^[6] Explicitly, \alpha = (\alpha_1, \alpha_2, \dots, \alpha_n) with each component \alpha_i \in \mathbb{N}_0 for i = 1, \dots, n.^[5]^[6] In finite-dimensional settings, all components are specified, but for infinite-dimensional spaces—such as in the theory of distributions or function spaces—multi-indices are typically required to have finite support, meaning only finitely many \alpha_i are non-zero, while the rest are zero.^[7] This restriction ensures well-defined operations and convergence in infinite products or sums.^[7] Multi-indices serve as a compact notational tool for indexing multi-dimensional quantities, such as the orders of partial derivatives or the exponents in monomials within multivariate polynomials.^[4]^[6] For instance, they facilitate concise expressions for higher-order derivatives in analysis, as explored in subsequent applications.^[5]

Symbolic Conventions

In multi-index notation, multi-indices are commonly denoted by Greek letters such as \alpha or \beta, which are occasionally rendered in boldface (e.g., \boldsymbol{\alpha}) in printed texts to emphasize their vectorial nature.^[8] For a vector \mathbf{x} = (x_1, \dots, x_n) \in \mathbb{R}^n and a multi-index \alpha = (\alpha_1, \dots, \alpha_n), the associated power is defined as

\mathbf{x}^\alpha = \prod_{i=1}^n x_i^{\alpha_i}.

^[9] This convention extends the standard exponentiation to multivariable contexts, facilitating compact expressions for monomials in polynomials. Summations involving multi-indices are expressed using sigma notation, such as \sum_\alpha to indicate summation over all relevant multi-indices or \sum_{|\alpha|=k} to restrict to those with total order k = \alpha_1 + \dots + \alpha_n.^[9] These forms are standard in expansions like Taylor series, where the sum aggregates terms weighted by multi-index factorials. The zero multi-index is denoted by \mathbf{0} = (0, \dots, 0), satisfying \mathbf{x}^\mathbf{0} = 1 for any \mathbf{x}, which aligns with the empty product convention in exponentiation.^[9]

Algebraic Operations

Arithmetic on Multi-Indices

Multi-indices, denoted as \alpha = (\alpha_1, \dots, \alpha_n) \in \mathbb{N}_0^n where \mathbb{N}_0 is the set of non-negative integers, support vector-like arithmetic operations defined componentwise to preserve their structure in non-negative integer tuples. The addition of two multi-indices \alpha and \beta is given by \alpha + \beta = (\alpha_1 + \beta_1, \dots, \alpha_n + \beta_n), which results in another multi-index since the sum of non-negative integers remains non-negative. This operation is commutative, meaning \alpha + \beta = \beta + \alpha, and associative, (\alpha + \beta) + \gamma = \alpha + (\beta + \gamma) for any multi-indices \alpha, \beta, \gamma.^[2]^[10] Scalar multiplication by a non-negative integer k \in \mathbb{N}_0 is defined as k\alpha = (k\alpha_1, \dots, k\alpha_n), yielding a multi-index where each component is scaled accordingly. This operation distributes over addition: k(\alpha + \beta) = k\alpha + k\beta, and addition over scalars: (k + m)\alpha = k\alpha + m\alpha for k, m \in \mathbb{N}_0. These properties mirror those of vector spaces but are restricted to the semigroup structure of \mathbb{N}_0^n under componentwise operations.^[2]^[10] Subtraction \alpha - \beta is defined only when \beta \leq \alpha componentwise, i.e., \beta_i \leq \alpha_i for all i = 1, \dots, n, and takes the form \alpha - \beta = (\alpha_1 - \beta_1, \dots, \alpha_n - \beta_n), ensuring the result is again a multi-index in \mathbb{N}_0^n. This partial operation supports decompositions in contexts like multi-index sums but does not extend to a full group structure.^[2]^[10]

Combinatorial Coefficients

In multi-index notation, the factorial of a multi-index \alpha = (\alpha_1, \dots, \alpha_n) \in \mathbb{N}_0^n is defined as the product of the factorials of its components: \alpha! = \prod_{i=1}^n \alpha_i!.^[5] This definition is used in the denominator of the multinomial coefficient \binom{|\alpha|}{\alpha_1, \dots, \alpha_n} = \frac{|\alpha|!}{\alpha!}, which counts the number of ways to divide |\alpha| distinct objects into n labeled groups of sizes \alpha_1, \dots, \alpha_n.^[11] A key combinatorial coefficient derived from this is the multinomial coefficient \binom{\alpha}{\beta}, defined for multi-indices \beta \leq \alpha (componentwise) as \binom{\alpha}{\beta} = \frac{\alpha!}{\beta! \, (\alpha - \beta)!}, and zero otherwise.^[12] This expression equals the product \prod_{i=1}^n \binom{\alpha_i}{\beta_i}, reflecting its structure as an independent choice per dimension.^[12] Combinatorially, as it equals the product \prod_{i=1}^n \binom{\alpha_i}{\beta_i}, it counts the total number of ways to choose, independently for each i, a subset of \beta_i elements from a set of \alpha_i distinct elements across the dimensions. These coefficients appear prominently in multivariable expansions, such as the multinomial theorem for (x + y)^\alpha, where x = (x_1, \dots, x_n) and y = (y_1, \dots, y_n) are vectors: (x + y)^\alpha = \sum_{\beta \leq \alpha} \binom{\alpha}{\beta} x^\beta y^{\alpha - \beta}.^[5] This product form \prod_{i=1}^n (x_i + y_i)^{\alpha_i} expands to the sum above, with each \binom{\alpha}{\beta} weighting the terms according to the ways to allocate exponents between x and y per variable.^[5] The construction ensures the expansion captures all possible combinations while preserving the total order per component, essential for applications in series and approximations.^[5]

Structural Properties

Length and Magnitude

In multi-index notation, the length or order of a multi-index \alpha = (\alpha_1, \dots, \alpha_n) \in \mathbb{N}_0^n is defined as |\alpha| = \sum_{i=1}^n \alpha_i, which corresponds to the total degree of the multi-index.^[13] This quantity plays a central role in multivariable analysis, particularly for classifying homogeneous polynomials, where a monomial x^\alpha is homogeneous of degree k if |\alpha| = k.^[13] More generally, measures of magnitude for multi-indices extend to l_p-norms, defined as |\alpha|_p = \left( \sum_{i=1}^n \alpha_i^p \right)^{1/p} for $1 \leq p < \infty, providing weighted assessments of size that generalize the total degree (the case p=1).^[14] For p = \infty, the supremum norm is \|\alpha\|_\infty = \max_i \alpha_i, capturing the maximum component of the multi-index.^[14] These norms are used to define multi-index sets in polynomial approximation, such as \{ \alpha \in \mathbb{N}_0^n : \|\alpha\|_p \leq m \}, which filter terms in sums by controlling the overall magnitude and help mitigate the curse of dimensionality in high dimensions.^[14] Such magnitude measures also facilitate the organization of summation indices in multivariate series expansions, where terms are grouped by total degree |\alpha| to isolate contributions from specific orders.^[13]

Partial Ordering

In multi-index notation, the set \mathbb{N}_0^n of n-tuples of nonnegative integers is equipped with a partial order defined componentwise: for multi-indices \alpha = (\alpha_1, \dots, \alpha_n) and \beta = (\beta_1, \dots, \beta_n), one has \alpha \leq \beta if and only if \alpha_i \leq \beta_i for every i = 1, \dots, n. This order makes \mathbb{N}_0^n into a partially ordered set (poset), specifically the product of n copies of the chain poset \mathbb{N}_0. The associated strict partial order is given by \alpha < \beta if \alpha \leq \beta and \alpha \neq \beta. In this poset, a chain is a totally ordered subset, meaning any two elements are comparable under \leq, while an antichain is a subset in which no two distinct elements are comparable. By Dilworth's theorem, the size of the largest antichain equals the minimum number of chains needed to cover the poset, which has applications in combinatorial optimization over multi-indices. To obtain a total order compatible with the partial order, one may use the lexicographic order \leq_{\lex}, defined by \alpha <_{\lex} \beta if, in the leftmost component j where \alpha_j \neq \beta_j, it holds that \alpha_j < \beta_j. This order refines the componentwise partial order and is commonly employed as a monomial ordering in polynomial rings. In applications such as the Leibniz product rule for higher-order derivatives, the partial order facilitates summation over all \beta \leq \alpha.

Applications in Analysis

Higher-Order Derivatives

In multi-index notation, higher-order partial derivatives of a function f: \mathbb{R}^n \to \mathbb{R} are expressed compactly using a multi-index \alpha = (\alpha_1, \dots, \alpha_n) \in \mathbb{N}_0^n, where each \alpha_i is a non-negative integer denoting the order of differentiation with respect to the variable x_i. The partial derivative operator is defined as

\partial^\alpha f = \frac{\partial^{\alpha_1}}{\partial x_1^{\alpha_1}} \cdots \frac{\partial^{\alpha_n}}{\partial x_n^{\alpha_n}} f,

which applies the derivatives successively in any order, provided f is sufficiently smooth.^[13]^[15] The total order of this derivative is given by the magnitude |\alpha| = \alpha_1 + \dots + \alpha_n, which represents the overall degree of differentiation. For instance, derivatives of order k are those where |\alpha| = k.^[13] Under appropriate smoothness conditions, such as f \in C^{|\alpha|}(\mathbb{R}^n), the mixed partial derivatives commute, meaning the order of differentiation does not affect the result. This is a multivariable extension of Clairaut's theorem, stating that \partial^\alpha \partial^\beta f = \partial^\beta \partial^\alpha f for multi-indices \alpha and \beta whenever f is twice continuously differentiable in the relevant components.^[13] The equality holds more generally for higher orders when f is sufficiently smooth, ensuring that the notation \partial^\alpha f is well-defined independently of the sequence of partials applied.^[15] Special cases of this notation include the gradient and the Hessian matrix. The gradient corresponds to first-order derivatives, where |\alpha| = 1 and exactly one \alpha_i = 1 with the rest zero, yielding \nabla f = D^1 f = (\partial f / \partial x_1, \dots, \partial f / \partial x_n).^[13] The Hessian arises for second-order derivatives with |\alpha| = 2, forming an n \times n symmetric matrix whose entries are the mixed partials \partial^2 f / \partial x_i \partial x_j, again assuming f \in C^2(\mathbb{R}^n).^[15]

Taylor Expansion

The multivariable Taylor theorem utilizes multi-index notation to express the polynomial approximation of a sufficiently smooth function f: \mathbb{R}^n \to \mathbb{R} around a point a \in \mathbb{R}^n. For f of class C^m in a convex open neighborhood of a, the expansion up to order m takes the form

f(x) = \sum_{|\alpha| \leq m} \frac{\partial^\alpha f(a)}{\alpha!} (x - a)^\alpha + R_m(x),

where the sum collects all partial derivative terms scaled by the multinomial coefficient \alpha! and the corresponding monomial powers.^[16] The remainder R_m(x) quantifies the approximation error, and in the Peano form, it satisfies R_m(x) = o(|x - a|^m) as x \to a. This little-o condition holds provided f is C^m near a, ensuring the polynomial part captures the local behavior up to order m.^[16] Each component of the Taylor polynomial is a homogeneous polynomial: the terms of exact degree k form \sum_{|\alpha|=k} c_\alpha (x - a)^\alpha, where the coefficients c_\alpha = \partial^\alpha f(a) / \alpha! are determined by the higher-order partial derivatives at a. These homogeneous components generalize the familiar quadratic forms in single-variable expansions to higher dimensions.^[16] When f is real analytic at a, meaning it equals its power series locally in some neighborhood, the infinite Taylor series \sum_{|\alpha| = 0}^\infty \frac{\partial^\alpha f(a)}{\alpha!} (x - a)^\alpha converges to f(x) for all x within the radius of convergence, which is the distance from a to the nearest point where f fails to be analytic. This convergence requires growth bounds on the derivatives, such as |\partial^\alpha f(a)| \leq M K^{|\alpha|} |\alpha|! for constants M, K > 0.^[16]

Leibniz Product Rule

The Leibniz product rule in multi-index notation generalizes the classical product rule from single-variable calculus to higher-order partial derivatives of products of multivariable functions. For smooth functions u, v: \mathbb{R}^n \to \mathbb{R} and a multi-index \alpha \in \mathbb{N}_0^n, the rule states that

\partial^\alpha (u v) = \sum_{\beta \leq \alpha} \binom{\alpha}{\beta} (\partial^\beta u) (\partial^{\alpha - \beta} v),

where the sum is over all multi-indices \beta such that \beta_i \leq \alpha_i for each i = 1, \dots, n, and \binom{\alpha}{\beta} = \frac{\alpha!}{\beta! (\alpha - \beta)!} is the multi-index binomial coefficient.^[17] This formula accounts for the combinatorial ways in which the derivatives can be distributed between u and v across each variable. A proof of this rule can be obtained by induction on the order |\alpha| of the multi-index. The base case |\alpha| = 1 follows directly from the standard product rule for first-order partial derivatives: for \alpha = e_j (the standard basis vector with 1 in the j-th position and 0 elsewhere), \partial_{x_j} (u v) = (\partial_{x_j} u) v + u (\partial_{x_j} v), which matches the sum with the two terms where \beta = 0 or \beta = e_j. Assuming the formula holds for all multi-indices of order at most k, consider |\alpha| = k+1. Without loss of generality, take \alpha = \gamma + e_j for some multi-index \gamma with |\gamma| = k and j \in \{1, \dots, n\}. Applying the first-order product rule to \partial^\gamma (u v) yields \partial^\alpha (u v) = \partial_{x_j} (\partial^\gamma (u v)) = (\partial_{x_j} (\partial^\gamma u)) v + u (\partial_{x_j} (\partial^\gamma v)). By the induction hypothesis and commutativity of mixed partial derivatives, each term expands into the desired sum over \beta \leq \alpha. The general case follows by iterating over the components of \alpha.^[12] When n=1, the multi-index \alpha reduces to a non-negative integer, \beta \leq \alpha means $0 \leq \beta \leq \alpha, and \binom{\alpha}{\beta} is the standard binomial coefficient, so the formula recovers the classical Leibniz rule for the \alpha-th derivative of a product: (u v)^{(\alpha)} = \sum_{\beta=0}^\alpha \binom{\alpha}{\beta} u^{(\beta)} v^{(\alpha - \beta)}.^[17] This rule finds application in differentiating powers of functions, such as u^m for positive integer m, by viewing the power as an m-fold product and iteratively applying the formula, which leads to a multinomial expansion involving binomial coefficients. It also extends to more general compositions in certain contexts, though higher-order chain rules require additional tools like Faà di Bruno's formula.^[17]