Fact-checked by Grok 2 weeks ago

Gradient

In and physics, the gradient of a scalar-valued f of several variables is a that points in the direction of the function's steepest increase at each point and whose magnitude equals the rate of that increase. For a f(x, y, z) in three dimensions, the gradient is formally defined as the \nabla f = \left( \frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}, \frac{\partial f}{\partial z} \right), where the components are the partial derivatives of f with respect to each variable. This operator, symbolized by the nabla \nabla, was introduced by in 1853 as part of his work on quaternions and vector analysis. The gradient plays a central role in multivariable calculus, where it enables the computation of directional derivatives: the directional derivative of f in the direction of a unit vector \mathbf{u} is given by the dot product \nabla f \cdot \mathbf{u}. Geometrically, level surfaces (or curves in 2D) of f are perpendicular to the gradient vector at every point, making \nabla f normal to these surfaces and useful for finding tangent planes. In physics, the gradient describes conservative force fields, such as the gravitational or electric field, where the force on a particle is \mathbf{F} = - \nabla V for a potential V. Beyond , the gradient is foundational in optimization algorithms like , which iteratively adjusts parameters to minimize a by moving opposite to the gradient direction. It also appears in for pressure gradients driving flow and in for shading based on surface normals derived from gradients. These applications underscore the gradient's versatility across disciplines, from theoretical analysis to practical computations in engineering and machine learning.

Basic Concepts

Motivation and Intuition

The concept of the gradient emerged in the as part of the development of , building on the foundations of partial derivatives established earlier. Partial derivatives, which capture how a multivariable varies with respect to one while treating others as constant, were systematically developed by Leonhard Euler around 1734,[] with notation refinements by in the 1840s.[] The gradient itself took shape through William Rowan Hamilton's introduction of quaternions in 1843 and the nabla operator in 1853,[] which laid groundwork for vector operations, and was fully articulated in modern form by J. Willard Gibbs and in the 1880s as they separated scalar and vector components in calculus. Intuitively, the gradient generalizes the idea of a to functions of multiple variables, providing a directional measure of change in a across multidimensional space. At any point, it indicates the path of most rapid increase in the function's value, much like following the steepest uphill route on a hilly , with its reflecting the sharpness of that rise. This vectorial perspective allows for a unified understanding of variation in all directions, bridging single-variable derivatives to complex spatial behaviors without relying on isolated one-dimensional slices. Physically, the gradient motivates many natural processes by quantifying how scalar quantities like or potential evolve in space, driving flows and forces accordingly. In , for example, the determines the direction of , where heat moves perpendicular to isotherms from hotter to cooler regions, as is proportional to this gradient per Fourier's law established in 1822.[] Likewise, in gravitational contexts, the gradient of the potential field points toward decreasing potential, aligning with the direction of attractive force and exemplifying how such vectors model conservative systems in . As a cornerstone of , the gradient establishes essential intuition for analyzing scalar fields—functions assigning values to points in space—before formal mathematical treatments. It underscores why tracking multidimensional changes matters for modeling real-world scenarios involving multiple influences, such as environmental variations or , setting the stage for deeper explorations in optimization and field theory.

Notation

The gradient of a scalar function f, denoted as \nabla f or \mathbf{\nabla} f, represents the vector field consisting of its partial derivatives, where \nabla is the nabla symbol or del operator. In vector form, it is often written using boldface notation, such as \mathbf{\nabla} f, to emphasize its status as a vector. The nabla operator \nabla itself is a vector differential operator, commonly expressed in Cartesian coordinates as \nabla = \hat{\mathbf{i}} \frac{\partial}{\partial x} + \hat{\mathbf{j}} \frac{\partial}{\partial y} + \hat{\mathbf{k}} \frac{\partial}{\partial z}, acting on f to yield the gradient vector. Variations in notation include index form, where the i-th component of the gradient is \frac{\partial f}{\partial x_i} for coordinates x_i, useful in higher-dimensional or tensorial settings. In computational and optimization contexts, the gradient may appear as a column matrix or , such as \nabla f = \begin{pmatrix} \frac{\partial f}{\partial x_1} \\ \vdots \\ \frac{\partial f}{\partial x_n} \end{pmatrix}, facilitating numerical implementations. Conventions distinguish the gradient from related operators: applied to a scalar field, \nabla f produces a vector, whereas the divergence \nabla \cdot \mathbf{v} (for vector \mathbf{v}) yields a scalar, and the curl \nabla \times \mathbf{v} yields a vector, ensuring no ambiguity in multivariable calculus. In mathematics, \nabla f is the predominant notation, while physics texts often prefer \operatorname{grad} f for clarity in electromagnetic or fluid dynamics applications. This notation will appear consistently in subsequent equations, such as the simple two-dimensional example \nabla (x^2 + y^2) = (2x, 2y), illustrating the vector pointing in the direction of steepest ascent without specifying coordinate systems here.

Definition

In Cartesian Coordinates

In Cartesian coordinates, the gradient of a scalar-valued function f: \mathbb{R}^n \to \mathbb{R} defined on an open set in Euclidean space is a vector whose components are the partial derivatives of f with respect to each coordinate variable. Specifically, at a point \mathbf{x} = (x_1, x_2, \dots, x_n), the gradient is given by \nabla f(\mathbf{x}) = \left( \frac{\partial f}{\partial x_1}(\mathbf{x}), \frac{\partial f}{\partial x_2}(\mathbf{x}), \dots, \frac{\partial f}{\partial x_n}(\mathbf{x}) \right). This assumes that the partial derivatives exist at \mathbf{x}. In two dimensions, for f(x, y), the gradient takes the form \nabla f(x, y) = \left( \frac{\partial f}{\partial x}, \frac{\partial f}{\partial y} \right), while in three dimensions, for f(x, y, z), it is \nabla f(x, y, z) = \left( \frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}, \frac{\partial f}{\partial z} \right) = \frac{\partial f}{\partial x} \mathbf{i} + \frac{\partial f}{\partial y} \mathbf{j} + \frac{\partial f}{\partial z} \mathbf{k}. These expressions hold assuming the partial derivatives exist at the point of interest, on an open domain in \mathbb{R}^2 or \mathbb{R}^3. To compute the gradient, evaluate each separately by treating the other variables as constants and differentiating with respect to the respective coordinate; the resulting components are assembled at the point of interest. For example, consider f(x, y) = x^2 + y^2; the partial with respect to x is $2x, and with respect to y is $2y, yielding \nabla f(x, y) = (2x, 2y). This is normal to the level curves of f, which are circles centered at the origin.

In Curvilinear Coordinates

In orthogonal , the gradient of a scalar f accounts for the local through scale factors, which adjust the partial derivatives to reflect the varying metric of the coordinate basis. These systems are particularly useful for problems with cylindrical or spherical , where the coordinate curves align with the physical . The general expression for the gradient in an orthogonal curvilinear system with coordinates (u_1, u_2, u_3) and corresponding scale factors h_1, h_2, h_3 is \nabla f = \sum_{i=1}^3 \frac{1}{h_i} \frac{\partial f}{\partial u_i} \hat{e}_i, where \hat{e}_i are the unit basis vectors along each coordinate direction. The scale factors h_i are defined as h_i = |\partial \mathbf{r}/\partial u_i|, quantifying the infinitesimal arc length per unit change in u_i. Cartesian coordinates represent a special case where all h_i = 1. In cylindrical coordinates (\rho, \phi, z), the scale factors are h_\rho = 1, h_\phi = \rho, and h_z = 1. Thus, the gradient takes the form \nabla f = \frac{\partial f}{\partial \rho} \hat{e}_\rho + \frac{1}{\rho} \frac{\partial f}{\partial \phi} \hat{e}_\phi + \frac{\partial f}{\partial z} \hat{e}_z. This expression arises from the metric in cylindrical systems, where the azimuthal direction stretches with radius \rho. For spherical coordinates (r, \theta, \phi), the scale factors are h_r = 1, h_\theta = r, and h_\phi = r \sin \theta. The gradient is then \nabla f = \frac{\partial f}{\partial r} \hat{e}_r + \frac{1}{r} \frac{\partial f}{\partial \theta} \hat{e}_\theta + \frac{1}{r \sin \theta} \frac{\partial f}{\partial \phi} \hat{e}_\phi. The dependence on \sin \theta in the \phi-component reflects the contraction of azimuthal circles toward the poles. A representative example is the gradient of a radial potential, such as the electric potential V = 1/(4\pi \epsilon_0 r) from a point charge, which depends only on the radial coordinate r. In spherical coordinates, \partial V / \partial r = -1/(4\pi \epsilon_0 r^2) and the angular derivatives vanish, yielding \nabla V = -\frac{1}{4\pi \epsilon_0 r^2} \hat{e}_r. This purely radial form aligns with the symmetry of the field. These formulations are essential in fields like , where the is the negative gradient of the in symmetric geometries, and in , for computing gradients in axisymmetric or spherical flows.

In General Coordinate Systems

In a general on a smooth manifold equipped with a Riemannian , the gradient of a f: M \to \mathbb{R} is defined abstractly as the unique \nabla f on M such that for every smooth X on M, the inner product satisfies \langle \nabla f, X \rangle = df(X), where df denotes the of f and \langle \cdot, \cdot \rangle is the . This definition assumes M is a smooth manifold and the provides a smoothly varying positive definite inner product on each T_p M, enabling the identification of tangent and cotangent spaces via the musical isomorphism. The df is a smooth 1-form, and \nabla f arises as its image under the sharp operator (^\sharp) induced by the , which maps covectors to vectors by raising indices. In local coordinates (x^1, \dots, x^n) on M, where the metric tensor has components g_{ij} (with inverse g^{ij}), the gradient takes the explicit form \nabla f = g^{ij} \frac{\partial f}{\partial x^j} \frac{\partial}{\partial x^i}, with summation over repeated indices i, j = 1, \dots, n. This coordinate expression leverages the metric to contract the covector \partial f / \partial x^j \, dx^j (the local representation of df) against g^{ij} to yield vector components. The assumption here is that f is smooth, ensuring the partial derivatives exist and the expression defines a smooth vector field. From the perspective of differential forms, the gradient \nabla f corresponds to the 1-form df via the metric's musical in the Riemannian setting, which provides a way to associate vector fields to 1-forms without relying on a specific coordinate . This view emphasizes the coordinate-free nature of the construction, where the metric bridges the duality between tangent vectors and covectors. As a representative example, in flat \mathbb{R}^n with the standard metric g_{ij} = \delta_{ij} (the ), the is g^{ij} = \delta^{ij}, so the general reduces to the familiar Cartesian gradient \nabla f = \sum_{i=1}^n \frac{\partial f}{\partial x^i} \frac{\partial}{\partial x^i}.

Relationships to Derivatives

Connection to Total Derivative

For a scalar-valued function f: \mathbb{R}^n \to \mathbb{R}, the Df(\mathbf{x}) at a point \mathbf{x} is the from \mathbb{R}^n to \mathbb{R} that approximates the change in f for small displacements \mathbf{h}, given by Df(\mathbf{x})(\mathbf{h}) = \nabla f(\mathbf{x}) \cdot \mathbf{h}. This representation shows that the gradient \nabla f(\mathbf{x}) fully encodes the as a , providing the best to the function's variation in any direction. The total differential of f expands this as df = \sum_{i=1}^n \frac{\partial f}{\partial x_i} \, dx_i = \nabla f \cdot d\mathbf{x}, where dx_i are changes in the coordinates, directly linking the partial derivatives in the gradient to the overall rate of change. This form arises from the definition of differentiability, where f is differentiable at \mathbf{x} if \lim_{\mathbf{h} \to \mathbf{0}} \frac{f(\mathbf{x} + \mathbf{h}) - f(\mathbf{x}) - \nabla f(\mathbf{x}) \cdot \mathbf{h}}{\|\mathbf{h}\|} = 0, with the linear term \nabla f(\mathbf{x}) \cdot \mathbf{h} constituting the ; a proof follows by verifying that the existence of partial derivatives and this imply the gradient's role in the approximation. A key application is the , which measures the instantaneous rate of change of f along a \mathbf{u}, defined as \nabla f(\mathbf{x}) \cdot \mathbf{u}. This is a special case of the where \mathbf{h} = t \mathbf{u} for small t, reducing to the of the gradient onto the \mathbf{u}. The connection extends to the multivariable chain rule: for a differentiable path \mathbf{g}(t): \mathbb{R} \to \mathbb{R}^n, the derivative of the composition f(\mathbf{g}(t)) is \frac{d}{dt} f(\mathbf{g}(t)) = \nabla f(\mathbf{g}(t)) \cdot \mathbf{g}'(t). This follows from applying the along the curve, where \mathbf{g}'(t) acts as the tangential displacement, and a sketch of the proof uses the along the to match the definition of the .

Linear Approximations

The gradient of a differentiable scalar-valued f: \mathbb{R}^n \to \mathbb{R} at a point \mathbf{x} enables the best of f near \mathbf{x}. Specifically, f(\mathbf{x} + \mathbf{h}) \approx f(\mathbf{x}) + \nabla f(\mathbf{x}) \cdot \mathbf{h}, with the error satisfying o(\|\mathbf{h}\|) as \mathbf{h} \to \mathbf{0}. This formula arises from the first-order expansion in multiple variables, where the gradient captures the linear change in f along any direction \mathbf{h}. This approximation is particularly useful for estimating values when exact computation is difficult. Geometrically, the linear approximation defines the to the of f at the point (\mathbf{x}, f(\mathbf{x})) in \mathbb{R}^{n+1}. The equation is z = f(\mathbf{x}) + \nabla f(\mathbf{x}) \cdot (\mathbf{u} - \mathbf{x}), providing the closest affine approximation to the locally at that point. This extends the one-dimensional line concept to higher dimensions, where the serves as the normal to the level sets but here defines the plane's in all directions. For illustration, consider f(x,y) = \sin x + \cos y near (0,0). Here, f(0,0) = 1 and \nabla f(0,0) = (1, 0), so the linear approximation is L(x,y) = 1 + x. For small increments (h,k), f(h,k) = \sin h + \cos k \approx h + (1 - k^2/2), confirming that the linear term $1 + h captures the dominant first-order behavior while neglecting higher-order contributions like -k^2/2. A higher-order refinement incorporates the Hessian matrix Hf(\mathbf{x}) for the quadratic term \frac{1}{2} \mathbf{h}^T Hf(\mathbf{x}) \mathbf{h}, yielding a second-order approximation f(\mathbf{x} + \mathbf{h}) \approx f(\mathbf{x}) + \nabla f(\mathbf{x}) \cdot \mathbf{h} + \frac{1}{2} \mathbf{h}^T Hf(\mathbf{x}) \mathbf{h}. In optimization, the condition \nabla f(\mathbf{x}) = \mathbf{0} identifies critical points, which may correspond to local minima if the function decreases in all directions away from \mathbf{x}. This underpins methods like , where the gradient's direction and guide iterative improvements toward minima. The Df(\mathbf{x}) formalizes this as the whose standard-basis matrix is the row \nabla f(\mathbf{x}).

Fréchet Derivative

The generalizes the concept of the to functions between normed vector spaces, particularly Banach spaces, providing a that is uniform in all directions. For a f: X \to Y where X and Y are Banach spaces and U \subseteq X is an containing x \in X, the Fréchet derivative of f at x, denoted Df(x) or T, is a bounded T: X \to Y such that f(x + h) = f(x) + T(h) + o(\|h\|) as h \to 0, where the little-o notation indicates that \|o(\|h\|)\| / \|h\| \to 0 as \|h\| \to 0. This condition ensures that the linear term T(h) captures the first-order behavior of f uniformly over the space, making it a stronger notion of differentiability than directional variants. In the specific case of finite-dimensional Euclidean spaces, such as f: \mathbb{R}^n \to \mathbb{R}, the aligns directly with the classical gradient. Here, the bounded linear operator T is represented by the inner product T(h) = \nabla f(x) \cdot h, where \nabla f(x) is the gradient vector of f at x. The defining then becomes \frac{\|f(x + h) - f(x) - \nabla f(x) \cdot h\|}{\|h\|} \to 0 as \|h\| \to 0, illustrating how the gradient serves as the in this setting by providing the best to f near x. An illustrative example arises in function spaces, common in the , where functionals map infinite-dimensional spaces like C[0,1] (continuous functions on [0,1] with the sup norm) to \mathbb{R}. Consider the integral functional \phi(f) = \int_0^1 f(x)^2 \, dx for f \in C[0,1]. The at f is the bounded linear functional A(h) = 2 \int_0^1 f(x) h(x) \, dx, satisfying \phi(f + h) = \phi(f) + A(h) + o(\|h\|_\infty) as \|h\|_\infty \to 0. This derivative, often identified via the with multiplication by $2f(x), highlights how Fréchet differentiability facilitates optimization in such spaces by linearizing variations around a function. The is distinguished from the weaker Gâteaux derivative, which only requires the existence of directional derivatives along each direction h (i.e., the limit along rays t h as t \to 0) that form a , but without uniformity over all directions. While a continuous Gâteaux derivative implies the (and they coincide), the converse holds, but Gâteaux differentiability alone does not guarantee the stronger uniform approximation essential for applications in Banach spaces.

Properties and Applications

Level Sets

In multivariable calculus, the level set of a scalar function f: \mathbb{R}^n \to \mathbb{R} at a constant value c is defined as the set L_c = \{ \mathbf{x} \in \mathbb{R}^n \mid f(\mathbf{x}) = c \}. Where the gradient \nabla f(\mathbf{x}_0) \neq \mathbf{0} at a point \mathbf{x}_0 \in L_c, this gradient vector is perpendicular to the tangent space of the level set at \mathbf{x}_0. To see this, consider a smooth curve \mathbf{r}(t) on the level set L_c passing through \mathbf{x}_0 at t=0, so f(\mathbf{r}(t)) = c for all t near 0. Differentiating with respect to t yields \frac{d}{dt} f(\mathbf{r}(t)) = \nabla f(\mathbf{r}(t)) \cdot \mathbf{r}'(t) = 0, implying that \nabla f(\mathbf{x}_0) is orthogonal to the tangent vector \mathbf{r}'(0). Since this holds for any tangent direction, \nabla f(\mathbf{x}_0) is normal to the entire tangent space of L_c at \mathbf{x}_0. This perpendicularity has key implications for analysis and . The integral curves of the gradient field, known as gradient flow lines, are everywhere to the s, providing a natural way to traverse from one to another along the direction of maximum change. In implicit , the enables of spaces or normals to surfaces defined implicitly by f(\mathbf{x}) = c, such as in or optimization, without explicit parameterization. A simple example in two dimensions is f(x,y) = x^2 + y^2, whose level sets L_c = \{ (x,y) \mid x^2 + y^2 = c \} for c > 0 are circles centered at the . The gradient \nabla f = (2x, 2y) points radially outward, perpendicular to the tangent (circumferential) direction at every point on the circle. In physics, surfaces—level sets of V—have the \mathbf{E} = -\nabla V normal to them, explaining why field lines are orthogonal to equipotentials in . At points where \nabla f(\mathbf{x}_0) = \mathbf{0}, known as critical points, the level set L_c may develop singularities, such as cusps or isolated points, and need not form a smooth manifold; the perpendicularity property fails there, complicating local analysis.

Conservative Vector Fields and Gradient Theorem

A vector field \mathbf{V} defined on a domain in \mathbb{R}^n is called conservative if there exists a scalar potential function f such that \mathbf{V} = \nabla f. In \mathbb{R}^3, for a simply connected domain, a continuously differentiable vector field \mathbf{V} is conservative if and only if its curl is zero, i.e., \nabla \times \mathbf{V} = \mathbf{0}. This irrotational condition ensures that line integrals of \mathbf{V} are path-independent, meaning the integral from point \mathbf{a} to \mathbf{b} yields the same value regardless of the path taken. The gradient theorem, also known as the fundamental theorem for line integrals, states that if \mathbf{V} = \nabla f for a scalar function f with continuous partial derivatives on a domain, then for any piecewise smooth curve C parameterized by \mathbf{r}(t) from t = a to t = b, the line integral is given by \int_C \mathbf{V} \cdot d\mathbf{r} = f(\mathbf{r}(b)) - f(\mathbf{r}(a)). This result generalizes the one-dimensional fundamental theorem of calculus to higher dimensions./16:_Vector_Calculus/16.03:_The_Fundamental_Theorem_of_Line_Integrals) The proof relies on the chain rule and the . Consider the composition g(t) = f(\mathbf{r}(t)); then g'(t) = \nabla f(\mathbf{r}(t)) \cdot \mathbf{r}'(t) = \mathbf{V}(\mathbf{r}(t)) \cdot \mathbf{r}'(t). Integrating both sides from a to b yields \int_a^b g'(t) \, dt = \int_a^b \mathbf{V}(\mathbf{r}(t)) \cdot \mathbf{r}'(t) \, dt = g(b) - g(a) = f(\mathbf{r}(b)) - f(\mathbf{r}(a)), which is exactly the line integral along C. For the potential f to exist, the domain must be simply connected (open, connected, and every closed curve can be continuously shrunk to a point), ensuring that \nabla \times \mathbf{V} = \mathbf{0} implies conservativeness. In non-simply connected domains, additional conditions may be needed, but the curl-zero test suffices in simply connected regions. This theorem has key applications in physics, where conservative fields like gravitational or electrostatic forces allow work done by the field to be computed as a potential difference, independent of path. For instance, the gravitational field \mathbf{F} = - \frac{GM m}{r^2} \hat{r} derives from the potential f = - \frac{GM m}{r}, so work is f(\mathbf{b}) - f(\mathbf{a}). Similarly, in electrostatics, the electric field \mathbf{E} = - \nabla V yields work as a voltage difference.

Direction of Steepest Ascent

The \nabla f(\mathbf{x}) at a point \mathbf{x} in the of a differentiable scalar f points in the of the steepest ascent of f, meaning it maximizes the among all unit vectors. The |\nabla f(\mathbf{x})| equals the supremum of the directional derivatives \nabla f(\mathbf{x}) \cdot \mathbf{u} over all unit vectors \mathbf{u} with |\mathbf{u}| = 1, and the maximizing is given by the unit vector \nabla f(\mathbf{x}) / |\nabla f(\mathbf{x})|. This property arises because the \nabla f(\mathbf{x}) \cdot \mathbf{u} represents the rate of change of f along the \mathbf{u}, and the maximum occurs when \mathbf{u} aligns with \nabla f(\mathbf{x}). To see this formally, apply the Cauchy-Schwarz inequality to the inner product: |\nabla f(\mathbf{x}) \cdot \mathbf{u}| \leq |\nabla f(\mathbf{x})| \cdot |\mathbf{u}| = |\nabla f(\mathbf{x})|, since |\mathbf{u}| = 1. Equality holds \mathbf{u} is parallel to \nabla f(\mathbf{x}), confirming that the gradient direction achieves the supremum and that |\nabla f(\mathbf{x})| is the maximum rate of increase. The direction of steepest , which maximizes the rate of decrease, is then -\nabla f(\mathbf{x}) / |\nabla f(\mathbf{x})|. This duality between ascent and descent directions is in analyzing local behavior of functions. In optimization, the steepest ascent property underpins gradient ascent algorithms, where iterates are updated as \mathbf{x}_{k+1} = \mathbf{x}_k + t_k \nabla f(\mathbf{x}_k) for a step size t_k > 0 to maximize or nonconvex objectives, such as likelihood functions in statistical models. Similarly, the flow lines of the —curves \mathbf{r}(t) satisfying \frac{d\mathbf{r}}{dt} = \nabla f(\mathbf{r}(t))—trace paths of steepest ascent, representing trajectories that follow the field's at each point. These paths align with the normals to the level sets of f, pointing toward regions of higher function values. As an illustrative example, consider the f(x, y) = -x^2 - y^2 in \mathbb{R}^2, which models a downward with a global maximum at the . At a point (x_0, y_0) away from the , \nabla f(x_0, y_0) = (-2x_0, -2y_0), so the unit direction of steepest ascent is (-x_0, -y_0)/\sqrt{x_0^2 + y_0^2}, directing movement inward toward the peak; following this repeatedly simulates hill-climbing to the maximum.

Generalizations

Jacobian Matrix

The Jacobian matrix provides a generalization of the gradient to functions mapping from \mathbb{R}^n to \mathbb{R}^m, where m > 1. For a \mathbf{F}: \mathbb{R}^n \to \mathbb{R}^m with components F_1, \dots, F_m, the matrix J_\mathbf{F} at a point \mathbf{x} \in \mathbb{R}^n is the m \times n matrix whose i-th row is the gradient vector \nabla F_i(\mathbf{x}), given by J_\mathbf{F}(\mathbf{x}) = \begin{pmatrix} \frac{\partial F_1}{\partial x_1} & \cdots & \frac{\partial F_1}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \frac{\partial F_m}{\partial x_1} & \cdots & \frac{\partial F_m}{\partial x_n} \end{pmatrix}. This matrix represents the best to \mathbf{F} near \mathbf{x}, capturing how changes in the input variables affect each output component. When m = 1, so \mathbf{F} = f: \mathbb{R}^n \to \mathbb{R} is scalar-valued, the Jacobian matrix reduces to a $1 \times n row vector that is the of the standard column \nabla f. In this case, J_f(\mathbf{x}) = (\nabla f(\mathbf{x}))^T, linking the two concepts directly as the Jacobian extends the directional information of the gradient to multiple outputs. Key properties of the Jacobian include the chain rule for : if \mathbf{F}: \mathbb{R}^m \to \mathbb{R}^p and \mathbf{G}: \mathbb{R}^n \to \mathbb{R}^m are differentiable, then J_{\mathbf{F} \circ \mathbf{G}}(\mathbf{x}) = J_\mathbf{F}(\mathbf{G}(\mathbf{x})) \cdot J_\mathbf{G}(\mathbf{x}). When the Jacobian is square (m = n), its \det J_\mathbf{F}(\mathbf{x}) measures the local scaling of volumes under the transformation \mathbf{F}, with |\det J_\mathbf{F}(\mathbf{x})| giving the factor by which infinitesimal volumes in the input space are multiplied in the output space. If \det J_\mathbf{F}(\mathbf{x}) \neq 0, then \mathbf{F} is locally invertible near \mathbf{x}, establishing it as a by the . A representative example is the transformation from polar to Cartesian coordinates in \mathbb{R}^2, defined by x = r \cos \theta, y = r \sin \theta. The Jacobian matrix is J = \begin{pmatrix} \cos \theta & -r \sin \theta \\ \sin \theta & r \cos \theta \end{pmatrix}, with determinant \det J = r. This positive value for r > 0 indicates that the transformation stretches areas by a factor of r, explaining the adjustment in polar integrals. Applications of the Jacobian include change of variables in multiple integrals, where for a transformation \mathbf{T}: \mathbb{R}^n \to \mathbb{R}^n, the integral \int_{\mathbf{F}(D)} f(\mathbf{y}) \, d\mathbf{y} = \int_D f(\mathbf{T}(\mathbf{u})) |\det J_\mathbf{T}(\mathbf{u})| \, d\mathbf{u}. The absolute value of the determinant ensures the integral accounts for orientation-preserving or reversing effects while preserving the total measure. Additionally, the invertibility condition via nonzero determinant is essential for confirming local diffeomorphisms in analysis and geometry.

Gradient of Vector Fields

The gradient of a vector field \mathbf{V}: \mathbb{R}^3 \to \mathbb{R}^3 is a second-order tensor, represented as a $3 \times 3 whose entries are the partial derivatives of the components of \mathbf{V}. Specifically, the components are given by (\nabla \mathbf{V})_{ij} = \frac{\partial V_i}{\partial x_j}, where the i-th row corresponds to the gradient of the scalar component V_i. In explicit matrix form, \nabla \mathbf{V} = \begin{pmatrix} \frac{\partial V_1}{\partial x_1} & \frac{\partial V_1}{\partial x_2} & \frac{\partial V_1}{\partial x_3} \\ \frac{\partial V_2}{\partial x_1} & \frac{\partial V_2}{\partial x_2} & \frac{\partial V_2}{\partial x_3} \\ \frac{\partial V_3}{\partial x_1} & \frac{\partial V_3}{\partial x_2} & \frac{\partial V_3}{\partial x_3} \end{pmatrix}. This matrix is a special case of the Jacobian matrix for vector-valued functions from \mathbb{R}^3 to \mathbb{R}^3. The gradient tensor can be decomposed into its symmetric and antisymmetric parts, which capture the deformation and rotation of the field, respectively. The trace of \nabla \mathbf{V} equals the divergence \nabla \cdot \mathbf{V} = \sum_{i=1}^3 \frac{\partial V_i}{\partial x_i}, measuring the net flux out of a volume element. The antisymmetric part relates to the curl \nabla \times \mathbf{V}, where the curl vector is twice the axial vector associated with this antisymmetric tensor. In , the gradient of the velocity field \mathbf{u} plays a central role in describing local . A of zero, \nabla \cdot \mathbf{u} = 0, characterizes incompressible flows, where elements neither expand nor contract, simplifying the Navier-Stokes equations. The \nabla \times \mathbf{u} defines the \boldsymbol{\omega}, which quantifies the local rotation or spinning of parcels around an . For example, consider a simple with velocity field \mathbf{u} = (y, 0, 0). The gradient tensor is \nabla \mathbf{u} = \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix}, yielding \nabla \cdot \mathbf{u} = 0 (incompressible) and \boldsymbol{\omega} = \nabla \times \mathbf{u} = (0, 0, -1), indicating uniform in the negative z-direction due to shearing.

On Riemannian Manifolds

In a Riemannian manifold (M, g), the gradient of a smooth scalar function f: M \to \mathbb{R} is the unique vector field \nabla f satisfying g(\nabla f, X) = df(X) for every smooth vector field X on M, where df is the differential of f. Equivalently, \nabla f is obtained by applying the musical isomorphism induced by the metric g, which raises the index of the covector df, yielding \nabla f = g^{-1}(df). This definition ensures that \nabla f points in the direction of steepest ascent of f with respect to the geometry defined by g. In local coordinates (x^i) on M, the components of the gradient are given by \nabla f = g^{ij} \frac{\partial f}{\partial x^j} \frac{\partial}{\partial x^i}, where g^{ij} are the entries of the inverse and summation over repeated indices is implied. This expression arises directly from contracting the covector components \frac{\partial f}{\partial x^j} with g^{ij}, without involvement of connection terms, as the df is covariantly constant for scalars. The squared of the gradient is then |\nabla f|^2 = g(\nabla f, \nabla f) = g^{ij} \frac{\partial f}{\partial x^i} \frac{\partial f}{\partial x^j}, which quantifies the maximum rate of change of f at each point. The integral curves of \nabla f, known as gradient flow lines, satisfy the \frac{d\gamma}{dt} = \nabla f(\gamma(t)) and evolve to increase f along geodesics in the direction of \nabla f when appropriately normalized, though the flow itself incorporates the of f in its acceleration. A classic example occurs on the unit sphere S^2 \subset \mathbb{R}^3 endowed with the induced Riemannian metric from the inner product. For the f(p) = z, where p = (x, y, z) \in S^2 and z is the third coordinate, the gradient at p is the orthogonal projection of the ambient gradient (0, 0, 1) onto the T_p S^2, given explicitly by \nabla f(p) = (0, 0, 1) - z p = (-xz, -yz, 1 - z^2). This vanishes at the poles (0, 0, \pm 1), the critical points of f, and points equatorially elsewhere, directing flow toward the . When the is flat, such as in Cartesian coordinates where the is \delta_{ij} and the \Gamma^k_{ij} = 0, the expression simplifies to the classical gradient \nabla f = \sum_i \frac{\partial f}{\partial x^i} \frac{\partial}{\partial x^i}, recovering the familiar directional derivative structure. This flat limit highlights how the Riemannian gradient generalizes the case to account for intrinsic via the .

References

  1. [1]
    Calculus III - Gradient Vector, Tangent Planes and Normal Lines
    Nov 16, 2022 · This says that the gradient vector is always orthogonal, or normal, to the surface at a point. Also recall that the gradient vector is,. ∇f= ...
  2. [2]
    [PDF] 18.02SC Notes: Gradient: definition and properties
    Definition of the gradient. ∂w. ∂w. If w = f(x, y), then ∂x and ∂y are the rates of change of w in the i and j directions. It will be quite useful to put these ...
  3. [3]
    History of Nabla and Other Math Symbols
    Jan 26, 1998 · The symbol, which is also called a "del," "nabla," or "atled" (delta spelled backwards), was introduced by William Rowan Hamilton (1805-1865) in 1853.<|control11|><|separator|>
  4. [4]
    The Gradient and Directional Derivative
    The gradient of a function w=f(x,y,z) is the vector function: For a function of two variables z=f(x,y), the gradient is the two-dimensional vector <f_x(x,y),f_ ...
  5. [5]
    Gradient
    The gradient is a vector operation which operates on a scalar function to produce a vector whose magnitude is the maximum rate of change of the function.
  6. [6]
    4.1 Gradient, Divergence and Curl
    “Gradient, divergence and curl”, commonly called “grad, div and curl”, refer to a very widely used family of differential operators and related notations.
  7. [7]
    4.1 Gradient, Divergence and Curl
    The gradient of a scalar-valued function f ( x , y , z ) is the vector field. grad grad f = ∇ ∇ f = ∂ f ∂ x ı ı ^ + ∂ f ∂ y ȷ ȷ ^ + ∂ f ∂ z k ^ · The divergence ...
  8. [8]
    The Curious History of Vectors and Tensors - SIAM.org
    Sep 3, 2024 · The idea of a vector as a mathematical object in its own right first appeared as part of William Rowan Hamilton's theory of quaternions.
  9. [9]
    [PDF] MATH 230-1: Multivariable Differential Calculus
    • gradients: the notion of a “gradient vector” is one of the most important ones in multivari- able calculus, and has no real analog in the single-variable ...
  10. [10]
    Temperature Gradient - an overview | ScienceDirect Topics
    A temperature gradient is created with the hotter temperatures near the Equator and tapering off as the poles are approached.
  11. [11]
    [PDF] Gradients Math 131 Multivariate Calculus
    The vectors in this vector field point in the direction of fastest ascent. In the 4th quadrant, they point left meaning that the quickest way up out of that.
  12. [12]
    Gradient -- from Wolfram MathWorld
    Gradient is a synonym for slope, and in vector analysis, it's a vector operator denoted del, often applied to a function of three variables.
  13. [13]
    [PDF] 3.3 Gradient Vector and Jacobian Matrix
    The gradient vector is typically denoted ∇f and sometimes as grad(f). The downward pointing triangular vector symbol is called a “nabla”.
  14. [14]
    Gradient, divergence, and curl - MIT
    Gradient. The gradient is an operator that takes a scalar valued function of several variables and gives a vector. It is one way of encoding the rate of ...
  15. [15]
    Gradients - Department of Mathematics at UTSA
    Jan 20, 2022 · The gradient of f is defined as the unique vector field whose dot product with any vector v at each point x is the directional derivative of f along v.
  16. [16]
    [PDF] Lecture 5 Vector Operators: Grad, Div and Curl
    We introduce three field operators which reveal interesting collective field properties, viz. • the gradient of a scalar field,. • the divergence of a vector ...
  17. [17]
  18. [18]
  19. [19]
    Div, Grad and Curl in Orthogonal Curvilinear Coordinates - Galileo
    Putting this together with the expression for the gradient gives immediately the expression for the Laplacian operator in curvilinear coordinates: ∇2ψ=1h1h ...
  20. [20]
    Orthogonal Curvilinear Coordinates - Richard Fitzpatrick
    Let us define the gradient $ \nabla{\bf A}$ of a vector field ... In an orthogonal curvilinear coordinate system, the previous expression generalizes to ...
  21. [21]
    [PDF] Coordinate Systems and Vector Derivatives Formula Sheet
    Gradient: ∇ f = ∂f. ∂x x +. ∂f. ∂y y +. ∂f. ∂z z. Divergence: ∇ · v = ∂vx ... Cylindrical Coordinates (r, φ, z). Relations to rectangular (Cartesian) ...
  22. [22]
    [PDF] Curl, Divergence, and Gradient in Cylindrical and Spherical ...
    Find the curl and the divergence for each of the following vectors in cylindrical coordi- nates: (a). ; (b). ; (c) . B.2. Find the gradient for each of the ...
  23. [23]
    [PDF] notes.coordinates.pdf - OSU Math
    x2+y2 + z calculate ∇f in cylindrical coordinates. Solution: We note f(r, θ, z) = r cos θ r2. + z = 1 r cosθ. So, from formula. (2.23), ∇f = er. −1 r2 cosθ ...
  24. [24]
    [PDF] NOTES ON RIEMANNIAN GEOMETRY Contents 1. Smooth ...
    Apr 1, 2015 · Riemannian metrics. 3.1. The metric. Definition 3.1 (Riemannian metric). Let M be a smooth manifold. A Riemannian metric is a symmetric positive ...
  25. [25]
    [PDF] Lectures on Riemannian Geometry
    Sep 23, 2005 · We can define the Riemannian gradient of f as g(gradgf,X) = dXf which is the (0,1)-tensor or vector field g-dual to df. Definition 2.3.6 Let ...
  26. [26]
    [PDF] Total derivatives Math 131 Multivariate Calculus
    When n = 2 the gradient, ∇f = (fx,fy), gives the slopes of the tangent plane in the x-direction and the y-direction. Total derivatives to vector-valued ...Missing: connection | Show results with:connection
  27. [27]
    None
    ### Summary of Sections from Multivariate Calculus PDF
  28. [28]
    [PDF] CHAIN RULE Maths21a, O. Knill - Harvard Mathematics Department
    PROOFS OF THE CHAIN RULE. d dt f(r(t)) = d dt (a(x0 + tu) + b(y0 + tv)) = au + bv and this is the dot product of ∇f = (a, b) with r ′(t)=(u, v). 2.
  29. [29]
    Introduction to Taylor's theorem for multivariable functions
    Taylor's theorem. Given a one variable function f(x), you can fit it with a polynomial around x=a. For example, the best linear approximation for f(x) is f(x)≈ ...
  30. [30]
    [PDF] Lecture 3: 20 September 2018 3.1 Taylor series approximation
    Sep 20, 2018 · Here the error of the approximation goes to zero at least as fast as (∆x)k as ∆x → 0. Thus, the larger the k the better is the approximation.
  31. [31]
    Chapter 8 The Gradient and Linear Approximation - Bookdown
    We do this by introducing the gradient vector. This vector has components which are the slopes on the surface at the point of interest in both directions. In ...
  32. [32]
    10.7 Optimization - Active Calculus
    Because , ∇ f = ⟨ 2 x , − 2 y ⟩ , we see that the origin ( x 0 , y 0 ) = ( 0 , 0 ) is a critical point. However, this critical point is neither a local maximum ...Missing: source | Show results with:source
  33. [33]
    [PDF] FUNCTIONAL ANALYSIS | Second Edition Walter Rudin
    Integration of vector-valued functions is treated. strictly as a tool; attention is confined to continuous integrands, with values. in a Frechet space. ...
  34. [34]
    [PDF] Waves and Imaging, Calculus of Variations, Functional Derivatives
    An operator F is a map from X to Y . We denote its action on a function f as Ff. We say that a functional φ is Fréchet differentiable at f ∈ X when there.
  35. [35]
    [PDF] Fréchet & Gâteaux Derivatives1and the Chain Rule
    The Fréchet derivative is defined in a way that is somewhat different than the Gâteaux derivative. Let V , W, Ω and F be as defined earlier. Again, fix y ∈ Ω.
  36. [36]
    [PDF] Gradient: proof that it is perpendicular to level curves and surfaces
    By this we mean it is perpendicular to the tangent to any curve that lies on the surface and goes through P . (See figure.) This follows easily from the chain ...
  37. [37]
    [PDF] Gradient
    Gradients are orthogonal to level ... perpendicular to any vector (x -x0) in the plane. It is one of the most important statements in multivariable calculus.
  38. [38]
    4.3: Equipotential Curves and Surfaces - Physics LibreTexts
    Jul 30, 2025 · Work is needed to move a charge from one equipotential to another. Equipotentials are perpendicular to electric field lines in every case.
  39. [39]
    Conservative Field -- from Wolfram MathWorld
    The following conditions are equivalent for a conservative vector field on a particular domain D ; 1. For any oriented simple closed curve C ; 2. For any two ...Missing: zero | Show results with:zero<|control11|><|separator|>
  40. [40]
  41. [41]
  42. [42]
    2.3 The Chain Rule
    The chain rule from single variable calculus has a direct analogue in multivariable calculus, where the derivative of each function is replaced by its Jacobian ...
  43. [43]
    [PDF] 1.4 Smooth Manifolds Defined
    Jacobian matrix df (x) = Rmxn is invertible for every x EU and so m = n. The ... Apply the same argument to its inverse to deduce that it is a diffeomorphism.
  44. [44]
    Jacobians
    Example 1: Compute the Jacobian of the polar coordinates transformation x = rcosθ,y=rsinθ. Solution: Since ∂x ...
  45. [45]
    Calculus III - Change of Variables - Pauls Online Math Notes
    Nov 16, 2022 · The Jacobian is defined as a determinant of a 2x2 matrix, if you are unfamiliar with this that is okay. Here is how to compute the determinant.
  46. [46]
    [PDF] 1.14 Tensor Calculus I: Tensor Fields
    The gradient of a scalar field and the divergence and curl of vector fields have been seen in §1.6. Other important quantities are the gradient of vectors and ...<|control11|><|separator|>
  47. [47]
    Incompressible Flow - an overview | ScienceDirect Topics
    In other words, the divergence of the incompressible flow (∇) is zero. In ... incompressible flow is that the divergence of the flow velocity vanishes.
  48. [48]
    Vorticity | Applied Mathematics | University of Waterloo
    Vorticity measures the local rotation of a fluid parcel, and is the curl of the velocity field, usually denoted by the greek letter omega.
  49. [49]
    [PDF] On the velocity gradient tensor
    In this example, the 'deformation' takes the form of the shear term du/dy. What about the more general case of non-parallel flow? For simplicity, we'll talk ...
  50. [50]
    [PDF] Riemannian Geometry
    Manfredo Perdigao do Carmo. Riemannian Geometry. Translated by Francis Flaherty. Birkhauser. Boston • Basel • Berlin. Page 2. CONTENTS. Preface to the first ...
  51. [51]
    [PDF] Math 868 — Homework 12
    Let (M,g) be a Riemannian manifold, f ∈ C∞(M) and let ∇f be the gradient of f, defined by g(∇f,X) = df(X) for all vectors X.
  52. [52]
    Gradient in coordinates of function in 2-sphere - Math Stack Exchange
    Dec 8, 2020 · The metric on the sphere is defined to be the pullback metric g=i∗g0 where g0 is the euclidean metric. if p=(x,y,z)∈S2⊂R3. As a consequence, ...The gradient on sphere - riemannian geometry - Math Stack ExchangeDerivation of the gradient on the n-sphere - Math Stack ExchangeMore results from math.stackexchange.com