Fact-checked by Grok 2 weeks ago

Sum of squares

In mathematics, the sum of squares refers to the aggregation of the squared values of a set of numbers, serving as a foundational operation with applications across algebra, statistics, number theory, geometry, and optimization.^[1] This simple yet powerful concept underpins measurements of dispersion, algebraic identities, and the representation of integers as sums of integer squares.^[2] For a finite set of real numbers x_1, x_2, \dots, x_n, the sum of squares is defined as \sum_{i=1}^n x_i^2.^[3] In statistics, the sum of squares is a key measure of variability within a dataset, calculated as the total of squared differences between each observation and the sample mean.^[2] It partitions data variation into components such as the total sum of squares (SST or TSS), which captures overall deviation from the mean via \sum (y_i - \bar{y})^2; the regression sum of squares (SSR or RSS), quantifying variation explained by a model as \sum (\hat{y}_i - \bar{y})^2; and the error sum of squares (SSE), representing unexplained residuals as \sum (y_i - \hat{y}_i)^2, where SST = SSR + SSE.^[4] These decompositions are essential in regression analysis and analysis of variance (ANOVA), enabling assessments of model fit through metrics like R-squared (SSR/SST) and hypothesis testing via F-statistics derived from mean squares (sum of squares divided by degrees of freedom).^[2] Algebraically, sums of squares appear in identities that facilitate expansions and simplifications, such as for two variables: x^2 + y^2 = (x + y)^2 - 2xy, derived directly from the binomial square formula.^[3] For sequences, closed-form formulas exist, including the sum of squares of the first n natural numbers, \sum_{k=1}^n k^2 = \frac{n(n+1)(2n+1)}{6}, proven by mathematical induction or telescoping series.^[3] Similar expressions apply to even and odd numbers, like \sum_{k=1}^n (2k)^2 = \frac{2n(n+1)(2n+1)}{3}.^[3] In number theory, the sum of squares function r_k(n) counts the ways a positive integer n can be expressed as the sum of k integer squares, allowing zeros and order distinctions.^[1] Landmark results include Fermat's theorem on sums of two squares (1636), stating that a prime p is expressible as p = a^2 + b^2 if and only if p = 2 or p \equiv 1 \pmod{4}, extended by Euler to all integers where primes congruent to $3 \pmod{4} have even exponents in their factorization.^[1] Lagrange's four-square theorem (1770) asserts every natural number is a sum of at most four squares, while Legendre's three-square theorem (1798) characterizes those expressible as three squares, excluding forms $4^a(8b + 7).^[5] These theorems, rooted in Diophantus's early work, connect arithmetic progressions and modular forms to deeper analytic number theory.^[1]

Fundamental Concepts

Definition and Notation

In mathematics, the sum of squares refers to the aggregate of the squares of a finite collection of real numbers a_1, \dots, a_n, formally defined as \sum_{i=1}^n a_i^2. This expression quantifies the total squared magnitude of the numbers and forms the foundation for various norms and measures in algebra and analysis. For instance, consider the simple case of two scalars a and b; their sum of squares is a^2 + b^2, which represents the squared distance from the origin in the plane spanned by these values.^[6] In vector spaces, the sum of squares commonly appears in the notation for the squared Euclidean norm of a vector \mathbf{x} = (x_1, \dots, x_n) \in \mathbb{R}^n, denoted \|\mathbf{x}\|^2 = \sum_{i=1}^n x_i^2. This notation emphasizes the connection to the length of the vector, where the norm itself is the square root of this sum. Historically, early explorations of such expressions were linked to quadratic forms; Leonhard Euler, in his 18th-century investigations of Diophantine problems, employed notations for quadratic polynomials like a x^2 \pm b x \pm c = 0, which underpin sums of squares in multivariate settings.^[7] The concept extends naturally to complex numbers, where the squared modulus of z \in \mathbb{C} is defined as |z|^2 = z \overline{z}, with \overline{z} denoting the complex conjugate. For a complex vector, this generalizes componentwise to \sum |z_i|^2.^[8]

Basic Properties

The sum of squares of real numbers exhibits fundamental non-negativity, as \sum_{i=1}^n a_i^2 \geq 0 for any real a_i, with equality holding if and only if a_1 = a_2 = \dots = a_n = 0.^[9] This property follows directly from the non-negativity of squares of real numbers and the additivity of the sum.^[10] Sums of squares are homogeneous of degree 2, meaning that scaling the inputs by a constant c \in \mathbb{R} scales the sum by c^2: \sum_{i=1}^n (c a_i)^2 = c^2 \sum_{i=1}^n a_i^2.^[10] This homogeneity arises from the quadratic nature of the expression and is preserved under linear transformations of the variables.^[10] In the context of vector spaces, the sum of squares \sum_{i=1}^n x_i^2 can be expressed as the quadratic form \mathbf{x}^T A \mathbf{x}, where \mathbf{x} = (x_1, \dots, x_n)^T is the column vector and A is the n \times n identity matrix, which is positive definite.^[10] More generally, any positive semidefinite quadratic form \mathbf{x}^T A \mathbf{x} with A symmetric and all eigenvalues nonnegative can represent a weighted sum of squares after diagonalization.^[9] This connection underscores the role of sums of squares in defining norms and inner products in Euclidean spaces. A key inequality involving sums of squares is the Cauchy-Schwarz inequality, which states that for real sequences a_i and b_i, (\sum_{i=1}^n a_i b_i)^2 \leq \left( \sum_{i=1}^n a_i^2 \right) \left( \sum_{i=1}^n b_i^2 \right), with equality if the sequences are proportional.^[11] A proof sketch via quadratic forms considers the expression \sum_{i=1}^n (a_i + \lambda b_i)^2 \geq 0 for all real \lambda, which expands to a quadratic in \lambda: \left( \sum b_i^2 \right) \lambda^2 + 2 \left( \sum a_i b_i \right) \lambda + \sum a_i^2 \geq 0.^[11] For this quadratic to be nonnegative for all \lambda, its discriminant must be nonpositive, yielding \left( \sum a_i b_i \right)^2 \leq \left( \sum a_i^2 \right) \left( \sum b_i^2 \right).^[11]

Applications in Statistics

Sum of Squared Errors

The sum of squared errors (SSE), also known as the residual sum of squares, is a measure of the discrepancy between observed values y_i and predicted values \hat{y}_i in a dataset, defined as \text{SSE} = \sum_{i=1}^n (y_i - \hat{y}_i)^2, where n is the number of observations.^[12] This metric quantifies the total squared deviations of the residuals, providing a way to assess how well a model fits the data, with smaller values indicating a better fit.^[12] The SSE plays a central role in the method of least squares, where the objective is to minimize this sum to determine the optimal model parameters that best approximate the observed data.^[13] In simple linear regression, minimizing the SSE leads to the ordinary least squares estimator for the slope coefficient \hat{\beta}_1 = \frac{\sum_{i=1}^n (x_i - \bar{x})(y_i - \bar{y})}{\sum_{i=1}^n (x_i - \bar{x})^2}, where \bar{x} and \bar{y} are the sample means of the predictor x_i and response y_i, respectively.^[14] This approach was developed in the early 19th century by Adrien-Marie Legendre and Carl Friedrich Gauss, primarily for fitting models to astronomical observations, with Legendre publishing the first clear exposition in 1805 and Gauss claiming prior invention around 1795.^[15] For example, consider fitting a line to three data points: (1, 2), (2, 4), and (3, 5). An initial guess of the line \hat{y} = 2x yields predicted values 2, 4, and 6, resulting in residuals 0, 0, and -1, so SSE = $0^2 + 0^2 + (-1)^2 = 1. Optimizing via least squares gives the fitted line \hat{y} = 1.5x + \frac{2}{3}, with predicted values \frac{13}{6}, \frac{11}{3}, and \frac{31}{6}, residuals -\frac{1}{6}, \frac{1}{3}, and -\frac{1}{6}, and SSE = \left(-\frac{1}{6}\right)^2 + \left(\frac{1}{3}\right)^2 + \left(-\frac{1}{6}\right)^2 = \frac{1}{6}, demonstrating the reduction achieved by minimization.

Variance and Analysis of Variance

In statistics, the sum of squared deviations measures the total variability in a dataset relative to its mean, providing a foundational quantity for assessing dispersion. For a sample of n observations x_1, x_2, \dots, x_n with sample mean \bar{x}, the total sum of squares (SST) is defined as

\text{SST} = \sum_{i=1}^n (x_i - \bar{x})^2.

This quantity quantifies the overall spread of the data around the central tendency. The sample variance, which normalizes this sum to estimate population variability, is then computed as

s^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x})^2,

where the denominator n-1 applies Bessel's correction to yield an unbiased estimator of the population variance.^[16] This decomposition of variability becomes central in analysis of variance (ANOVA), a method developed by Ronald Fisher to partition the total sum of squares into components attributable to different sources, enabling inference about group differences. In one-way ANOVA, which compares means across k groups with total N observations, SST decomposes into the between-group sum of squares (SSB), capturing variability due to group means, and the within-group sum of squares (SSW), reflecting variability within groups:

\text{SST} = \text{SSB} + \text{SSW},

where

\text{SSB} = \sum_{j=1}^k n_j (\bar{y}_j - \bar{y})^2, \quad \text{SSW} = \sum_{j=1}^k \sum_{i=1}^{n_j} (y_{ij} - \bar{y}_j)^2,

with \bar{y}_j as the mean of group j and \bar{y} as the grand mean. This partitioning, rooted in least squares principles, allows testing whether observed differences in group means exceed what would be expected by chance.^[17] The ANOVA table summarizes this decomposition, including mean squares (MS) obtained by dividing sums of squares by their degrees of freedom (df). For one-way ANOVA, the degrees of freedom are df_between = k-1 and df_within = N-k, with total df = N-1. The between-group mean square is MSB = SSB / (k-1), and the within-group mean square is MSW = SSW / (N-k). The F-statistic, which tests the null hypothesis of equal group means, is then F = MSB / MSW, following an F-distribution under the null with parameters (k-1, N-k). A large F-value indicates that between-group variability significantly exceeds within-group variability, rejecting the null.^[18] To illustrate, consider a one-way ANOVA comparing crop yields across three fertilizer treatments (k=3), with group sizes n1=5, n2=5, n3=5 (N=15) and yields (in kg): Group 1: 4.2, 4.5, 4.8, 5.0, 4.7; Group 2: 5.1, 5.3, 5.6, 5.4, 5.2; Group 3: 6.0, 6.2, 5.9, 6.1, 6.3. The grand mean \bar{y} \approx 5.35. Computations yield SST ≈ 5.96 (df=14), SSB ≈ 5.34 (df=2), and SSW ≈ 0.63 (df=12), with MSB ≈ 2.67, MSW ≈ 0.0525, and F ≈ 50.86 (p < 0.001), indicating significant differences among treatments. The ANOVA table is:

Source	SS	df	MS	F
Between	5.34	2	2.67	50.86
Within	0.63	12	0.0525
Total	5.96	14

This example demonstrates how sums of squares facilitate inference on group effects while controlling for within-group noise.

Number Theory

Representation as Sums of Squares

A fundamental result in number theory concerns the representation of primes as sums of two squares. Fermat's theorem states that an odd prime p can be expressed as p = a^2 + b^2 for integers a and b if and only if p \equiv 1 \pmod{4}, while the prime 2 admits the representation $2 = 1^2 + 1^2.^[1] This criterion extends multiplicatively to all positive integers: a positive integer n can be written as a sum of two squares if and only if, in its prime factorization, every prime congruent to 3 modulo 4 appears with an even exponent.^[1] The multiplicative property arises from the Brahmagupta–Fibonacci identity, which preserves the sum-of-two-squares form under multiplication.^[1] For example, the prime 5, which satisfies $5 \equiv 1 \pmod{4}, is represented as $5 = 1^2 + 2^2. For sums of three squares, Legendre's three-square theorem provides the characterization: a natural number n can be expressed as n = a^2 + b^2 + c^2 for integers a, b, and c if and only if n is not of the form $4^k(8m + 7) for nonnegative integers k and m.^[19] This excludes numbers like 7, which is $4^0(8 \cdot 0 + 7), and indeed, no combination of three integer squares sums to 7, as the possible sums of three squares not exceeding 7 are 0, 1, 2, 3, 4, 5, and 6. To constructively find representations, descent methods play a key role, particularly for two squares. Euler's proof of Fermat's theorem employs infinite descent on Gaussian integers, which can be adapted into an algorithm to identify a and b.^[20] A more efficient computational approach for primes p \equiv 1 \pmod{4} involves solving x^2 \equiv -1 \pmod{p} to obtain an initial x_0, then computing the continued fraction expansion of x_0 / p until a convergent yields the desired squares; this method, refined by Brillhart, ensures practical efficiency for large primes.^[20] Similar descent techniques apply to three squares, though they are more involved due to the theorem's conditional nature.

Identities and Theorems

One of the most significant results in the theory of sums of squares is Lagrange's four-square theorem, which states that every natural number can be expressed as the sum of four integer squares.^[21] This theorem was proved by Joseph-Louis Lagrange in 1770, building on earlier work by Euler.^[21] The proof demonstrates that every prime can be expressed as a sum of four squares and uses the multiplicative property of sums of four squares (via Euler's four-square identity) to extend these representations to all natural numbers.^[22] Central to Lagrange's proof is Euler's four-square identity, which demonstrates that the product of two sums of four squares is itself a sum of four squares.^[23] Discovered by Leonhard Euler in the 18th century, the identity is given by

(a^2 + b^2 + c^2 + d^2)(e^2 + f^2 + g^2 + h^2) = (ae - bf - cg - dh)^2 + (af + be + ch - dg)^2 + (ag - bh + ce + df)^2 + (ah + bg - cf + de)^2.

^[23] This algebraic identity arises from the norm-preserving property of quaternion multiplication and enables the composition of representations.^[23] For sums of two squares, Jacobi's two-square theorem provides an exact count of representations.^[1] It states that the number of ways to write a positive integer n as the sum of two integer squares, counting orders and signs (denoted r_2(n)), is r_2(n) = 4(d_1(n) - d_3(n)), where d_i(n) is the number of positive divisors of n congruent to i modulo 4.^[1] This formula, due to Carl Gustav Jacob Jacobi, quantifies the representations possible only for numbers whose prime factors of the form 4k+3 have even exponents.^[1] In contrast to the conditional nature of two- and three-square representations, the four-square case admits a precise formula for all n. Jacobi's four-square theorem gives r_4(n) = 8 \sum_{\substack{d \mid n \\ 4 \nmid d}} d, where the sum is over divisors of n not divisible by 4; for odd n, this simplifies to $8 \sigma(n), with \sigma(n) the sum of all divisors.^[1] This exact expression, also due to Jacobi, highlights the universality of four-square representations and provides computational bounds beyond classical proofs.^[1]

Algebra and Optimization

Polynomials and Hilbert's Problems

In the context of real algebraic geometry, a polynomial p(\mathbf{x}) with real coefficients is called a sum of squares (SOS) if it can be expressed as p(\mathbf{x}) = \sum_{i=1}^k q_i(\mathbf{x})^2 for some polynomials q_i(\mathbf{x}) also with real coefficients.^[24] This decomposition certifies the nonnegativity of p(\mathbf{x}) over \mathbb{R}^n, since squares are inherently nonnegative, but the converse does not hold in general for multivariate polynomials of higher degree. David Hilbert posed his seventeenth problem in 1900, asking whether every non-negative polynomial in several variables can be represented as a sum of squares of rational functions.^[25] Emil Artin resolved this affirmatively in 1927, proving that any such polynomial admits a decomposition into squares of rational functions, thereby establishing a foundational result in real algebra.^[26] However, this representation involves denominators, and the question of whether every non-negative polynomial is itself an SOS of polynomials remained open until counterexamples emerged. The first explicit counterexample was constructed by Theodore Motzkin in 1967: the bivariate polynomial M(x,y) = x^4 y^2 + x^2 y^4 - 3 x^2 y^2 + 1, which is nonnegative everywhere by the arithmetic-geometric mean inequality but cannot be written as an SOS of polynomials.^[27] This example highlighted the incompleteness of SOS decompositions for certifying nonnegativity in the polynomial ring, spurring further research into representations beyond pure squares. A significant advancement came with Konrad Schmüdgen's theorem in 1991, which provides a positivstellensatz stating that any polynomial strictly positive on a compact basic closed semialgebraic set can be expressed as a sum of squares in the associated preordering generated by the defining polynomials of the set.^[28] A complementary result is Putinar's positivstellensatz (1993), which asserts that any polynomial strictly positive on an archimedean basic closed semialgebraic set can be represented as a sum of squares of polynomials in the quadratic module generated by the constraints.^[29] This result strengthens the connections between sums of squares and real algebraic geometry by offering certificates of positivity on constrained domains, though it involves more complex algebraic structures than simple SOS polynomials.

Semidefinite Programming

In optimization, sums of squares (SOS) play a central role in tackling nonconvex polynomial problems by reformulating them as semidefinite programs (SDPs). A multivariate polynomial p(\mathbf{x}) is non-negative over a semialgebraic set if it admits an SOS decomposition, which can be certified by checking the positive semidefiniteness of associated moment or localizing matrices. This duality allows global optimization of polynomials—such as minimizing p(\mathbf{x}) subject to polynomial inequalities—to be approximated via SDP feasibility, where the objective traces the relaxation value. The Lasserre-Parrilo hierarchy provides a systematic sequence of SDP relaxations for polynomial optimization, indexed by relaxation order k, which tightens successively toward the global optimum as k increases. Each level k constructs a moment matrix of degree $2k and localizing matrices for constraints, ensuring the relaxation is an upper bound on the optimum; convergence to the exact value holds under archimedeanity assumptions. Feasibility of the hierarchy at a finite level certifies global optimality and yields an SOS certificate for non-negativity, enabling exact solutions for many problems despite the general intractability of polynomial optimization. This framework builds on Hilbert's 17th problem by providing a computational pathway to SOS representations. A concrete example is maximizing a quadratic objective \mathbf{x}^T A \mathbf{x} + \mathbf{b}^T \mathbf{x} subject to quadratic constraints like \mathbf{x}^T Q_i \mathbf{x} + \mathbf{c}_i^T \mathbf{x} + d_i \geq 0, which can be lifted to an SDP by homogenizing and parameterizing via monomials up to degree 2, then solving for positive semidefiniteness using specialized solvers such as SeDuMi. This approach can yield the exact global maximum for quadratic problems under certain conditions, such as when Slater's condition holds for the constraints, in which case the second-order relaxation is tight.^[30] Applications of SOS via SDP hierarchies extend to approximating NP-hard problems, such as max-cut on graphs, where the basic SDP relaxation achieves a 0.878-approximation ratio, and higher-order SOS levels provide stronger bounds and better performance guarantees in structured instances. Recent advancements leverage SOS for robust control synthesis in polynomial systems, enabling data-driven controllers that ensure stability under uncertainties through SOS-stabilized Lyapunov functions. In machine learning, SOS relaxations enforce fairness constraints by verifying individual fairness in models, bounding metric disparities across subpopulations via polynomial certificates.^[31]^[32]

Geometry and Inner Product Spaces

Pythagorean Theorem

The Pythagorean theorem states that in a right-angled triangle with legs of lengths a and b and hypotenuse of length c, the sum of the squares of the legs equals the square of the hypotenuse: a^2 + b^2 = c^2.^[33] This relation embodies the sum of squares as a fundamental geometric principle, linking the areas of squares constructed on the sides of the triangle.^[34] Although attributed to the Greek philosopher Pythagoras (c. 570–495 BCE), evidence from Babylonian clay tablets, such as Plimpton 322 (c. 1800 BCE), indicates that the theorem was known and applied centuries earlier for generating Pythagorean triples and surveying purposes.^[35] In ancient India, the theorem appears in the Sulba Sutras (c. 800–500 BCE), used in altar construction, predating Pythagoras.^[36] A notable proof by rearrangement was provided by the Indian mathematician Bhāskara II in his 12th-century text Lilavati, where four right triangles and a square are arranged to form a larger square on the hypotenuse, with the inner square's area equaling c^2 - a^2 - b^2 = 0, accompanied by the exclamation "Behold!". Another classical proof, due to Euclid in Elements (c. 300 BCE), relies on similar triangles: dropping an altitude from the right angle to the hypotenuse creates two smaller right triangles similar to the original, leading to the proportions a^2 = c \cdot p and b^2 = c \cdot q where p + q = c, yielding a^2 + b^2 = c^2.^[37] In vector terms, the theorem extends to orthogonal vectors \mathbf{u} and \mathbf{v} in Euclidean space, where the squared magnitude of their sum equals the sum of their squared magnitudes: \|\mathbf{u} + \mathbf{v}\|^2 = \|\mathbf{u}\|^2 + \|\mathbf{v}\|^2, since the dot product \mathbf{u} \cdot \mathbf{v} = 0 for orthogonality.^[38] This follows from expanding \|\mathbf{u} + \mathbf{v}\|^2 = (\mathbf{u} + \mathbf{v}) \cdot (\mathbf{u} + \mathbf{v}) = \|\mathbf{u}\|^2 + 2\mathbf{u} \cdot \mathbf{v} + \|\mathbf{v}\|^2.^[39] The theorem generalizes to n-dimensional Euclidean space, where for mutually orthogonal vectors \mathbf{v}_1, \dots, \mathbf{v}_n, the squared norm of their sum is the sum of the squared norms: \|\sum_{i=1}^n \mathbf{v}_i\|^2 = \sum_{i=1}^n \|\mathbf{v}_i\|^2.^[40] In higher-dimensional analogs of right triangles, such as the Pythagorean theorem for parallelepipeds, the square of the content of the "hypotenuse" face equals the sum of the squares of the contents of the orthogonal faces.^[41] This non-negativity of sums of squares underpins the triangle inequality in these spaces.^[42]

Norms and Orthogonality

In finite-dimensional real vector spaces equipped with the standard inner product, the Euclidean norm of a vector x = (x_1, \dots, x_n) is defined as \|x\|_2 = \sqrt{\sum_{i=1}^n x_i^2}. This norm satisfies the properties of a vector norm, including positivity, homogeneity, and the triangle inequality, and it induces a metric on the space given by d(x, y) = \|x - y\|_2, measuring the straight-line distance between points.^[43] The squared Euclidean norm \|x\|_2^2 = \sum_{i=1}^n x_i^2 directly arises from the inner product \langle x, x \rangle, providing a measure of the vector's length that generalizes the concept of distance in Euclidean geometry.^[44] Orthogonality in these spaces is defined for two vectors u and v if their inner product vanishes, i.e., \langle u, v \rangle = \sum_{i=1}^n u_i v_i = [0](/page/0). This condition implies the Pythagorean relation \|u + v\|_2^2 = \|u\|_2^2 + \|v\|_2^2, which serves as a special case of the more general Pythagorean theorem for right-angled triangles in the plane.^[45] In inner product spaces, where the inner product \langle \cdot, \cdot \rangle is a positive-definite sesquilinear form, the norm is extended as \|x\| = \sqrt{\langle x, x \rangle}, preserving orthogonality and the associated decomposition properties for sums of orthogonal vectors.^[46] These concepts extend naturally to infinite-dimensional Hilbert spaces, which are complete inner product spaces. In a Hilbert space H with an orthonormal basis \{e_i\}_{i \in I}, Parseval's identity asserts that for any x \in H, \|x\|^2 = \sum_{i \in I} |\langle x, e_i \rangle|^2, equating the energy of x to the sum of squares of its coefficients in the basis expansion.^[47] This identity underscores the preservation of norms under orthogonal projections and is fundamental to the structure of Hilbert spaces. A key example occurs in function spaces, where the L^2 space over a measure space (X, \mu) consists of square-integrable functions with norm \|f\|_{L^2} = \sqrt{\int_X |f|^2 \, d\mu} and inner product \langle f, g \rangle = \int_X f \overline{g} \, d\mu, forming a Hilbert space.^[48] Orthogonality here means \langle f, g \rangle = 0, and Parseval's identity applies to orthonormal bases such as Fourier bases. In the context of Fourier series on [-\pi, \pi], for a square-integrable function f, Parseval's theorem states that \frac{1}{\pi} \int_{-\pi}^{\pi} |f(x)|^2 \, dx = \frac{a_0^2}{2} + \sum_{n=1}^\infty (a_n^2 + b_n^2), relating the L^2 norm of f to the sum of squares of its Fourier coefficients, with applications in signal processing and harmonic analysis.^[49]

References

[1]
Sum of Squares Function -- from Wolfram MathWorld
The number of representations of n by k squares, allowing zeros and distinguishing signs and order, is denoted r_k(n). The special case k=2 corresponding to ...
[2]
Sum of Squares - an overview | ScienceDirect Topics
The sum of squares (SS) is defined as the total of the squared differences between each score and the mean, used to measure variability in a dataset.
[3]
Sum of Squares Formulas and Proofs - BYJU'S
In this article, we will come across the formula for addition of squared terms with respect to statistics, algebra, and for n number of terms. Sum of Squares ...<|control11|><|separator|>
[4]
Sum of Squares: Definition, Formula & Types - Statistics By Jim
The sum of squares (SS) is a statistic that measures the variability of a dataset's observations around the mean.
[5]
[PDF] Linear Algebra Vector Norms - cs.wisc.edu
Mar 19, 2013 · A vector norm is a function from Rn to R, with a certain. number of properties. If x ∈ Rn, we symbolize its norm by ||x||. The defining. ...
[6]
Completing the Square: The prehistory of the quadratic formula
Chapter VI covers the general quadratic equation: Euler writes it as a x 2 ± b x ± c = 0 , and then remarks that it can always be put in the form x 2 + p x = q ...
[7]
[PDF] Lecture 1: Complex Arithmetic - UW Math Department
Definition. If z = x + iy , its modulus is |z| = px2 + y2 = √. z z . If z = x is real, then |z| = |x| is the absolute value of x. |z| = ...
[8]
Positive Semidefinite Quadratic Form -- from Wolfram MathWorld
A quadratic form Q(x) is said to be positive semidefinite if it is never <0. However, unlike a positive definite quadratic form, there may exist a x!
[9]
Quadratic Form -- from Wolfram MathWorld
Also, two real quadratic forms are equivalent under the group of linear transformations iff they have the same quadratic form rank and quadratic form signature.<|control11|><|separator|>
[10]
Schwarz's Inequality -- from Wolfram MathWorld
Schwarz's inequality is sometimes also called the Cauchy-Schwarz inequality (Gradshteyn and Ryzhik 2000, p. 1099) or Buniakowsky inequality (Hardy et al.
[11]
2.3 - Sums of Squares | STAT 501
Called the "error sum of squares," as you know, it quantifies how much the data points vary around the estimated regression line.
[12]
Gauss and the Invention of Least Squares - Project Euclid
It is argued (though not conclusively) that Gauss probably possessed the method well before Legendre, but that he was unsuccessful in communicating it to his ...
[13]
The First Method for Finding $\beta_0$ and $\beta_1$
We can then estimate β0 and β1 as ^β1=sxysxx,^β0=¯y−^β1¯x. The above formulas give us the regression line ...
[14]
Carl Friedrich Gauss & Adrien-Marie Legendre Discover the Method ...
Carl Friedrich Gauss Offsite Link is credited with developing the fundamentals of the basis for least-squares analysis in 1795 at the age of eighteen.
[15]
[PDF] Why the N − 1 (Bessel's correction in sample variance calculations)?
Jan 21, 2022 · Equation (13) is an important and well-known result that the expected squared error in the sample mean is the variance of the PDF divided by N.
[16]
Classics in the History of Psychology -- Fisher (1925) Chapter 8
We shall in this chapter give examples of the further applications of the method of the analysis of variance developed in the last chapter.
[17]
13.2 - The ANOVA Table | STAT 415 - STAT ONLINE
SS(Total) is the sum of squares between the n data points and the grand mean. As the name suggests, it quantifies the total variability in the observed data.
[18]
Waring's Problem -- from Wolfram MathWorld
In Lagrange's four-square theorem, Lagrange proved that g(2)=4 , where 4 may be reduced to 3 except for numbers of the form 4^n(8k+7) (as proved by Legendre ...
[19]
Note on Representing a Prime as a Sum of Two Squares - jstor
(i) Find the solution x0 of x2 - -1 (mod p), where 0 < x0 < p72. (ii) Expand xo/p into a simple continued fraction to the point where the de- nominators of its ...
[20]
Lagrange's Four-Square Theorem -- from Wolfram MathWorld
It states that every positive integer can be written as the sum of at most four squares. Although the theorem was proved by Fermat using infinite descent, the ...
[21]
proof of Lagrange's four-square theorem - PlanetMath.org
Mar 22, 2013 · This is the Euler four-square identity, q.v., with different notation. Lemma 2. If 2m 2 ⁢ m is a sum of two squares, then so is m m . Proof.
[22]
Euler Four-Square Identity -- from Wolfram MathWorld
The amazing polynomial identity (a_1^2+a_2^2+a_3^2+a_4^2)(b_1^2+b_2^2+b_3^2+b_4^2) =(a_1b_1-a_2b_2Missing: formula | Show results with:formula
[23]
Introduction to Sums of Squares
Pn,≤2d is the set of all nonnegative polynomials in n variables of degree at most 2d. Protagonist: A polynomial p ∈ R[x1,...,xn] is called a sum of squares ...
[24]
[PDF] Mathematical Problems
1900. 17. Page 18. whether the new axioms are compatible with the previous ones. The physi- cist, as his theories develop, often finds himself forced by the ...Missing: 17th | Show results with:17th
[25]
Über die Zerlegung definiter Funktionen in Quadrate
Über die Zerlegung definiter Funktionen in Quadrate. Download PDF. Emil Artin. 560 Accesses. 202 Citations. 3 Altmetric. Explore all metrics. Article PDF ...
[26]
[PDF] arXiv:1505.08145v2 [math.AG] 19 Aug 2015
Aug 19, 2015 · [12] T.S. Motzkin, The arithmetic-geometric inequality, in Inequalities, Oved Shisha (ed.) Academic. Press (1967), 205-224.<|separator|>
[27]
TheK-moment problem for compact semi-algebraic sets
Schmüdgen, K. TheK-moment problem for compact semi-algebraic sets. Math. Ann. 289, 203–206 (1991).
[28]
[PDF] Verifying Individual Fairness in Machine Learning Models
Using the sum-of-squares relaxations, and the fact that semidefinite programs can be solved up to exponential accuracy in polynomial time (w.r.t the number of ...
[29]
A Sum of Squares‐Based Approach to Robust Nonlinear Mixed H2 ...
Oct 13, 2022 · This paper considers the robust nonlinear mixed H2/H∞ output-feedback control problem for a class of uncertain polynomial systems.Missing: post- | Show results with:post-
[30]
Pythagorean Theorem -- from Wolfram MathWorld
A clever proof by dissection which reassembles two small squares into one larger one was given by the Arabian mathematician Thabit ibn Kurrah (Ogilvy 1994, ...
[31]
Pythagorean Theorem and its many proofs
The Pythagorean (or Pythagoras') Theorem is the statement that the sum of (the areas of) the two small squares equals (the area of) the big one.
[32]
Pythagoras's theorem in Babylonian mathematics - MacTutor
Pythagoras's theorem in Babylonian mathematics. In this article we examine four Babylonian tablets which all have some connection with Pythagoras's theorem.Missing: precursors Bhaskara
[33]
Pythagorean Theorem - History of Mathematics Project
While Pythagoras lived and wrote from 569–475 BCE, the result that now bears his name appeared as early as 1800 BCE in surviving clay Babylonian tablets.
[34]
Euclid's Elements, Book I, Proposition 47 - Clark University
If the rectilinear figures on the sides of the triangle are similar, then the figure on the hypotenuse is the sum of the other two figures. A bit of history.
[35]
[PDF] 6 Orthogonality and Least Squares - UC Berkeley math
Theorem 2 (Pythagorean Theorem). If u, v are orthogonal vectors in V , then ku + vk2 = kuk2 + kvk2. (2). Proof. Suppose that u, v are orthogonal vectors in V .Missing: interpretation | Show results with:interpretation
[36]
[PDF] 2: Vectors and Dot Product - Harvard Mathematics Department
Having given precise definitions of all objects we can now prove Pythagoras theorem: Pythagoras theorem: if v and w are orthogonal, then |v − w|2 = |v|2 + |w|2.Missing: interpretation | Show results with:interpretation<|separator|>
[37]
Lengths and the Generalized Pythagorean Theorem - Brown Math
The generalization of the distance formula to higher dimensions is straighforward. By applying the Pythagorean theorem to a succession of planar triangles ...
[38]
[PDF] The Full Pythagorean Theorem
Jan 1, 2010 · If a right triangle has legs of length a and b and its hypotenuse has length c then a2 + b2 = c2. The Playfair proof of the Pythagorean theorem ...
[39]
An n-dimensional Pythagorean Theorem - jstor
Summary. An n-dimensional generalization of the standard cross product leads to an n- dimensional generalization of the Pythagorean theorem. 1. R. W. R.
[40]
[PDF] Chapter 4 Vector Norms and Matrix Norms - UPenn CIS
A vector space E together with a norm is called a normed vector space. From (N3), we easily get. | x − y | ≤ x − y .
[41]
[PDF] Lecture 2: Linear Algebra Review 2.1 Vectors - People @EECS
Euclidean norm of the input: for every x ∈ Rn, we have kUxk2 = kxk2 (why?). 2.2.2 Matrix norms. There are many ways to define the norm of a matrix A ∈ Rm×n.Missing: textbook | Show results with:textbook
[42]
[PDF] Orthogonality in inner product spaces.
In an inner product space, vectors x and y are orthogonal if their inner product hx,yi = 0. An orthogonal set has vectors that are mutually orthogonal.
[43]
INNER PRODUCT & ORTHOGONALITY
Two vectors are orthogonal to each other if their inner product is zero. That means that the projection of one vector onto the other "collapses" to a point.
[44]
[PDF] 5. Hilbert spaces Definition 5.1. Let H be a (complex) vector space. A ...
We don't need the Hahn-Banach Theorem on Hilbert spaces because the Riesz ... (Parseval's identity) that will give the norm in terms of these coefficients.
[45]
[PDF] L2 spaces (and their useful properties)
For a measure space (X,A,µ), the set L2 = L2(X,A,µ) of all square inte- grable, A\B(R)-measurable real functions on X would be a Hilbert space.
[46]
[PDF] Fourier series, Parseval's Identity, and ∑ 1 n2 Idea - UNL Math
Fourier series use integration to capture information 'averaged' over an entire interval. This allows it, for example, to approximate discontinuous functions!