Multiplication algorithm
A multiplication algorithm is a systematic procedure for computing the product of two mathematical objects, such as integers or polynomials, through a finite sequence of arithmetic operations, with variants optimized for efficiency in terms of time and space complexity depending on the operand size and computational model. These algorithms form a cornerstone of computational mathematics and computer science, enabling everything from basic arithmetic in processors to high-precision calculations in cryptography and scientific simulations.[1]
The most straightforward multiplication algorithm, often taught in elementary education, employs the long multiplication method, which generates partial products by multiplying the multiplicand by each digit of the multiplier and then summing them with appropriate shifts, achieving a time complexity of O(n²) for n-digit numbers.[2] This approach is simple to implement in software and hardware but becomes inefficient for large n, prompting the development of faster alternatives. In hardware designs, such as those in digital processors, shift-and-add techniques extend this by iteratively shifting the multiplicand and adding it to an accumulator based on multiplier bits, typically requiring O(n) cycles for n-bit operands.[1] Optimizations like Booth's recoding (introduced in 1951) reduce the number of additions by encoding the multiplier to handle sequences of identical bits efficiently, using values like -1, 0, or +1, which cuts partial additions by nearly half on average.[1]
For larger numbers, divide-and-conquer strategies offer asymptotic improvements over the quadratic baseline. The Karatsuba algorithm, devised by Anatolii Karatsuba in 1962, recursively splits operands into halves and computes the product using only three multiplications of half-sized numbers instead of four, plus some additions and shifts, yielding a complexity of O(n^{log_2 3}) ≈ O(n^{1.585}).[3] This breakthrough inspired generalizations like the Toom-Cook algorithm, which evaluates polynomials at multiple points for even better performance in practice for moderate sizes. For extremely large integers, fast Fourier transform (FFT)-based methods dominate, with the Schönhage-Strassen algorithm (1971) achieving O(n log n log log n) time by representing numbers as polynomials, performing convolution via FFT over rings with roots of unity, and managing carry propagation.[3][4] Recent refinements, such as Fürer's algorithm (2007) and Harvey and van der Hoeven's (2019), have pushed closer to the theoretical O(n log n) limit, influencing libraries like GMP for arbitrary-precision arithmetic.[3]
In hardware contexts, advanced multipliers employ parallel structures to reduce latency. Array multipliers use a grid of full adders to generate and sum partial products in a regular, pipelined fashion, suitable for VLSI implementation but with O(n) delay.[1] Tree multipliers, such as Wallace or Dadda trees, leverage carry-save adders (CSAs) to compress partial products in logarithmic depth, achieving O(log n) latency at the cost of increased area, and are common in modern CPUs and GPUs for floating-point units.[1] High-radix variants, processing multiple bits per cycle (e.g., radix-4 or -8), further minimize iterations using multiplexers and precomputed multiples, balancing speed and complexity in embedded systems.[1] Overall, the evolution of multiplication algorithms reflects ongoing trade-offs between theoretical efficiency, practical implementation, and application-specific constraints.
Manual Multiplication Methods
Long Multiplication
Long multiplication, also known as the grade-school or standard multiplication algorithm, is a manual method for computing the product of two multi-digit numbers by generating partial products for each digit of the multiplier and summing them with appropriate shifts for place value. This approach leverages the distributive property of multiplication over addition, treating the multiplier's digits as contributions scaled by powers of the base (typically 10 in decimal arithmetic). It is particularly effective for pencil-and-paper calculations involving numbers up to several digits long.[5][6]
The step-by-step process begins by aligning the two numbers vertically, with the multiplicand (the number being multiplied) placed above the multiplier (the number doing the multiplying), both right-justified. Starting with the units digit of the multiplier, multiply it by each digit of the multiplicand from right to left, recording the partial product below the line and handling any carries by adding 1 to the next digit's product if it reaches or exceeds the base. Shift this partial product one position to the left (adding a zero) for the tens digit of the multiplier, and repeat the multiplication and shifting for each subsequent digit, increasing the shift by one position per place value. Finally, add all the partial products column by column from right to left, again managing carries explicitly to obtain the total product. This method ensures accurate place value alignment and systematic error checking through intermediate steps.[7][8]
A detailed example illustrates the process for multiplying 123 by 456. First, multiply 123 by 6 (units digit of 456), yielding 738 (123 × 6 = 738). Next, multiply 123 by 50 (tens digit, shifted left by one), giving 6150 (123 × 5 = 615, then append 0). Then, multiply 123 by 400 (hundreds digit, shifted left by two), resulting in 49200 (123 × 4 = 492, then append two 0s). Add the partial products: 738 + 6150 = 6888; 6888 + 49200 = 56088. Thus, 123 × 456 = 56088. Carries are handled within each partial product multiplication, such as when 3 × 6 = 18 (write 8, carry 1 to the next column).[9][10]
123
× 456
-----
738 (123 × 6)
6150 (123 × 50, shifted)
49200 (123 × 400, shifted)
-----
56088
123
× 456
-----
738 (123 × 6)
6150 (123 × 50, shifted)
49200 (123 × 400, shifted)
-----
56088
Variations in notation emphasize vertical alignment to maintain place values, with the multiplicand and multiplier stacked and a multiplication symbol (×) placed between them. Some presentations use diagonal lines or arrows to visually connect the digits being multiplied, highlighting the "cross-multiplications" between corresponding positions, though this is optional for clarity. Carries are often noted explicitly above the digits or in a separate row during partial product formation and the final addition to avoid errors in manual computation.[11][12]
For hand calculations with n-digit numbers, long multiplication requires approximately n² single-digit multiplications and additions, resulting in O(n²) time complexity, which is efficient enough for typical educational and everyday use but scales poorly for very large numbers. Its structured, repeatable steps make it highly suitable for teaching in schools, as it builds understanding of place value and partial products without requiring advanced tools.[13][14]
Grid Method
The grid method, also known as the box method or area model, is a visual technique for multiplying multi-digit numbers by decomposing them into place values and organizing partial products in a tabular format.[8][15] To construct the grid, the digits of each factor are separated by place value—for instance, a two-digit number like 23 is broken into 20 and 3—creating rows for one factor's components and columns for the other's, forming cells at each intersection.[16] Each cell is then filled with the product of the corresponding row and column values, representing partial products shifted appropriately for place value.[15]
The summation process involves adding the partial products from the grid, typically by summing the values in the cells while carrying over any values exceeding 9 or 99 as needed to the next position.[8] This step-by-step aggregation ensures the total product is calculated without losing track of tens, hundreds, or higher places.[16]
For example, to multiply 23 by 14, decompose 23 into 20 and 3 (rows) and 14 into 10 and 4 (columns), forming a 2×2 grid:
Summing the cells—200 + 80 + 30 + 12—yields 322.[15][8]
This method shares conceptual similarities with long multiplication by relying on partial products but organizes them spatially for simultaneous visualization rather than sequential computation.[16] It is particularly advantageous for beginners, as the grid visually separates place values and partial products, reducing errors in alignment and fostering a deeper understanding of how decomposition contributes to the overall product.[15][8]
Lattice Multiplication
Lattice multiplication, also known as the gelosia method, is a visual technique for multiplying multi-digit numbers that employs a grid with diagonal lines in each cell to separate the units and tens components of partial products, allowing for organized addition along parallel diagonals.[17] This method emphasizes the separation of multiplication and addition steps, providing a structured way to handle place values and carries, which aids in error detection and conceptual understanding of the process.[17]
The lattice method traces its origins to early Hindu mathematics, where it appeared in texts such as the 16th-century commentary by Ganesa on Bhaskara II's Lilavati[18], and was subsequently adopted in Arab mathematics as the "method of the sieve" or "method of the net" following the 12th century.[17] It reached Europe in the late medieval period, with the earliest printed account in the Italian Treviso Arithmetic of 1478, earning the name gelosia from the lattice-patterned window screens resembling jealous guards against onlookers.[17] The technique is also featured in Vedic mathematics, a system drawing from ancient Indian traditions to simplify computations.[19]
To construct the grid, align the multiplicand's digits across the top of a rectangular array and the multiplier's digits along the left side, creating a cell for each digit pair, with each cell containing a diagonal line from upper right to lower left.[17] For the highest digit of the multiplier at the top of the left column and proceeding downward, multiply the digits for each cell, placing the units digit of the product below the diagonal and the tens digit above; this splitting inherently accounts for place values based on the row and column positions.[17] The lattice serves as a precursor to the grid method by incorporating these diagonal splits for carries, unlike simpler grids that lack such built-in place value separation.[17]
Summing occurs along the diagonals parallel to the cell dividers, starting from the bottom-right corner and moving upward-left, where each diagonal may span multiple cells depending on the numbers' lengths; add the digits along each diagonal, carrying over any sum of 10 or more to the next diagonal.[17] The resulting product is formed by reading the diagonal sums from the uppermost (leftmost) to the lowermost (rightmost), placing the rightmost sum as the units digit and proceeding leftward, ignoring a leading zero if the top diagonal sums to 10 or more after carry.[17]
For instance, to compute 89 × 92, draw a 2×2 grid with 8 and 9 across the top, and 9 (tens digit) at the top of the left column and 2 (units digit) at the bottom. In the top-left cell, 8 × 9 = 72, so write 2 below the diagonal and 7 above. In the top-right cell, 9 × 9 = 81, so write 1 below and 8 above. In the bottom-left cell, 8 × 2 = 16, so write 6 below and 1 above. In the bottom-right cell, 9 × 2 = 18, so write 8 below and 1 above.[17]
The diagonal sums are: units place (bottom-right below): 8. Tens place (bottom-right above + bottom-left below + top-right below): 1 + 6 + 1 = 8. Hundreds place (bottom-left above + top-left below + top-right above): 1 + 2 + 8 = 11 (write 1, carry 1). Thousands place (top-left above + carry): 7 + 1 = 8. Thus, the product is 8188.[17]
Russian Peasant Multiplication
The Russian peasant multiplication, also known as the Egyptian or Ethiopian multiplication method, is an ancient technique for performing multiplication through repeated doubling and halving, without relying on multiplication tables or decimal place values.[20] This approach was documented in the Rhind Mathematical Papyrus from ancient Egypt around 1650 BCE and later observed among Russian peasants in the 19th century, where it gained its modern name due to its simplicity for those without formal arithmetic education.[21] It proved particularly efficient in cultures lacking widespread use of Hindu-Arabic numerals, allowing calculations with basic operations like addition and division by 2.[22]
The algorithm proceeds as follows: Select one number as the multiplier (to be halved) and the other as the multiplicand (to be doubled). Create two columns: in the first, repeatedly halve the multiplier, discarding any remainder (i.e., use integer division) until reaching zero; in the second, simultaneously double the multiplicand for each step. For each row where the halved value is odd, mark or note the corresponding doubled value. Continue until the halved value is zero, then sum the marked doubled values to obtain the product.[20] This process requires only the ability to double, halve integers, and add, making it accessible for manual computation.[22]
Mathematically, the method exploits the binary representation of the multiplier: each odd step corresponds to a '1' bit in the binary expansion, effectively summing the multiplicand shifted by powers of 2 (i.e., multiplied by 2^k for each set bit position k).[22] This equivalence to binary decomposition ensures the algorithm's correctness, as multiplication by an integer n is identical to summing the multiplicand multiplied by each power of 2 where n's binary digit is 1.[21]
To illustrate, consider the multiplication of 13 by 18, treating 18 as the multiplier to halve:
| Halved (18) | Doubled (13) | Mark if Odd |
|---|
| 18 (even) | 13 | |
| 9 (odd) | 26 | ✓ |
| 4 (even) | 52 | |
| 2 (even) | 104 | |
| 1 (odd) | 208 | ✓ |
| 0 | 416 | |
Sum the marked values: 26 + 208 = 234, which is 13 × 18.[20]
Another example is 25 × 7, halving 7:
| Halved (7) | Doubled (25) | Mark if Odd |
|---|
| 7 (odd) | 25 | ✓ |
| 3 (odd) | 50 | ✓ |
| 1 (odd) | 100 | ✓ |
| 0 | 200 | |
Sum: 25 + 50 + 100 = 175, confirming 25 × 7 = 175.[22]
This technique bears a resemblance to the binary multiplication process employed in early digital computers, where shifts and additions mimic the doubling and selective summing.[21]
Quarter Square Multiplication
The quarter square multiplication method computes the product ab using the algebraic identity
ab = \frac{(a + b)^2 - (a - b)^2}{4},
which can be expressed in terms of quarter squares q(n) = n^2 / 4, yielding ab = q(a + b) - q(a - b). This approach transforms multiplication into additions, subtractions, and lookups in a precomputed table of quarter squares, avoiding direct digit-by-digit operations. It relies on the fact that (a + b)^2 - (a - b)^2 = 4ab, allowing the product to emerge from the difference of two scaled squares divided by 4.[23]
To apply the procedure, first compute the sum s = a + b and the difference d = a - b (assuming a \geq b > 0). Look up q(s) and q(d) from a quarter square table, subtract to get q(s) - q(d), and the result is ab. Tables typically list values of q(n) for integers n up to a certain limit, often scaled (e.g., multiplied by 100 or 1000) to avoid fractions. For multi-digit numbers, the method adapts by applying the identity to pairs of digits or using partial products, though this increases table size and lookup complexity for larger operands.[23][24]
For example, to multiply 23 by 17, set a = 23 and b = 17, so s = 40 and d = 6. Assuming a table where q(40) = 400 and q(6) = 9, the product is $400 - 9 = 391. This method was particularly useful with printed tables, as it required only basic arithmetic beyond the lookups.[23]
The origins of quarter square multiplication trace back to ancient mathematics, with the underlying identity appearing in Euclid's Elements (circa 300 BCE), Book II, Proposition 5. Early traces appear in Middle Eastern clay tablets from around 2000 BCE, possibly used for area calculations via half-integer squares. By the 19th century, dedicated quarter square tables proliferated as alternatives to logarithmic tables for multiplication; notable examples include John Leslie's table up to 2,000 in his 1820 The Philosophy of Arithmetic and Samuel Linn Laundy's extensive table up to 100,000 in 1856. These tables often involved collaborative labor, with mathematicians overseeing computations by teams of assistants. The method declined in the 20th century with the advent of electronic calculators and computers, which rendered table lookups obsolete.[23][24]
Computer Multiplication Algorithms
Binary Multiplication in Computers
Binary multiplication in computers adapts the long multiplication algorithm to binary representation, where operands consist solely of bits valued at 0 or 1, eliminating the need for multi-digit partial products beyond simple AND operations. For each bit position in the multiplier, if the bit is 1, the multiplicand is logically ANDed with the multiplier bit (yielding the multiplicand itself) and shifted left by the bit's position—equivalent to multiplication by powers of 2—before being added to an accumulator; if the bit is 0, no addition occurs for that position. This shift-and-add process is implemented in hardware using barrel shifters for efficient left shifts and adder circuits for accumulation, forming the basis of multiplication in central processing units (CPUs).[25]
To illustrate, consider multiplying the 4-bit binary number 1011₂ (decimal 11) by 1101₂ (decimal 13). The multiplier's bits, from least to most significant, are 1, 0, 1, 1. The partial products are generated as follows:
- For bit 0 (value 1): 1011₂
- For bit 1 (value 0, shifted left by 1): 0000₂
- For bit 2 (value 1, shifted left by 2): 101100₂
- For bit 3 (value 1, shifted left by 3): 1011000₂
Summing these yields 1011₂ + 101100₂ + 1011000₂ = 10001111₂ (decimal 143), confirming 11 × 13 = 143.
In hardware implementations, parallel processing accelerates this by generating all partial products simultaneously via an array of AND gates, followed by reduction using adder arrays. Array multipliers organize full adders and half adders in a rectangular grid to sum partial products row by row, propagating carries horizontally and vertically for the final product. For faster reduction of multiple partial products, Wallace trees employ a parallel structure of carry-save adders to compress terms logarithmically, reducing the number of adder levels from linear to O(log n) depth. This technique, proposed by C. S. Wallace, minimizes propagation delay in combinational circuits. Wallace trees and array multipliers were foundational in early microprocessors, such as the Intel 8086, where the MUL instruction executed shift-and-add sequences via microcode within the arithmetic logic unit (ALU).[26][27][28]
The time complexity of binary multiplication via this method is O(n²) bit operations for n-bit operands, as it involves n partial products each requiring O(n) work for generation and summation, rendering it efficient for small fixed-width integers in ALUs but prompting advanced algorithms for larger numbers.[29]
Karatsuba Multiplication
The Karatsuba multiplication algorithm is a divide-and-conquer method for multiplying large integers that reduces the number of required multiplications from the naive four to three, offering asymptotic improvements over the standard quadratic-time approach. Developed by Soviet mathematician Anatoly Karatsuba in 1960 while a student under Andrey Kolmogorov, the algorithm was initially met with skepticism by Kolmogorov, who had recently conjectured that no sub-quadratic multiplication method existed; however, Karatsuba's result disproved this, and the work was published in 1962 in collaboration with Yuri Ofman.[30][31] This breakthrough marked the first asymptotically faster integer multiplication algorithm, influencing subsequent developments in computational complexity.[31]
The algorithm operates by splitting each input number into two parts of roughly equal size, assuming the numbers are represented in base B with n digits, where n is even for simplicity. Let x = x_1 B^{m} + x_0 and y = y_1 B^{m} + y_0, where m = n/2, x_1, x_0 < B^m, and similarly for y. Instead of computing the four products x_1 y_1, x_1 y_0, x_0 y_1, and x_0 y_0 as in long multiplication, the method computes only three:
p_1 = x_1 y_1, \quad p_2 = x_0 y_0, \quad p_3 = (x_1 + x_0)(y_1 + y_0).
The product is then reconstructed as
x y = p_1 B^{2m} + \left[ p_3 - p_1 - p_2 \right] B^m + p_2,
where the middle term p_3 - p_1 - p_2 = x_1 y_0 + x_0 y_1. This avoids one multiplication at the cost of a few additions and subtractions, which are cheaper operations. The process can be applied recursively to the smaller subproducts for larger numbers.[31]
The time complexity of the Karatsuba algorithm is O(n^{\log_2 3}), where \log_2 3 \approx 1.585, significantly better than the O(n^2) complexity of schoolbook multiplication for sufficiently large n. This bound arises from the recurrence T(n) = 3 T(n/2) + O(n), solved using the master theorem, yielding the sub-quadratic growth; the constant factors are higher than the naive method, so it becomes advantageous only for numbers with hundreds of digits or more.[31]
A concrete example illustrates the algorithm with the multiplication $1234 \times 5678, splitting at m=2 digits (base 100): $1234 = 12 \cdot 100^2 + 34 and $5678 = 56 \cdot 100^2 + 78. Compute p_1 = 12 \times 56 = 672, p_2 = 34 \times 78 = 2652, and p_3 = (12 + 34)(56 + 78) = 46 \times 134 = 6164. The middle coefficient is $6164 - 672 - 2652 = 2840, so the product is $672 \cdot 100^4 + 2840 \cdot 100^2 + 2652 = 6,720,000 + 284,000 + 2,652 = 7,006,652, matching the direct computation. Recursive application extends the method to arbitrarily large numbers by further subdividing the subproducts.[31]
Toom–Cook Algorithm
The Toom–Cook algorithm is a family of multiplication algorithms that generalize the Karatsuba method by splitting each operand into k parts rather than two, treating them as polynomials of degree k-1, evaluating at $2k-1 points, performing multiplications at those points, and interpolating back to obtain the product coefficients. This reduces the number of multiplications from k^2 to $2k-1, offering better asymptotic performance for larger k, though with increased additive work in evaluation and interpolation. Andrei Toom introduced the approach in 1963, and Stephen Cook refined it in his 1966 PhD thesis, leading to variants like Toom-3 (k=3), which requires 5 multiplications instead of 9 for three-part splits.[32][3]
For Toom-3, numbers are split into three segments in base B, say x = x_2 B^{2m} + x_1 B^m + x_0 and similarly for y. The polynomials are evaluated at five points (typically -1, 0, 1, 1/2, ∞), multiplied pointwise, and interpolated using Lagrange or Newton methods to recover the seven coefficients of the degree-4 product polynomial, followed by carry propagation. The time complexity for Toom-k is O(n^{\log_k (2k-1)}); for Toom-3, \log_3 5 \approx 1.465, improving on Karatsuba's 1.585 for moderate large numbers, making it practical in libraries like GMP for operands around 100-1000 digits before switching to FFT-based methods.[3][33]
An example for Toom-3 with small numbers in base 10, multiplying 123 × 456 (split as 1|2|3 and 4|5|6, m=1): Evaluate polynomials at points like 0 (gives x0y0=36=18), 1 (123*456 directly but scaled), -1, etc., but detailed interpolation yields the product 56088 after adjustments. In practice, higher-precision splits are used recursively.[33]
Schönhage–Strassen Algorithm
The Schönhage–Strassen algorithm is a fast multiplication method for very large integers, leveraging Fourier transforms to achieve sub-quadratic time complexity. Developed for numbers with thousands or more digits, it represents a significant advancement over earlier divide-and-conquer approaches by incorporating cyclic convolution techniques. The algorithm is particularly effective for asymptotic performance in computational number theory and cryptographic applications requiring high-precision arithmetic.[34]
Invented by German mathematicians Arnold Schönhage and Volker Strassen, the algorithm was first described in their 1971 paper, marking a breakthrough in integer multiplication by integrating fast Fourier transform (FFT) principles into exact arithmetic. It has been implemented in major libraries such as the GNU Multiple Precision Arithmetic Library (GMP), where it serves as the default method for multiplying integers beyond a certain size threshold, typically around 10,000 bits.[34]
At its core, the algorithm treats the input integers as polynomials and computes their product via cyclic convolution using a number-theoretic transform (NTT), a discrete Fourier transform variant over finite fields to ensure exact integer results without floating-point errors. To handle large numbers, the inputs are split into smaller blocks, represented as coefficients in a polynomial modulo a carefully chosen prime or a ring like \mathbb{Z}/(2^{2^k} + 1)\mathbb{Z}, where the modulus supports primitive roots of unity for efficient transforms. The convolution is performed recursively, with the transform size doubling at each level to manage carries. Final reconstruction uses the Chinese Remainder Theorem (CRT) to combine results from multiple modular computations, resolving any overflow and yielding the exact product. This NTT-based approach avoids the precision issues of standard FFT while maintaining fast convolution.[34][35]
The time complexity of the Schönhage–Strassen algorithm is O(n \log n \log \log n) bit operations for multiplying two n-bit integers, which is asymptotically superior to the O(n^{1.585}) of the Karatsuba algorithm and the higher-degree polynomials of Toom–Cook methods for sufficiently large n. This near-linear scaling arises from the recursive application of FFT-like transforms, where the \log \log n factor accounts for the depth of recursion in handling the transform lengths. For practical sizes, such as n > 2^{20}, it outperforms simpler methods, though constant factors make it less efficient for smaller inputs.[34]
A high-level example illustrates the process for multiplying two $2^k-bit numbers a and b. First, partition a and b into m = 2^{k/2} blocks of \log_2 m-bit digits each, treating them as polynomials A(x) and B(x) of degree less than m. Compute the cyclic convolution C(x) = A(x) B(x) \mod (x^m - 1) using NTT: transform A and B to point-value representations via NTT of length $2m (to avoid wrap-around issues), pointwise multiply, and inverse NTT to recover C. Handle carries by adding the high part of C shifted by m to the low part. Repeat this recursively for the NTT steps until the base case of small multiplications, then apply CRT across multiple such computations if using composite moduli for larger ranges. The result is the full product without intermediate rounding errors.[34]
Recent Improvements and Complexity Bounds
Subsequent to the Schönhage–Strassen algorithm, further refinements have pushed the theoretical limits of integer multiplication closer to the conjectured optimal O(n \log n) bound. In 2007, Arnold Fürer developed an algorithm achieving O(n \log n \cdot 2^{O(\log^* n)}), where \log^* n is the iterated logarithm, slightly improving the extra factor over Schönhage–Strassen by using a more sophisticated choice of rings and transform lengths in the recursive structure.[3]
A major breakthrough came in 2019 with the algorithm by David Harvey and Joris van der Hoeven, which attains O(n \log n) time, matching the lower bound implied by information-theoretic considerations for multiplication in standard models. This result relies on advanced techniques in analytic number theory and careful control of recursion depths to eliminate the \log \log n factor, though the large hidden constants make it impractical for current hardware. As of 2025, it remains primarily theoretical, with no widespread implementation in libraries like GMP, which continue to rely on Schönhage–Strassen for very large integers due to better practical constants. These advancements underscore the ongoing pursuit of optimal computational bounds, influencing research in algebraic complexity and high-performance computing.[3]
Specialized Multiplication Techniques
Complex Number Multiplication
The multiplication of two complex numbers z_1 = a + bi and z_2 = c + di, where a, b, c, d \in \mathbb{R} and i = \sqrt{-1}, follows the algebraic formula
(a + bi)(c + di) = (ac - bd) + (ad + bc)i.
This straightforward approach computes the real part ac - bd and imaginary part ad + bc using four real multiplications (ac, bd, ad, bc), plus additions and a subtraction.[36]
To optimize, Carl Friedrich Gauss devised a method in the early 19th century that reduces the multiplications to three while increasing additions and subtractions. Define
p = ac, \quad q = bd, \quad r = (a + b)(c + d).
The real part is then p - q, and the imaginary part is r - p - q. This saves 25% on multiplications compared to the direct method, making it advantageous for manual or low-level computational settings, and it draws inspiration from quarter square identities.[36]
For high-precision complex arithmetic, the Gauss technique extends recursively in a divide-and-conquer manner akin to the Karatsuba algorithm. Here, the real and imaginary coefficients a, b, c, d are treated as large integers and split into halves (e.g., a = a_1 2^k + a_0), reducing the overall multiplication to three subproblems of half the size each, yielding a time complexity of O(n^{1.585}) for n-bit coefficients instead of O(n^2). This recursive splitting applies the three-multiplication formula at each level, enabling efficient computation in arbitrary-precision libraries.[36]
As an illustration, multiply $3 + 4i by $1 - 2i. The direct formula gives real part $3 \cdot 1 - 4 \cdot (-2) = 11 and imaginary part $3 \cdot (-2) + 4 \cdot 1 = -2, yielding $11 - 2i. Using Gauss: p = 3 \cdot 1 = 3, q = 4 \cdot (-2) = -8, r = (3 + 4)(1 - 2) = -7; real part $3 - (-8) = 11, imaginary part -7 - 3 - (-8) = -2, confirming the result. Complex multiplication underpins core arithmetic in signal processing, such as phase rotations and convolutions in the Fast Fourier Transform.[36][37]
Polynomial Multiplication
Polynomial multiplication involves computing the product of two polynomials, typically represented by their coefficient sequences, resulting in a new polynomial whose coefficients are determined by convolving the input sequences. This operation is fundamental in computer algebra and symbolic computation, with algorithms ranging from straightforward direct methods to asymptotically efficient techniques based on transforms. The choice of algorithm depends on factors such as polynomial degree, coefficient ring, and whether the polynomials are dense (most coefficients nonzero) or sparse (few nonzero coefficients).[38]
The schoolbook method, also known as the classical or direct multiplication algorithm, computes the product by treating the polynomials as sequences and performing a double summation over their coefficients. For two polynomials A(x) = \sum_{i=0}^{d_1} a_i x^i and B(x) = \sum_{j=0}^{d_2} b_j x^j, the product C(x) = A(x) B(x) = \sum_{k=0}^{d_1 + d_2} c_k x^k has coefficients c_k = \sum_{i=0}^{k} a_i b_{k-i}, implemented via nested loops that iterate over all pairs of indices. This approach has time complexity O(d_1 d_2), making it quadratic in the degrees for equal-sized inputs. For example, multiplying (x^2 + 2x + 1)(x + 3) yields coefficients computed as c_0 = 1 \cdot 3 = 3, c_1 = (2 \cdot 3 + 1 \cdot 1) = 7, c_2 = (1 \cdot 3 + 2 \cdot 1) = 5, c_3 = 1 \cdot 1 = 1, resulting in x^3 + 5x^2 + 7x + 3. While simple and suitable for small or sparse polynomials, its inefficiency for large dense polynomials motivates faster alternatives.[39]
The Karatsuba algorithm, originally developed for integer multiplication, extends naturally to polynomials over any ring by recursively splitting the inputs into high and low parts. For polynomials of degree less than n, it divides each into parts of degree roughly n/2, computes three products of half-size polynomials instead of four, and combines them with shifts and additions to obtain the full product, achieving a time complexity of O(n^{\log_2 3}) \approx O(n^{1.585}). This divide-and-conquer strategy reduces the number of recursive multiplications at the cost of additional linear-time operations, providing a sub-quadratic improvement over the schoolbook method for moderately large degrees. The approach is particularly effective when implemented with careful base-case handling for small polynomials.[40]
For very large polynomials, fast Fourier transform (FFT)-based methods achieve near-linear time by exploiting the convolution theorem: the coefficient product corresponds to pointwise multiplication in the frequency domain. The algorithm evaluates both polynomials at the $2m-th roots of unity (where m exceeds the output degree), multiplies the resulting point values, and applies an inverse FFT to recover the coefficients. Over fields supporting suitable roots of unity, such as the complex numbers or finite fields via number theoretic transforms, this yields O(d \log d) complexity for degree-d polynomials, assuming a power-of-two padded length. An adaptation of the Schönhage–Strassen algorithm refines this for polynomial rings by using truncated FFTs over rings of integers modulo a Fermat number, maintaining O(d \log d \log \log d) while avoiding coefficient growth issues in exact arithmetic. These methods excel for dense polynomials in computer algebra systems.[41][42]
In practice, systems like Mathematica employ FFT-based multiplication for dense polynomials to handle high-degree computations efficiently, while sparse cases use optimized schoolbook variants or heap-based merging to avoid unnecessary operations on zero coefficients. This overlap with integer multiplication algorithms arises when viewing large integers as polynomials in a base-b representation, allowing shared implementations across domains.[43]