Fact-checked by Grok 2 weeks ago

Rosenbrock function

The Rosenbrock function, also known as the banana function or valley function, is a non-convex mathematical widely employed as a problem to evaluate the performance of optimization algorithms, particularly those handling ill-conditioned and nonlinear landscapes. Introduced by British engineer Howard H. Rosenbrock in his paper on automatic methods for finding extrema of multivariable functions, it was originally designed to test numerical optimization techniques for applications, such as process and . The function's defining characteristic is a long, narrow parabolic valley that curves toward the global minimum, creating challenges for algorithms due to its steep sides, flat bottom, and the need for efficient traversal along the valley floor. The function is unimodal in its standard two-dimensional form but can exhibit multiple local minima in higher dimensions. Its enduring popularity stems from its simplicity in formulation yet profound difficulty in optimization, as the valley's amplifies numerical instability and slow in traditional methods. It has been cited in thousands of studies since 1960, influencing developments in , hyperparameter tuning, and engineering design problems. Variants, such as rotated or noisy versions, further extend its utility for assessing robustness against real-world perturbations.

Definition and Formulation

Two-Dimensional Form

The two-dimensional Rosenbrock function, a classic benchmark for optimization algorithms, is defined mathematically as f(x, y) = (a - x)^2 + b(y - x^2)^2, where the typical parameters are a = 1 and b = 100. This formulation was introduced by Howard H. Rosenbrock in 1960 as a for numerical optimization methods. The function attains its global minimum at the point (x, y) = (1, 1), where f(1, 1) = 0. Geometrically, it features a narrow, d "banana-shaped" that follows the parabolic curve y = x^2, rendering the problem non-convex overall while exhibiting behavior near the minimum. This structure poses challenges for algorithms, as the valley is relatively easy to locate but difficult to traverse to the exact minimum due to its steep walls and flat bottom. For visualization and testing purposes, the input domain is commonly restricted to x, y \in [-2.048, 2.048]. This two-dimensional case serves as the foundation for multidimensional extensions, which generalize the form to higher dimensions while preserving similar challenging characteristics.

Multidimensional Extension

The multidimensional Rosenbrock function generalizes the two-dimensional case to n \geq 2 dimensions, serving as a benchmark for testing optimization algorithms in higher-dimensional spaces. It is defined by the equation f(\mathbf{x}) = \sum_{i=1}^{n-1} \left[ 100 (x_{i+1} - x_i^2)^2 + (1 - x_i)^2 \right], \quad \mathbf{x} = (x_1, \dots, x_n)^\top \in \mathbb{R}^n. The function attains its global minimum value of f(\mathbf{x}^*) = 0 at \mathbf{x}^* = (1, 1, \dots, 1)^\top. The domain is often restricted to x_i \in [-2.048, 2.048] for all i = 1, \dots, n. This formulation incorporates parameters a = 1 and b = 100, yielding the more general form f(\mathbf{x}) = \sum_{i=1}^{n-1} \left[ b (x_{i+1} - x_i^2)^2 + (a - x_i)^2 \right]. These values emphasize the penalty on deviations from the ideal curve x_{i+1} = x_i^2 while penalizing distance from x_i = a, and they are standard in optimization literature. The chained structure scales effectively to high dimensions, with implementations demonstrating feasibility up to n = 1000 or more, allowing evaluation of algorithm performance in increasingly complex search spaces. The sequential quadratic terms create a narrow parabolic along the direction where x_i \approx 1 for all i, posing persistent challenges due to the elongated, curved nature of the landscape.

Historical Context

Introduction by Rosenbrock

The Rosenbrock function was introduced by Howard H. Rosenbrock, a , in his 1960 paper titled "An Automatic Method for Finding the Greatest or Least Value of a Function," published in The Computer Journal. Rosenbrock, born in 1920, earned a B.Sc. in from in 1941. After war service in the RAF, he joined industry in 1947, working at GEC Research Laboratories (1947–1948), the Electrical Research Association (1949–1951), John Brown & Co. Ltd. (1951–1954), and Costain John Brown Ltd. (1954–1962), where he served as research manager from 1957. The function was introduced while Rosenbrock was research manager at Costain John Brown Ltd., as a within his proposed direct search optimization method, which aimed to automatically locate extrema of functions without relying on derivatives. This approach was motivated by the limitations of existing gradient-based algorithms, which often struggled with ill-conditioned problems exhibiting narrow valleys, leading to slow and inefficient searches. The function served as a synthetic yet representative to evaluate the robustness of optimization techniques in scenarios akin to real-world tasks, where precise extremum finding was critical for operational efficiency. Over subsequent decades, it evolved into a widely adopted standard for testing optimization algorithms beyond its original context.

Evolution as a Test Function

Following its introduction in 1960, the Rosenbrock function rapidly gained traction in the 1970s as a for assessing both global and local optimization algorithms, appearing in early computational implementations and comparative studies. For instance, it was featured in the ACM's Algorithm 450, a subroutine for its minimization published in 1973, which facilitated its integration into optimization software testing. By the mid-1970s, researchers routinely employed it to evaluate multivariable search strategies, as evidenced in analyses of pattern search and direct methods. In the , the function was formalized in influential test suites, including the comprehensive collection by Moré, Garbow, and Hillstrom (1981), which provided standardized problems for rigorously testing unconstrained optimization codes. This inclusion elevated its status, ensuring consistent use across academic and applied optimization research to measure algorithm reliability and efficiency on challenging landscapes. The Rosenbrock function's persistence as a test problem derives from its exemplification of core optimization difficulties, particularly severe ill-conditioning along a narrow parabolic , non-convex characteristics in extended dimensions, and the consequent sluggish convergence of approaches. These traits allow it to stress-test algorithms' ability to escape deceptive local structures while maintaining computational tractability, making it a staple in comparisons even decades later. Notable milestones mark its evolving role, such as its application in late 1980s and 1990s evaluations of quasi-Newton methods, where it underscored the advantages of Hessian approximations like BFGS in handling ill-conditioned valleys compared to steepest descent. Into the , it became to assessing evolutionary algorithms, notably in studies contrasting self-adaptation in evolution strategies and real-parameter genetic algorithms on non-separable, curved functions.

Mathematical Properties

Stationary Points

The stationary points of the Rosenbrock function are determined by setting its partial derivatives to zero. In the two-dimensional form, f(x, y) = (1 - x)^2 + 100(y - x^2)^2, the is \frac{\partial f}{\partial x} = -2(1 - x) - 400x(y - x^2) = 0, \frac{\partial f}{\partial y} = 200(y - x^2) = 0. The second equation implies y = x^2, which substitutes into the first to yield -2(1 - x) = 0, so x = 1 and y = 1. This unique at (1, 1) is the global minimum, where f(1, 1) = 0. For the multidimensional extension, f(\mathbf{x}) = \sum_{i=1}^{n-1} \left[100(x_{i+1} - x_i^2)^2 + (1 - x_i)^2 \right], the stationary points satisfy the analogous system of n equations derived from the chained structure. The global minimum remains at \mathbf{x} = (1, 1, \dots, 1), with f = 0. However, increasing dimensionality introduces additional critical points beyond this minimum. Specifically, there is one local minimum near (x_1, x_2, \dots, x_n) \approx (-1, 1, \dots, 1), along with a growing number of points; for n = 30, identifies 108 stationary points total, comprising the two minima and 106 saddles, with the count of saddles rising exponentially as n increases up to 100. Hessian matrix evaluation classifies these points: it is positive definite solely at the global minimum, verifying a strict local minimum there. At the local minimum and all points, the Hessian possesses at least one negative eigenvalue, confirming their non-minimizing nature.

Gradient and Hessian Analysis

The of the two-dimensional Rosenbrock function f(x, y) = 100(y - x^2)^2 + (1 - x)^2 is given by \nabla f(x, y) = \begin{pmatrix} 400x^3 - 400xy + 2x - 2 \\ 200(y - x^2) \end{pmatrix}. This vector points in the direction of steepest ascent and is derived by computing the partial derivatives: \frac{\partial f}{\partial x} = -2(1 - x) - 400x(y - x^2) and \frac{\partial f}{\partial y} = 200(y - x^2). For the multidimensional extension f(\mathbf{x}) = \sum_{i=1}^{n-1} \left[ 100(x_{i+1} - x_i^2)^2 + (1 - x_i)^2 \right], the gradient components are \frac{\partial f}{\partial x_1} = -2(1 - x_1) - 400 x_1 (x_2 - x_1^2), \frac{\partial f}{\partial x_j} = -2(1 - x_j) + 200 (x_j - x_{j-1}^2) - 400 x_j (x_{j+1} - x_j^2), \quad 1 < j < n, \frac{\partial f}{\partial x_n} = 200 (x_n - x_{n-1}^2). These are obtained via the chain rule applied to the summation terms. The Hessian matrix captures the second-order derivatives, providing insight into local curvature. For the two-dimensional case, it is H = \begin{pmatrix} 1200x^2 - 400y + 2 & -400x \\ -400x & 200 \end{pmatrix}. At the global minimum (x, y) = (1, 1), the eigenvalues of this Hessian are approximately 1001.6 and 0.4, both positive, confirming positive definiteness and a local minimum. Away from the minimum, the eigenvalues of the Hessian vary significantly, with some regions exhibiting negative eigenvalues that indicate indefinite matrices and saddle-like local behavior, contributing to the function's non-convex nature. In higher dimensions, the full Hessian at the minimum is tridiagonal and becomes increasingly ill-conditioned, with the condition number (ratio of largest to smallest eigenvalue) rising from about 2500 in 2D to higher values before asymptoting as dimension n grows, explaining the slow convergence of optimization algorithms along the narrow valley. The level sets of the Rosenbrock function form a characteristic "banana-shaped" valley curving toward the minimum at (1, \dots, 1), with steep walls perpendicular to the valley floor reflecting high curvature in those directions and relatively flat curvature along the valley, which amplifies the challenges posed by the ill-conditioned Hessian.

Applications in Optimization

Challenges and Characteristics

The Rosenbrock function is non-convex, featuring a single global minimum at (1, 1) in the two-dimensional case, yet its local structure is highly misleading due to a pronounced "banana valley" that curves gently toward the minimum. This valley, formed by a sequence of stationary points along an approximately parabolic path, creates elongated level sets that guide optimizers into a deceptive trough, where progress toward the true minimum becomes arduous despite the absence of multiple local minima. Such characteristics make it an ideal test for assessing an algorithm's ability to navigate non-convex landscapes without being sidetracked by apparent flatness. A primary challenge stems from the ill-conditioned Hessian matrix, which exhibits a high condition number—approximately 100 in two dimensions, increasing with dimension but approaching an asymptote in higher dimensions at the global minimum—resulting in highly anisotropic curvature and elongated contours that hinder efficient convergence, particularly in methods reliant on coordinate-wise updates. This ill-conditioning causes slow progress along the valley's length, as small changes in one direction dominate the function's behavior, amplifying numerical instability and requiring algorithms to balance exploration across correlated variables. The function's sensitivity to initial conditions exacerbates this: starting points outside the valley lead to erratic trajectories, while those within it but distant from the minimum trap solvers in prolonged, shallow descent phases. In higher dimensions (n > 10), scalability issues intensify, as the valley narrows progressively and the condition number increases but asymptotes, demanding greater computational effort to maintain directional accuracy amid growing variable interdependencies. Although the function approximates a quadratic bowl near the minimum—facilitating local convergence once approached—its global structure starkly contrasts with purely quadratic benchmarks, underscoring the need for robust handling of curved, ill-conditioned geometries in optimization testing. While the two-dimensional version is unimodal, higher-dimensional extensions can exhibit local minima, further challenging global optimization methods.

Algorithm Testing Examples

The Rosenbrock function serves as a challenging benchmark for evaluating the performance of gradient-based optimization algorithms, particularly highlighting issues like slow convergence in narrow valleys. In the two-dimensional case, starting from the point (-1.2, 1), standard gradient descent exhibits pronounced zigzagging behavior due to the function's ill-conditioned curvature, often requiring hundreds of iterations to approach the global minimum at (1, 1) with an accuracy of \epsilon = 10^{-6}. This slow progress underscores the algorithm's sensitivity to the valley shape, where steps oscillate perpendicular to the direction of the minimum, leading to inefficient exploration. Quasi-Newton methods, such as BFGS, address these limitations by approximating the to better handle ill-conditioning, achieving in far fewer iterations on the two-dimensional Rosenbrock function. For instance, BFGS typically converges to \epsilon = 10^{-6} in around 25 iterations from similar starting points, compared to the hundreds needed by . This improvement stems from the method's ability to incorporate curvature information without full Hessian computation, making it more robust for problems like the Rosenbrock valley. Performance remains under 50 iterations even with variants, establishing BFGS as a standard for local optimization testing on this . Evolutionary algorithms like (DE) excel in global search on the Rosenbrock function, particularly for or high-dimensional variants, by maintaining for effective search, particularly useful in high-dimensional variants that may exhibit local minima. In a seminal 1997 study, DE with size 10, crossover rate 0.9, and scaling factor 0.9 successfully reached the minimum to \epsilon = 10^{-6} in all 20 runs on the two-dimensional case, averaging 654 function evaluations. These 1990s benchmarks demonstrated DE's effectiveness over other heuristics like adaptive nonlinear mapping, with consistent performance scaling to higher dimensions through parameter tuning. In modern contexts, such as testing optimizers, the Rosenbrock function evaluates adaptive methods like on higher-dimensional instances (e.g., n=10), where the valley structure amplifies challenges in gradient scaling. outperforms basic by adaptively adjusting learning rates, achieving faster convergence and stability with fewer function evaluations to reach low error thresholds like \epsilon = 10^{-6}, though exact counts vary by (typically hundreds rather than thousands for ). This makes suitable for ill-conditioned problems in training, as evidenced by benchmarks showing reduced oscillations compared to non-adaptive methods.
AlgorithmDimensionStarting PointMetric to \epsilon = 10^{-6}Source
2(-1.2, 1)Hundreds of iterations
BFGS2Similar to above~25 iterations
2Population-based654 function evaluations (avg.)
10Random initializationFewer evaluations than GD (hundreds)

References

  1. [1]
    Rosenbrock Function -- from Wolfram MathWorld
    The Rosenbrock function, also known as Rosenbrock's banana function or Rosenbrock's valley, is the function defined by f(x,y)=(ax)^2+b(yx^2)^2
  2. [2]
    Rosenbrock Function
    The Rosenbrock function, also referred to as the Valley or Banana function, is a popular test problem for gradient-based optimization algorithms.Missing: definition | Show results with:definition
  3. [3]
  4. [4]
    Rosenbrock Function | BenchmarkFcns
    The Rosenbrock function is a multimodal, n-dimensional non-convex mathematical function widely used for testing optimization algorithms.
  5. [5]
    [PDF] an efficient and robust sav based algorithm for discrete gradient ...
    Rosenbrock function. This is a benchmark problem for optimization of non-convex functions. We first consider the 2D case with f(x, y)=(a − x)2 + b(y − x2)2,.
  6. [6]
    [PDF] AM205: Assignment 5 - Web.math.wisc.edu
    Rosenbrock function. A well-known benchmark problem for optimization algorithms is minimization of Rosenbrock's function f(x, y) = 100(y − x2)2 + (1 − x)2,.
  7. [7]
    Solve Nonlinear Problem with Many Variables - MATLAB & Simulink
    This function is an N-dimensional generalization of Rosenbrock's function, a difficult function to minimize numerically. The minimum value of 0 is attained ...<|control11|><|separator|>
  8. [8]
    Announcement: Inaugural Howard Rosenbrock Prize
    Sep 4, 2015 · Howard Rosenbrock was a pioneer of modern control theory and practice. Born in England in 1920, he graduated in 1941 from University College ...
  9. [9]
    Howard Harry Rosenbrock. 16 December 1920—21 October 2010
    Jan 22, 2020 · His 1960 paper (3) was one result of this research and became a citation classic (partly because it used a non-convex test function, u ...
  10. [10]
    [PDF] Modified Oneâ•'atâ•'aâ•'time Optimization - Scholars' Mine
    Jul 5, 1974 · The Rosenbrock function and the 4-variable. Rosenbrock type function are given in Table 4. The starting point used by Rosenbrock is the x1 = - ...
  11. [11]
    [PDF] TR 75-246 - Cornell eCommons
    Rosenbrock's function. )2 2 f(x) = 100ml2 - x + (1 - x. 2 1) with 5 different ... Powell, M. J. D., (1970d), "A New Algorithm for Unconstrained. Optimization”, in ...
  12. [12]
    Testing Unconstrained Optimization Software - ACM Digital Library
    A comparison of several current optimization methods, and the use of transformations in constrained problems.
  13. [13]
    Testing unconstrained optimization software (Journal Article) - OSTI
    Feb 28, 1981 · Much of the testing of otpimization software is inadequate because the number of test functions is small or the starting points are close to ...Missing: Rosenbrock | Show results with:Rosenbrock
  14. [14]
    A Note on the Extended Rosenbrock Function - ResearchGate
    Aug 6, 2025 · The Rosenbrock function is a well-known benchmark for numerical optimization problems, which is frequently used to assess the performance of ...
  15. [15]
    [PDF] Benchmarking in Optimization: Best Practice and Open Issues
    Jul 8, 2020 · Famous studies, e.g., from Moré et al. [1981], were performed in this period and established well-known test functions that are known to every ...
  16. [16]
  17. [17]
    [PDF] B1 Optimization – Solutions
    The Rosenbrock function is f(x, y) = 100(y − x. 2. ) 2. + (1 − x). 2. (a) Compute the gradient and Hessian of f(x, y). (b) Show that that f(x, y) has zero ...
  18. [18]
    A Comparative Study of Optimization Techniques on the ... - Scirp.org.
    This study presents a comprehensive analysis of various optimization techniques applied to the challenging Rosenbrock function, offering valuable insights into ...Missing: motivation | Show results with:motivation
  19. [19]
    [PDF] arXiv:1405.2948v1 [cs.NA] 2 Feb 2014
    Feb 2, 2014 · 3. Condition number of the Hessian matrix for the N-dimensional Rosenbrock function at the global minimum. FIG. 4. ˆCe for N-dimensional ...<|control11|><|separator|>
  20. [20]
    [PDF] Rosenbrock H H. An automatic method for finding the greatest or ...
    Rosenbrock H H. An automatic method for finding the greatest or least value of a function Comput. J. 3: 175-84, 1960.Missing: optimization | Show results with:optimization
  21. [21]
    [PDF] Lecture 2
    The algorithm converges in only 15 iterations compared to hundreds for steepest descent. • However, the method requires computing the Hessian matrix at each.
  22. [22]
    [PDF] Differential Evolution – A Simple and Efficient Heuristic for Global ...
    Abstract. A new heuristic approach for minimizing possibly nonlinear and non-differentiable con- tinuous space functions is presented.
  23. [23]
    [PDF] A Comparative Study of Optimization Techniques on the ...
    Oct 12, 2024 · The study delves into a diverse array of optimization methods, including tra- ditional Gradient Descent, its stochastic variant (SGD), and the ...