Fact-checked by Grok 2 weeks ago

Isotonic regression

Isotonic regression is a non-parametric statistical method that estimates a stepwise-constant, monotone (non-decreasing or non-increasing) function to fit a sequence of observations while minimizing the mean squared error, subject to the order constraint imposed by the predictors.^[1] This technique ensures that the fitted values preserve the monotonic relationship between the independent and dependent variables, making it particularly useful when domain knowledge or data patterns suggest such ordering. The origins of isotonic regression trace back to the mid-1950s, with foundational contributions including Brunk's work on maximum likelihood estimation of monotone parameters, which addressed the problem of finding optimal estimates under ordering restrictions.^[2] Concurrently, Ayer et al. introduced a pooling-based approach for constructing empirical distribution functions from interval-censored data, laying the groundwork for the pool-adjacent-violators algorithm (PAVA), the standard computational method for solving isotonic regression problems. PAVA iteratively merges adjacent data points that violate the monotonicity constraint by averaging their values, producing a unique solution to the constrained least-squares optimization in linear time for ordered data.^[3] Isotonic regression has broad applications across statistics and related fields, including order-restricted inference for testing hypotheses under monotonicity assumptions, such as in bioassay experiments or ranking problems.^[3] In machine learning, it serves as a calibration tool to adjust classifier output probabilities, transforming scores into reliable estimates while enforcing monotonicity, as seen in recalibrating risk prediction models for clinical decision-making.^[1] Extensions include multivariate settings, distributional regression under order constraints, and integration with regularization techniques like LASSO for high-dimensional data.^[4]^[5]

Fundamentals

Definition

Isotonic regression is a statistical technique for estimating a monotone function from a set of observations, where the goal is to find fitted values that preserve a specified order—typically non-decreasing or non-increasing—while minimizing a loss function applied to the residuals. In its classical form, it minimizes the sum of squared errors between the observations and the fitted values subject to these monotonicity constraints, without assuming any parametric form for the underlying relationship. This approach is particularly useful when domain knowledge or data patterns suggest that the response should not decrease (or increase) as the predictor increases. In contrast to ordinary linear regression, which fits a straight line through the data and allows predictions to violate monotonicity, isotonic regression enforces order preservation directly in the solution space, often resulting in a piecewise constant function that adapts to the data's structure. This non-parametric nature makes it suitable for scenarios where the relationship is known to be monotone but its exact shape is unknown, avoiding the risk of overfitting parametric assumptions. The method can generalize to other convex loss functions beyond squared errors, such as absolute deviations, depending on the robustness requirements. A basic example involves data points (x_i, y_i) for i = 1, \dots, n with ordered predictors x_1 < x_2 < \dots < x_n. Isotonic regression seeks \hat{y}_i such that \hat{y}_1 \leq \hat{y}_2 \leq \dots \leq \hat{y}_n and \sum_{i=1}^n (y_i - \hat{y}_i)^2 is minimized; the solution typically forms a non-decreasing step function consisting of constant blocks over ranges of the predictors. This preserves the data's order while smoothing inconsistencies.

Mathematical Formulation

Isotonic regression is formally defined as the problem of finding estimates \hat{y} = (\hat{y}_1, \dots, \hat{y}_n) for given data points (x_i, y_i)_{i=1}^n, where x_1 \leq x_2 \leq \dots \leq x_n, that minimize the sum of squared errors subject to a monotonicity constraint on the estimates. Specifically, the optimization problem is

\hat{y} = \arg\min_{\hat{y}_1 \leq \hat{y}_2 \leq \dots \leq \hat{y}_n} \sum_{i=1}^n (y_i - \hat{y}_i)^2.

In vector notation, letting \mathbf{y} = (y_1, \dots, y_n)^\top and \mathbf{z} = (z_1, \dots, z_n)^\top, the problem can be expressed as

\hat{\mathbf{y}} = \arg\min_{\mathbf{z} : z_1 \leq \dots \leq z_n} \|\mathbf{y} - \mathbf{z}\|_2^2,

where \|\cdot\|_2^2 denotes the squared Euclidean norm. The least squares criterion, corresponding to the L_2 loss, serves as the default objective in standard isotonic regression due to its connection to maximum likelihood estimation under Gaussian errors. Extensions to other loss functions, such as the L_1 loss \sum_{i=1}^n |y_i - \hat{y}_i| under the same monotonicity constraints, arise in robust regression settings where median-based estimates are preferred over means.^[6] The optimization problem is convex because the objective function is a convex quadratic and the feasible set \{\mathbf{z} \in \mathbb{R}^n : z_1 \leq \dots \leq z_n\} forms a convex polyhedral cone. This convexity ensures the existence of a global minimum, which is unique for strictly convex losses like the squared error.

Algorithms

Pool-Adjacent-Violators Algorithm

The Pool-Adjacent-Violators Algorithm (PAVA) is an iterative procedure designed to compute the solution to the univariate isotonic regression problem by enforcing non-decreasing order on the fitted values. Introduced by Barlow et al. in their seminal work on statistical inference under order restrictions, PAVA operates by starting with the observations themselves as initial estimates and progressively merging groups of adjacent points that violate the monotonicity constraint until the sequence is fully non-decreasing. This approach leverages the convexity of the least-squares objective to guarantee convergence to the unique optimum.^[7] The algorithm proceeds as follows: Initialize each observation y_i as its own block with value \hat{y}_i^{(0)} = y_i and weight equal to 1 (or the provided weights w_i). In each iteration, scan the current block values for adjacent pairs where the value of block r exceeds that of block r+1 (i.e., \hat{y}_r > \hat{y}_{r+1}). Upon detecting a violation, merge the two blocks into one by computing the weighted average \hat{y}_{\text{new}} = \frac{\sum w_j y_j}{\sum w_j} over all observations in the merged block, where the weights reflect the size or specified weights of the original blocks. This pooling may propagate backward if the new block value now violates monotonicity with the preceding block, requiring further merges in the same iteration. Repeat the process across all blocks until no violations remain. The final fitted values are the constant values assigned to each position within the resulting blocks. A pseudocode outline for unweighted univariate PAVA is provided below, assuming a left-to-right scan with backward propagation for efficiency:

Initialize: blocks = [[y_1], [y_2], ..., [y_n]]  # Each block is a list of y values, or track indices/sizes
While true:
    violation_found = false
    i = 0
    while i < len(blocks) - 1:
        current_avg = sum(blocks[i]) / len(blocks[i])
        next_avg = sum(blocks[i+1]) / len(blocks[i+1])
        if current_avg > next_avg:
            # Pool blocks i and i+1
            merged = blocks[i] + blocks[i+1]
            blocks[i] = merged
            del blocks[i+1]
            violation_found = true
            # Check backward if needed (propagate)
            while i > 0:
                prev_avg = sum(blocks[i-1]) / len(blocks[i-1])
                current_avg = sum(blocks[i]) / len(blocks[i])  # Recompute after merge
                if prev_avg <= current_avg:
                    break
                # Pool i-1 and i
                merged = blocks[i-1] + blocks[i]
                blocks[i-1] = merged
                del blocks[i]
                i -= 1
            # Do not increment i after merge; recheck from current i
        else:
            i += 1
    if not violation_found:
        break
# Output: fitted values by averaging within final blocks
Initialize: blocks = [[y_1], [y_2], ..., [y_n]]  # Each block is a list of y values, or track indices/sizes
While true:
    violation_found = false
    i = 0
    while i < len(blocks) - 1:
        current_avg = sum(blocks[i]) / len(blocks[i])
        next_avg = sum(blocks[i+1]) / len(blocks[i+1])
        if current_avg > next_avg:
            # Pool blocks i and i+1
            merged = blocks[i] + blocks[i+1]
            blocks[i] = merged
            del blocks[i+1]
            violation_found = true
            # Check backward if needed (propagate)
            while i > 0:
                prev_avg = sum(blocks[i-1]) / len(blocks[i-1])
                current_avg = sum(blocks[i]) / len(blocks[i])  # Recompute after merge
                if prev_avg <= current_avg:
                    break
                # Pool i-1 and i
                merged = blocks[i-1] + blocks[i]
                blocks[i-1] = merged
                del blocks[i]
                i -= 1
            # Do not increment i after merge; recheck from current i
        else:
            i += 1
    if not violation_found:
        break
# Output: fitted values by averaging within final blocks

This implementation ensures pooling handles chain reactions efficiently. In terms of computational complexity, the naive scanning approach in PAVA leads to a worst-case time of O(n^2), as there can be up to n-1 iterations (each reducing the number of blocks by at least one), with each iteration potentially scanning and updating the full sequence of length O(n). However, empirical performance is typically linear O(n) for randomly ordered or well-behaved data, making it highly practical for large n.^[8] To illustrate, consider a small dataset y = [1, 4, 3, 2]. Initialize blocks as ^[9], ^[10], ^[11], ^[12] with averages 1, 4, 3, 2. A violation exists between the second and third blocks (4 > 3), so pool them into [4, 3] with average 3.5 (weight 2), yielding blocks ^[9], [3.5, 3.5], ^[12]. Now a violation occurs between the second and third (3.5 > 2); pool them into [4, 3, 2] with average (4 + 3 + 2)/3 = 3 (weight 3), resulting in ^[9], [3, 3, 3]. No further violations (1 ≤ 3), so the fitted values are [1, 3, 3, 3]. This example demonstrates two pooling steps, highlighting how merges can expand blocks iteratively.

Linear-Time Algorithms and Variants

Variants of the pool-adjacent-violators algorithm (PAVA) achieve guaranteed linear time complexity O(n), where n denotes the number of data points, by utilizing stack-based structures to track and merge violator blocks efficiently, thereby circumventing the O(n²) worst-case scenarios of unoptimized iterative pooling. These implementations process the data in a forward pass while maintaining a stack of candidate blocks, popping and averaging as needed to enforce monotonicity upon violation detection, ensuring each element is handled constantly many times. An equivalent approach, the taut string algorithm, reformulates isotonic regression as finding the shortest path (taut string) within a tubular constraint around the cumulative sum of observations, yielding the same solution as PAVA and admitting an O(n) implementation through geometric constructions like the greatest convex minorant.^[13] For weighted isotonic regression, which minimizes the objective \sum_{i=1}^n w_i (y_i - \hat{y}_i)^2 subject to the non-decreasing order constraint \hat{y}_1 \leq \hat{y}_2 \leq \cdots \leq \hat{y}_n with positive weights w_i, pooling operations compute weighted means of adjacent blocks to preserve the isotonic property while accounting for varying observation importance.^[14] Efficient linear-time PAVA implementations are integrated into established software ecosystems, such as the isotone package in R, which supports univariate and multivariate isotone optimization including weighted cases via its gpava function, and Python's scikit-learn library through the IsotonicRegression class, which handles weights in its fit method for seamless integration in machine learning pipelines.^[15]^[16] Post-2020 advancements emphasize refined O(n) PAVA variants with optimized block execution orders and minimal memory footprints, as exemplified by a 2022 implementation that outperforms prior methods on large-scale datasets by streamlining up-and-down block handling, facilitating applications with n exceeding 10^6 without performance degradation.^[17]

Applications

Statistical Estimation

Isotonic regression plays a central role in non-parametric statistical inference for estimating monotone regression functions under order restrictions, where the goal is to fit a non-decreasing (or non-increasing) function to data while minimizing a loss criterion, typically the squared error. Introduced as a maximum likelihood estimator for monotone parameters, this approach ensures the fitted function respects the assumed monotonicity.^[18] Confidence intervals for the isotonic regression estimator \hat{y} can be constructed using asymptotic theory, which leverages the limiting distribution of the estimator under monotonicity constraints, or via bootstrap methods to account for the piecewise constant nature of the fit. Asymptotic approaches provide pointwise intervals with coverage rates approaching the nominal level, particularly for multiple isotonic models where the estimator converges at rate n^{1/3} locally away from knots. Bootstrap procedures, such as the nonparametric bootstrap applied to residuals, offer consistent finite-sample coverage for monotone regression functions, improving upon naive methods by resampling under the order constraint to capture the estimator's variability.^[19] Hypothesis testing in isotonic regression often employs likelihood ratio tests to assess monotonicity against unrestricted alternatives, where the test statistic compares the likelihood under the monotone constraint to the unconstrained maximum. These tests exhibit chi-bar-squared limiting distributions due to the inequality constraints, enabling assessment of whether imposing monotonicity significantly improves fit. For instance, in trend estimation for time series data, isotonic regression identifies long-term monotone patterns by projecting observations onto a non-decreasing sequence, effectively isolating trends from seasonal fluctuations while preserving order.^[20] Similarly, in survival analysis with ordered covariates, such as dose-response studies, isotonic methods estimate hazard ratios under monotonicity, providing constrained partial likelihood estimates that enhance precision for cumulative incidence functions. Recent applications include estimating effective doses in biased coin up-and-down sequential designs for dose-finding.^[21]^[22]

Machine Learning and Optimization

In machine learning, isotonic regression serves as a post-processing technique to calibrate the probability outputs of classifiers, ensuring that predicted probabilities align more closely with true empirical probabilities while preserving monotonicity. This approach is particularly useful for models like support vector machines or random forests, where raw outputs may exhibit poor calibration due to overconfidence or underconfidence. Unlike parametric methods such as Platt scaling, which assume a logistic form, isotonic regression non-parametrically fits a stepwise monotonic function to the scores, reducing calibration errors without introducing bias from model assumptions. For instance, in binary classification tasks, it minimizes the squared error subject to a non-decreasing constraint on the calibration curve, leading to improved reliability in decision-making scenarios like medical diagnostics.^[23]^[1] In ranking and recommendation systems, isotonic regression enforces order-preserving constraints on predicted scores, ensuring that higher scores correspond to preferred items in a monotonic manner. This is applied by treating ranking scores as inputs and observed relevance labels as targets, fitting an isotonic model to adjust raw predictions for better alignment with user preferences or item rankings. Such calibration enhances top-N recommendation accuracy by mitigating distortions in score distributions, as demonstrated in query-level learning to rank frameworks where it optimizes pairwise preferences through iterative isotonic updates. In recommender systems, this integration helps maintain consistent ordering, improving metrics like normalized discounted cumulative gain (NDCG) without altering the underlying ranking model's structure.^[24] Isotonic regression also emerges as a subproblem within larger convex optimization frameworks, particularly in monotonic variants of support vector machines (SVMs), where it enforces shape constraints on decision boundaries. In these settings, the optimization decomposes into solving isotonic regressions to project solutions onto monotonic feasible sets, facilitating the incorporation of domain knowledge like increasing credit risk with financial ratios. This makes it suitable for applications requiring interpretable, constraint-satisfying models, such as credit scoring, by embedding isotonic projections into the dual SVM formulation to handle inequality constraints efficiently.^[25] Empirically, isotonic regression demonstrates advantages in reducing overfitting on small datasets, as evidenced by calibration studies on UCI benchmarks. These gains stem from its piecewise constant fits, which regularize noisy probability estimates more effectively than linear methods on limited data, though it requires careful validation to avoid stepwise artifacts. Extensions, such as ensemble near-isotonic regressions from 2016, further bolster performance on UCI tasks by averaging multiple fits, achieving lower Brier scores compared to standard isotonic approaches in binary classification scenarios.^[26]^[27] Recent work as of 2024 has explored isotonic regression for variance estimation in model validation.^[10]

Variants and Extensions

Centered Isotonic Regression

Centered isotonic regression (CIR) is a refinement of standard isotonic regression designed to produce strictly monotone estimates while preserving the total sum (or mean) of the observations. Formally, given observations y_1, \dots, y_n at ordered points x_1 < \dots < x_n, CIR seeks to minimize

\sum_{i=1}^n (y_i - \hat{y}_i)^2

subject to \hat{y}_1 \leq \hat{y}_2 \leq \dots \leq \hat{y}_n and \sum_{i=1}^n \hat{y}_i = \sum_{i=1}^n y_i, with the additional property of strict inequality in the interior to avoid flat segments.^[29] For weighted cases, the centering constraint generalizes to \sum w_i \hat{y}_i = \sum w_i y_i, where w_i are positive weights. This formulation ensures the solution lies on the intersection of the isotonic cone and the affine hyperplane defined by the sum constraint, which standard isotonic regression already satisfies but without the strictness enforcement.^[29] The motivation for CIR stems from limitations in standard isotonic regression, where piecewise-constant fits can introduce bias and inefficiency when the true underlying function is strictly increasing, as is common in balanced experimental designs. By enforcing strict monotonicity through centered pooling, CIR reduces estimation bias and variance, particularly in small-sample settings, without requiring additional parametric assumptions. This makes it suitable for scenarios where preserving the sample mean is crucial for unbiased estimation, such as in controlled experiments.^[29] Algorithmically, CIR adapts the pool-adjacent-violators algorithm (PAVA) by incorporating a projection step onto the centering hyperplane during pooling. When adjacent points violate the monotonicity constraint, they are collapsed into a single "shrinkage" point at the sample-size-weighted average of their x and y coordinates: \tilde{x} = \frac{\sum n_j x_j}{\sum n_j} and \tilde{y} = \frac{\sum n_j y_j}{\sum n_j}, where n_j are group sizes. Linear interpolation is then applied between these points to yield the final strictly increasing fit, ensuring the overall mean is preserved throughout. This modification avoids the flat stretches of standard PAVA while maintaining computational efficiency.^[30] In applications, CIR enhances variance reduction in statistical estimation for experimental designs, notably in dose-response and phase I clinical trials, where standard methods may overestimate or underestimate due to non-strict fits. For instance, simulations show CIR reduces the mean squared error by approximately a factor of 2 compared to isotonic regression for logistic response curves with n=20 samples.^[29] It was introduced in 2008 by Assaf Oron in the context of up-and-down experimental designs for percentile estimation.^[31]

Multivariate and Constrained Forms

Multivariate isotonic regression extends the univariate formulation to settings where data points are partially ordered, typically on a lattice such as \mathbb{R}^d equipped with a componentwise partial order. Under this order, the estimator \hat{y} must satisfy \hat{y}_i \leq \hat{y}_j whenever x_i \preceq x_j componentwise, ensuring non-decreasing behavior in each dimension.^[32] This generalization allows for modeling multivariate relationships under order restrictions, with the least squares solution existing and unique for any weights and observations when the partial order is reflexive, antisymmetric, and transitive.^[33] Constrained variants of isotonic regression incorporate additional shape constraints beyond monotonicity, such as convexity, to capture more nuanced structures like increasing convex functions. For example, the estimator may be required to satisfy both \hat{y}(x \vee z) + \hat{y}(x \wedge z) \geq \hat{y}(x) + \hat{y}(z) for convexity and the partial order for monotonicity, forming a closed convex cone in the parameter space.^[34] These combined constraints arise in statistical estimation where domain knowledge imposes multiple properties, enabling adaptive rates that depend on the complexity of the true function, such as the number of rectangular blocks in the isotonic case or the distance from affine subspaces in the convex case.^[35] Algorithms for multivariate isotonic regression often reformulate the problem as a convex optimization task solvable via network flow methods, where the partial order is represented as a directed graph and the least squares objective minimized subject to flow conservation and capacity constraints.^[36] Coordinate descent approaches provide an efficient alternative, particularly for high-dimensional settings; these iteratively update estimates for subsets of coordinates while fixing others and enforcing the order through projections like pairwise averaging, converging due to the strict convexity of the objective.^[37] Block coordinate descent variants further adapt to joint estimation across multiple monotonic functions, initializing with univariate solutions and penalizing violations via likelihood-based weights.^[37] In high dimensions, exact computation faces significant challenges, as the problem is NP-hard for general partial orders due to the exponential size of the constraint set, though polynomial-time methods exist for specific structures like product orders in low dimensions.^[38] Recent approximations in the 2020s address this by incorporating sparsity assumptions or greedy strategies, such as iterative pooling of adjacent violators extended to multiple dimensions, achieving near-optimal risk bounds while reducing computational demands.^[39]

Theoretical Properties

Uniqueness and Optimality

In isotonic regression, the problem of minimizing the squared error subject to monotonicity constraints forms a convex optimization setup, with a strictly convex quadratic objective function and a closed convex feasible set defined by the order restrictions. This guarantees the existence of at least one optimal solution. The strict convexity of the squared error objective further ensures that the minimizer is unique, distinguishing isotonic regression from cases with non-strictly convex losses where multiple solutions may exist. Uniqueness can be characterized through the subgradient of the objective at the solution, where the zero vector belongs to the sum of the subgradient of the loss and the normal cone to the feasible set, or equivalently via the Karush-Kuhn-Tucker (KKT) conditions that enforce primal and dual feasibility along with complementary slackness for the inequality constraints \hat{y}_i \leq \hat{y}_{i+1}.^[40] The optimal solution \hat{y} represents the orthogonal projection of the observed data vector y onto the closed convex cone of non-decreasing sequences in the Euclidean space, minimizing the Euclidean distance while preserving monotonicity. A key structural property of this unique solution is its piecewise constant form, consisting of contiguous blocks where the values are equal and represent the average of the corresponding observations within each block, reflecting the resolution of order violations through pooling.

Computational Complexity

The univariate case of isotonic regression admits efficient solutions, with optimized implementations of the Pool-Adjacent-Violators Algorithm (PAVA) achieving O(n) time complexity in both average and worst cases for n observations.^[41] This linear-time performance stems from the algorithm's ability to iteratively merge adjacent violators while maintaining monotonicity, making it highly practical for large datasets.^[42] In the multivariate setting, computational demands increase significantly with dimensionality. For general partial orders, isotonic regression can be solved in polynomial time, for example in O(n^2) time using algorithms that generalize the PAVA approach.^[43] For fixed dimensions d, dynamic programming methods enable polynomial-time solutions on lattice partial orders, such as product spaces of chains, with time complexity scaling as O(n^{d+1}) for a balanced grid of n points per dimension.^[44] However, in high dimensions, the effective complexity grows rapidly due to the size of the poset. Space requirements mirror these trends: O(n) suffices for univariate problems, as only a single pass over the data is needed to store and update partitions. In multivariate lattice structures, space complexity rises to O(n^d) to accommodate the exponential growth in the number of comparable elements and intermediate states during dynamic programming. As of 2025, challenges remain in the literature regarding efficient approximation algorithms for high-dimensional isotonic regression, particularly in developing stochastic methods with strong theoretical guarantees in regimes where exact solutions are computationally expensive.

References

[1]
Smooth Isotonic Regression: A New Method to Calibrate Predictive ...
We proposed a smooth isotonic regression method that significantly improves simple isotonic regression. The method combines the merits of parametric and non- ...
[2]
Page Not Found
**Summary:**
[3]
None
Nothing is retrieved...<|control11|><|separator|>
[4]
https://academic.oup.com/jrsssb/article/83/5/963/7056107
[5]
Isotonic Distributional Regression
Abstract. Isotonic distributional regression (IDR) is a powerful non-parametric technique for the estimation of conditional distributions under order restr.
[6]
Maximum Likelihood Estimates of Monotone Parameters
December, 1955 Maximum Likelihood Estimates of Monotone Parameters. H. D. Brunk · DOWNLOAD PDF + SAVE TO MY LIBRARY. Ann. Math. Statist. 26(4): 607-616 ...
[7]
https://doi.org/10.1002/9780470316885
[8]
[PDF] Linear Time Isotonic and Unimodal Regression in the L1 and L ...
In this section, the isotonic regression will always refer to L1-optimal isotonic regression. Without loss of generality, we can represent the isotonic.
[9]
[PDF] Regularized monotonic regression - Optimization Online
In our experiments, its running time was growing almost linearly with n, rather than in proportion to n2 as suggested by the worst-case analysis. Finally, we ...
[10]
[PDF] Taut strings and modality - University of Bristol
Isotonic regression of g = slope of the greatest convex minorant (GCM) of the CSD. This is the graph of the supremum of all convex functions whose graphs ...
[11]
[PDF] Pool-Adjacent-Violators Algorithm (PAVA) and Active Set Methods
In this paper we give a general framework for isotone optimization. First we discuss a generalized version of the pool-adjacent-violators algorithm (PAVA) to ...
[12]
https://cran.r-project.org/web/packages/isotone/index.html
[13]
IsotonicRegression — scikit-learn 1.7.2 documentation
IsotonicRegression is an isotonic regression model. It has parameters for minimum/maximum predicted values and constraining predictions to increase or decrease ...
[14]
Confidence intervals in monotone regression - Wiley Online Library
Jun 18, 2024 · We construct bootstrap confidence intervals for a monotone regression function. It has been shown that the ordinary nonparametric bootstrap ...INTRODUCTION · SMOOTH (MONOTONE... · LAKE MENDOTA: YEARLY...
[15]
Projections Onto Order Simplexes and Isotonic Regression - PMC
For this application the isotonic regression yields estimators for the long-time trend with negligible influence from the seasonal effect, a desirable property.
[16]
Partial likelihood estimation of isotonic proportional hazards models
Analysis of data from a recent HIV prevention study illustrates the practical utility of the isotonic methodology in estimating nonlinear, monotonic covariate ...
[17]
1.16. Probability calibration — scikit-learn 1.7.1 documentation
Probability calibration improves a model's probability estimates by fitting a regressor to map the classifier's output to a calibrated probability.<|control11|><|separator|>
[18]
[PDF] Query-Level Learning to Rank Using Isotonic Regression
We tackle this optimization problem using functional iterative methods where the update at each iteration is computed by solving an isotonic regression problem.
[19]
Credit rating with a monotonicity-constrained support vector ...
This paper proposes a new rating model based on a support vector machine with monotonicity constraints derived from the prior knowledge of financial experts.
[20]
[PDF] Getting More Out of Small Data Sets - SciTePress
All data sets are freely available from the UCI machine learning repository (Lichman, 2013). Comparison of traditional isotonic regression,. Data Generation ...
[21]
[PDF] Binary Classifier Calibration using an Ensemble of Near Isotonic ...
The commonly used algorithm for isotonic regression is the Pool Adjacent Violators Algorithm (PAVA), which is linear in the number of training data [2]. An ...Missing: numerical | Show results with:numerical
[22]
[PDF] arXiv:1909.03725v3 [stat.ME] 28 Sep 2021
Sep 28, 2021 · The most commonly used partial order in multivariate isotonic regression is the componentwise order defined by. x x. 0. ⇐⇒ xi ≤ x0 i for i ...
[23]
A Multivariate Version of Isotonic Regression - jstor
Given any partial order and weights, the multivariate isotonic regression exists uniquely for any p-dimensional real vectors xl, ..., Xk. This theorem ...Missing: constraints | Show results with:constraints
[24]
[PDF] Nonparametric Shape-restricted Regression - arXiv
Around the same time, Brunk [23] considered maximum likelihood estimation of a regression function under monotonicity con- straints. Since then isotonic ...
[25]
Shape-Constrained Statistical Inference - Annual Reviews
Apr 22, 2024 · Statistical models defined by shape constraints are a valuable alternative to parametric models or nonparametric models defined in terms of ...
[26]
[PDF] Generalized Isotonic Regression - arXiv
Oct 6, 2012 · ... on m which is O(n2) in the worst case. Given GIRP requires at most n iterations, this leads to worst case complexities of O(mn2 log n) or O(n4) ...
[27]
[PDF] A joint estimation approach for monotonic regression functions in ...
May 28, 2023 · Let λk and λp (k ̸= p) be two monotonic functions within isotonic regression models of the form (4). The observations for (Yk,Xk) and (Yp,Xp) ...
[28]
https://www.tandfonline.com/doi/full/10.1080/10920277.2024.2421221
[29]
[1907.01715] Sparse High-Dimensional Isotonic Regression - arXiv
Jul 3, 2019 · We consider the problem of estimating an unknown coordinate-wise monotone function given noisy measurements, known as the isotonic regression problem.Missing: NP- hardness
[30]
Stat 8054 Lecture Notes: Isotonic Regression
May 8, 2025 · All the sampling distributions are known (Barlow et al., 1972; Robertson et al., 1988), but going into all that would take us too much into ...
[31]
[PDF] Efficient regularized isotonic regression with application to gene ...
The IRP algorithm offers solutions to both the statistical and computational difficulties of isotonic regression. Algorith- mically, IRP solves (2) as a ...Missing: descent | Show results with:descent
[32]
[PDF] Monotone Regression: A Simple and Fast O(n) PAVA Implementation
The pool-adjacent-violators algorithm was first described by Ayer et al. (1955), but became well-known by the up-and-down-blocks implementation of the algorithm ...
[33]
Least Squares Isotonic Regression in Two Dimensions
Most such algorithms either are not polynomial or they are of unknown time complexity. Recently, it has become clear that the general isotonic regression ...
[34]
[PDF] Sparse High-Dimensional Isotonic Regression - arXiv
Jul 3, 2019 · Namely, we define the partial order on Rd as follows: x1 x2 if x1,i ≤ x2,i for all i ∈ {1,...,d}. If f is monotone according to the. Euclidean ...
[35]
[PDF] MULTIVARIATE ISOTONIC REGRESSION AND ITS ALGORITHMS
Sarath. Kulatunga introduced An Algorithm for Computing Multivariate Isotonic Regression [7]. Their algorithm works for partially ordered sets only. In the ...Missing: flow | Show results with:flow
[36]
[PDF] The Isotron Algorithm: High-Dimensional Isotonic Regression
The Perceptron algorithm elegantly solves binary classification problems that have a margin between positive and negative examples. Isotonic regres-.Missing: NP- | Show results with:NP-