Fact-checked by Grok 2 weeks ago
References
-
[1]
[PDF] Introduction to Statistical Learning Theory - Columbia CSThe goal of statistical learning theory is to study, in a sta- tistical framework, the properties of learning algorithms. In particular, most results take the ...
-
[2]
[PDF] CS229T/STAT231: Statistical Learning Theory (Winter 2016)Apr 20, 2016 · • Bousquet/Boucheron/Lugosi, 2008: Introduction to Statistical Learning Theory. • Martin Wainwright's lecture notes. • Peter Bartlett's ...
-
[3]
The Nature of Statistical Learning Theory | SpringerLinkIn stock Free deliveryThe aim of this book is to discuss the fundamental ideas which lie behind the statistical theory of learning and generalization.Missing: foundational work -
-
[4]
[PDF] The Vapnik-Chervonenkis Dimension - UPenn CISThe classic paper on the VC dimension, and the one in which the main elements of the proof of Theorem 3.3 are rst introduced, is by Vapnik and Chervonenkis [95] ...
-
[5]
[PDF] STATISTICAL LEARNING Theory (SLT): CS6464 - CSE IITMStatistical learning theory deals with the problem of finding a predictive function based on data. The goal of learning is prediction. Learning falls into many ...
-
[6]
[PDF] 15.097 Lecture 14: Statistical learning theory - MIT OpenCourseWareIn Statistical Learning Theory, generally there is no assumption made about the target (such as its belonging to some class). This is probably the main reason ...
-
[7]
Elements of Statistical Learning: data mining, inference, and ...The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Second Edition February 2009. Trevor Hastie, Robert Tibshirani, Jerome Friedman.
-
[8]
[PDF] An overview of statistical learning theory - MITAbstract—Statistical learning theory was introduced in the late 1960's. Until the 1990's it was a purely theoretical analysis of the problem of function ...
-
[9]
[PDF] STATISTICAL LEARNING THEORY: MODELS, CONCEPTS, AND ...Statistical learning theory provides the theoretical basis for many of today's ma- chine learning algorithms and is arguably one of the most beautifully ...
-
[10]
The statistical theories of Fisher and of Neyman and PearsonJan 9, 2013 · To fill this lacuna, R. A. Fisher developed his Theory of Significance Testing (FST) in the early 1920s, and in the late 1920s Neyman and ...
-
[11]
[PDF] The Fisher, Neyman-Pearson Theories of Testing HypothesesMar 22, 2006 · The Fisher and Neyman-Pearson approaches to testing statistical hypotheses are compared with respect to their attitudes to the interpretation of ...
-
[12]
[PDF] On Martingale Extensions of Vapnik–Chervonenkis Theory ... - MITAbstract We review recent advances on uniform martingale laws of large numbers and the associated sequential complexity measures. These results.
-
[13]
[PDF] A Theory of the Learnable - PeopleIn this paper we have considered learning as the proc- ess of deducing a program for performing a task, from information that does not provide an explicit ...
-
[14]
[PDF] Rademacher and Gaussian Complexities: Risk Bounds and ...Abstract. We investigate the use of certain data-dependent estimates of the complexity of a function class, called Rademacher and Gaussian complexities.
-
[15]
[PDF] Stability and Generalization - Journal of Machine Learning ResearchBousquet and A. Elisseeff. Algorithmic stability and generalization performance. In Neural. Information Processing Systems 14, 2001. L. Breiman. Bagging ...
-
[16]
[PDF] support-vector networksSUPPORT-VECTOR NETWORKS. Corinna Cortes 1 and Vladimir Vapnik 2. AT&T Labs-Research, USA. Abstract. The support-vector network is a new learning machine for two ...
-
[17]
[PDF] arXiv:1812.11118v2 [stat.ML] 10 Sep 2019Sep 10, 2019 · The double descent risk curve introduced in this paper reconciles the U-shaped curve predicted by the bias-variance trade-off and the ...
-
[18]
[PDF] Understanding Machine Learning: From Theory to AlgorithmsPage 1. Understanding Machine Learning: From Theory to Algorithms c 2014 by Shai Shalev-Shwartz and Shai Ben-David. Published 2014 by Cambridge University ...
-
[19]
[PDF] Statistical Learning Theory: Models, Concepts, and ResultsStatistical learning theory provides the theoretical basis for many of today's machine learning al- gorithms and is arguably one of the most beautifully ...
- [20]
-
[21]
[PDF] Statistical learning theory: a tutorial - Princeton UniversityIn this article, we provide a tutorial overview of some aspects of statistical learning theory, which also goes by other names such as statistical pattern ...
-
[22]
[PDF] Realizable Learning is All You NeedThe equivalence of realizable and agnostic learnability is a fundamental phenomenon in learning theory. With variants ranging from classical settings like PAC ...
-
[23]
None### Definition and Principle of Empirical Risk Minimization (ERM)
-
[24]
[PDF] probability inequalities for sums of bounded - UMBC CSEEAMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1963 for h>0. ... [8] Hoeffding, Wassily, "A class of statistics with asymptotically normal distribution,".
-
[25]
[PDF] Vladimir N. Vapnik - The Nature of Statistical Learning TheoryVapnik: The Nature of Statistical Learning Theory, Second Edition. Wallace: Statistical and Inductive Inference by Minimum Massage Length. Vladimir N. Vapnik.
-
[26]
[PDF] STATISTICAL LEARNING THEORY: MODELS, CONCEPTS, AND ...Statistical learning theory (SLT) is a theoretical branch of machine learning and attempts to lay the mathematical foundations for the field. The questions ...
-
[27]
[PDF] Convexity, Classification, and Risk BoundsAll of the classification procedures mentioned in earlier sections use surrogate loss functions that either are upper bounds on 0–1 loss or can be transformed ...
-
[28]
[PDF] On the Design of Loss Functions for ClassificationThis is akin to robust loss functions proposed in the statistics literature to reduce the impact of outliers.
-
[29]
Relative Deviation Learning Bounds and Generalization with ... - arXivOct 22, 2013 · We also give detailed proofs of two-sided generalization bounds that hold in the general case of unbounded loss functions, under the assumption ...
-
[30]
[PDF] On divergences, surrogate loss functions, and decentralized detectionNov 6, 2005 · This correspondence provides the basis for choosing and evaluating various surrogate losses frequently used in statistical learning (e.g., hinge ...<|separator|>
-
[31]
[PDF] Proper Losses, Moduli of Convexity, and Surrogate Regret BoundsProper losses (Buja et al., 2005) are loss functions to measure the discrepancy between a proba- bilistic prediction and an expected outcome, and have been ...
-
[32]
On the Uniform Convergence of Relative Frequencies of Events to ...Uniform convergence of Vapnik–Chervonenkis classes under ergodic sampling. The Annals of Probability, Vol. 38, No. 4 | 1 Jul 2010. Low-contention data ...
-
[33]
On the density of families of sets - ScienceDirectIn this paper we will answer this question in the affirmative by determining the exact upper bound. (Theorem 2).
-
[34]
Learnability and the Vapnik-Chervonenkis dimension | Journal of ...It is shown that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension.
-
[35]
[PDF] Toward efficient agnostic learning - UPenn CISOne of the major limitations of the Probably Approximately Correct (or PAC) learn- ing model (Valiant, 1984) (and related models) is the strong assumptions ...<|separator|>
-
[36]
[PDF] On the Complexity of Linear Prediction: Risk Bounds, Margin ...We provide sharp bounds for Rademacher and Gaussian complexities of (con- strained) linear classes. These bounds make short work of providing a number of.
-
[37]
[PDF] vladimir-vapnik-the-nature-of-statistical-learning-springer-2010.pdfThe nature of statistical learning theory/Vladimir N. Vapnik. 2nd ed. P. cm. (Statistics for engineering and information science).Missing: foundational | Show results with:foundational
-
[38]
[PDF] Neural Networks and the Bias/Variance DilemmaGeman, E. Bienenstock, and R. Doursat from one training set to another. Evidently, the variance contribution to mean-squared error is then small. On the ...
-
[39]
[PDF] Regularization and statistical learning theory for data analysisThis paper is organized as follows. We first outline the key concepts of Regular- ization and Statistical Learning Theory in Sections 2 and 3, respectively. We ...Missing: original | Show results with:original
-
[40]
Regression Shrinkage and Selection Via the Lasso - Oxford AcademicSUMMARY. We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute valu.Missing: regularization original
-
[41]
[PDF] On Early Stopping in Gradient Descent Learning - MITAbstract. In this paper, we study a family of gradient descent algorithms to approximate the regression function from Reproducing Kernel Hilbert Spaces ...
-
[42]
Dropout: A Simple Way to Prevent Neural Networks from OverfittingDropout is a technique for addressing this problem. The key idea is to randomly drop units (along with their connections) from the neural network during ...
-
[43]
[1806.03852] Data augmentation instead of explicit regularizationJun 11, 2018 · Data augmentation systematically provides large generalization gains and does not require hyperparameter re-tuning.
-
[44]
[PDF] Vapnik paper on Support Vector machinesTo simplify computations one can introduce the following (slightly modified) concept of the Generalized Optimal hyperplane (Cortes and Vap- nik, 1995). The ...
-
[45]
[PDF] Covering Number Bounds of Certain Regularized Linear Function ...Our main goal is to construct regularization conditions for linear function classes so that the resulting covering numbers are independent of the input data ...
-
[46]
[PDF] Entropy Numbers of Linear Function Classes x 7! w x: kwk`MCovering numbers are of considerable interest to learning theory because generalization bounds can be stated in terms of them [1, 30]. Definition 6 (Operator ...
-
[47]
[PDF] Oracle Inequalities in Empirical Risk Minimization and Sparse ...Mar 12, 2011 · The purpose of these lecture notes is to provide an introduction to the general the- ory of empirical risk minimization with an emphasis on ...
-
[48]
Reconciling modern machine learning practice and the bias ... - arXivDec 28, 2018 · In this paper, we reconcile the classical understanding and the modern practice within a unified performance curve. This "double descent ...
-
[49]
Reconciling modern machine-learning practice and the classical ...This results in classical overfitting as predicted by the bias–variance ... Vapnik, The Nature of Statistical Learning Theory (Springer, 1995). Crossref.
-
[50]
[PDF] Local Rademacher Complexities - UC Berkeley StatisticsMay 14, 2004 · Abstract. We propose new bounds on the error of learning algorithms in terms of a data-dependent notion of complexity.
-
[51]
[PDF] Learning Kernels Using Local Rademacher ComplexityBartlett and S. Mendelson, “Rademacher and gaussian complexities: Risk bounds and structural re- sults,” Journal of Machine Learning Research, vol. 3, pp ...
-
[52]
[PDF] Semi-Supervised LearningBioinformatics: The Machine Learning Approach, Pierre Baldi and Søren Brunak. Reinforcement Learning: An Introduction, Richard S. Sutton and Andrew G. Barto.
-
[53]
[PDF] Representation Learning for Clustering: A Statistical FrameworkThis can be regarded as a formal PAC framework to an- alyze the problem of learning representation for k-means clustering. The learner is compared to the best ...Missing: unsupervised | Show results with:unsupervised
-
[54]
[PDF] Cluster Stability for Finite SamplesWe conclude that stability remains a meaningful cluster validation criterion over finite samples. Clustering is one of the most common tools of unsupervised ...
-
[55]
Stability estimation for unsupervised clustering: A review - PMC - NIHJan 21, 2022 · Stability is a measurement that characterizes the strength and reproducibility of a cluster and an items membership to a cluster.
-
[56]
[2203.09251] Near Instance-Optimal PAC Reinforcement Learning ...Mar 17, 2022 · In this paper, we propose the first nearly matching (up to a horizon squared factor and logarithmic terms) upper and lower bounds on the sample complexity of ...Missing: unsupervised | Show results with:unsupervised
-
[57]
PAC-Bayesian Analysis of Co-clustering and BeyondWe derive PAC-Bayesian generalization bounds for supervised and unsupervised learning models based on clustering, such as co-clustering, ...
-
[58]
[PDF] Semi-Supervised Classification by Low Density SeparationThe cluster assumption states that the decision boundary should not cross high density regions, but instead lie in low density regions.
-
[59]
[PDF] On the Relation Between Low Density Separation, Spectral ... - MITOne of the important intuitions of semi-supervised learning is the cluster assumption(e.g., [4]) or, more specifically, the low density separation assumption ...
-
[60]
[PDF] Transductive Rademacher Complexity and its Applications - arXivHowever, the setting of semi-supervised learning is different from transduction. In semi-supervised learning the learner is given randomly drawn training set ...
-
[61]
An lp theory of PCA and spectral clustering - Project EuclidAN p THEORY OF PCA AND SPECTRAL CLUSTERING Principal Component Analysis (PCA) is a powerful tool in statistics and machine learning.
-
[62]
[PDF] Online Learning: Theory, Algorithms, and Applications - TTICRegret bounds are the common thread in the analysis of online learning algorithms. A regret bound measures the performance of an online algorithm relative to ...Missing: seminal | Show results with:seminal
-
[63]
[PDF] A Survey of Algorithms and Analysis for Adaptive Online LearningFTL can provide sublinear regret in the case of strongly convex functions (as we will show), but for general convex functions additional regularization is ...
-
[64]
[PDF] 1 The Perceptron AlgorithmToday will look at a more classic algorithm for learning linear separators, with a different kind of guarantee. 1 The Perceptron Algorithm. One of the oldest ...Missing: statistical | Show results with:statistical
-
[65]
Neural Tangent Kernel: Convergence and Generalization in ... - arXivJun 20, 2018 · The Neural Tangent Kernel (NTK) describes the evolution of an ANN during training, central to its generalization features. It converges to a ...Missing: seminal | Show results with:seminal
-
[66]
A Survey on Statistical Theory of Deep Learning: Approximation ...Mar 7, 2025 · Abstract. In this article, we review the literature on statistical theories of neural networks from three perspectives: approximation, training ...
-
[67]
Information-theoretic analysis of generalization capability of learning ...May 22, 2017 · We derive upper bounds on the generalization error of a learning algorithm in terms of the mutual information between its input and output.
- [68]
-
[69]
Sharpness-Aware Minimization for Efficiently Improving GeneralizationView a PDF of the paper titled Sharpness-Aware Minimization for Efficiently Improving Generalization, by Pierre Foret and 3 other authors.
- [70]
-
[71]
Causal Inference Meets Deep Learning: A Comprehensive SurveyIn this survey, we provide a comprehensive and structured review of causal inference methods in deep learning.