Fact-checked by Grok 2 weeks ago

Subsampling

Subsampling is the process of selecting a of points or observations from a larger sample or , often to reduce computational demands, approximate characteristics, or enable efficient while maintaining representativeness. In statistics, a subsample is explicitly defined as a sample drawn from an existing sample, serving as a portion of the original sample that itself represents a part of the broader . In , subsampling denotes a specific resampling developed by Dimitris N. Politis and Joseph P. Romano in 1994, which estimates the of a normalized —known as a root statistic—by calculating the over multiple contiguous or non-overlapping subsamples of fixed size b from the original dataset of size n, where b \to \infty and b/n \to 0. This approach constructs asymptotically valid confidence regions under minimal assumptions, requiring only that the root statistic converges to a limiting , without needing uniformity of convergence or strong conditions on the underlying data-generating . Unlike the bootstrap, which resamples with replacement from the empirical and can fail in cases of discontinuous functions (e.g., for quantiles or extreme order statistics), subsampling proves robust by directly leveraging the structure of the observed data. It has been extended to dependent data such as stationary time series and random fields, facilitating applications like . The foundational text on this method, Subsampling by Politis, Romano, and Michael Wolf (1999), provides a comprehensive framework for its implementation and theoretical underpinnings. Beyond inferential statistics, subsampling plays a crucial role in machine learning, where it involves randomly selecting subsets of training data to accelerate model fitting, mitigate class imbalances through undersampling the majority class, or enhance privacy via mechanisms like privacy amplification by subsampling in differential privacy protocols. For instance, in boosting algorithms or stochastic gradient descent, subsampling reduces overfitting and computational costs by training on fractions of the dataset iteratively. In signal processing and digital media, subsampling typically refers to downsampling, which decreases the sampling rate to lower resolution, or chroma subsampling, a compression technique that reduces the resolution of color (chroma) components relative to brightness (luma) information—expressed in ratios like 4:2:0—to minimize bandwidth and storage while exploiting human visual sensitivity to luminance over color detail. This latter application is standard in video encoding formats such as JPEG and MPEG, enabling efficient transmission without perceptible quality loss for most viewers.

General Principles

Definition and Terminology

Subsampling is a data reduction technique that involves selecting a of points from a larger or signal, thereby decreasing its overall size while striving to retain key characteristics such as statistical properties or frequency content. This process is commonly employed across disciplines like and to manage computational demands, lower storage needs, or simplify analysis without substantial loss of information. By reducing the , sampling rate, or density of observations, subsampling facilitates efficient handling of high-dimensional or voluminous . In , subsampling is often synonymous with downsampling, which specifically denotes the reduction of a signal's sampling rate by an factor M, typically involving low-pass filtering to prevent followed by retention of every M-th sample. Related terminology includes "," which refers to the same integer-factor rate reduction process, emphasizing the combined filtering and subsampling steps to convert a signal from a higher sampling rate f_s to a lower one f_s / M. "," by contrast, describes intentional sampling at a rate below the for bandlimited signals, which can exploit for applications like bandpass signal acquisition but risks if not carefully managed. These terms trace their origins to foundational sampling theory developed in the 1940s, building on Harry Nyquist's 1928 work on telegraph transmission limits and Claude Shannon's 1949 formalization of conditions for bandlimited signals. A key distinction exists between subsampling and : while subsampling decreases the number of data points to achieve reduction, increases sampling density—often by rendering or acquiring at higher —before averaging to combat artifacts in the final output. This contrast highlights subsampling's focus on efficiency through sparsity rather than enhancement through . Understanding these concepts presupposes familiarity with the Nyquist-Shannon sampling theorem, which asserts that a continuous bandlimited signal with maximum f_{\max} can be perfectly reconstructed from discrete samples taken at a rate greater than $2f_{\max}, providing the theoretical basis for safe rate reductions in subsampling. In statistics, subsampling involves selecting subsets from an existing sample to approximate population parameters or construct confidence intervals; this can include uniform random selection without replacement or non-random methods such as contiguous blocks to preserve temporal or spatial correlations in dependent data, particularly when full resampling like is infeasible. In applications, such as audio or image bandwidth , subsampling reduces resource requirements while maintaining perceptual quality.

Mathematical Foundations

Subsampling of a discrete-time signal x by an integer factor M produces a new sequence y = x[Mm], where m \in \mathbb{Z}. This operation retains every Mth sample of the original signal, effectively reducing the sampling rate by a factor of M. The process can be viewed as compressing the time axis, leading to a denser representation in the time domain but potentially introducing distortions in the frequency domain if not properly managed. To derive the effects of , the subsampled sequence is analyzed in the . Specifically, the derivation involves the replication due to the periodic selection, leading to aliased components where higher frequencies fold into the . In the Fourier domain, the (DTFT) of the subsampled signal Y(e^{j\omega}) is given by Y(e^{j\omega}) = \frac{1}{M} \sum_{k=0}^{M-1} X\left(e^{j (\omega - 2\pi k)/M}\right), where X(e^{j\omega}) is the DTFT of the original signal. This demonstrates by a factor of $1/M along with , as the M shifted and scaled versions of the original are averaged, causing higher-frequency components to fold into the lower-frequency range. The aliased frequencies can be expressed as f_{\text{alias}} = f / M + k / M for integer k, where f is the original normalized frequency, illustrating how components at f + k alias to the same location after scaling. In statistical and data contexts, subsampling refers to selecting a of size b from an original of size N via random selection without , where each possible subset is equally likely. The probability that any specific point is included in the subsample is p = b / N. This approach approximates the original and is used for , such as estimating variability without assuming . The in subsampled representations can be quantified using the (MSE), particularly for estimators like the sample . For subsampling without , the MSE of the subsampled sample \bar{y} (which is unbiased for the ) equals its variance, given by \text{MSE}(\bar{y}) = \frac{N - b}{N} \cdot \frac{\sigma^2}{b}, where \sigma^2 is the variance. This formula highlights how the error decreases with larger subsample sizes b but increases with the finite correction factor (N - b)/N, providing bounds on the of the subsampled relative to the full .

Subsampling in Signal Processing

Downsampling Process

The downsampling process in digital signal processing, also known as decimation, reduces the sampling rate of a discrete-time signal by an integer factor M while preserving the signal's relevant information content. This is achieved through a two-step procedure designed to prevent aliasing and maintain signal integrity. The first step involves applying a low-pass filter to the input signal x to bandlimit it, ensuring that frequency components above the new Nyquist frequency f_s / (2M) (where f_s is the original sampling frequency) are attenuated. The second step is decimation, where every M-th sample of the filtered signal is retained, effectively discarding the intermediate samples. For example, with M=2, the downsampled signal y is formed as y{{grok:render&&&type=render_inline_citation&&&citation_id=0&&&citation_type=wikipedia}} = x{{grok:render&&&type=render_inline_citation&&&citation_id=0&&&citation_type=wikipedia}}, y{{grok:render&&&type=render_inline_citation&&&citation_id=1&&&citation_type=wikipedia}} = x{{grok:render&&&type=render_inline_citation&&&citation_id=2&&&citation_type=wikipedia}}, y{{grok:render&&&type=render_inline_citation&&&citation_id=2&&&citation_type=wikipedia}} = x{{grok:render&&&type=render_inline_citation&&&citation_id=4&&&citation_type=wikipedia}}, and so on, halving the sampling rate. For efficient implementation, especially in hardware or real-time systems, polyphase decomposition reconfigures the filtering and downsampling operations to minimize computations. The filter's impulse response h is partitioned into M polyphase components E_k(z) = \sum_{m} h(Mm + k) z^{-m} for k = 0, 1, \dots, M-1, allowing the downsampler to be moved before the filters using the noble identity. This results in processing the input at the lower output rate rather than the full input rate, reducing multiplications by approximately a factor of M. For M=4, the structure consists of four parallel branches: the input signal is delayed successively by 0, 1, 2, and 3 samples, each branch is downsampled by 4 (selecting every fourth sample), filtered by the corresponding polyphase filter E_k(z), and the outputs are summed to produce the decimated signal. When the desired rate change is a rational factor L/M (where L and M are not equal to 1), the process combines upsampling by L with low-pass interpolation filtering, followed by downsampling by M with anti-aliasing filtering. The overall low-pass filter is designed to meet the stricter of the two requirements, with a cutoff at \min(1/L, 1/M) times the intermediate sampling rate. The foundational concepts of multirate , including these downsampling techniques, were developed in the 1970s by Ronald E. Crochiere and Lawrence R. Rabiner at Bell Laboratories.

Anti-Aliasing and Filtering

In subsampling by an integer factor M, occurs due to the periodic replication of the signal's frequency spectrum at intervals of $2\pi/M radians per sample in the , causing folding where components above the \pi/M overlap with the spectrum. This mechanism distorts the subsampled signal by mapping high-frequency content into the lower-frequency range, potentially corrupting the desired information. For instance, a high-frequency sinusoid at \omega = \pi - \epsilon (near the Nyquist limit but exceeding \pi/M) will fold back to appear as a low-frequency component at \omega' \approx \epsilon/M after subsampling, masquerading as part of the original and introducing artifacts that cannot be removed post-subsampling. To prevent such , an is essential before the subsampling operation, ideally with a of \pi/M radians per sample to preserve the while attenuating higher frequencies that would otherwise fold in. The of this is rectangular, passing signals unchanged up to \pi/M and zero thereafter, with its given by the : h = \frac{\sin(\pi n / M)}{\pi n / M}, \quad n \neq 0, and h{{grok:render&&&type=render_inline_citation&&&citation_id=0&&&citation_type=wikipedia}} = 1 by continuity, ensuring a gain that maintains signal levels appropriate for . This theoretical filter provides perfect of the but is non-causal and infinite in duration, making it unrealizable in practice without approximation. In practical implementations, (FIR) filters are commonly employed for due to their ability to achieve response, which avoids phase distortion and preserves the temporal alignment of signal features, unlike (IIR) filters that offer computational efficiency through but introduce nonlinear phase shifts. FIR designs often use windowing methods, such as the window, where the shape parameter \beta (typically ranging from 4 to 10) trades off passband ripple against stopband to meet suppression requirements; for example, \beta = 5.5 yields about 50 dB stopband rejection suitable for many audio tasks. IIR filters, while requiring fewer coefficients (e.g., second-order sections for elliptic designs achieving sharp roll-off), demand careful stability analysis and are less favored when phase linearity is critical, though they reduce multiplications per output sample by up to an in resource-constrained systems. The choice balances computational cost—FIR filters may need hundreds of taps for steep transitions, increasing and power use—against performance, with polyphase structures further optimizing FIR decimators by partitioning the filter into M subfilters operating at the reduced rate. Pre-1990s digital signal processing literature, such as early works focusing on basic sampling theory, often underemphasized efficient polyphase techniques for subsampling, treating filters as straightforward convolutions without rate-specific optimizations; these methods, popularized through seminal analyses in the early , are now standard for minimizing redundant computations in by up to M-fold.

Subsampling in Statistics

Methods and Techniques

In , subsampling refers to a resampling introduced by Dimitris N. Politis and Joseph P. Romano in 1994 for estimating the of a normalized , known as a root , under minimal assumptions. The core procedure involves computing the root statistic over multiple subsamples of fixed size b drawn from the original of size n, where b \to \infty and b/n \to 0 as n \to \infty. For independent and identically distributed (i.i.d.) , subsamples can be non-overlapping or random without replacement. For dependent , such as , contiguous blocks of size b are typically used to preserve temporal structure. The empirical of the root statistics from these subsamples approximates the limiting of the full-sample root statistic, enabling the construction of asymptotically valid regions without requiring strong assumptions or uniformity conditions. Unlike the bootstrap, which resamples with replacement from the empirical distribution and may fail for discontinuous statistics (e.g., quantiles or median), subsampling is robust because it directly uses the observed data structure without artificial resampling. The choice of b is critical; theoretical guidelines suggest b \approx n^{1/3} for many cases to balance bias and variance in the approximation. Extensions include the jackknife, a special case of subsampling where b = n-1 and subsamples omit one observation each, used primarily for variance estimation. The jackknife variance estimator for an estimator \hat{\theta} is \hat{\text{Var}}(\hat{\theta}) = \frac{n-1}{n} \sum_{i=1}^{n} (\hat{\theta}_{(i)} - \bar{\hat{\theta}})^2, where \hat{\theta}_{(i)} is the estimator from the subsample excluding the i-th observation, and \bar{\hat{\theta}} is their average. Originally proposed by John Tukey in 1958, it provides a nonparametric uncertainty measure but can overestimate variance for non-smooth estimators.

Applications in Estimation

Subsampling is widely applied in constructing intervals and testing hypotheses for complex , particularly in nonparametric and semiparametric settings. In , it facilitates efficient in multi-stage designs; for example, the U.S. Census Bureau used subsampling in its 2000 Accuracy and Coverage Evaluation (A.C.E.) post-enumeration , which sampled approximately 300,000 housing units to measure census undercoverage and support dual-system for demographic adjustments. For high-dimensional data where the number of variables p grows with sample size n, subsampling integrates with methods like penalized empirical likelihood (PEL) to form valid . In PEL, subsamples of size m (with m/n \to 0 and p/m \to 0) calibrate the empirical distribution of the adjusted log-likelihood ratio statistic, \hat{G}_n(x) = |I_n|^{-1} \sum_{I \in I_n} \mathbf{1}(-\log R^*_m(\mu_0; I) \leq x), enabling confidence intervals for means in sparse regimes. Subsampling has been extended to dependent data, including stationary time series and random fields, for applications such as and volatility forecasting. The foundational text, Subsampling by Politis, Romano, and Michael Wolf (1999), details these extensions and implementation. Early implementations in the faced computational challenges for massive datasets, but post-2010 advancements, including GPU-accelerated and techniques like subagging (subsample aggregation), have enabled scalable inference with convergence rates matching full-data methods. As of 2025, subsampling remains integral to analytics and machine learning-adjacent statistical procedures.

Subsampling in Image Processing

Chroma Subsampling

Chroma subsampling is a technique in image and video processing that reduces the resolution of color (chroma) information relative to brightness (luma) to achieve compression efficiency, exploiting the human visual system's lower spatial sensitivity to chromatic details compared to luminance variations. This principle allows the chroma channels, typically represented as Cb and Cr in the YCbCr color space, to be downsampled by a factor of 2 horizontally and/or vertically without significantly impairing perceived quality. The process begins with converting the input RGB image to YCbCr, where Y carries the luma, and Cb and Cr encode blue and red differences, respectively; subsampling then averages or filters chroma values over blocks of luma samples. Key standards define specific subsampling ratios relative to full sampling (). In 4:2:2, is subsampled by 2 horizontally but not vertically, halving the color data per line while preserving vertical color resolution, as specified in BT.601 for studio digital video interfaces. The format, common in consumer applications, further subsamples by 2 vertically as well, reducing color information to one-quarter of the luma resolution by averaging over 2x2 blocks; this is the default in (ISO/IEC 10918-1) for still images and /2 (ISO/IEC 11172-2 and 13818-2) for video compression. A typical involves conversion, low-pass filtering of to prevent , subsampling, and subsequent quantization and encoding, as illustrated in block diagrams where luma remains at full grid while grids are coarsened. Despite its efficiency, introduces artifacts, primarily color bleeding where sharp color edges appear smeared into adjacent areas due to the reduced chroma resolution and quantization effects. This is most noticeable in high-contrast color transitions, such as text on colored backgrounds, though the impact on overall quality is minimal for most content, with objective metrics showing some degradation compared to full sampling. The technique originated in the 1950s with analog color television development, pioneered by Alda V. Bedford at through her 1949 patent on bandwidth-efficient color encoding for systems, which separated luma and chroma to fit within monochrome broadcast limits. Early implementations in overlooked some perceptual nuances, such as -luma interactions, leading to visible color artifacts in high-definition contexts. Refinements occurred in the with codecs like and -2, which standardized for DVD and broadcast, incorporating better filtering to align with human vision models while maintaining . As of 2025, remains the standard format in UHD Blu-ray and many streaming services using HEVC compression, balancing quality and bandwidth efficiency.

Spatial and Temporal Subsampling

Spatial subsampling reduces the of an by mapping multiple original s to each new in a lower-dimensional , effectively decreasing spatial detail while preserving overall structure. Two primary methods are nearest-neighbor selection, which assigns to the new the of the nearest original for simplicity and speed, and pixel averaging, which computes the of neighboring s to smooth the result. For a 2× reduction in both dimensions, pixel averaging typically involves the of a 2×2 block: x'_{i,j} = \frac{x_{2i,2j} + x_{2i+1,2j} + x_{2i,2j+1} + x_{2i+1,2j+1}}{4} This approach, akin to a uniform bilinear interpolation, helps retain intensity information more evenly than nearest-neighbor, though it may introduce slight smoothing. Temporal subsampling lowers the frame rate in video sequences to cut data volume, often by frame dropping, such as retaining every second frame to halve the rate from 60 fps to 30 fps. This simple technique achieves compression but can produce jerkiness or stuttering in fast-motion scenes due to abrupt temporal changes. To mitigate this, motion compensation interpolates dropped frames using estimated motion vectors between retained frames, synthesizing intermediate content for smoother playback. These subsampling strategies find widespread use in generating thumbnails for quick previews and enhancing video streaming efficiency, especially under limitations prevalent in applications during the early when average connection speeds were below 1 Mbps. By reducing file sizes, spatial subsampling enables faster loading of images, while temporal subsampling optimizes real-time video delivery without excessive buffering. A key challenge in subsampling is the introduction of blurring artifacts from inadequate pre-filtering, which over-smooths edges and details, or insufficient low-pass filtering that fails to suppress high frequencies adequately. These issues, related to anti-aliasing principles, are effectively addressed by the Lanczos resampling kernel, a sinc-based windowed filter that balances sharpness and artifact reduction through its controlled frequency response, minimizing both blurring and aliasing in downscaled outputs.

Subsampling in Machine Learning

Data Reduction Techniques

Subsampling serves as a key data reduction technique in to manage large datasets by selecting representative subsets for training, thereby addressing memory constraints and computational demands while preserving model performance. This approach draws from statistical sampling principles but adapts them to the iterative, optimization-driven nature of workflows, enabling efficient processing of datasets that exceed single-machine capabilities. Random subsampling without replacement is a foundational where a fixed number k of samples are uniformly selected from the full to form a set, ensuring no duplicates and fitting within available limits. For instance, in deep neural networks on massive image corpora, this technique reduces the effective size from millions to thousands of examples per , accelerating without significant in estimates. Such uniform selection is particularly effective for initial model fitting, as it maintains the original proportionally. Importance sampling extends random methods by assigning selection probabilities based on the magnitude of loss gradients, with each sample x_i chosen proportional to |\nabla L(x_i)|, prioritizing examples that contribute most to error reduction. This weighted approach minimizes variance in updates, leading to faster training on imbalanced or high-variance datasets, such as those in where rare events drive performance gains. By focusing computational effort on informative samples, can reduce training time by up to an (e.g., 10x speedup) in tasks compared to uniform sampling, while maintaining or improving accuracy. Curriculum subsampling introduces a , beginning with easier examples—often defined by lower initial values—and gradually incorporating harder ones to mimic learning trajectories and avoid . In practice, this involves sequencing subsamples by increasing difficulty metrics, such as model confidence scores, which has been shown to enhance in sequence-to-sequence models over random ordering. The method's efficacy stems from smoothing the optimization landscape, making it suitable for complex tasks like where early exposure to simple patterns builds robust representations. Subsampling also plays a role in privacy-preserving , such as privacy amplification by subsampling in protocols. This technique randomly selects a of for processing, amplifying guarantees (e.g., reducing the effective in DP-SGD) while enabling training on sensitive datasets like those in , as standardized in frameworks up to 2025. Prior to , subsampling techniques in were often inefficient for due to reliance on single-node processing, limiting to datasets under a few gigabytes and resulting in prohibitive training times for emerging large-scale applications. Since the release of in , these methods have been integrated into distributed systems, enabling parallel subsampling across clusters for terabyte-scale , with built-in functions for uniform and that reduce overhead by orders of magnitude. This shift has facilitated widespread adoption in production environments, such as recommendation systems handling billions of interactions.

Integration with Algorithms

Subsampling plays a crucial role in ensemble methods, particularly through bagging, where bootstrap sampling with replacement typically draws approximately 63% unique instances from the training set to create diverse base learners, thereby reducing variance in predictions. Variants like subagging employ subsampling without replacement at rates of 10-50% of the dataset size, which enhances computational efficiency while maintaining or improving predictive stability compared to traditional bagging. These approaches aggregate predictions from multiple subsampled models to yield robust ensemble outputs, as demonstrated in ensembles where subagging achieves comparable accuracy to bagging but with lower overhead. In (SGD), subsampling forms the basis of mini-batch training, where gradients are estimated from subsets of size 32-256 to balance computational speed and optimization stability. This mini-batch approach reduces the variance of the stochastic gradient estimate from \sigma^2 (full single-sample case) to approximately \sigma^2 / b, where b is the batch size, enabling faster convergence toward minima in large-scale problems like training. Larger batches further mitigate noise but increase per-iteration costs, making subsampling a key tunable parameter for practical workflows. Active learning integrates subsampling by selectively querying subsets of unlabeled data for oracle labeling, focusing on high-uncertainty or informative instances to minimize annotation efforts. This pool-based strategy reduces labeling costs relative to random sampling, as shown in text classification tasks where uncertainty sampling achieves target accuracy with far fewer queries. For instance, in radiology image labeling, active methods cut human effort by up to 90% while preserving model performance. As of 2023, advances have embedded subsampling within architectures to handle long sequences, addressing the of self-attention by techniques like random and layerwise token dropping during pretraining. This drops a fraction of input tokens (e.g., 10-20%) to sparsify computations without substantial accuracy loss, challenging early assumptions of uniform token importance and enabling efficient processing of sequences exceeding 512 tokens. Such methods have been pivotal in scaling models like variants for tasks involving extended contexts, yielding up to 2x speedups in training time.

References

  1. [1]
    SUBSAMPLE Definition & Meaning - Merriam-Webster
    The meaning of SUBSAMPLE is to draw samples from (a previously selected group or population) : sample a sample of.
  2. [2]
    Subsample: Definition - Statistics How To
    In statistics, a subsample is a sample of a sample. In other words, a sample is part of a population and a subsample is a part of a sample.
  3. [3]
    Large Sample Confidence Regions Based on Subsamples under ...
    December, 1994 Large Sample Confidence Regions Based on Subsamples under Minimal Assumptions. Dimitris N. Politis, Joseph P. Romano · DOWNLOAD PDF + SAVE TO MY ...
  4. [4]
    [PDF] 1 Subsampling
    Subsampling is originally due to Politis and Romano (1994). In order to motivate the idea behind subsampling, consider the following ... → 0 actually follows from ...
  5. [5]
    Subsampling | SpringerLink
    The primary goal of this book is to lay some of the foundation for subsampling methodology and related methods.
  6. [6]
    Subsampling for class imbalances - tidymodels
    Subsampling, either undersampling or oversampling, addresses class imbalances by reducing the majority class frequency to match the minority class, improving ...Subsampling For Class... · Subsampling The Data · Model PerformanceMissing: machine | Show results with:machine
  7. [7]
    Privacy Amplification by Subsampling - DifferentialPrivacy.org
    Apr 13, 2025 · Privacy amplification by subsampling involves taking a random subset of a large dataset, running a DP algorithm on it, and the ambiguity of ...
  8. [8]
    Sub-Sampling Techniques
    The basic idea is to train classifiers on multiple subsamples of the data and combine their predictions, usually by voting.<|control11|><|separator|>
  9. [9]
    Chroma Subsampling: 4:4:4 vs 4:2:2 vs 4:2:0 - RTINGS.com
    Mar 4, 2019 · Chroma subsampling is a type of compression that reduces the color information in a signal in favor of luminance data. This reduces bandwidth ...Test Results · When Does It Matter? · Conclusion
  10. [10]
    Subsamplings - an overview | ScienceDirect Topics
    Subsampling is defined as a technique used during boosting iterative processes to reduce overfitting by implementing stochastic gradient descent, which aims ...
  11. [11]
  12. [12]
    Supersampling — Documentation - Unigine Developer
    Nov 30, 2021 · Supersampling is a technique used to increase the effective resolution of a frame by rendering the scene larger than its final resolution, ...Missing: subsampling definition
  13. [13]
  14. [14]
    24.4 - Mean and Variance of Sample Mean | STAT 414
    Our result indicates that as the sample size n increases, the variance of the sample mean decreases. That suggests that on the previous page, if the instructor ...
  15. [15]
    Multirate DSP, Part 1: Upsampling and Downsampling - EETimes
    Apr 21, 2008 · This chapter investigates basics of multirate digital signal processing, illustrates how to change a sampling rate for speech and audio signals,
  16. [16]
    Polyphase Decomposition | Spectral Audio Signal Processing
    Polyphase Decomposition. The previous section derived an efficient polyphase implementation of an FIR filter $ h$ whose output was downsampled by the factor ...
  17. [17]
    Multirate DSP, part 2: Noninteger sampling factors - EE Times
    ### Summary of Rational Rate Changes Using Upsampling and Downsampling
  18. [18]
    Multirate Digital Signal Processing - Google Books
    Intended for a one-semester advanced graduate course in digital signal processing or as a reference for practicing engineers and researchers.Missing: 1970s | Show results with:1970s
  19. [19]
    Downsampling and Aliasing | Spectral Audio Signal Processing
    The aliasing theorem states that downsampling in time corresponds to aliasing in the frequency domain.
  20. [20]
    [PDF] Anti-aliasing (decimation) filtering before downsampling
    Decimation and Interpolation. Decimation. • Anti-aliasing (decimation) filtering before downsampling. • Filter has cutoff frequency of π. /M. Interpolation.
  21. [21]
    Design of Decimators and Interpolators - MATLAB & Simulink
    You can treat general rational conversions the same way as upsampling and downsampling operations. The cutoff is ω c = min ( 1 L , 1 M ) and the gain is L . ...
  22. [22]
    [PDF] Windowed-Sinc Filters
    The frequency response of the ideal low-pass filter is shown in (a), with the corresponding filter kernel in (b), a sinc function. Since the sinc is infinitely ...Missing: downsampling | Show results with:downsampling
  23. [23]
    [PDF] Benefits of Integrated FIR and IIR Filters in Delta-Sigma ADCs
    In general, there are three basic types of digital filters: finite impulse response (FIR) filters, infinite impulse response (IIR) filters, and sinc filters. ...
  24. [24]
    Overview of Multirate Filters - MATLAB & Simulink - MathWorks
    Multirate filters are digital filters that change the sample rate of a digital signal, often incorporating FIR or IIR filters to mitigate aliasing or imaging.Decimation and Interpolation · Decimators · Interpolators · Sample Rate Converters
  25. [25]
    [PDF] Jackknife notes - Statistics & Data Science
    Nov 29, 2023 · The jackknife variance estimate is known to usually over-estimate the true variance (i.e., to be biased upwards). (Efron and Stein 1981). This ...
  26. [26]
  27. [27]
  28. [28]
    Post-Enumeration Surveys - U.S. Census Bureau
    The Post-Enumeration Survey measures the accuracy of the census by surveying people and matching responses to the census to determine coverage.Overview · Procedural History · A.C.E. Design And...Missing: subsampling | Show results with:subsampling
  29. [29]
    A penalized empirical likelihood method in high dimensions - arXiv
    Feb 13, 2013 · This paper formulates a penalized empirical likelihood (PEL) method for inference on the population mean when the dimension of the observations ...
  30. [30]
    Scalable subsampling: computation, aggregation and inference - arXiv
    Dec 13, 2021 · Subsampling is a general statistical method developed in the 1990s aimed at estimating the sampling distribution of a statistic \hat \theta _n ...Missing: definition | Show results with:definition
  31. [31]
    Statistical methods and computing for big data - PMC
    This article summarizes recent methodological and software developments in statistics that address the big data challenges.
  32. [32]
    [PDF] Chroma subsampling notation - Charles Poynton
    Chroma subsampling notation uses 3 or 4 integers separated by colons. The first digit is luma horizontal reference, the second is horizontal subsampling of CB/ ...<|separator|>
  33. [33]
    Color bleeding reduction in image and video compression
    ... color bleeding is mainly caused by the subsampling and quantization of chroma components. Restoring luma can reduce blocking distortion, but with little ...Missing: loss | Show results with:loss
  34. [34]
  35. [35]
    Resample (Data Management)—ArcGIS Pro | Documentation
    There are four options for the Resampling Technique parameter: Nearest—Performs a nearest neighbor assignment and is the fastest of the interpolation methods.<|separator|>
  36. [36]
    Introduction to Raster Resampling Methods
    The Bilinear Interpolation method calculates new pixel values through weighted averaging of four neighboring pixels (4-neighborhood) in the input raster dataset ...
  37. [37]
    Motion‐based frame interpolation for film and television effects
    Sep 23, 2020 · Frame interpolation is the process of synthesising a new frame in-between existing frames in an image sequence. It has emerged as a key ...
  38. [38]
    Bandwidth Constraints to Using Video and Other Rich Media in ...
    Sep 16, 2005 · We describe the development of a bandwidth usage index, which seeks to provide a practical method to gauge the extent to which websites can successfully be ...Missing: subsampling | Show results with:subsampling
  39. [39]
    [PDF] Content-Adaptive Image Downscaling - Computer Science
    Figure 3: The Lanczos kernel or Photoshop's “Sharpen” filter come at the expense of ringing artifacts due to negative lobes and oscillations in their ...Missing: mitigation | Show results with:mitigation
  40. [40]
    [PDF] Learning from Subsampled Data: Active and Randomized Strategies
    May 17, 2013 · Subsampling a large dataset of rotamers uniformly at random makes these rare examples even rarer, which leads many estimators to overfit them.<|control11|><|separator|>
  41. [41]
    [PDF] A Review on Optimal Subsampling Methods for Massive Datasets
    Subsampling is an effective way to deal with big data problems and many subsampling ap- proaches have been proposed for different models, such as leverage ...Missing: definition | Show results with:definition
  42. [42]
    [PDF] Sampling without Replacement Leads to Faster Rates in Finite-Sum ...
    Sampling data points without replacement leads to faster convergence in stochastic gradient algorithms for finite-sum minimax optimization compared to sampling ...
  43. [43]
    [PDF] Deep Learning with Importance Sampling - arXiv
    Importance sampling accelerates deep learning by focusing on informative samples, reducing gradient variance, and using a tractable upper bound to gradient ...
  44. [44]
    [PDF] Importance Sampling for Minibatches
    Importance sampling for minibatches combines importance sampling, which samples more important examples, with minibatching, which accelerates training, to ...
  45. [45]
    [PDF] A Comprehensive Survey on Curriculum Learning - arXiv
    Oct 25, 2020 · Abstract. Curriculum learning (CL) is a training strategy that trains a machine learning model from easier data to harder data, which.Missing: subsampling seminal
  46. [46]
    Big data preprocessing: methods and prospects
    Nov 1, 2016 · Under-sampling has the advantage of producing reduced data sets, and thus interesting approaches based on neighborhood methods, clustering and ...
  47. [47]
    (PDF) Big data scalability based on Spark Machine Learning Libraries
    Nov 20, 2019 · In this study, a classification application was devoloped on Apache Spark using the Naive Bayes method which machine learning libraries of ...
  48. [48]
    Bagging predictors | Machine Learning
    Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor.
  49. [49]
    Analyzing bagging - Project Euclid
    With theoretical explanations, we motivate subagging based on subsampling as an alternative aggregation scheme. It is computationally cheaper but still shows ...
  50. [50]
    Optimization Methods for Large-Scale Machine Learning
    This paper provides a review and commentary on the past, present, and future of numerical optimization algorithms in the context of machine learning ...
  51. [51]
    [PDF] Active Learning Literature Survey - Burr Settles
    Jan 26, 2010 · This report provides a general introduction to active learning and a survey of the literature. This includes a discussion of the scenarios in ...
  52. [52]
    Active Learning Performance in Labeling Radiology Images Is 90 ...
    Nov 30, 2021 · We conclude that the human effort required to label an image dataset can be reduced by approximately 90% in most cases by using the described ...