Fact-checked by Grok 2 weeks ago

Array slicing

Array slicing is a core programming technique for extracting a contiguous subsequence of elements from an array, specified by indices such as start, end, and optional step values, often producing a new array or a lightweight view of the original without altering it.^[1]^[2] This operation is essential for efficient data manipulation in numerical computing, web development, and general-purpose programming, enabling tasks like subarray selection, data subsetting, and iterative processing without unnecessary memory copies.^[1]^[3] In languages like Python and its NumPy extension, basic slicing uses syntax such as array[start:stop:step], where omitted values default to the array's beginning, end, or step of 1, and negative indices count from the end; this typically yields a view—a reference to the original array's memory—rather than a full copy, promoting memory efficiency for large datasets.^[1] For instance, in NumPy's N-dimensional arrays, slicing extends to multiple dimensions via tuples of slices, supporting tools like Ellipsis (...) to handle remaining axes, while advanced indexing with integer or Boolean arrays triggers copies instead.^[1] Python's built-in array module similarly supports slicing and assignment but enforces type homogeneity, requiring assigned values to match the array's numeric type code to avoid errors.^[3] In JavaScript, the Array.prototype.slice() method provides slicing functionality with optional start and end parameters, returning a shallow copy of the selected portion into a new array object, where nested objects remain referenced rather than duplicated.^[2] This shallow nature ensures the original array remains unchanged, but modifications to shared objects propagate across both, a key consideration for avoiding unintended side effects in dynamic code.^[2] Across these implementations, array slicing balances performance and flexibility, forming a cornerstone of array operations in modern computing environments.^[1]^[3]^[2]

Fundamentals

Definition and Purpose

Array slicing is a fundamental operation in array-based programming, building on the concept of arrays as ordered collections of homogeneous elements stored in contiguous memory locations, typically accessed via zero-based indexing. This structure enables constant-time access to individual elements through their indices, forming the basis for efficient data handling in computational tasks. At its core, array slicing refers to the technique of extracting a contiguous or patterned subset of elements from an original array to form a new array or view, specified by parameters such as a starting index, an ending index (exclusive), and an optional step size that determines the interval between selected elements.^[1] This method allows programmers to isolate specific portions of data without needing to iterate manually over the entire structure, promoting modular and expressive code in array-oriented languages and libraries.^[4] The primary purpose of array slicing lies in its ability to facilitate efficient data manipulation, particularly in scenarios involving large datasets where full array copies would be prohibitively expensive in terms of memory and time.^[1] In view-based implementations, such as those in NumPy, slicing produces a lightweight reference to the original array's data rather than duplicating it, enabling in-place modifications and reducing overhead in numerical computations.^[5] This efficiency is crucial for data analysis, algorithm optimization, and processing in scientific computing, where operations like subset extraction support tasks such as filtering, windowing, and statistical processing on massive arrays without unnecessary resource consumption.^[1] Beyond performance gains, array slicing enhances code readability by allowing concise expressions for common subsetting needs, while supporting functional programming paradigms through immutable views that encourage non-destructive data handling.^[6] It serves as a foundational tool for higher-level operations in domains like machine learning and simulations, where repeated access to array segments is routine, ultimately streamlining development and maintenance of complex programs.^[6]

Basic Syntax

The basic syntax for one-dimensional array slicing in many programming contexts uses the notation array[start:end:step], where start is the inclusive starting index (defaulting to 0), end is the exclusive ending index (defaulting to the array's length), and step is the increment between selected indices (defaulting to 1 for forward traversal).^[1]^[7] This convention enables efficient extraction of contiguous or strided subsequences without modifying the original array structure. The start index includes the element at that position, while the end index excludes the element at that position, ensuring the slice length is end - start when step is 1.^[1] Pseudocode examples illustrate common patterns. For an array arr of length 10 containing elements indexed from 0 to 9, the slice arr[0:5] selects elements at indices 0 through 4, yielding the first five elements.^[1] Similarly, arr[2:7:2] starts at index 2, ends before index 7, and steps by 2, selecting elements at indices 2, 4, and 6. To capture the full array, arr[:] omits both start and end, equivalent to arr[0:len(arr):1].^[7] Edge cases handle invalid or boundary ranges gracefully. An empty range, such as arr[5:2] where start exceeds end, results in an empty array or sequence. Slices extending beyond bounds are truncated: for instance, arr[8:] on a length-10 array selects elements from index 8 to 9, while arr[12:] yields an empty result. Negative values for start or end count from the end of the array (e.g., -1 refers to the last element), supported in basic slicing conventions.^[1]^[7] Slicing operations typically produce a view of the original array—sharing underlying data without copying—or a shallow copy that duplicates top-level references but not nested content, avoiding a full deep copy for efficiency. In numerical array libraries, views predominate to minimize memory overhead, whereas sequence types in general-purpose languages often default to shallow copies.^[1]^[8] This behavior ensures subsets can be manipulated without unintended deep duplication unless explicitly required.

Advanced Features

Multi-dimensional Slicing

Multi-dimensional array slicing extends the one-dimensional slicing mechanism to higher-dimensional structures, such as matrices (2D arrays) or tensors (3D or more), by allowing independent selection along each axis or dimension. In this approach, the slice notation separates specifications for each dimension using commas within square brackets, such as arr[start1:end1, start2:end2] for a 2D array, where the first slice applies to rows and the second to columns. This enables precise extraction of subarrays without flattening the data, preserving the multi-dimensional structure.^[1] For example, in pseudocode representative of systems like NumPy, row slicing can select specific rows while retaining all columns using arr[1:3, :], which extracts rows indexed 1 through 2 (exclusive of 3) entirely. Similarly, column slicing employs arr[:, 0:2] to retrieve the first two columns across all rows. These operations produce a view of the original array, maintaining efficiency by avoiding data duplication. For higher dimensions, the pattern generalizes; a 3D array might use arr[0:1, 1:3, :] to select the first "layer," rows 1-2, and all elements in the third dimension.^[1] Partial slicing combines specific ranges with full-axis selectors (denoted by :) to target subsets across dimensions, effectively "broadcasting" the partial selection over the unspecified axes. For instance, arr[1, :, 0:2] in a 3D array selects the second layer (index 1), all rows, and the first two columns, resulting in a 2D subarray. This flexibility supports operations on irregular subsets without requiring explicit loops, enhancing code readability and performance in numerical computing environments.^[1] Common use cases include image processing, where slicing crops regions of interest from pixel arrays; for example, extracting a sub-image via image[100:200, 150:250] isolates a 100x100 pixel patch for further analysis. In data analysis, multi-dimensional slicing facilitates selecting sub-tables from data frames, such as df.iloc[0:5, 1:4] in Pandas to retrieve the first five rows and columns 1-3, enabling focused statistical computations on tabular data. These applications leverage the axis-independent nature of slicing to handle real-world multi-dimensional datasets efficiently.^[9]^[10]

Strides and Negative Indexing

In array slicing, strides, also known as steps, provide a mechanism to select elements at regular intervals, enabling patterned extraction without explicit loops. This parameter, often denoted as the third element in a slice notation like start:end:step, allows skipping elements by specifying an integer value greater than 1 for subsampling or downsampling operations. For instance, a stride of 2 extracts every other element, as in arr[::2], which retrieves elements at indices 0, 2, 4, and so on from the array arr. Negative strides, such as -1 in arr[::-1], reverse the order of selection, effectively creating a view of the array in reverse without copying data.^[11] Negative indexing complements strides by allowing relative addressing from the end of the array, where -1 refers to the last element, -2 to the second-to-last, and so forth. This facilitates intuitive access to trailing portions, such as arr[-1] for the final element or arr[-3:] to select the last three elements up to the end. When combined with strides, negative indices enable versatile operations like reversed subsampling; for example, arr[-5::-2] starts from the fifth-to-last element and steps backward by two positions each time. This relative counting simplifies code for tasks involving tail-end processing, and it integrates seamlessly with positive strides or ranges.^[1] Pseudocode illustrates these features effectively. For downsampling, consider an array arr of length 9; arr[0::3] yields elements at indices 0, 3, and 6, reducing the data by a factor of three for applications like signal processing. Reversal with a subset, such as arr[5:1:-1] on an array indexed from 0 to 7, selects elements at 5, 4, 3, 2 (stopping before 1), demonstrating bounded reverse traversal. In multi-dimensional arrays, strides apply per dimension; arr[::2, ::-1] takes every second row forward and reverses all columns, producing a strided view that alternates row sampling while flipping the column order. These operations maintain the array's dimensionality where applicable. Strided slices typically produce views rather than copies, referencing the original array's memory without duplication, which enhances memory efficiency in numerical computations. This zero-copy approach relies on adjusting the stride values in the array's metadata to reinterpret the linear memory layout, allowing operations like transposition or reshaping at negligible cost. However, such views carry risks: modifications to the sliced view alter the original array, potentially leading to unintended side effects if not handled carefully, as both share the underlying data buffer.^[11]^[12] Performance-wise, strides enable efficient zero-copy operations in contiguous memory layouts, accelerating tasks like data subsampling by avoiding allocation overhead. Yet, in non-contiguous scenarios—such as after transpositions or irregular strides—access may incur cache misses, slowing traversal compared to sequential reads, with efficiency depending on the hardware's memory hierarchy and stride alignment. Optimizing stride values, such as ensuring they match element sizes, can mitigate these penalties in high-performance computing environments.^[11]

Historical Development

Early History

The conceptual origins of array slicing lie in the pre-1960s practices of mathematical subscripting for vectors and matrices within numerical analysis, where subscript notation enabled the indexing of multi-dimensional data structures to facilitate computations on subsets of elements. This approach was influenced by punch-card data processing systems, which organized tabular data into fixed formats resembling arrays, allowing selective extraction and manipulation to support batch-oriented scientific and engineering calculations on early computers like the IBM 701 and UNIVAC I.^[13]^[14] In 1964, PL/I became the first programming language to introduce formal array slicing mechanisms, enabling flexible data subsets through cross-section specifications that addressed the need for efficient handling of strings and multi-dimensional arrays in systems programming. Developed by IBM in response to the limitations of specialized languages like FORTRAN for scientific tasks and COBOL for business applications, PL/I allowed programmers to specify array slices using asterisks to denote varying dimensions—for instance, A(I,*) to reference the entire ith row as a vector—thereby unifying array operations across domains and reducing the verbosity of explicit indexing in assembly-level code. This feature was motivated by the demands of multitasking environments and OS-level programming on IBM System/360 hardware, where dynamic data subsetting improved modularity and performance in resource-constrained batch processing.^[15] Fortran 66, standardized in 1966 by the American National Standards Institute, extended subscript expressions to include arithmetic operations, permitting indirect range selection for array elements via computed indices and supporting partial array input/output operations critical to scientific computing workflows. Driven by the growing complexity of numerical simulations in physics and engineering, where full array processing was inefficient on limited memory machines, these expressions allowed programmers to compute indices dynamically (e.g., via formulas like I + J*2), facilitating selective data access without redundant loops and aligning with the era's emphasis on optimized I/O for punch-card and tape-based systems. Algol 68, finalized in 1968 as a successor to Algol 60, advanced array slicing through block-based mechanisms integrated with mode declarations for dynamic arrays, prioritizing type safety via strong static checking and orthogonality to enhance expressive power. Features like flexible array modes (e.g., flex [ ] real) and slicing syntax (e.g., a[2:3, *] for subarray extraction) were introduced to support runtime resizing and subset operations, motivated by the need to overcome fixed-size limitations in early assemblers and enable more abstract, error-resistant code for algorithmic development in academic and research settings. Key drivers included the shift toward dynamic memory allocation in multiprogramming systems, which reduced boilerplate looping for data traversal and promoted modular designs amid the transition from batch to interactive computing paradigms.^[16]

Timeline of Adoption

Array slicing emerged in the 1960s as programming languages began supporting more sophisticated array manipulations to handle scientific and engineering computations. In 1964, PL/I introduced basic range specifications and cross-sections for arrays, allowing developers to reference portions of multi-dimensional arrays using asterisks for varying dimensions, such as A(I,*) for the i-th row.^[17] In 1966, APL (A Programming Language) implemented advanced array indexing, allowing subarray selection via index vectors (e.g., A[2 3; ⍳5]) for flexible multi-dimensional extraction. Fortran 66 standardized subscript access for arrays, enabling element selection through computed index expressions that laid the groundwork for later slicing operations.^[18] Algol 68, released in 1968, advanced this with mode declarations for arrays and slicing capabilities, permitting extractions like r[3:7] from a row to create a new bounded array.^[19] MATLAB, developed in the late 1970s, provided matrix subset operations as part of its interactive calculator interface, facilitating direct access to array portions without explicit loops.^[20] The S language, first implemented in 1976, emphasized statistical vectors with powerful subscripting, including numeric, negative, and logical selectors to extract and manipulate array subsets efficiently.^[21] Fortran 77 enhanced input/output operations with implied DO loops, allowing slices of arrays to be processed in I/O statements for streamlined data handling.^[18] The 1980s saw array slicing integrated into languages focused on safety and text processing. Ada 83 introduced array subtypes with index constraints, enabling constrained slices that promoted type safety in array manipulations by enforcing bounds at compile time.^[22] Perl, released in 1987, adopted list slicing for sequences, particularly useful in text processing to extract substrings and array portions via notation like @array[1..3]. In the 1990s, slicing became a staple in dynamic and high-performance languages. Python, first released in 1991, included sequence slicing from its inception, allowing flexible extractions from lists and strings with start:stop:step syntax. Fortran 90 formalized allocatable arrays and true slicing with triplet notation, such as A(1:10, 1:5:2), supporting dynamic array sections and whole-array operations.^[18] Analytica in 1994 offered declarative slicing for multi-dimensional arrays in modeling contexts. S-Lang, introduced in 1998, provided interactive slicing for scripting and data visualization. D, emerging in 1999, used template-based slicing to enable generic array handling in systems programming. The 2000s and 2010s extended slicing to specialized domains like audio, shell scripting, and parallelism, with modern languages emphasizing safety and performance. SuperCollider in 2004 incorporated array slicing for signal processing in audio arrays. Cobra in 2006 introduced object-oriented slicing for collections. PowerShell, also from 2006, used slicing in pipeline data flows for administrative scripting. The fish shell in 2015 added range slicing syntax (e.g., $A[1..3]) for command-line data manipulation. Go in 2009 added string and byte slice operations for efficient memory views. Cilk Plus in 2010 enabled parallel slicing for array computations in extensions to C/C++. Julia in 2012 provided high-performance slicing with views to minimize copying in scientific computing. Swift in 2014 introduced collection substrings via slicing for safe, efficient access in Apple's ecosystem. Rust in 2015 formalized slice references (&[T]) as non-owning views into arrays, promoting memory safety without garbage collection. Over time, adoption shifted from static, low-level languages like Fortran and PL/I toward dynamic ones like Python and Perl, with increasing integration of slicing in parallelism (e.g., Cilk Plus, Julia) and data science tools (e.g., S, MATLAB). This evolution reflects broader trends in computing, from rigid subscripting to flexible, zero-copy views that enhance productivity and performance.^[23]

References

[1]
Indexing on ndarrays — NumPy v2.3 Manual
All arrays generated by basic slicing are always views of the original array. NumPy slicing creates a view instead of a copy as in the case of built-in Python ...
[2]
Array.prototype.slice() - JavaScript - MDN Web Docs
Jul 20, 2025 · The slice() method of Array instances returns a shallow copy of a portion of an array into a new array object selected from start to end ( end not included)
[3]
array — Efficient arrays of numeric values — Python 3.14.0 ...
Array objects support the ordinary sequence operations of indexing, slicing, concatenation, and multiplication. When using slice assignment, the assigned value ...
[4]
Array Slice - (Data Structures) - Vocab, Definition, Explanations
An array slice refers to a portion of an array that is selected or extracted based on specified indices. This operation allows programmers to create a new ...
[5]
Copies and views — NumPy v2.3 Manual
Copies and views#. When operating on NumPy arrays, it is possible to access the internal data buffer directly using a view without copying data around.
[6]
Array slicing and indexing - Python for health data science.
A list slice is a view of that memory. When you update the slice you update the original data in the array. If you are not careful this can lead to unexpected ...<|control11|><|separator|>
[7]
5. Data Structures — Python 3.14.0 documentation
The optional arguments start and end are interpreted as in the slice notation and are used to limit the search to a particular subsequence of the list. The ...
[8]
copy — Shallow and deep copy operations — Python 3.14.0 ...
A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original. A deep copy ...
[9]
4. A crash course on NumPy for images — skimage 0.25.2 documentation
### Summary: Using NumPy Slicing for Image Processing in scikit-image
[10]
Indexing and selecting data — pandas 2.3.3 documentation - PyData |
With DataFrame, slicing inside of [] slices the rows. This is provided largely as a convenience since it is such a common operation.Intro to data structures · MultiIndex / advanced indexing · 10 Minutes to pandas · Dev
[11]
CA1814: Prefer jagged arrays over multidimensional - Microsoft Learn
Sep 5, 2023 · In a jagged array, which is an array of arrays, each inner array can be of a different size. By only using the space that's needed for a given array, no space ...
[12]
[PDF] The NumPy array: a structure for efficient numerical computation
Feb 8, 2011 · Views need not be created using slicing only. By modifying strides, for example, an array can be transposed or reshaped at zero cost (no memory ...
[13]
Array programming with NumPy - Nature
Sep 16, 2020 · Strides are necessary to interpret computer memory, which stores elements linearly, as multidimensional arrays. They describe the number of ...
[14]
Earliest Uses of Symbols for Matrices and Vectors - MacTutor Index
Most of the basic notation for matrices and vectors in use today was available by the early 20 th century.Missing: 1960s | Show results with:1960s
[15]
Punch Cards for Data Processing
Punch cards became the preferred method of entering data and programs onto them. They also were used in later minicomputers and some early desktop calculators.
[16]
The early history and characteristics of PL/I - ACM Digital Library
The period between April and November 1964 was marked by a staggering amount of language definition, debate, presentation and early compiler implementation. It ...Missing: slicing | Show results with:slicing
[17]
None
Summary of each segment:
[18]
The early history and characteristics of PL/I - ACM Digital Library
The Early History and Characteristics of PL/I. George Radin. IBM T. J. Watson Research Center. P. O. Box 218. Yorktown Heights, N.Y. 10598. Introduction. Source ...
[19]
[PDF] The Seven Ages of Fortran
Apr 1, 2011 · The language now contains features for array processing, abstract data types, dynamic data structures, object- oriented programming and parallel ...
[20]
[PDF] Algol 68 - Software Preservation Group
This Edition, which is issued as a Supplement to ALGOL Bulletin number 47, includes all errata authorised by the ALGOL 68 Support subcommittee o£ IFIP WG2.1 up ...
[21]
A history of MATLAB | Proceedings of the ACM on Programming ...
Written in Fortran in the late 1970s, it was a simple interactive matrix calculator built on top of about a dozen subroutines from the LINPACK and EISPACK ...
[22]
[PDF] A Brief History of S - Statistics and Actuarial Science
S was developed to avoid writing Fortran for data analysis, starting with an idea in 1976 and initial implementation by Becker, Chambers, Dunn, McRae and ...
[23]
Ada 83 LRM, Sec 3.6: Array Types - Ada Information Clearinghouse
The array subtype is the subtype obtained by imposition of the index constraint on the array type. If a constrained array definition is given for a type ...
[24]
History of coarrays and SPMD parallelism in Fortran
Jun 12, 2020 · In this paper, we review both the early history of coarrays and also their development into a part of Fortran 2008 and eventually into a larger ...