Fact-checked by Grok 2 weeks ago

Shapefile

A shapefile is a geospatial data format developed by in the early 1990s for storing the geometric location and associated attribute information of spatial features in (GIS) software, such as points, lines, and polygons. It was introduced alongside ArcView GIS version 2 to facilitate efficient data handling without topological relationships, enabling faster drawing, editing, and storage compared to more complex formats. The format is publicly documented as an open specification, first detailed in Esri's technical description, which has made it a for GIS data exchange across and non-Esri applications, including tools like . A shapefile comprises a collection of at least three mandatory files: the main file (.shp) that holds coordinates in a structure with a fixed 100-byte header followed by variable-length records; the shape index file (.shx) that provides byte offsets for rapid access to features; and the attribute file (.dbf) that stores tabular data linked to each geometric feature. Optional files, such as the projection file (.prj) for definitions or spatial index files (.sbn and .sbx), can enhance functionality but are not required for basic use. Shapefiles support five primary types—Point, PolyLine, , MultiPoint, and their variants with Z () or M (measure) values—but each file is limited to a single type, and the format does not enforce topological integrity, such as shared edges between polygons. Despite its widespread adoption, with over 160,000 instances in collections like those of the as of 2024, shapefiles have notable limitations, including a 2 GB file size cap per component, lack of support in attributes, absence of data handling beyond specific "no data" values, and incompatibility with or representations. These constraints have led to recommendations for migration to more modern formats like for long-term sustainability, though shapefiles remain prevalent due to their simplicity and broad compatibility.

Introduction

Definition and Purpose

A shapefile is an open specification binary format developed by for representing geospatial data, such as points, lines, and polygons. It serves as a data storage format for capturing the location, shape, and attributes of geographic features. The primary purpose of a shapefile is to store geometric locations alongside associated attribute information, enabling mapping, , and in geographic information systems (GIS). Key characteristics of shapefiles include their composition as a collection of multiple related files with specific extensions, rather than a single file, which allows for modular handling of geometry and attributes. This structure supports simple feature geometries that comply with (OGC) standards for nontopological vector data. Shapefiles maintain a one-to-one relationship between spatial shapes and their descriptive attributes, facilitating efficient data exchange without complex topological relationships. Shapefiles are widely adopted in GIS applications for tasks such as , , , and environmental modeling, owing to their straightforward design and across diverse software platforms. In the context of geospatial data representation, shapefiles exclusively handle data—discrete features modeled as points, lines, and polygons—contrasting with raster formats that use a of cells to depict continuous surfaces like or .

History and Development

The Shapefile format was developed by in the early as a simple, non-topological vector data storage solution for geographic information systems (GIS). It was introduced with the release of ArcView GIS version 2 in the early , 's desktop GIS software aimed at broadening access to beyond specialized users. Designed initially for ArcView, the format combined geometry storage with attribute data in a dBASE-compatible structure, prioritizing ease of use, faster rendering, and reduced storage needs compared to earlier topological formats like those in ARC/INFO. Esri released the public technical specification for Shapefile in July 1998 through a , transitioning it from a internal format to a mostly open one that promoted across GIS tools. This openness facilitated its integration into Esri's next-generation platform, launched with version 8.0 in December 1999, where Shapefile became a core supported format for exchange and . By the early 2000s, the format had achieved status in open-source GIS ecosystems; the Geospatial Data Abstraction Library (GDAL), initiated in 2001, included robust Shapefile read/write support from its outset, enabling seamless handling in tools like , which launched in 2002 and relied on GDAL for operations. Esri's decision to publish the specification encouraged widespread adoption while maintaining regulatory oversight, allowing the format to influence data sharing without full open-source licensing. With the release of ArcGIS 8.0 in 1999, enhancements included the introduction of .shp.xml metadata files, providing structured descriptions of spatial reference and dataset properties to address documentation gaps in the original design. Standardization efforts aligned Shapefile geometries with the Open Geospatial Consortium (OGC) Simple Features specification, supporting common types like points, lines, and polygons for basic spatial queries and operations. Its simplicity also indirectly shaped later formats, such as GeoJSON (standardized in 2008), by establishing a baseline for encoding simple feature geometries and attributes in interoperable ways. As of 2025, Shapefile remains prevalent in GIS workflows despite its age, with ongoing support in modern software like and , and continued use by major data providers such as the U.S. Census Bureau for annual TIGER/Line releases. However, its limitations—such as a 2 GB cap and lack of advanced features—have led to a gradual decline in favor of more flexible, standards-compliant alternatives like , though it endures as a legacy interchange format in legacy systems and data archives.

Components

Required Files

A shapefile dataset requires three mandatory files to function as a complete data format: the main geometry file (.shp), the shape index file (.shx), and the attribute database file (.dbf). These files collectively enable the storage and retrieval of geospatial features, including their and associated descriptive data. Without all three, the cannot be properly interpreted by GIS software, rendering it invalid or incomplete. The .shp file serves as the core component, storing the vector geometry for each in a series of records. This includes representations such as points, lines, or polygons that define the spatial locations and shapes of geographic entities. The .shx file acts as a positional to the .shp file, containing offsets that allow software to quickly locate and access specific geometry records without scanning the entire .shp file. This indexing supports efficient querying and rendering of spatial . The .dbf maintains the attribute information for each feature using the dBase III database format, where each record corresponds directly to a in the .shp by sequential order. This linkage allows attributes like names, populations, or classifications to be associated with their respective spatial elements. All required must share the same base filename—for example, "rivers.shp", "rivers.shx", and "rivers.dbf"—and reside in the same to ensure proper integrity. The .shp employs a mixed byte order, with big-endian for management fields in the header (such as code and ) and little-endian for fields (such as shape type and bounding box coordinates); the .shx uses big-endian byte order throughout.

Optional Files

Shapefiles may include several optional files that enhance functionality, such as defining spatial references, handling , providing , or improving query performance, without affecting the core of the required files. The .prj file stores the coordinate reference system (CRS) information for the shapefile, typically in Well-Known Text (WKT) format or PROJ.4 notation, enabling accurate and projection during mapping and analysis in GIS software. This file is recommended for all shapefiles to ensure across different systems and to prevent misinterpretation of spatial coordinates. The .cpg file specifies the character encoding used in the associated .dbf attribute file, such as or ANSI, to support international characters and non-Latin scripts in attribute data. It is particularly useful for datasets containing multilingual text, ensuring proper display and processing in diverse software environments. Metadata can be stored in .shp.xml files using XML format, which documents details about the shapefile such as its origin, creation date, and descriptive attributes, facilitating validation, documentation, and integration with tools like . The .sbn and .sbx files provide a spatial index using spatial binning to improve query performance on large datasets. For performance optimization on large datasets, the .qix file provides a quadtree-based spatial index, accelerating spatial queries by organizing geometries into hierarchical quadrants, and is commonly generated by open-source tools like GDAL or MapServer for compatibility with shapefiles. These optional files share the same base filename as the core shapefile components (e.g., example.prj for example.shp) to maintain association, but their absence does not invalidate the dataset, though it may limit advanced features depending on the consuming software. Usage of .prj is advised universally for , while .cpg, .shp.xml, .sbn, .sbx, and .qix are employed based on data complexity, encoding needs, and query requirements in specific applications.

Formats

Geometry Format (.shp)

The .shp file contains the geometric data of the shapefile in a binary format, using a combination of big-endian and little-endian byte orders for different elements. The file begins with a fixed 100-byte header that encodes metadata essential for parsing the entire structure. This header starts at byte 0 with a file code of 9994, stored as a 4-byte big-endian integer, followed by 20 bytes of unused space initialized to zero. Bytes 24 through 27 specify the total file length as a 4-byte big-endian integer, measured in 16-bit words (each word being 2 bytes) and including the header itself. The version number, fixed at 1000 for the standard shapefile format, occupies bytes 28 through 31 as a 4-byte little-endian integer. Bytes 32 through 35 contain the shape type as a 4-byte little-endian integer, which defines the geometry type shared by all records in the file (e.g., 1 for point shapes). The remaining bytes 36 through 99 form the file's bounding box, comprising four 8-byte little-endian doubles representing the minimum and maximum X and Y coordinates (Xmin, Ymin, Xmax, Ymax) of the overall spatial extent derived from all geometries; optional Z and M extents follow but default to zero if unused. Following the header, the file consists of a sequence of variable-length records, each representing a single . Each starts with an 8-byte header: bytes 0 through 3 hold the number as a 4-byte big-endian (beginning at 1 and incrementing sequentially), and bytes 4 through 7 store the length (excluding the header) as a 4-byte big-endian in 16-bit words. The 's immediately follows, beginning with a 4-byte little-endian at offset 8 that specifies the type, which must match the file header's type. For geometries ( type 0), the ends here with no additional . Otherwise, the remaining variable-length binary encodes the specifics. Geometry encoding uses little-endian byte order for all coordinate and descriptive data, with coordinates represented as 64-bit IEEE double-precision floating-point values for high precision. A simple point geometry (shape type 1) consists solely of an X coordinate (8 bytes) followed by a Y coordinate (8 bytes). For more complex types like polylines (shape type 3) and polygons (shape type 5), the encoding is identical in structure: it begins with a per-record bounding box of four 8-byte little-endian doubles (Xmin, Ymin, Xmax, Ymax), followed by a 4-byte little-endian integer for the number of parts, a 4-byte little-endian integer for the total number of points, an array of 4-byte little-endian integers (one per part) serving as indices into the points array to delineate multi-part boundaries, and finally the array of points (each an X-Y pair of 8-byte doubles). This part-index mechanism enables support for multi-part features, such as disconnected polyline segments or polygons with interior rings (islands or holes). The file's overall bounding box is computed as the union of all individual record extents during creation. There is no explicit end-of-file marker; the total number of records and file termination are inferred from the header's length field.
FieldBytesTypeEndiannessDescription
File Code0-3BigMust be 9994
Unused4-23--20 bytes of zeros
File Length24-27BigTotal length in 16-bit words
Version28-31LittleMust be 1000
Shape Type32-35Little type for the file
Xmin36-43LittleMinimum X coordinate
Ymin44-51LittleMinimum Y coordinate
Xmax52-59LittleMaximum X coordinate
Ymax60-67LittleMaximum Y coordinate
(Optional Zmin, Zmax, Mmin, Mmax)68-99LittleIf unused, set to 0.0
This table outlines the .shp header structure for reference.

Index Format (.shx)

The index (.shx) serves as a positional companion to the main shapefile (.shp), enabling efficient to individual records without requiring a full sequential of the larger .shp . It stores offsets and lengths for each record in the .shp, allowing software to jump directly to specific features during reading or rendering operations. This linear indexing approach is essential for performance in applications handling large datasets, as it facilitates quick lookups by record position, which corresponds to the order of attributes in the associated .dbf . The .shx file begins with a 100-byte header that mirrors the structure of the .shp header, ensuring consistency in basic metadata. This includes bytes 0–3 containing the file code 9994 (indicating the shapefile format), bytes 4–23 as unused (set to zero), bytes 24–27 specifying the total file length in 16-bit words, bytes 28–31 indicating version 1000, and bytes 32–35 denoting the overall shape type (an integer from 0 to 31, such as 1 for points or 5 for polygons). Bytes 36–99 encompass the bounding box fields (minimum and maximum X and Y coordinates as IEEE double-precision values), which match those in the .shp header to describe the spatial extent of all features; however, these are not used for indexing purposes in the .shx itself. The file length value accounts for the fixed 50 16-bit words of the header plus 4 words per index record, reflecting the total number of shapefile records. Following the header, the .shx contains one fixed-length 8-byte for each in the .shp, resulting in a total count identical to that of the .shp. Each .shx consists of two 4-byte big-endian integers: the first (bytes 0–3) provides the in 16-bit words from the beginning of the .shp to the start of the corresponding .shp header (for example, the first 's is typically 50, as it follows the 100-byte .shp header), and the second (bytes 4–7) specifies the content length of that .shp in 16-bit words, excluding the 8-byte .shp header itself. These point precisely to the .shp headers, which include a number and length, enabling seamless synchronization between the files. For the .shx to function correctly, it must maintain exact correspondence with the .shp in terms of record count, order, and content lengths; any addition, deletion, or modification of geometries in the .shp necessitates rebuilding the .shx to update the offsets and lengths accordingly. This positional alignment also links each .shx entry to the corresponding attribute row in the .dbf file by sequential order, supporting integrated access to spatial and tabular data. Unlike spatial indexing formats such as .sbn and .sbx, the .shx provides no capability for querying based on geographic location, limiting it to simple ordinal access.

Attribute Format (.dbf)

The .dbf file in a Shapefile stores tabular attribute for each geometric in a format compatible with III database tables, ensuring a one-to-one correspondence between records and shapes in the accompanying .shp file. This structure allows for the association of descriptive attributes, such as names or population values, with spatial entities without embedding them directly in the . The file consists of a fixed header, field descriptors, and records, all adhering to the dBase III specification for with legacy database applications. The begins with a 32-byte header that provides essential about the table. Byte 0 indicates the dBase version, typically 0x03 for dBase III without memo fields or 0x83 with memo support, though Shapefiles generally avoid memo fields. Bytes 1 through 3 store the last update date (year minus 1900, month, and day, respectively). Bytes 4 to 7 contain the total number of records as a little-endian 32-bit , matching the number of shapes in the .shp . Bytes 8 and 9 specify the header (little-endian 16-bit), which includes the initial 32 bytes plus 32 bytes per field descriptor and a 1-byte terminator. Bytes 10 and 11 define the record (little-endian 16-bit), determining the fixed size of each data row. The remaining bytes 12 to 31 are reserved, typically set to 0x00, with byte 28 sometimes indicating an incomplete transaction flag (0x00 or 0x01) and byte 29 for (usually 0x00 in unencrypted Shapefiles). Following the header are field subheaders, each 32 bytes long, defining the structure of the attribute columns until terminated by a 0x0D byte. The first 11 bytes (0-10) hold the field name as an ASCII string, limited to 10 characters followed by a null terminator or space padding. Byte 11 specifies the data type: 'C' for character strings, 'N' for numeric values, 'L' for logical (true/false), or 'D' for dates in YYYYMMDD format; floating-point numbers are also handled as 'N' type. Bytes 12 to 15 provide the byte displacement of the field within each record (little-endian 32-bit, often calculated on-the-fly). Byte 16 sets the field length (1 to 255 bytes), and byte 17 indicates decimal places (0 to 15 for numerics). Bytes 18 to 31 are reserved, set to 0x00. Shapefiles support up to 255 fields, with field names limited to 10 characters to maintain dBase III compatibility. Data records follow immediately after the field descriptors, with one record per shape in positional order—the nth record in the .dbf corresponds directly to the nth shape in the .shp file, enabling straightforward linking without additional keys. Each record is a fixed-length sequence matching the header's record length specification, starting with a 1-byte marker: 0x20 (space) for active records or 0x2A () for deleted ones, which are skipped during processing but retained in the file. Subsequent bytes fill the fields sequentially: character fields are left-justified and space-padded; numeric fields are right-justified with leading spaces and no ; logical fields use a single byte with 'T', 'F', or space; date fields occupy 8 bytes in fixed YYYYMMDD format. The total record length is limited to 4 in standard Shapefile implementations to avoid exceeding dBase constraints, and complex data types like arrays or objects are not supported, restricting attributes to simple scalar values. The file concludes with a 0x1A () terminator byte after the last record, signaling the end of data. By default, text encoding follows the dBase III standard using ASCII or OEM codepages, but Shapefiles may include an optional .cpg companion file specifying extended codepages (e.g., or ) for international characters, with the numeric value in .cpg indicating the encoding to use if present.

Spatial Index Format (.sbn and .sbx)

The .sbn and .sbx files constitute an optional spatial indexing mechanism for shapefiles, enabling faster retrieval of features based on their geographic locations during queries. These files implement an data structure, which organizes the minimum bounding rectangles (MBRs) of the geometries stored in the corresponding .shp file into a balanced of nodes. This approach minimizes the number of features examined in spatial operations, such as tests or checks for points, lines, and polygons, particularly beneficial for large datasets exceeding thousands of records. The .sbn file holds the core data in a format with variable-length records representing internal nodes and leaf nodes. Each node encapsulates MBRs that approximate the extent of child nodes or individual features, along with pointers to facilitate . Leaf nodes reference specific records by their , allowing the index to guide searches without loading the full geometry data. The R-tree's design ensures logarithmic-time query performance by pruning irrelevant branches early, though it permits some overlap in MBRs to maintain balance during insertions. These indexes are generated by Esri's software during shapefile creation or optimization, and while not universally present, they significantly enhance rendering and analysis speed in compatible tools. Complementing the .sbn file, the .sbx serves as a fixed-length index akin to the .shx file used for in the .shp, record numbers to byte offsets and content lengths within the .sbn. This pairing allows efficient random access to nodes, streamlining the integration with the main shapefile components. Compatibility is limited to shapefiles at version 1000 or higher, where the spatial extent is initially partitioned into bins to seed the construction, promoting even distribution across the tree levels. Open-source libraries like GDAL support reading these indexes to exploit their acceleration benefits, though creation remains proprietary to tools.

Shape Types and Records

Supported Geometry Types

The Shapefile format defines a set of geometry types to represent spatial features, each specified by a unique integer code stored in the file header and at the start of each record. These types encompass simple points, linear features, polygonal areas, and multi-part collections, with extensions for () values and linear measures () for applications like or . All non-null geometries within a single shapefile must share the same type, ensuring uniformity. The format supports 15 primary types as of the original specification, with additional reserved codes for future extensions. The following table enumerates the supported shape types, their codes, and basic compositions:
CodeTypeDescription and Composition
0NullNo geometric content; serves as a placeholder record with no coordinates.
1PointA single 2D point defined by X and Y double-precision coordinates.
3PolylineOne or more parts, where each part is an array of connected 2D points (doubles for X,Y); represents open linear features.
5PolygonOne or more closed rings, each an array of 2D points (at least four per ring, first and last identical); represents areal features.
8MultiPointA collection of non-connected 2D points within a bounding box, stored as an array of X,Y doubles.
11PointZA single 3D point with X,Y,Z doubles; optional M value follows.
13PolylineZPolyline with Z-enabled points (X,Y,Z doubles per point); includes Z range and optional M range/array.
15PolygonZPolygon with Z-enabled points; includes Z range and optional M range/array per ring.
18MultiPointZMultiPoint with Z-enabled points; includes Z range and optional M range/array.
21PointMA single 2D point with an associated M double-precision measure.
23PolylineMPolyline with M values per point or segment; includes M range and array.
25PolygonMPolygon with M values; includes M range and array per ring.
28MultiPointMMultiPoint with M values per point; includes M range and array.
31MultiPatchA complex 3D surface composed of patches (e.g., triangle strips, fans, rings) using X,Y,Z coordinates; supports optional M and represents volumetric objects like buildings.
Point types consist of straightforward coordinate tuples, while polyline and polygon types use integer arrays to define the number and offsets of parts or rings, followed by double arrays for the points themselves. MultiPoint types aggregate independent points without connectivity. The Z variants incorporate a Z array or range for elevation, and M variants add measure data for attributes like distance along a path; Z types can optionally include M arrays, providing combined ZM support without separate codes. No support exists for curves, splines, or true surfaces beyond linear segments in MultiPatch patches. For polygons and their Z/M variants, rings must be closed and non-self-intersecting, with outer rings oriented and interior (hole) rings counterclockwise to distinguish boundaries. The even-odd rule determines inclusion of areas between overlapping rings. These conventions ensure consistent rendering and topological integrity in GIS applications. The core types originated in the specification, focusing on 2D geometries, with Z and M extensions introduced concurrently to accommodate and needs. MultiPatch is included in the original specification as an advanced type for feature representation, expanding applicability to volumetric data while maintaining through reserved codes.

Record Contents and Encoding

Each shapefile record in the .shp file begins with an 8-byte header consisting of a 4-byte record number (starting from 1) stored in big-endian byte order, followed by a 4-byte length in big-endian byte order, where the length is measured in 2-byte words (thus, the actual byte length of the content is twice the stated value). The content immediately follows this header and starts with a 4-byte integer indicating the shape type, encoded in little-endian byte order, which determines the structure of the remaining data; all subsequent fields, including coordinates and counts, are also in little-endian format unless otherwise specified. To ensure even byte alignment, records are padded with null bytes if the content length is odd, maintaining a total record size that is a multiple of 2 bytes. Basic geometric primitives, such as points, are encoded starting at 4 of the content (after the shape type): a point consists of two 8-byte double-precision floating-point values for the X and Y coordinates. For more complex types like polylines and polygons, the content includes a 32-byte bounding box (four doubles: minimum and maximum X and Y), followed by a 4-byte for the number of parts (e.g., rings or line segments), a 4-byte for the total number of points, and an array of 4-byte integers (one per part) providing offsets into the points array to delineate each part. The points themselves follow as an array of double pairs, with polylines allowing open ends and polygons requiring closed rings (first and last point identical), where outer rings are oriented clockwise and interior rings (holes) counterclockwise. Null shapes, indicated by shape type 0, have no geometric content beyond the 4-byte type identifier, resulting in a content length of 2 words (4 bytes). Optional (elevation) and (measure) values extend supported types into or measured variants; for Z-enabled shapes, an additional 32-byte Z-range box (Zmin, Zmax, and a redundant Z value), a 4-byte point count, and an array of Z doubles follow the XY points, while M values—for linear referencing along features like distance or time—are similarly appended as an M-range box, point count, and M doubles, often set to a "no data" value below -10^38 to indicate absence. When reading records, software parses the content length from the header to advance the file pointer by the appropriate amount after processing, enabling efficient skipping of unwanted records without full decoding; the .shx index file provides byte offsets from the file start (post-header) to facilitate random access. Throughout, the binary encoding adheres to IEEE 754 for doubles and standard two's complement for integers, with big-endian exclusively for record numbers and lengths to support legacy systems.

Limitations

Storage and Size Constraints

The Shapefile format imposes strict limits on file sizes due to its reliance on 32-bit fields for storing lengths and offsets, which are interpreted in 16-bit words. Specifically, the .shp file has a practical maximum size of , as enforced by software for compatibility, although the file length encoding using 32-bit signed representing the number of 16-bit words theoretically allows up to approximately 4 . Similarly, the .dbf attribute file is constrained by the underlying III specification to a maximum size of . These limits stem from the format's design in the 1990s, when was a common boundary for file systems and addressing. In terms of records, the format theoretically supports up to 2^31 records (about 2.1 billion), limited by the 32-bit record numbering and offset fields in the .shp and .shx files. However, practical constraints from the overall file size cap this at far fewer entities; for example, a shapefile of simple point features reaches the 2 GB limit with roughly 70 million records, assuming minimal attribute data. Additionally, individual attribute fields in the file are restricted to a maximum length of 254 characters for character types, with a total of 255 fields permitted per table. Coordinate precision is determined by the use of 64-bit IEEE double-precision floating-point numbers for X and Y values, offering approximately 15-16 digits of precision. Absolute precision scales with coordinate magnitude due to the fixed relative error of floating-point representation but remains sufficient (sub-millimeter) for continental or global scales in standard GIS projections. To address these constraints, some software libraries, such as GDAL/OGR, provide options to exceed the 2 GB limit by relaxing enforcement (e.g., via the 2GB_LIMIT=NO creation option), enabling files up to approximately 4 GB in theory, though this sacrifices interoperability with strict implementations like 's software. does not officially support shapefiles larger than 2 GB and instead recommends alternatives like the File Geodatabase format, which removes these size restrictions. Common workarounds for shapefiles include partitioning large datasets into multiple smaller files by geographic or theme. These limitations make shapefiles unsuitable for very large datasets, such as national-scale point clouds exceeding hundreds of millions of points, where file splitting becomes necessary to maintain usability and avoid corruption risks from approaching the size thresholds.

Topology and Multi-Type Issues

Shapefiles are inherently nontopological data structures, meaning they do not explicitly store or maintain relationships between features, such as shared edges or nodes among adjacent polygons. Instead, geometries are stored as independent collections of coordinates, leading to potential redundancies where vertices along common boundaries are duplicated across features. This lack of topology enforcement can result in gaps, overlaps, or slivers during data creation or editing, as there is no automatic validation or correction for spatial integrity, unlike topological formats such as coverages that enforce planarity and adjacency rules. When processing shapefiles for topological analysis, such as identifying shared boundaries or ensuring space-filling coverage, software must compute intersections , which can introduce computational overhead and errors if geometries are not ""—for instance, polygons with self-intersections or incorrect ring orientations (outer rings , interior rings counterclockwise). shapefiles using nontopological methods may further degrade planar relationships, potentially skewing spatial queries or overlay operations that assume topological consistency. Regarding multi-type issues, shapefiles require all non-null shapes within a single file to conform to the same geometry type, as specified in the file header; mixing types such as points, polylines, and polygons in one shapefile is not supported. This limitation stems from the format's design, where the shape type field (e.g., 1 for Point, 3 for PolyLine, 5 for Polygon) defines a uniform structure for all records, preventing heterogeneous collections that might be needed for complex datasets. Although the specification notes that future versions may accommodate mixed types by flagging them in the header, current implementations enforce homogeneity, often requiring users to split data into separate files or convert to alternative formats like GeoPackages for multi-geometry support.

References

  1. [1]
    ESRI Shapefile - The Library of Congress
    May 9, 2024 · The ESRI Shapefile format is a special-purpose dataset for storing nontopological geometry and attribute information for the spatial features ...
  2. [2]
    [PDF] ESRI Shapefile Technical Description
    This document defines the shapefile (.shp) spatial data format and describes why shapefiles are important. It lists the tools available in. Environmental ...
  3. [3]
    Shapefile Definition | GIS Dictionary - Esri Support
    A vector data storage format for storing the location, shape, and attributes of geographic features. A shapefile is stored in a set of related files and ...
  4. [4]
    What is a shapefile? - ArcMap Resources for ArcGIS Desktop
    A shapefile is a simple, nontopological format for storing the geometric location and attribute information of geographic features.
  5. [5]
    Shapefiles—ArcGIS Online Help | Documentation
    A shapefile is an Esri vector data storage format for storing the location, shape, and attributes of geographic features. It is stored as a set of related ...
  6. [6]
    Geoprocessing considerations for shapefile output—ArcMap
    Shapefiles were developed to provide a simple format for storing geographic and attribute information. Because of the simplicity of shapefiles, they are a very ...Missing: compliance | Show results with:compliance
  7. [7]
    [PDF] Raster is faster but vector is corrector - Esri
    Mar 14, 2019 · The vector model uses points and line segments to identify locations on the earth while the raster model uses a series of cells to represent.
  8. [8]
    History of GIS | Timeline of the Development of GIS - Esri
    Since its founding in 1969, Esri has played a vital role in the creation and development of geographic information system (GIS) technology.
  9. [9]
    ESRI Arc Geodatabase Format Family - The Library of Congress
    May 9, 2024 · ... GIS. In 1999, Esri released ArcGIS 8.0 to provide a single integrated software architecture that included the geodatabase, an object ...
  10. [10]
    ESRI Shapefile / DBF — GDAL documentation
    The ESRI Shapefile/DBF driver reads, creates, and edits shapefiles, including standalone DBF files. It supports Create(), georeferencing, and virtual I/O.
  11. [11]
    ESRI Shapefile Technical Description - Esri Support
    This paper defines the shapefile data format. It also provides all the technical information necessary for writing a computer program to create shapefiles.
  12. [12]
    OGC Simple Features: Powering Modern Geospatial Systems
    May 15, 2025 · It provides a consistent framework for spatial feature data storage, exchange, and analysis, ensuring compatibility across platforms and ...
  13. [13]
    TIGER/Line Shapefiles - U.S. Census Bureau
    Sep 23, 2025 · All legal boundaries and names are as of January 1, 2025. The 2025 TIGER/Line Shapefiles were released on September 23, 2025. Download. ftp ...
  14. [14]
    Shapefile file extensions - ArcMap Resources for ArcGIS Desktop
    Shapefiles are a simple, nontopological format for storing the geometric location and attribute information of geographic features.
  15. [15]
    shptree — MapServer 8.4.1 documentation
    Shptree creates a spatial index of your shapefile, using a quadtree method. This means that MapServer will use this index to quickly find the appropriate shapes ...
  16. [16]
    dBASE Table for ESRI Shapefile (DBF) - Library of Congress
    Jun 10, 2025 · Attribute records in the dBASE file must be in the same order as records in the main (.shp) file. The dBASE table (.dbf) file cannot exceed 2 GB ...
  17. [17]
    dBASE .DBF File Structure
    Valid dBASE for Windows table file, bits 0-2 indicate version number: 3 for dBASE Level 5, 4 for dBASE Level 7. Bit 3 and bit 7 indicate presence of a dBASE IV ...
  18. [18]
    The DBase III File Format - FileFormat.Info
    The DBase III File Format. Also known as: DBF. Original Documentation. DBASE - File header structure (DBASE III) OFFSET Count TYPE Description 0000h 1 byte ...
  19. [19]
    Structure of the dBase III file. - Promotic
    The .dbf data file has a precisely defined structure. It consists of a header and a data block. The header itself consists of global data describing the dbf ...
  20. [20]
    Shapefiles in ArcGIS Pro—ArcGIS Pro | Documentation
    Note: When using the Date field type in a shapefile, only NULL and Date values are supported; Date/Time values are not supported.Shapefiles In Arcgis Pro · Note · Work With Shapefiles In The...Missing: integration | Show results with:integration
  21. [21]
    How To: Read and Write Shapefile and dBASE Files Encoded in ...
    Apr 14, 2023 · Most shapefiles and dBASE files should have the code page information stored in the file. Some programs, such as Microsoft Access and Excel, ...
  22. [22]
    ArcGIS Shapefile Files Types & Extensions - GISGeography
    Here is a list of all the files that make up a shapefile, including SHP, SHX, DBF, PRJ, XML, SBN, SBX, and CPG. Main File (.SHP). SHP is a mandatory Esri file ...
  23. [23]
    What are the limits to dBASE databases?
    DBF (Table) file. 2 Billion. Size in bytes per record (dBASE 4). 4000. Size in bytes per record (dBASE for Windows). 32767. Number of fields per table (dBASE 4).
  24. [24]
    Geoprocessing considerations for shapefile output—ArcGIS Pro
    Shapefiles make use of the dBASE file format (.dbf file) to store attributes. dBASE is a non-Esri format developed in the early 1980s and was, at that time, the ...
  25. [25]
    Understanding Topology and Shapefiles - Esri
    Topology in GIS is spatial relationships between features. Shapefiles are nontopological, using rings for polygons, and do not explicitly store topological ...