Simple Features
Simple Features, officially known as Simple Feature Access, is an Open Geospatial Consortium (OGC) and International Organization for Standardization (ISO) standard that defines a common architecture for the storage, access, and manipulation of simple two-dimensional geometric features and their associated attributes in geographic information systems (GIS).[1][2] It specifies a set of basic geometry types—including Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon, and GeometryCollection—along with spatial reference systems to enable interoperability across software applications, databases, and file formats for vector-based spatial data.[1][3] Originally approved as OGC's first implementation standard in 1997, Simple Features emerged from early efforts to standardize geospatial data handling amid growing needs for compatible GIS technologies. In November 2025, the OGC proposed restructuring the standard into multiple parts for improved modularity, with comments open until December 2025.[4] It was formalized internationally through ISO 19125 in 2004, divided into Part 1 (Common Architecture), which outlines the conceptual model and terminology, and Part 2 (SQL Option), which provides an SQL schema for database storage, retrieval, querying, and updating of feature collections.[2][5] This dual structure ensures compatibility with relational database systems via standards like SQL/MM Part 3 (Spatial).[3] The standard's key strength lies in its focus on simplicity and portability, restricting geometries to linear interpolation and excluding complex curves or 3D representations in its core model, though extensions for measured geometries and ongoing updates address advanced needs like 3D coordinates.[1] Widely implemented in open-source and commercial GIS tools—such as PostGIS for PostgreSQL and libraries like GEOS—it underpins modern spatial data exchange, enabling seamless integration in applications ranging from mapping software to urban planning systems.[1] As of 2025, the ISO 19125 working group continues revisions to align with evolving geospatial requirements, including enhanced support for dynamic and multi-dimensional data.[1]History and Development
Origins
In the early 1990s, the geographic information systems (GIS) sector grappled with fragmentation caused by proprietary data formats and software architectures from leading vendors, including ESRI's ArcInfo and Intergraph's MGE, which restricted data sharing and integration across platforms.[6] This lack of interoperability impeded the broader adoption of GIS in enterprise environments, as organizations struggled to exchange vector-based geographic features without custom conversions or vendor-specific tools.[7] To address these challenges, the Open GIS Consortium (later renamed the Open Geospatial Consortium, or OGC, in 2004) was established in 1994 as a voluntary organization dedicated to fostering open standards for geospatial data and services.[7] In 1996, the OGC issued a Request for Technology (RT) under its Open Geodata Model Working Group, soliciting proposals from industry stakeholders to define a common abstract model for simple vector geometries.[6] Submissions from key players, including ESRI, Intergraph, MapInfo, and others, culminated in the approval of the first Simple Features specification in August 1997, marking OGC's inaugural standard and establishing a foundational framework for 2D vector geometry representation.[7][6] The primary motivations for Simple Features were to standardize the storage, retrieval, and manipulation of geographic features—such as points, lines, and polygons—in relational databases and applications, thereby promoting vendor-neutral interoperability and reducing dependency on proprietary systems.[6] This approach enabled seamless data exchange in corporate information infrastructures, aligning GIS with mainstream computing paradigms like SQL-based databases while minimizing disruptions to existing vendor implementations.[6] From the outset, the specification emphasized planar, non-curved geometries with basic topological relationships, deliberately limiting scope to essential 2D vector elements to ensure rapid adoption and practicality in diverse GIS workflows.Standardization Process
The standardization of Simple Features began with the Open Geospatial Consortium (OGC) in 1999, when it released the Simple Features Specification for SQL, Revision 1.1 (OGC 99-049). This document defined an abstract model for representing and accessing geospatial features and their associated geometries, emphasizing interoperability across software systems and databases.[8] In 2004, the International Organization for Standardization (ISO) adopted and formalized this model through ISO 19125-1: Geographic information — Simple feature access — Part 1: Common architecture. This standard established a unified framework for simple feature geometry, including the Dimensionally Extended Nine-Intersection Model (DE-9IM) for spatial relationships and integration with spatial referencing systems as defined in ISO 19111. Concurrently, ISO 19125-2: Geographic information — Simple feature access — Part 2: SQL option was published, specifying a SQL schema for storing, retrieving, querying, and updating simple geographic feature collections in relational databases.[2][5] The OGC refined the standard in 2011 with the release of OpenGIS Simple Features Access, Part 1: Common Architecture.[9] This update superseded prior versions and incorporated enhancements to the geometry model while maintaining compatibility with ISO 19125. This update facilitated broader adoption by addressing implementation details for distributed computing environments. Although the core Simple Features model focuses on linear geometries, revisions aligned with broader ISO spatial schemas (such as ISO 19107) enabled support for curved geometries like CircularString in extended implementations.[9] In the 2020s, the ISO/OGC Simple Features Standards Working Group (SWG) has been actively revising ISO 19125 to enhance interoperability. As of November 2025, efforts include a proposal to restructure the standard into a multipart format, with parts covering common architecture, Well-Known Text (WKT), and Well-Known Binary (WKB) representations, to improve modularity and alignment with modern geospatial technologies while preserving backward compatibility.[4] These updates build on the foundational model to support evolving technologies. Simple Features maintains a close relationship with ISO/IEC 13249-3 (SQL/MM Part 3: Spatial), an international standard for spatial extensions in SQL databases that extends the Simple Features model with additional geometry subtypes, routines, and functions for advanced spatial data management.Core Model
Feature Definition
In the Simple Features standard, a feature represents an abstraction of real-world phenomena, restricted to two-dimensional geometry with linear interpolation between vertices, and consisting of both spatial and non-spatial attributes.[2] This combination forms a tuple-like structure where the geometry provides the locational aspect, while attributes capture descriptive properties such as identifiers, measurements, or categorical data.[10] The model emphasizes interoperability by standardizing how these elements integrate to model geographic entities like buildings, roads, or natural features. The abstract feature model establishes a root class, typically denoted as Feature, which serves as the base for all feature types in a hierarchy.[11] This root class includes a mandatory geometry attribute of type Geometry, enabling the association of spatial information with non-spatial properties defined by the feature type. Feature attributes are characteristics with specified names, data types (e.g., integer, string, or real), and value domains, allowing for extensible schemas tailored to application needs.[11] Features support both single geometries and multi-geometries through the Geometry type, which can be a primitive object or a GeometryCollection aggregating multiple elements, all sharing the same spatial context. A prerequisite for interpreting feature geometries is their association with a spatial reference system (SRS), which defines the coordinate space and ensures consistent positioning relative to a datum or projection.[11] Without an SRS, geometries lack unambiguous real-world reference, rendering spatial operations unreliable. For instance, a road feature might employ a LineString geometry to delineate its path, paired with attributes such as name (string), length (real), and surface type (enumerated), forming a complete representation for mapping or analysis.[10]Geometry Model
The Simple Features geometry model defines an abstract hierarchy for representing spatial objects in geographic information systems, with Geometry serving as the root class.[12] This hierarchy includes instantiable subclasses restricted to zero-, one-, and two-dimensional geometric objects embedded in two-, three-, or four-dimensional coordinate space.[12] The primary primitive subclasses are Point (0D), Curve (1D), and Surface (2D), while aggregates such as MultiPoint, MultiCurve, MultiSurface, and GeometryCollection allow composition of multiple elements.[12] Each geometry object is associated with a spatial reference system, ensuring consistent positioning and measurement across datasets.[12] Topological invariants form the core of the model, emphasizing the structure and relationships within and between geometries.[12] Dimension is a key invariant, classifying objects as 0D (points with no extent), 1D (curves with length but no area), or 2D (surfaces with area).[12] The model defines each geometry in terms of three topological components: the interior (the main body), the boundary (the perimeter or edges), and the exterior (the infinite surrounding space).[12] Geometries are topologically closed, meaning they include their boundary, which supports consistent computation of relationships; for example, the boundary of a closed Curve is empty, while aggregates like MultiCurve apply a mod-2 union rule for boundaries.[13] Empty geometries represent null or absent spatial extent, such as an empty Point or the boundary of a MultiPoint, which has no elements.[12] Validity rules ensure geometries are "simple," prohibiting self-intersections except at endpoints for curves and requiring connected interiors for surfaces without cut lines, spikes, or punctures.[12] For aggregates like MultiPolygon, interiors must not overlap, though boundaries may touch at discrete points.[13] These rules maintain topological integrity. The model's emphasis on interior, boundary, and exterior components provides the foundational primitives for spatial predicates defined in the Dimensionally Extended Nine-Intersection Model (DE-9IM), enabling queries like intersection or containment.[12]Geometry Types
Primitive Types
The primitive types in the Simple Features specification form the foundational geometric objects for representing spatial data in a vector model, consisting of atomic, non-composite elements that capture basic locations, paths, and areas without aggregation.[9] These types adhere to a hierarchical geometry model where each primitive is defined by coordinates in a 2D or optionally 3D space, ensuring interoperability across geospatial systems.[9] They are designed for simplicity, with linear interpolation between points, and serve as building blocks for more complex structures like multi-geometries.[9] The Point is a zero-dimensional primitive representing a single location, defined by a pair of coordinates (x, y), optionally extended with z for altitude and m for measures like time or distance.[9] It has no boundary and is used to model discrete positions, such as the coordinates of cities or landmarks.[9] Validity for a Point requires only that its coordinates are provided, with no simplicity rules beyond the absence of anomalous values.[9] The LineString is a one-dimensional primitive consisting of a sequence of at least two Points connected by straight-line segments, forming a continuous path with linear interpolation between vertices.[9] It models linear features like roads or rivers, where the sequence defines the direction from start to end Point.[9] For validity, a LineString must be simple, meaning it does not intersect itself except possibly at endpoints if closed (where the first and last Points coincide), and it should avoid duplicate consecutive Points.[9] A closed LineString with these properties constitutes a LinearRing, which serves as the boundary for areal primitives.[9] The Polygon is a two-dimensional primitive representing a bounded planar surface, composed of one exterior LinearRing and zero or more interior LinearRings (holes) that do not intersect each other or the exterior.[9] It is used for features like land parcels or lakes, where the exterior ring encloses the area and interior rings define exclusions.[9] Orientation follows the right-hand rule: the exterior ring is traversed counterclockwise when viewed from above, while interior rings are clockwise, ensuring consistent topology.[9] Validity requires no self-intersections in any ring, closure of all rings (first and last Points identical), and that interior rings lie fully within the exterior without touching or crossing it.[9] Extensions in the 2011 revision of the specification introduced abstract supertypes and specialized primitives to support non-linear geometries while maintaining compatibility with core types.[9] The Curve is an abstract one-dimensional type generalizing LineString, allowing for continuous paths that may include curved segments, with simplicity defined by no self-intersections except at endpoints for closed instances.[9] Similarly, the Surface is an abstract two-dimensional type generalizing Polygon, comprising one exterior boundary and optional interior boundaries formed by closed Curves, requiring non-intersecting interiors for validity.[9] A concrete extension is the CircularString, a Curve subtype defined by a sequence of at least three Points that form circular arc segments between them, enabling representation of rounded features like arcs without linear approximation.[9] These extensions preserve the validity principles of no self-intersections and proper closure for rings.[9]Aggregate and Collection Types
In the Simple Features model, aggregate types extend the primitive geometry types by combining multiple instances into cohesive structures, enabling the representation of complex spatial features without imposing strict topological relationships beyond basic simplicity rules.[9] These aggregates serve as building blocks for modeling real-world phenomena that consist of multiple disconnected or loosely related components, such as clusters of points or disjoint areas.[9] MultiPoint is a zero-dimensional aggregate consisting of a collection of Point instances, where the points are neither connected nor ordered.[9] It forms a specialized GeometryCollection restricted to points, with no topological constraints between members other than the requirement for simplicity, meaning no two points coincide.[9] The boundary of a MultiPoint is an empty set, as it lacks edges.[9] This type is commonly used to represent discrete locations, such as a set of cities or sensor positions, allowing efficient storage and querying of multiple independent points as a single feature.[9] MultiLineString aggregates multiple LineString primitives into a one-dimensional collection, modeling sets of linear features that may touch at endpoints but whose interiors do not overlap.[9] For simplicity, each LineString must be simple, and any intersections between them occur only at boundary points, with the overall boundary computed as the symmetric difference (mod 2 rule) of the individual boundaries.[9] This structure supports representations like road networks or river systems, where individual segments form a route without requiring connectivity.[9] MultiPolygon combines multiple Polygon instances into a two-dimensional aggregate, suitable for areas composed of disjoint components with non-intersecting interiors.[9] Boundaries of the polygons may touch at finite points or edges, ensuring the collection is topologically closed, and simplicity requires that no anomalous overlaps or self-intersections occur within or between polygons.[9] Typical applications include archipelagos or fragmented land parcels, where each polygon represents a separate but related area, facilitating unified management in geographic information systems.[9] GeometryCollection provides a heterogeneous and recursive container for mixing primitive and aggregate geometries, such as Points, LineStrings, and Polygons, without dimensional restrictions.[9] All elements must share the same spatial reference system, but no inherent topology is enforced beyond the simplicity of individual components; it can nest other collections.[9] This type is ideal for complex features requiring diverse geometric elements, like urban planning datasets combining buildings (Polygons), roads (LineStrings), and landmarks (Points).[9]Spatial Operations and Queries
DE-9IM Model
The Dimensionally Extended 9-Intersection Model (DE-9IM) is a topological framework used in the Simple Features specification to define and query spatial relationships between two geometric objects by examining the intersections of their interiors, boundaries, and exteriors. It builds on the original 9-Intersection Model (9IM) by incorporating dimension information to handle relationships in planar space more precisely, allowing for distinctions based on the highest dimension of intersection points, lines, or areas. This model is essential for implementing robust spatial queries that capture topological invariants, independent of coordinate transformations or rotations.[14] The core of DE-9IM is a 3x3 matrix that represents the pairwise intersections between the topological components of two geometries, a and b. Each geometry is decomposed into its interior (I), boundary (B), and exterior (E), where the interior is the open set of points strictly inside the geometry, the boundary consists of the closure points, and the exterior encompasses all other space. The matrix entries denote the dimension of these intersections, using the following codes: 0 for point-like (0-dimensional), 1 for line-like (1-dimensional), 2 for area-like (2-dimensional), F for false (empty intersection), and * for "don't care" (any dimension or empty). The structure of the matrix is as follows:| I(b) | B(b) | E(b) | |
|---|---|---|---|
| I(a) | I(a) ∩ I(b) | I(a) ∩ B(b) | I(a) ∩ E(b) |
| B(a) | B(a) ∩ I(b) | B(a) ∩ B(b) | B(a) ∩ E(b) |
| E(a) | E(a) ∩ I(b) | E(a) ∩ B(b) | E(a) ∩ E(b) |
- Equals: The geometries are topologically identical, sharing the same interior, boundary, and exterior points. Matrix pattern: TF**FFF (for area geometries, often fully specified as 212212212).[14]
- Disjoint: No points in common between the geometries. Matrix pattern: FFFF***.[14]
- Intersects: The geometries share at least one point (negation of Disjoint). This is a composite predicate derived from other patterns, such as any non-FF* in relevant positions.[14]
- Touches: The geometries share boundary points but not interior points. Matrix patterns vary by dimension, e.g., FT******* for area-area or line-line cases.[14]
- Crosses: The interiors intersect in a way that neither geometry is fully contained in the other, typically producing lower-dimensional intersections along boundaries. Matrix patterns include TT***** for point-line, line-area, or point-area cases, and 0******** for line-line cases.[14]
- Within: Geometry a is completely inside b, with a's boundary possibly touching b's but a's exterior disjoint from b's interior. Matrix pattern: TFFFF**.[14]
- Contains: The inverse of Within, where b is inside a. Matrix pattern: TFFFF** (symmetric application).[14]
- Overlaps: The interiors intersect in the same dimension, but neither contains the other. Matrix patterns include TTT for point-point or area-area, or 1T*T for line-line.[14]
SQL/MM Spatial Functions
The SQL/MM spatial functions, defined in ISO 19125-2, provide a standardized set of SQL-invoked routines prefixed with "ST_" for creating, manipulating, and querying Simple Features geometries within relational database management systems (RDBMS). These functions enable the storage, retrieval, and analysis of spatial data in tables where geometry columns are explicitly typed as ST_Geometry or subtypes, with metadata tracked in system tables such as GEOMETRY_COLUMNS and SPATIAL_REF_SYS to specify geometry types, dimensions, and spatial reference systems (SRID). Basic accessor and measurement functions support inspection and quantification of geometries. ST_GeometryType returns the specific type of a given geometry as a text string, such as "ST_Point" or "ST_Polygon". ST_Area computes the measure of a surface geometry, like a Polygon or MultiPolygon, in the units of the spatial reference system. Similarly, ST_Length calculates the measure of a curve geometry, such as a LineString, also in reference system units. For serialization, ST_AsText outputs a geometry in its Well-Known Text (WKT) representation, facilitating human-readable export without including SRID metadata. Spatial relationship predicates evaluate topological interactions between geometries, implementing the Dimensionally Extended Nine-Intersection Model (DE-9IM) through standardized SQL functions. Key examples include ST_Intersects, which returns true if two geometries have at least one point in common; ST_Contains, which checks if the second geometry is completely inside the first with shared interior points; and ST_Within, which verifies if the first geometry is entirely within the second under similar conditions. Additional predicates cover relations like ST_Touches, ST_Crosses, and ST_Overlaps, enabling efficient spatial joins and filtering in queries. Constructor functions allow programmatic creation of geometries from scalar values or text inputs. ST_Point constructs a Point geometry from x and y coordinates, optionally including an SRID. ST_LineFromText builds a LineString (or subtype) by parsing a WKT string, with an optional SRID parameter. ST_MakePolygon creates a Polygon from a LinearRing shell and optional interior rings for holes, ensuring valid topology. Aggregation functions support combining multiple geometries into composite types for analysis. ST_Collect aggregates a set of geometries into a GeometryCollection or Multi-geometry, preserving distinct elements without resolving overlaps. In contrast, ST_Union merges a collection of geometries into a single geometry by computing their point-set union, dissolving boundaries where they overlap. These are typically used with SQL GROUP BY clauses on geometry columns to summarize spatial data across rows.Data Representations
Well-Known Text
Well-Known Text (WKT) is a human-readable textual format for representing Simple Features geometries, enabling the exchange and visualization of spatial data in a portable, platform-independent manner. Defined in the Open Geospatial Consortium (OGC) Simple Feature Access standard (Part 1: Common Architecture), WKT uses a structured string syntax that specifies the geometry type followed by its coordinate components enclosed in parentheses. This format supports all core Simple Features geometry types, including points, linestrings, polygons, and their multi-part and collection variants, while restricting geometries to 2D coordinate space with optional measures.[2] The basic syntax begins with the geometry type name in upper camel case (e.g., POINT, LINESTRING), which is case-insensitive but conventionally capitalized for readability, followed by an opening parenthesis, the coordinate data, and a closing parenthesis. Coordinates are pairs of numeric values (x y) separated by spaces, with decimal numbers permitted and no units specified in the core format. For linear rings in polygons, the syntax requires explicit closure by repeating the first coordinate at the end of each ring. Empty geometries are represented by appending EMPTY to the type name, such as POINT EMPTY. This structure ensures topological closure for all valid instances, including boundaries. Examples of primitive geometry types in WKT include:- A point:
POINT (10 10) - A linestring:
LINESTRING (0 0, 1 1, 2 2, 3 3) - A polygon with one exterior ring:
POLYGON ((0 0, 10 0, 10 10, 0 10, 0 0))(interior rings, if any, follow separated by commas)
MULTIPOINT ((10 10), (20 20)), while a multilinestring might be MULTILINESTRING ((0 0, 1 1), (2 2, 3 3)), and a multipolygon MULTIPOLYGON (((0 0, 10 0, 10 10, 0 10, 0 0)), ((5 5, 7 5, 7 7, 5 7, 5 5))) for one exterior ring and one interior hole. GeometryCollection allows heterogeneous nesting, as in GEOMETRYCOLLECTION (POINT (10 10), LINESTRING (0 0, 1 1)), where each element is a full WKT-tagged subgeometry separated by commas. These formats facilitate debugging and manual inspection of geometries during development and data interchange.
An extension to WKT, known as Extended Well-Known Text (EWKT), incorporates spatial reference system information via a SRID prefix in the format SRID=SRID=4326;POINT(0 0) denotes a point in the WGS 84 geographic coordinate system. EWKT is not part of the core Simple Features specification but is widely adopted in implementations for enhanced interoperability. Well-Known Binary serves as a compact binary counterpart to WKT for efficient storage and transmission.[16][17]Well-Known Binary
The Well-Known Binary (WKB) format provides a compact, platform-independent binary encoding for Simple Features geometries, enabling efficient storage, transmission, and exchange between systems.[18] It consists of a sequence of bytes representing the geometry type, followed by the coordinates or sub-geometries, prefixed by a byte order indicator to ensure consistent interpretation across different endianness architectures.[19] The structure begins with a single byte specifying the byte order: 0 for big-endian (XDR) or 1 for little-endian (NDR). This is followed by a 4-byte unsigned integer indicating the geometry type, which may include flags in extended variants for dimensions (Z for elevation, M for measure) and spatial reference system identifier (SRID). Coordinates are encoded as 8-byte IEEE 754 double-precision floating-point values, with 16 bytes per 2D point (X and Y); additional dimensions add 8 bytes each if flagged.[20] For aggregate geometries like MultiPoint or GeometryCollection, the encoding includes a count of elements (4-byte integer) followed by the WKB representations of each component.[19] Geometry type codes are assigned as follows: 1 for Point, 2 for LineString, 3 for Polygon, 4 for MultiPoint, 5 for MultiLineString, 6 for MultiPolygon, and 7 for GeometryCollection; codes 13 and above are reserved for curve types in extended profiles.[20] SRID embedding is optional and indicated by a flag (e.g., 0x20000000 in little-endian) within the type field, followed by a 4-byte signed integer for the SRID value if present; this allows geometries to carry coordinate reference system information directly.[19] For example, the simple 2D Point at coordinates (0, 0) in little-endian byte order is encoded as the 21-byte hexadecimal sequence0101000000000000000000000000000000000000, where the first byte is 01 (little-endian), the next four bytes are 01000000 (type 1), and the remaining 16 bytes represent the zero doubles for X and Y.[19]
WKB's advantages include its compactness—reducing data size compared to textual formats—and streamlined parsing without string processing, making it ideal for database storage and network transmission.[18] An extension known as Extended WKB (EWKB) builds on this by mandating SRID, Z, and M support via dedicated flags, as implemented in systems like PostgreSQL's PostGIS for enhanced interoperability.[21]