GTFS
The General Transit Feed Specification (GTFS) is a community-driven open data standard that defines a common format for public transportation schedules, routes, stops, and associated geographic information, enabling transit agencies to share static and real-time service data with software applications and riders worldwide.[1] Originally developed in 2005 through a collaboration between TriMet in Portland, Oregon, and Google to integrate transit data into online trip planners like Google Maps, GTFS began as the Google Transit Feed Specification before being renamed in 2010 to emphasize its broader, independent adoption beyond Google's ecosystem.[2] Its primary purpose is to facilitate accessible, interoperable transit information for applications such as route planning, mobile apps, and real-time alerts, rather than internal operational systems, and it has been licensed under the Apache 2.0 open-source license since its inception.[1]
GTFS consists of two main components: GTFS Schedule, which packages static data in a ZIP file containing required text files (e.g., agency.txt for agency details, stops.txt for stop locations, routes.txt for route definitions, and trips.txt for scheduled trips) along with over 15 optional files for enhancements like fares, pathways between stops, and multilingual translations; and GTFS Realtime, which uses Protocol Buffers to deliver dynamic updates such as vehicle positions, trip delays, and service alerts.[3] This structure allows for simple maintenance and validation while supporting extensions for specialized needs, such as accessibility features or fare calculations.[4] Since its release, GTFS has seen widespread adoption by thousands of transit agencies globally, powering tools from mapping services to analytics platforms and contributing to improved rider experiences through standardized data sharing.[5] As of 2025, it is used by over 10,000 agencies in more than 100 countries and has become a de facto international standard, influencing policies such as the U.S. Federal Transit Administration's requirements for GTFS data in National Transit Database reporting, which supports eligibility for federal grants.[6][7]
Overview
Definition and Purpose
The General Transit Feed Specification (GTFS) is an open data standard that defines a common format for public transportation schedules and associated geographic information, allowing transit agencies to publish static schedules, routes, stops, and fares in a machine-readable structure.[5] Developed primarily to facilitate the integration of transit data into mapping and trip planning applications, GTFS enables agencies to share information seamlessly with third-party developers and tools, thereby reducing fragmentation in transit data availability and promoting interoperability across diverse software ecosystems.[3] This format supports a wide range of transit modes, including buses, trains, subways, ferries, and paratransit services, making it adaptable to urban, suburban, and regional networks worldwide.[8]
The primary purpose of GTFS is to empower riders with accessible, accurate transit information through applications like mobile trip planners, which can aggregate data from multiple agencies to provide multimodal journey options and real-time updates when combined with extensions like GTFS Realtime.[5] By standardizing data exchange, it lowers barriers for developers to build innovative tools for route optimization, accessibility analysis, and service visualization, ultimately enhancing the overall rider experience with reliable schedules and geographic context.[8] GTFS data feeds are generally published under open licenses by transit agencies, such as Creative Commons Attribution, permitting reuse, modification, and distribution with attribution to the provider.
GTFS emerged in the mid-2000s as a response to the fragmented and proprietary nature of transit data at the time, which hindered widespread adoption of digital trip planning services.[8] Originating from a collaboration between Google and TriMet, the public transit agency in Portland, Oregon, the specification was initially designed to format TriMet's data for integration into Google Maps, addressing the need for a simple, exportable format that agencies could maintain without complex technical infrastructure.[8] This partnership quickly established GTFS as a de facto global standard, now supporting thousands of transit providers and fostering open data initiatives in public transportation.[5]
Components and Variants
GTFS consists of two primary components: GTFS Schedule, which provides static transit information, and GTFS Realtime, which delivers dynamic updates to complement the static data.[1]
GTFS Schedule is a static dataset distributed as a ZIP file containing one or more CSV-formatted text files that describe public transportation schedules, routes, stops, and associated details such as fares and accessibility features.[1] It serves as the baseline for transit planning and journey routing applications by defining fixed timetables and service patterns, enabling software to ingest and process agency data in a standardized manner.[9] At minimum, a GTFS Schedule feed includes the required files agency.txt, routes.txt, trips.txt, stops.txt, stop_times.txt, and either calendar.txt or calendar_dates.txt (or both), along with optional files for enhanced functionality.[1]
GTFS Realtime extends GTFS Schedule by providing live updates on service disruptions, using a Protocol Buffers format to encode structured data for efficiency.[10] This component includes three main message types: TripUpdate for delays or cancellations, VehiclePosition for real-time locations, and Alert for service advisories, allowing agencies to report deviations from the static schedule.[10] GTFS Realtime feeds reference the corresponding GTFS Schedule dataset to ensure contextual accuracy, with the current Protocol Buffers schema at version 2.0 as of 2024.[10]
Several variants and extensions build on these core components to address specialized transit needs. GTFS-Flex, adopted into the official GTFS specification in March 2024, extends GTFS Schedule to model demand-responsive transportation services, such as dial-a-ride or deviated routes, by adding files like booking_rules.txt and locations.geojson, along with new fields in stop_times.txt for pickup and drop-off windows.[11] It facilitates integration with trip planners by making flexible services discoverable alongside fixed-route data.[11] GTFS feeds also integrate with the Mobility Database schema, an open catalog maintained by MobilityData that standardizes metadata for over 4,000 transit and shared mobility datasets in GTFS format, promoting data sharing and validation.[12] Emerging standards like the General On-Demand Feed Specification (GOFS), launched in 2025, complement GTFS by providing a dedicated format for purely on-demand services, ensuring interoperability while keeping GTFS focused on scheduled and semi-flexible transit.[13]
History
Origins and Development
GTFS originated in 2005 as part of Google's initiative to incorporate public transit schedules into its mapping platform, addressing the fragmentation of transit data across agencies. The project began through a collaboration between Google engineers and TriMet, the transit authority in Portland, Oregon, after TriMet's IT manager Bibiana McHugh contacted Google to enable transit integration in Google Maps. This effort resulted in the creation of a simple, text-based format initially dubbed the Google Transit Feed Specification (GTFS), focused on static route and schedule information in CSV files for easy parsing and integration. The inaugural implementation launched on December 7, 2005, with Google Transit featuring only TriMet's data, marking the format's debut in a live trip-planning tool.[14][15]
Development proceeded iteratively, involving feedback from over 30 transit agencies, developers, and Google staff to refine the specification for reliability and usability. Early priorities centered on supporting multi-agency feeds and headway-based scheduling, culminating in the public release of the GTFS specification in September 2006 alongside expansion to five additional U.S. cities: Eugene, Honolulu, Pittsburgh, Seattle, and Tampa. By 2009, documentation efforts advanced with updates to the Google Code wiki, including removal of Google-specific submission guidelines and proposals to rename the format to the General Transit Feed Specification, reflecting its growing independence from Google's ecosystem. This collaborative process emphasized simplicity and openness, enabling agencies to export data without proprietary software.[3][16]
Adoption accelerated rapidly in the late 2000s, with GTFS feeds published by an estimated 261 transit agencies worldwide by 2010, covering over 250 cities and extending to international regions like Europe and parts of Asia. This growth was driven by the format's ease of implementation and its role in enhancing Google Maps' transit directions. To address limitations in static data, Google introduced GTFS Realtime in August 2011, an extension for dynamic updates on vehicle locations, trip delays, and service alerts, developed in partnership with agencies including TriMet, MBTA, and BART. A pivotal milestone came in 2013, when control of the specification's evolution shifted fully to the open community through dedicated discussion forums, solidifying GTFS as a de facto global standard beyond Google's proprietary use.[17]
Governance and Evolution
In 2013, the GTFS community began formalizing under the developers.google.com platform, shifting from initial Google-led development to collaborative maintenance of the specification through a wiki and the GitHub repository at google/transit, where stakeholders could propose and discuss changes.[18][19] This transition empowered transit agencies, developers, and other users to contribute to the evolving standard, fostering an open process for updates while Google retained oversight of the core repository.
The establishment of gtfs.org in 2022, led by the non-profit MobilityData in collaboration with community leaders like Andrew Byrd, marked a significant step toward centralized, accessible documentation for GTFS.[8][20] This platform consolidated resources previously scattered across Google Transit APIs, the GitHub repository, and other sites, providing a unified hub for specifications, best practices, and community resources to support broader adoption and maintenance.
A major advancement in governance occurred in 2025, when a new framework for GTFS Schedule took effect on July 7, following a community vote conducted from June to July 2025.[21][22] This update, proposed via Pull Request #544 and refined over two years of feedback, introduced structured pull request reviews for proposed changes and established working groups for GTFS Schedule and Realtime to facilitate discussions, categorization of change types (e.g., clarifications, additions), and voting processes.[23] These mechanisms ensure collaborative decision-making, with votes occurring in GitHub Pull Requests and summaries from working group meetings integrated into comments for transparency.
Key evolution milestones include the October 28, 2025, revision of the GTFS Schedule reference, which incorporated enhancements for fare complexities through GTFS-Fares V2 (adding files like fare_leg_rules.txt, fare_leg_join_rules.txt, and fare_transfer_rules.txt), pathways via pathways.txt for intra-station navigation, and translations with translations.txt for multilingual support.[9][24] Ongoing community efforts are tracked through monthly GTFS Digests, which highlight active proposals such as semantics clarifications for fields like trip modifications and transfer rules, encouraging participation via voting and discussion.[25][26]
Applications
Journey Planning and Routing
GTFS plays a central role in journey planning software by providing the static data necessary to model transit networks and compute optimal routes for users. Journey planners parse key files such as routes.txt, which defines route paths and types; trips.txt, which specifies individual vehicle runs along those routes; and stop_times.txt, which details arrival and departure times at stops to enable calculations of itineraries, transfer points, and total travel durations.[9][27] This parsing allows algorithms to construct graphs where stops serve as nodes and trips as edges weighted by time, facilitating efficient shortest-path computations like Dijkstra's or more specialized transit variants that account for service calendars and frequencies.[9]
Prominent tools leverage GTFS for integrated routing in widely used applications. Google Maps incorporates GTFS feeds to deliver transit directions, combining route and schedule data with mapping layers for seamless trip suggestions across thousands of agencies.[28] Similarly, the open-source OpenTripPlanner (OTP) processes GTFS to perform multimodal routing, employing algorithms such as RAPTOR for rapid connection-based searches that optimize paths using stop sequences and calendar-constrained availability. Apple Maps also utilizes GTFS-compatible transit data for planning public transport legs within broader navigation.[29]
GTFS supports multimodal integration by allowing planners to combine transit data with pedestrian and cycling networks, enabling door-to-door routing from origin to destination. This involves augmenting GTFS-derived transit segments with walking or biking distances calculated from geographic coordinates in stops.txt and shapes.txt, while route types—such as 3 for bus or 2 for rail—help differentiate vehicle speeds and transfer rules in the overall itinerary.[30][31]
In practice, GTFS powers journey planning for over 10,000 agencies in more than 100 countries, enabling apps to offer features like estimated arrival times based on schedules and filters for accessibility, such as wheelchair-accessible routes and stops.[7] For instance, users in cities like New York or London can plan trips via Google Maps that incorporate bus, rail, and walking segments with precise transfer timings.
GTFS Realtime extends the static GTFS Schedule by providing dynamic updates on vehicle positions, trip modifications, and service alerts, enabling transit agencies to disseminate live information to improve operational decision-making and passenger experience. This specification, developed as an open protocol buffer format, allows agencies to feed data from Automatic Vehicle Location (AVL) systems into centralized platforms for real-time vehicle tracking, where GPS-enabled devices on buses or trains transmit location data to monitor adherence to routes and schedules.[32]
In operations, GTFS Realtime supports dynamic schedule adjustments by alerting dispatchers to deviations, such as traffic delays or mechanical issues, facilitating proactive rerouting or holding decisions to maintain service reliability. Fleet management benefits from this integration, as agencies use the data to optimize resource allocation, including deploying spare vehicles or adjusting driver assignments based on live telemetry. For instance, the New York City Metropolitan Transportation Authority (MTA) employs GTFS Realtime feeds for subway tracking, publishing vehicle positions and trip updates that enable control centers to respond to congestion in real time across its extensive network.[33]
Third-party applications and rider apps consume these feeds to deliver predicted arrival times, delay notifications, and alerts for service disruptions, enhancing user confidence and trip planning accuracy. Studies indicate that access to such real-time information can reduce perceived wait times by up to 20% for transit riders, as demonstrated in field experiments where mobile apps using AVL-derived data allowed users to time their arrivals more precisely, thereby minimizing idle time at stops.[34]
Key challenges in GTFS Realtime implementation include maintaining data freshness, with best practices recommending updates every 10-30 seconds for trip updates and vehicle positions to ensure timeliness, though delays beyond 90 seconds can degrade usability. Additionally, feeds must be validated against the corresponding GTFS Schedule baseline to prevent inconsistencies, such as mismatched trip IDs or route alignments, which could lead to erroneous predictions if not regularly cross-checked by agency systems.[35]
Research and Accessibility Analysis
Researchers have increasingly utilized GTFS data to conduct analyses of public transit equity and accessibility, leveraging its standardized structure to quantify service disparities across urban populations. By processing GTFS feeds, studies can evaluate how transit availability intersects with socioeconomic factors, revealing patterns of underinvestment in marginalized communities. This approach has become prominent in the 2010s and 2020s, as open GTFS datasets enable scalable, reproducible research without proprietary software.[36]
A key research tool involves parsing the stop_times.txt and stops.txt files to assess service frequency, coverage gaps, and demographic impacts. The stop_times.txt file provides arrival and departure timestamps for trips at specific stops, allowing researchers to calculate headways and operational reliability; for instance, aggregating these times can identify areas with infrequent service, such as headways exceeding 30 minutes during peak hours, which correlate with reduced ridership in low-income neighborhoods. Meanwhile, stops.txt details stop locations and attributes, enabling spatial analysis of coverage; by geospatially joining this data with demographic layers, studies have shown that transit gaps disproportionately affect minority and low-income populations, with coverage deficits up to 20% higher in such areas compared to affluent ones. These analyses often integrate GTFS with external datasets like the U.S. Census Bureau's American Community Survey to map equity metrics, such as the percentage of households without access to frequent transit within a half-mile walk.[37][38][39]
Accessibility studies employing GTFS focus on fields like wheelchair_boarding in stops.txt and pathways.txt to evaluate compliance with the Americans with Disabilities Act (ADA). The wheelchair_boarding field indicates whether a stop accommodates wheelchair users (e.g., values of 1 for ramps or lifts available, 2 for some limitations), allowing audits of ADA adherence; research highlights gaps in reporting full accessibility in GTFS feeds. The pathways.txt file describes intra-station connections, such as ramps or elevators between levels, which supports modeling barrier-free routes; when combined with routing algorithms, it enables computation of accessibility metrics like the number of jobs reachable within 45 minutes by wheelchair-accessible transit. These metrics reveal disparities in accessible job access for disabled users in sprawling urban areas.[40][41][42]
Notable examples from 2020s research illustrate GTFS's role in exposing urban inequality, particularly transit deserts—areas with minimal service relative to demand. A 2021 study using GTFS data identified transit deserts affecting millions of residents, where low-frequency routes leave low-income households isolated from employment centers; this revealed how such gaps exacerbate poverty cycles in urban peripheries. Similarly, integration of GTFS with census data has supported equity audits, showing that Black and Hispanic communities in major metros have fewer accessible jobs via transit compared to white counterparts, informing targeted interventions.[43][44][36]
GTFS analyses have influenced policy by providing evidence for transit investments, notably through the U.S. Federal Transit Administration's (FTA) performance metrics frameworks. The FTA incorporates GTFS-derived indicators into its National Transit Database reporting and Mobility Performance Metrics, using them to evaluate service equity and accessibility for federal funding allocations; for example, agencies must report on accessible trips and coverage to low-income areas, guiding over $10 billion in annual grants toward closing identified gaps. This has led to policy shifts, such as prioritizing ADA-compliant expansions in under-served regions based on GTFS audits.[45][46]
Data Sharing and Registries
GTFS feeds are distributed through various registries and platforms that aggregate and provide access to publicly available datasets, facilitating open data sharing among transit agencies and developers. One prominent historical registry was TransitFeeds.com, which operated from 2013 to 2022 and hosted thousands of GTFS feeds before being archived and succeeded by the Mobility Database.[12] The Mobility Database, launched in February 2024 by MobilityData, serves as a leading open repository containing over 4,000 transit and shared mobility feeds in GTFS and related formats from more than 70 countries, allowing users to search, download, and access metadata for feeds worldwide.[47] In the United States, the Federal Transit Administration (FTA) maintains a national hub through the National Transit Database (NTD), which collects and publishes GTFS weblinks from reporting agencies, enabling geospatial analysis and service coverage data for over 1,000 U.S. transit providers as part of annual reporting requirements.[48] Additionally, the official GTFS website (gtfs.org) hosts the Canonical GTFS Schedule Validator, a free open-source tool developed by MobilityData to check feed compliance with the specification and best practices, helping agencies ensure data quality before publication.[49]
Transit agencies typically share GTFS data by publishing a ZIP file containing the required text files at a stable, public URL on their website, such as agency.org/gtfs.zip, which allows direct access without registration or proprietary systems.[9] To support version control and track updates, agencies include a feed_version field in the feed_info.txt file, often incrementing it as an integer or semantic version (e.g., "2.1") with each schedule change to help applications detect new data.[50]
By 2025, GTFS has seen widespread global adoption, with over 10,000 agencies in more than 100 countries publishing feeds to describe their services.[7] Standards for updates emphasize regular refreshes to reflect timetable changes, such as seasonal adjustments for summer schedules or holidays, with many agencies updating every few months and larger ones doing so weekly to maintain accuracy.[51][52]
These sharing mechanisms provide significant benefits by enabling third-party developers to build applications using open GTFS data, bypassing the need for costly proprietary APIs from individual agencies.[29] For instance, European Mobility as a Service (MaaS) platforms like Whim and Citymapper integrate GTFS feeds from multiple operators to offer seamless multimodal trip planning across cities such as Helsinki and London.[53]
GTFS Schedule Structure
GTFS Schedule datasets are packaged as self-contained ZIP archives containing a collection of text files, each representing a specific aspect of transit information, such as agencies, routes, and stops. All files must reside at the root level of the ZIP archive, with no subdirectories permitted, ensuring a simple and portable structure. These files are formatted as comma-separated values (CSV) with a .txt extension, encoded in UTF-8 (with an optional Byte Order Mark), and the first line of each file must contain case-sensitive header names defining the fields. This format adheres to the CSV standard outlined in RFC 4180, prohibiting elements like tabs, carriage returns, new lines within fields, or HTML tags to maintain parsability.[9]
The organization of the dataset relies on unique identifiers (IDs) to establish relationships across files, enabling efficient querying and integration. For instance, the route_id field in routes.txt serves as a foreign key referenced in trips.txt to associate trips with their routes, while stop_id links stop_times.txt to stops.txt. Files are categorized as required, optional, or conditionally required based on the transit service type; mandatory files include agency.txt, routes.txt, trips.txt, and stop_times.txt, which form the core of any valid feed. Conditional requirements allow flexibility, such as making stops.txt optional when defining demand-responsive service zones via the locations.geojson file instead. This ID-based linking supports modular data management without redundancy.[9]
Validation of GTFS datasets is essential to ensure compliance with the specification, typically performed using open-source tools like the Canonical GTFS Schedule Validator maintained by MobilityData. The v7.0 release, issued in March 2025, incorporates full support for extensions like GTFS Flex and Rider Categories, flagging errors such as missing mandatory files, invalid field values, or inconsistent IDs. Common error types include the absence of required files like stop_times.txt or violations of conditional rules, such as referencing undefined stop_ids.[54][55]
GTFS feeds are designed for scalability, with typical sizes ranging from 1 MB for small agencies to 100 MB or more for large metropolitan systems; for example, the Sydney transit feed exceeds 85 MB uncompressed. The stop_times.txt file, which records arrival and departure times for each trip stop, often dominates the dataset size and can contain over 1 million entries in major networks, demonstrating the format's capacity to handle extensive schedules without performance degradation in standard processing tools.[56][57]
The agency.txt file in a GTFS Schedule dataset defines the transit agencies responsible for the services included in the feed, enabling support for multiple agencies within a single dataset.[58] This file is required and uses agency_id as its primary key, with each row representing one agency.[58] All agencies in the dataset must share the same agency_timezone to ensure consistent time handling across the feed.[58]
The file includes the following fields:
| Field Name | Type | Presence | Description |
|---|
agency_id | Unique ID | Conditionally required | Identifies a transit agency or brand; required if multiple agencies are present in the feed, and recommended otherwise for unique identification. |
agency_name | Text | Required | The full name of the transit agency. |
agency_url | URL | Required | A fully qualified URL pointing to the agency's website. |
agency_timezone | Timezone | Required | The timezone for the agency, following the Olson naming convention (e.g., "America/New_York"); must be identical for all agencies. |
agency_lang | Language code | Optional | The primary language used by the agency, in IETF BCP 47 format (e.g., "en" for English). |
agency_phone | Phone number | Optional | A voice telephone number for customer contact with the agency. |
agency_fare_url | URL | Optional | A fully qualified URL for purchasing tickets or accessing fare information. |
agency_email | Email | Optional | An actively monitored email address for customer service inquiries. |
These fields allow consumers of the feed, such as trip planning applications, to display accurate agency information to users.[58] For multi-agency regions, such as metropolitan areas served by coordinated operators, this structure facilitates unified data distribution without separate feeds.[58]
The feed_info.txt file provides essential metadata about the entire GTFS dataset, including details on its publisher and validity period.[59] It is conditionally required—if the optional translations.txt file is present, feed_info.txt must be included; otherwise, it is recommended for all feeds.[59] The file contains only one row, with no primary key, and uses language codes in IETF BCP 47 format for textual fields.[59]
The file includes the following fields:
| Field Name | Type | Presence | Description |
|---|
feed_publisher_name | Text | Required | The full name of the organization publishing the GTFS feed. |
feed_publisher_url | URL | Required | A fully qualified URL for the publisher's website. |
feed_lang | Language code | Required | The default language used for text in the dataset; use "mul" for multilingual feeds with translations. |
default_lang | Language code | Optional | The fallback language to display when the user's preferred language is unknown (e.g., "en"). |
feed_start_date | Date | Recommended | The start date of the schedule's validity period, in YYYYMMDD format. |
feed_end_date | Date | Recommended | The end date of the schedule's validity period, in YYYYMMDD format; must not precede feed_start_date. |
feed_version | Text | Recommended | A version identifier for the feed (e.g., "2025-11-01"), useful for tracking updates like seasonal changes. |
feed_contact_email | Email | Optional | An email address for technical inquiries about the feed. |
feed_contact_url | URL | Optional | A URL providing contact or support information for the feed. |
This metadata helps feed consumers understand the scope and maintenance of the dataset, such as through feed_version for detecting changes during seasonal updates.[59] Best practices recommend including at least one contact method (feed_contact_email or feed_contact_url) to address data quality issues reported by users, and specifying feed_start_date and feed_end_date to clarify the temporal coverage of the schedule.[59]
Route and Trip Definitions
In the GTFS Schedule dataset, routes are defined in the routes.txt file, which provides essential metadata for transit lines operated by agencies. Each route is identified by a unique route_id, serving as the primary key that links to other files in the feed. The agency_id field establishes a connection to the agency defined in agency.txt, allowing routes to be associated with specific operators, particularly in multi-agency feeds. Public-facing identifiers include route_short_name for abbreviated labels, such as "32", and route_long_name for descriptive titles like "32nd St. Crosstown", with at least one required for display purposes.[9]
The route_type field categorizes the mode of transportation using an enumerated system aligned with common transit standards, where values such as 0 indicate tram/streetcar/light rail, 1 for subway/metro, 2 for rail, 3 for bus (covering short- and long-distance services), 4 for ferry, and additional types like 11 for trolleybus. Visual elements are supported via route_color (a hexadecimal code, e.g., "00FF00" for green) and route_text_color for contrasting text. For routes with flexible service patterns, continuous_pickup and continuous_drop_off specify boarding and alighting rules along the path: 0 or empty for no continuous service, 1 for continuous, 2 for arrangement with driver, or 3 for advance coordination with the agency; these fields are conditionally forbidden if detailed stop times are provided elsewhere. As of updates in 2025, the network_id field in routes.txt—used to group routes—has been made conditionally forbidden when networks.txt or route_networks.txt files are present, promoting more structured network modeling.[9][9]
Trips, representing individual instances of service along a route, are detailed in the trips.txt file, which establishes a one-to-many relationship with routes through the shared route_id field. Each trip has a unique trip_id as its primary key and references a service_id from calendar.txt or calendar_dates.txt to indicate operational dates. Variants within a route, such as express versus local services, can be distinguished using fields like trip_headsign for destination signage (e.g., "Downtown via Express") or direction_id (0 for outbound, 1 for inbound, though conventions vary by agency). Additional attributes include shape_id linking to path geometry in shapes.txt, and accessibility indicators: wheelchair_accessible (0 for no information, 1 for fully accessible, 2 for partially accessible) and bikes_allowed (0 for no information, 1 for allowed, 2 for not allowed). This structure enables modeling of diverse trip patterns under a single route, such as peak-hour variants or seasonal adjustments, without duplicating route metadata.[9]
| Field in routes.txt | Type | Presence | Description |
|---|
| route_id | Unique ID | Required | Unique identifier for the route. |
| agency_id | Foreign ID | Conditionally Required | References the operating agency. |
| route_short_name | Text | Conditionally Required | Short public name (e.g., "3"). |
| route_long_name | Text | Conditionally Required | Full public name (e.g., "Broadway Local"). |
| route_type | Enum | Required | Transportation mode (e.g., 3=Bus). |
| route_color | Color (Hex) | Optional | Color for route visualization. |
| continuous_pickup | Enum (0-3) | Conditionally Forbidden | Rules for ongoing boarding. |
| continuous_drop_off | Enum (0-3) | Conditionally Forbidden | Rules for ongoing alighting. |
| network_id | ID | Conditionally Forbidden | Route grouping (deprecated in favor of networks files post-2025). |
| Field in trips.txt | Type | Presence | Description |
|---|
| route_id | Foreign ID | Required | Links to the parent route. |
| service_id | Foreign ID | Required | Defines service days. |
| trip_id | Unique ID | Required | Unique trip identifier. |
| trip_headsign | Text | Optional | Destination display text. |
| direction_id | Enum (0/1) | Optional | Travel direction. |
| shape_id | Foreign ID | Optional | Path shape reference. |
| wheelchair_accessible | Enum (0-2) | Optional | Accessibility level. |
| bikes_allowed | Enum (0-2) | Optional | Bike policy. |
The stops.txt file in a GTFS dataset defines the physical locations of transit stops, stations, and related points, providing essential geographic and descriptive data for transit networks. Each entry includes a unique stop_id to identify the location, a stop_name that matches the agency's rider-facing nomenclature (such as on timetables or signage), and geographic coordinates via stop_lat and stop_lon in WGS84 decimal degrees, with a recommended precision of six decimal places (approximately 0.11 meters accuracy).[60] These coordinates pinpoint the boarding point, such as a bus pole or platform edge, rather than the roadway or track, ensuring relevance for passenger navigation.[60]
The location_type field categorizes entries as 0 for a stop or platform, 1 for a station (a grouping of multiple stops), or other values like 2 for entrances/exits, allowing hierarchical organization.[60] For instance, a multi-platform station can be represented as a parent entry (location_type=1) with child stops (location_type=0) linked via the parent_station field, which references the station's stop_id; this enables modeling complex sites like rail hubs with distinct platforms identified by an optional platform_code (e.g., "A1" or "Track 3").[60] Additionally, the wheelchair_boarding field indicates accessibility: 0 for no information, 1 for possible (or accessible path from parent), or 2 for not possible, supporting inclusive routing applications.[60] An optional zone_id can tag stops for fare zones, linking to fare rules without affecting core location data.[60]
The stop_times.txt file specifies scheduled arrival and departure timings for each stop along a trip, linking back to trip definitions via the required trip_id field.[61] Timings use the arrival_time and departure_time fields in HH:MM:SS format relative to the agency's timezone, supporting values exceeding 24:00:00 for overnight or multi-day services (e.g., 25:30:00 for 1:30 AM the following day).[61] The stop_sequence field orders stops sequentially for a trip (e.g., 1, 3, 5), not necessarily consecutively, while an optional stop_headsign provides destination signage that may vary per stop, overriding the trip-level headsign if needed.[61]
Pickup and drop-off rules are governed by pickup_type and drop_off_type enums: 0 (or empty) for regular service, 1 for none, 2 to arrange via agency phone, or 3 to coordinate with the driver, allowing restrictions like request-stop operations.[61] The timepoint field distinguishes exact schedules (1) from approximate or interpolated times (0), which is crucial for services without fixed stops, such as express buses or trains using continuous positioning models where timings are estimated between major points.[61] For exact timepoints, both arrival and departure times must be specified; approximate ones facilitate flexible operations like on-demand or variable-speed routes.[61]
Service Calendars and Exceptions
The service calendars in GTFS Schedule are defined primarily through the calendar.txt file, which specifies recurring patterns of service availability for trips on a weekly basis within a defined date range.[62] This file groups trips that operate on the same days of the week, using boolean indicators for each weekday to denote whether service is active (1) or inactive (0).[62] The key fields include service_id (a unique identifier linking to trips in trips.txt), monday through sunday (required enum fields for daily availability), start_date (the first valid service date in YYYYMMDD format), and end_date (the last valid service date in YYYYMMDD format, which must be on or after start_date).[62] For example, a weekday commuter service might set monday to 1, tuesday to 1, and so on through friday to 1, with saturday and sunday set to 0, applying this pattern from a start date to an end date spanning several months.[62]
To handle irregularities such as holidays, special events, or seasonal adjustments that deviate from the weekly pattern, the calendar_dates.txt file provides date-specific exceptions.[63] This file references the same service_id from calendar.txt (or defines standalone services if calendar.txt is absent) and includes date (a specific date in YYYYMMDD format) and exception_type (an enum where 1 adds service on that date and 2 removes it).[63] Each combination of service_id and date must be unique, allowing producers to override the default calendar—for instance, removing service on a holiday (exception_type 2) from a weekday pattern or adding extra service on an event day (exception_type 1).[63] This approach enables precise control over service validity without altering the core weekly definitions.
GTFS requires at least one of calendar.txt or calendar_dates.txt to be present in a feed, ensuring all service dates are explicitly defined; if calendar.txt is omitted, calendar_dates.txt must list every active date comprehensively.[62][63] For holidays or events, a common practice is to remove the date from the regular service pattern using exception_type 2 in calendar_dates.txt and, if applicable, add it to an alternative pattern (e.g., a reduced holiday schedule) using exception_type 1.[64] To promote feed simplicity and consumer efficiency, best practices recommend minimizing the use of exceptions in calendar_dates.txt by favoring broad weekly patterns in calendar.txt where possible, while reserving exceptions for true deviations.[9] The service_id serves as the linkage mechanism, associating these calendar definitions directly with individual trips in trips.txt to determine operational validity on any given date.[62]
| Field Name | Type | Presence | Description |
|---|
| service_id | Unique ID | Required | Identifies the service pattern, referenced by trips.txt. |
| monday | 0 or 1 | Required | 1 if service operates on Mondays; 0 otherwise. |
| tuesday | 0 or 1 | Required | 1 if service operates on Tuesdays; 0 otherwise. |
| wednesday | 0 or 1 | Required | 1 if service operates on Wednesdays; 0 otherwise. |
| thursday | 0 or 1 | Required | 1 if service operates on Thursdays; 0 otherwise. |
| friday | 0 or 1 | Required | 1 if service operates on Fridays; 0 otherwise. |
| saturday | 0 or 1 | Required | 1 if service operates on Saturdays; 0 otherwise. |
| sunday | 0 or 1 | Required | 1 if service operates on Sundays; 0 otherwise. |
| start_date | Text (YYYYMMDD) | Required | Start date of the service period. |
| end_date | Text (YYYYMMDD) | Required | End date of the service period. |
| Field Name | Type | Presence | Description |
|---|
| service_id | ID | Required | Identifies the service for the exception, matching calendar.txt or standalone. |
| date | Text (YYYYMMDD) | Required | The specific exception date. |
| exception_type | 1 or 2 | Required | 1 adds service on this date; 2 removes it. |
Path Shapes and Frequencies
In GTFS Schedule datasets, the shapes.txt file provides a mechanism to define the geographic paths followed by transit vehicles along routes, enabling more accurate visual representations in mapping and routing applications. This optional file consists of sequences of latitude and longitude coordinates that form polylines approximating the actual travel path, which may deviate from straight lines between stops to reflect road alignments or other constraints. Each shape is identified by a unique shape_id and is linked to specific trips via the trips.txt file, allowing multiple trips to share the same shape for efficiency.[65]
The structure of shapes.txt includes the following fields:
| Field Name | Type | Presence | Description |
|---|
shape_id | Unique ID | Required | Identifier for the shape, used to associate it with trips. |
shape_pt_lat | Latitude | Required | Latitude coordinate of a point along the shape. |
shape_pt_lon | Longitude | Required | Longitude coordinate of a point along the shape. |
shape_pt_sequence | Non-negative integer | Required | Index defining the order of points in the shape, with values increasing from start to end (not necessarily consecutive). |
shape_dist_traveled | Non-negative float | Optional | Actual distance traveled along the shape from the first point to the current one, in distance units consistent with other GTFS files; recommended for routes with loops or inline stops to improve position interpolation accuracy. |
These points do not need to coincide exactly with stop locations but should be positioned close to the vehicle's expected path for effective rendering. When combined with stop_times.txt, shape points allow interpolation of vehicle positions between stops, enhancing trip visualization in journey planning tools.[65]
The frequencies.txt file complements path shapes by supporting headway-based scheduling for services where trips operate at regular intervals rather than fixed times, particularly useful for high-frequency routes like buses or subways. This optional file specifies recurring trip patterns within defined time windows, reducing the need to enumerate every individual trip in trips.txt for compressed representations. It applies to trips that may already reference shapes, adding temporal regularity to the spatial definitions.[66]
Key fields in frequencies.txt are:
| Field Name | Type | Presence | Description |
|---|
trip_id | Foreign ID → trips.txt | Required | Identifier of the trip pattern to which the frequency applies. |
start_time | Time (HH:MM:SS) | Required | Start time of the frequency period, relative to the service day, when the first vehicle departs the first stop. |
end_time | Time (HH:MM:SS) | Required | End time of the frequency period, when the headway changes or service stops. |
headway_secs | Positive integer | Required | Interval in seconds between consecutive vehicle departures during the period; periods must not overlap. |
exact_times | Enum (0 or 1) | Optional | Indicates service type: 0 for approximate headway-based scheduling (vehicles aim for intervals but may vary slightly); 1 for exact schedule adherence with consistent headways (default assumes 0 if omitted). |
For headway-based services (exact_times=0), this enables modeling of non-scheduled operations where arrival predictions rely on intervals rather than precise timetables. In visual routing, frequencies integrate with shapes to depict service density along paths, aiding users in understanding wait times without detailed stop-by-stop data. The optional shape_dist_traveled field in shapes.txt further refines distance calculations for such interpolated positions, promoting consistency in applications handling both spatial and temporal aspects of transit.[66]
Transfers and Fare Rules
The transfers.txt file in GTFS Schedule defines rules for passenger transfers between stops, trips, or routes, enabling transit planning software to model seamless connections.[67] It is an optional file with a primary key consisting of from_stop_id, to_stop_id, from_trip_id, to_trip_id, from_route_id, and to_route_id.[67] The from_stop_id field, a foreign key referencing stops.stop_id, identifies the starting point of a transfer and is conditionally required for transfer types 1 through 3, while optional for types 4 and 5.[67] Similarly, to_stop_id references the ending stop and follows the same conditional requirement.[67]
The transfer_type field, which is required, specifies the nature of the transfer using enumerated values: 0 for recommended transfers without timing constraints, 1 for timed transfers where arrival and departure are synchronized, 2 for transfers requiring a minimum walking or waiting time, 3 for transfers that are not possible, 4 for transfers within the same vehicle (in-seat), and 5 for transfers involving alighting and reboarding the same vehicle at the same stop.[67] An optional min_transfer_time field provides the minimum duration in seconds needed for the transfer, such as time for walking between stops or platform changes.[67] More specific transfer rules, such as those linking particular trips, override general stop-to-stop rules to ensure accurate routing.[67] For instance, in a multi-modal system, a type 2 transfer with 300 seconds minimum time might connect a bus stop to a nearby train platform, accounting for pedestrian access.[67]
Fare information in GTFS is handled through the optional fare_attributes.txt and fare_rules.txt files, forming the basis of GTFS-Fares V1 for modeling pricing structures.[68] The fare_attributes.txt file, with fare_id as its primary key, defines core fare properties.[68] The required price field is a non-negative float representing the cost in units of the specified currency_type, which uses ISO 4217 codes such as USD or EUR.[68] The payment_method field, also required, indicates timing: 0 for payment on board and 1 for payment before boarding.[68] The transfers field specifies allowed additional rides: 0 for no transfers, 1 for one additional transfer, 2 for two, or left empty for unlimited.[68] An optional transfer_duration sets the validity period in seconds for those transfers, while agency_id conditionally links to a specific agency if the feed includes multiple providers.[68]
The fare_rules.txt file links fares to specific journeys via the required fare_id foreign key and optional fields like route_id (referencing routes.route_id), origin_id, destination_id, and contains_id, all of which reference stops.zone_id for zonal pricing.[69] This allows fares to apply based on routes, origin-destination pairs, or zones traversed, enabling complex tariffs such as distance-based or area-specific charges.[69] For example, a fare rule might assign a higher price for trips originating in zone A and ending in zone C, regardless of the route taken, provided the journey contains zone B.[69]
As of the October 28, 2025 revision, GTFS-Fares V2 introduces expansions like fare_leg_rules.txt to better support multi-leg journeys, where fares accumulate across transfers or segments, and enhanced zone-based pricing for more granular modeling of regional systems.[70] These updates address limitations in V1 by allowing rules for intermediate legs and variable pricing per transfer, improving accuracy for integrated ticketing in large networks.[68]
Accessibility Features and Pathways
The General Transit Feed Specification (GTFS) incorporates accessibility features through dedicated files that model the physical layout of transit stations, enabling trip planners to provide navigation guidance compliant with standards like the Americans with Disabilities Act (ADA).[9] These elements focus on multi-level stations and internal pathways, allowing users to assess barriers such as stairs or slopes before traveling. By integrating with stop and trip attributes, GTFS facilitates end-to-end accessible routing, prioritizing details like elevator availability and pathway widths for wheelchair users.[71]
The levels.txt file defines the vertical structure of stations, essential for navigating multi-level facilities like underground or elevated transit hubs. It is conditionally required when pathways include elevators (pathway_mode=5 in pathways.txt), ensuring accurate modeling of elevation changes.[72] The file uses a primary key of level_id to uniquely identify each level within a station.[72]
| Field | Type | Presence | Description |
|---|
| level_id | Unique ID | Required | Identifies a level within a station; must be unique across the feed. |
| level_index | Float | Required | Numeric index for vertical ordering: 0 for ground level, positive values for levels above ground, and negative for below; higher indices indicate upper positions. |
| level_name | Text | Optional | Human-readable name displayed to riders, such as "Mezzanine" or "Platform Level." |
Levels link to stops via the level_id field in stops.txt, providing context for pathways that span elevations, such as elevators connecting a concourse (level_index=0) to a platform (level_index=1).[73]
The pathways.txt file models connections between station locations as a graph, detailing modes of traversal to support precise navigation and accessibility assessments. It is optional but recommended for complex stations to represent elements like escalators and elevators, which are critical for ADA compliance by indicating barrier-free options.[74] Each pathway is identified by a unique pathway_id and connects from_stop_id to to_stop_id, both referencing entries in stops.txt (e.g., entrances, platforms, or intermediate nodes like landings).[74] The pathway_mode field specifies the type, with values 4 for escalators (often unidirectional) and 5 for elevators (typically bidirectional and key for wheelchair access).[71] Additional attributes quantify physical demands, such as length in meters for horizontal distance, traversal_time in seconds for expected duration (excluding waits), stair_count for the number of steps in stairs (positive for ascent, negative for descent), and max_slope as a ratio (e.g., 0.083 for an 8.3% incline, where values below 0.083 often denote wheelchair accessibility).[74]
| Field | Type | Presence | Description |
|---|
| pathway_id | Unique ID | Required | Identifies the pathway; must be unique. |
| from_stop_id | ID (from stops.txt) | Required | Starting location ID. |
| to_stop_id | ID (from stops.txt) | Required | Ending location ID. |
| pathway_mode | Enum (1-7) | Required | Traversal type: 1=walkway, 2=stairs, 3=moving sidewalk, 4=escalator, 5=elevator, 6=fare gate, 7=exit gate. |
| is_bidirectional | 0 or 1 | Required | 0 for unidirectional (e.g., escalators), 1 for bidirectional. |
| length | Non-negative float | Optional | Pathway length in meters. |
| traversal_time | Non-negative integer | Optional | Average traversal time in seconds. |
| stair_count | Integer | Optional | Number of stairs (for mode=2); sign indicates direction. |
| max_slope | Non-negative float | Optional | Maximum slope ratio (e.g., 0.083). |
| min_width | Non-negative float | Optional | Minimum clear width in meters (e.g., 1.5 for accessible paths). |
| signposted_as | Text | Optional | Signage text for directions (e.g., "Follow signs to Platform 3"). |
| reversed_path | ID (pathway_id) | Optional | ID of the reverse-direction pathway. |
Introduced in updates as of October 2025, fields like min_width and reversed_path enhance modeling of accessible and directional paths, such as narrow corridors or one-way escalators with dedicated reverses.[74] The signposted_as field provides directional cues, like "To Street Level via Elevator," aiding visual navigation.[74]
These files integrate with accessibility flags in stops.txt (via wheelchair_boarding, where 1 indicates step-free access) and trips.txt (via wheelchair_accessible), allowing planners to filter pathways by mode—e.g., prioritizing elevators (mode=5) between levels for wheelchair users while excluding stairs (mode=2).[71] This conditional use of levels ensures elevators are only modeled with defined vertical ordering, creating comprehensive accessible routes from station entry to boarding.[72]
Translations, Attributions, and Extensions
The translations.txt file enables multilingual support in GTFS datasets by providing translations for customer-facing text fields, such as stop names or route descriptions, from the default language specified in feed_info.txt.[75] This optional file includes fields like table_name (identifying the GTFS table, e.g., stops), field_name (the translatable field, e.g., stop_name), language (a two-letter ISO 639-1 code, e.g., nl for Dutch), and translation (the translated text).[75] Additional conditional fields, such as record_id (to target a specific record) or field_value (to match a particular value for translation), allow precise mappings.[75] For instance, in a feed with French as the default language, translations.txt can translate a stop name like "Bruxelles-Ouest" to "Brussel-West" for Dutch users, enhancing accessibility for international riders.[76]
The attributions.txt file credits organizations involved in producing or operating the transit data, particularly useful in aggregated or third-party datasets.[77] As an optional component, it features fields including attribution_id (a unique identifier), organization_name (e.g., "Rejseplanen"), is_producer (1 if the organization produced the data), is_operator (1 if it operates services), and is_authority (1 if it is the regulating authority), with at least one role flag required.[77] Optional fields like attribution_url, attribution_email, or attribution_phone provide contact details, and linkages via agency_id, route_id, or trip_id scope attributions to specific elements or the entire feed.[77] In practice, this file attributes data aggregators, such as crediting Rejseplanen for Danish transit data with a URL to their site.[78]
Extensions in GTFS address advanced use cases, including on-demand and flexible services. The booking_rules.txt file, part of the GTFS-Flex subset, specifies reservation requirements for demand-responsive transportation, such as dial-a-ride or route deviation services.[79] Key fields include booking_rule_id (unique identifier), booking_type (e.g., 0 for real-time, 1 for same-day, 2 for prior-day), and prior_notice_duration variants like prior_notice_duration_min (minimum minutes in advance, e.g., 60) or prior_notice_start_day (earliest advance days, e.g., 14).[79] Supporting fields such as message, phone_number, info_url, and booking_url guide riders on how to book, with examples like Heartland Express requiring bookings 1 to 14 days ahead between 8 AM and 3 PM on weekdays.[80] GTFS-Flex, officially adopted into the core specification in March 2024, facilitates discoverability of these flexible services by modifying files like stop_times.txt to include booking rule references and supporting point-to-zone or checkpoint deviations.[11]
The locations.geojson file introduces GeoJSON-formatted zones for pickup and drop-off in on-demand services, diverging from traditional CSV to represent polygonal areas per RFC 7946.[81] It requires a FeatureCollection structure with features containing id (unique identifier, shared with stops.stop_id or groups), properties (e.g., stop_name), and geometry as Polygon or MultiPolygon with coordinate arrays.[81] This enables modeling service areas where riders request stops anywhere within boundaries, integrated via stop_times.txt for trips.[11] Complementing this, areas.txt defines logical groupings like fare zones with area_id and area_name, while stop_areas.txt assigns stop_id values to these areas, allowing aggregated representations without physical pathways.[82][83]
In 2025, GTFS-Fares v2 introduced rider_categories.txt to model eligibility groups for fares, such as students or elderly riders, adopted in February via community proposal.[84] This optional file includes rider_category_id, rider_category_name (e.g., "Senior"), is_default_fare_category (1 if applicable to all), and eligibility_url for details.[85] It links to fare_products.txt, which describes purchasable tickets with fields like fare_product_id, fare_product_name, amount (cost), currency (ISO 4217 code), and rider_category_id to restrict eligibility.[86] An empty rider_category_id in fare_products.txt indicates universal access, supporting complex fare structures in flexible services.[86]
GTFS Realtime
Overview and Protocol
GTFS Realtime is a feed specification designed to provide dynamic transit information, such as trip delays, vehicle positions, and service alerts, as a complement to the static GTFS Schedule dataset.[32] It enables public transportation agencies to deliver real-time updates to applications, improving rider experience by offering live departure times and alerts.[32] These feeds require a corresponding GTFS Schedule feed for contextual reference, such as route and stop definitions.[10]
The format employs binary Protocol Buffers (protobuf) for efficient serialization, defined in the gtfs-realtime.proto schema (version 2.0 and higher).[87] Feeds are delivered over HTTP or HTTPS from any web server, allowing frequent updates without complex infrastructure.[32] The protocol is licensed under Apache 2.0, promoting open adoption and implementation.[32]
A GTFS Realtime feed consists of a single FeedMessage containing a header and a list of entities.[10] The header includes a Unix timestamp indicating when the feed was generated and an incrementality field, which must be set to FULL_DATASET (value 0) as DIFFERENTIAL (value 1) is currently unsupported.[88] Entities encompass TripUpdate for schedule adjustments, VehiclePosition for location data, and Alert for service notifications, which can be combined within the same feed.[89] Validation is supported through reference implementations and language bindings generated from the protobuf schema.[90]
Trip Updates
The Trip Update message in GTFS Realtime provides real-time information about deviations from the static schedule for individual trips, such as predicted arrival and departure times, delays, cancellations, or added unscheduled trips.[91] It is designed to complement the static GTFS Schedule data by allowing transit agencies to report timetable fluctuations for trips that support real-time updates.[89] This enables applications like trip planners to adjust user predictions dynamically, for instance, notifying passengers of a bus arriving five minutes late at a specific stop.[91]
The core structure of a Trip Update includes a TripDescriptor, which identifies the affected trip by referencing static GTFS elements, followed by an array of StopTimeUpdate messages for stop-specific changes.[92] The TripDescriptor contains fields such as trip_id (a unique identifier from trips.txt), start_date (in YYYYMMDD format indicating the service date), route_id (from routes.txt), and direction_id (0 or 1 for the trip's direction).[91] It also includes a schedule_relationship enum to indicate the trip's overall status: SCHEDULED (following the static schedule with possible updates), UNSCHEDULED (a trip without a static counterpart), or CANCELED (the entire trip is canceled). As of May 2025, the enum was expanded with experimental values including NEW (extra trip unrelated to existing trips), REPLACEMENT (replaces a scheduled trip with new schedule or routing), DUPLICATED (copies an existing trip with different start date/time), and DELETED (trip removed and not shown to users); the former ADDED value is deprecated.[93][94] An optional VehicleDescriptor may link to vehicle position data, though it is not required for trip updates.[91]
Each StopTimeUpdate in the array targets a specific stop on the trip, identified by stop_sequence (the order from stop_times.txt) or stop_id (from stops.txt).[95] Key fields include arrival and departure (absolute timestamps in seconds since Unix epoch), or delay (seconds offset from the static schedule, positive for late or negative for early).[91] The schedule_relationship for the stop can be SCHEDULED (update applies normally), SKIPPED (the stop is bypassed), or NO_DATA (no prediction available).[95] These updates integrate with the static schedule by using the baseline times from stop_times.txt as a reference; for example, a +300-second delay at stop sequence 5 adjusts the predicted arrival relative to the scheduled time.[91]
Additional fields under TripProperties allow modifications like changing the trip_headsign (destination sign text from trips.txt), enabling reports of route adjustments without altering the core trip identity.[96] GTFS Realtime feeds use FULL_DATASET incrementality, but agencies can publish partial datasets containing only changed entities to reduce bandwidth; true differential mode remains unsupported.[88] Partial updates are common, where only affected stops or trips are included, allowing efficient streaming of changes like a single-stop delay.[89]
Common use cases include predicting arrivals with delays (e.g., reporting a +5-minute offset via the delay field), handling cancellations by setting the trip's schedule_relationship to CANCELED, or updating headsigns for detour announcements.[91] For instance, if a trip is running ahead, a negative delay like -120 seconds can be applied to earlier stops, while later ones use absolute times for precision.[97]
A representative example in Protocol Buffer text format (decoded for readability) illustrates a delayed trip:
entity {
id: "simple-trip"
trip_update {
trip {
trip_id: "trip1"
start_time: "14:05:00"
start_date: "20220628"
route_id: "ROUTE1"
direction_id: 0
schedule_relationship: SCHEDULED
}
stop_time_update {
stop_sequence: 3
arrival {
delay: 5
}
departure {
delay: 5
}
schedule_relationship: SCHEDULED
}
stop_time_update {
stop_sequence: 12
arrival {
delay: -2
}
departure {
delay: -2
}
schedule_relationship: SCHEDULED
}
}
}
entity {
id: "simple-trip"
trip_update {
trip {
trip_id: "trip1"
start_time: "14:05:00"
start_date: "20220628"
route_id: "ROUTE1"
direction_id: 0
schedule_relationship: SCHEDULED
}
stop_time_update {
stop_sequence: 3
arrival {
delay: 5
}
departure {
delay: 5
}
schedule_relationship: SCHEDULED
}
stop_time_update {
stop_sequence: 12
arrival {
delay: -2
}
departure {
delay: -2
}
schedule_relationship: SCHEDULED
}
}
}
This protobuf snippet updates stop sequences 3 and 12 for trip1 on June 28, 2022, showing a 5-second delay early in the route and 2 seconds ahead later, relative to static stop_times.txt.[97] In JSON encoding (supported by some tools for interoperability), the structure mirrors this, with fields like "delay": 5 under arrival/departure objects.
Vehicle Positions
The VehiclePosition message in GTFS Realtime provides real-time location data for transit vehicles, enabling applications to track their movements and statuses dynamically. This feed entity is distinct from trip updates, which primarily handle timing modifications, by focusing on geospatial and operational details derived from onboard systems like GPS. Agencies publish these messages in a Protocol Buffers format, typically updated frequently to reflect current conditions, with data recommended to be no older than 90 seconds for accuracy.[98][99]
The core structure of a VehiclePosition includes several key components. The VehicleDescriptor identifies the vehicle with fields such as id (a unique identifier), label (a user-facing name), and license_plate (optional vehicle registration). The Position is mandatory and contains latitude and longitude in WGS-84 degrees, along with optional details like bearing (direction in degrees clockwise from true north), speed (in meters per second), and odometer (cumulative distance traveled in meters). Linkage to scheduled service occurs via the TripDescriptor, which references the static GTFS trip ID, schedule relationship (e.g., scheduled or added), and route or trip details. Additionally, the StopStatus indicates the vehicle's proximity to stops, using enums like INCOMING_AT (arriving), STOPPED_AT (at the stop), or IN_TRANSIT_TO (en route, default).[98][100]
Specialized fields enhance situational awareness. The current_status field specifies the vehicle's state relative to its next stop, aligning with the StopStatus enum for precision in arrival predictions. Congestion_level categorizes traffic conditions using enums such as UNKNOWN_CONGESTION_LEVEL, RUNNING_SMOOTHLY, STOP_AND_GO, CONGESTION, or SEVERE_CONGESTION, helping to contextualize delays. The experimental occupancy_status provides crowding information via enums including EMPTY, MANY_SEATS_AVAILABLE, FEW_SEATS_AVAILABLE, STANDING_ROOM_ONLY, CRUSHED_STANDING_ROOM_ONLY, FULL, and NOT_ACCEPTING_PASSENGERS, allowing apps to display passenger load levels. A timestamp records when the position was captured, separate from the feed's generation time.[100][101]
Vehicle positions integrate with Automatic Vehicle Location (AVL) systems to power live tracking maps in transit applications, such as displaying bus icons on Google Maps with real-time routes and estimated times based on GPS data. Occupancy data supports crowding alerts, enabling users to avoid overloaded vehicles; for instance, Google Maps visualizes levels as icons indicating low, medium, high crowding, or full capacity on Android and iOS devices. These features improve rider decision-making by combining location with capacity insights.[102][103]
Regarding precision, positions rely on GPS coordinates in the WGS-84 system, with typical AVL devices providing accuracy within 10-20 meters under urban conditions, though no strict requirement is mandated in the spec—agencies are encouraged to use reliable hardware for consistent updates. Examples include bus tracking apps like those from Transit or agency-specific tools, where vehicle positions enable features like next-bus notifications by snapping GPS points to static route shapes for smoother visualization. Research on GTFS Realtime accuracy highlights that positional errors can affect on-time performance metrics, underscoring the need for timely and precise feeds.[98][104]
Service Alerts
Service Alerts in GTFS Realtime provide a mechanism for transit agencies to communicate disruptions and other important information affecting public transportation services, such as station closures, route detours, or accessibility issues.[105] These alerts are distinct from trip updates or vehicle positions, focusing instead on broader network impacts that may not be tied to specific vehicle movements.[10] The Alert message structure allows for targeted notifications by specifying affected entities and including descriptive text, enabling applications to display relevant warnings to users in real time.[10]
The core of a Service Alert is the InformedEntity field, which identifies the affected components of the transit network using selectors for agencies, routes, route types, trips, or stops from the corresponding GTFS Schedule feed.[10] Multiple InformedEntity entries can be used to cover various impacts, with fields within a single entry combined via logical AND (e.g., a specific route and stop).[105] The Alert header provides context through enums for cause (e.g., UNKNOWN_CAUSE=0, CONSTRUCTION=9, ACCIDENT=5, WEATHER=7), effect (e.g., NO_SERVICE=0, DETOUR=3, ACCESSIBILITY_ISSUE=10), and severity_level (e.g., INFO=0, WARNING=1, SEVERE=3), helping applications prioritize and categorize the alert.[10]
Descriptive content is delivered via TextMessage fields, including header_text for a concise summary and description_text for detailed explanations, both utilizing TranslatedString to support multiple languages through Translation sub-messages with BCP-47 language codes.[10] Additional fields include a Url (also a TranslatedString) for linking to external details, such as an agency webpage, and active_period TimeRange entries to define when the alert should be displayed, with multiple ranges possible for recurring issues; omission of active_period implies indefinite activity.[10]
Common use cases for Service Alerts include notifying users of station closures due to maintenance (cause=8, effect=NO_SERVICE=0) or line-wide detours from accidents (cause=5, effect=DETOUR=3), with multi-language translations ensuring accessibility for diverse riders.[105] For instance, an alert might target a specific stop on route "1" affected by weather (cause=7), integrating with GTFS Schedule entities to limit notifications to relevant trips.[105] In practice, these alerts power push notifications in mobile apps or on-screen displays in stations, enhancing rider awareness during disruptions.[106]