Fact-checked by Grok 2 weeks ago

GTFS

The General Transit Feed Specification (GTFS) is a community-driven open data standard that defines a common format for public transportation schedules, routes, stops, and associated geographic information, enabling transit agencies to share static and real-time service data with software applications and riders worldwide.^[1] Originally developed in 2005 through a collaboration between TriMet in Portland, Oregon, and Google to integrate transit data into online trip planners like Google Maps, GTFS began as the Google Transit Feed Specification before being renamed in 2010 to emphasize its broader, independent adoption beyond Google's ecosystem.^[2] Its primary purpose is to facilitate accessible, interoperable transit information for applications such as route planning, mobile apps, and real-time alerts, rather than internal operational systems, and it has been licensed under the Apache 2.0 open-source license since its inception.^[1] GTFS consists of two main components: GTFS Schedule, which packages static data in a ZIP file containing required text files (e.g., agency.txt for agency details, stops.txt for stop locations, routes.txt for route definitions, and trips.txt for scheduled trips) along with over 15 optional files for enhancements like fares, pathways between stops, and multilingual translations; and GTFS Realtime, which uses Protocol Buffers to deliver dynamic updates such as vehicle positions, trip delays, and service alerts.^[3] This structure allows for simple maintenance and validation while supporting extensions for specialized needs, such as accessibility features or fare calculations.^[4] Since its release, GTFS has seen widespread adoption by thousands of transit agencies globally, powering tools from mapping services to analytics platforms and contributing to improved rider experiences through standardized data sharing.^[5] As of 2025, it is used by over 10,000 agencies in more than 100 countries and has become a de facto international standard, influencing policies such as the U.S. Federal Transit Administration's requirements for GTFS data in National Transit Database reporting, which supports eligibility for federal grants.^[6]^[7]

Overview

Definition and Purpose

The General Transit Feed Specification (GTFS) is an open data standard that defines a common format for public transportation schedules and associated geographic information, allowing transit agencies to publish static schedules, routes, stops, and fares in a machine-readable structure.^[5] Developed primarily to facilitate the integration of transit data into mapping and trip planning applications, GTFS enables agencies to share information seamlessly with third-party developers and tools, thereby reducing fragmentation in transit data availability and promoting interoperability across diverse software ecosystems.^[3] This format supports a wide range of transit modes, including buses, trains, subways, ferries, and paratransit services, making it adaptable to urban, suburban, and regional networks worldwide.^[8] The primary purpose of GTFS is to empower riders with accessible, accurate transit information through applications like mobile trip planners, which can aggregate data from multiple agencies to provide multimodal journey options and real-time updates when combined with extensions like GTFS Realtime.^[5] By standardizing data exchange, it lowers barriers for developers to build innovative tools for route optimization, accessibility analysis, and service visualization, ultimately enhancing the overall rider experience with reliable schedules and geographic context.^[8] GTFS data feeds are generally published under open licenses by transit agencies, such as Creative Commons Attribution, permitting reuse, modification, and distribution with attribution to the provider. GTFS emerged in the mid-2000s as a response to the fragmented and proprietary nature of transit data at the time, which hindered widespread adoption of digital trip planning services.^[8] Originating from a collaboration between Google and TriMet, the public transit agency in Portland, Oregon, the specification was initially designed to format TriMet's data for integration into Google Maps, addressing the need for a simple, exportable format that agencies could maintain without complex technical infrastructure.^[8] This partnership quickly established GTFS as a de facto global standard, now supporting thousands of transit providers and fostering open data initiatives in public transportation.^[5]

Components and Variants

GTFS consists of two primary components: GTFS Schedule, which provides static transit information, and GTFS Realtime, which delivers dynamic updates to complement the static data.^[1] GTFS Schedule is a static dataset distributed as a ZIP file containing one or more CSV-formatted text files that describe public transportation schedules, routes, stops, and associated details such as fares and accessibility features.^[1] It serves as the baseline for transit planning and journey routing applications by defining fixed timetables and service patterns, enabling software to ingest and process agency data in a standardized manner.^[9] At minimum, a GTFS Schedule feed includes the required files agency.txt, routes.txt, trips.txt, stops.txt, stop_times.txt, and either calendar.txt or calendar_dates.txt (or both), along with optional files for enhanced functionality.^[1] GTFS Realtime extends GTFS Schedule by providing live updates on service disruptions, using a Protocol Buffers format to encode structured data for efficiency.^[10] This component includes three main message types: TripUpdate for delays or cancellations, VehiclePosition for real-time locations, and Alert for service advisories, allowing agencies to report deviations from the static schedule.^[10] GTFS Realtime feeds reference the corresponding GTFS Schedule dataset to ensure contextual accuracy, with the current Protocol Buffers schema at version 2.0 as of 2024.^[10] Several variants and extensions build on these core components to address specialized transit needs. GTFS-Flex, adopted into the official GTFS specification in March 2024, extends GTFS Schedule to model demand-responsive transportation services, such as dial-a-ride or deviated routes, by adding files like booking_rules.txt and locations.geojson, along with new fields in stop_times.txt for pickup and drop-off windows.^[11] It facilitates integration with trip planners by making flexible services discoverable alongside fixed-route data.^[11] GTFS feeds also integrate with the Mobility Database schema, an open catalog maintained by MobilityData that standardizes metadata for over 4,000 transit and shared mobility datasets in GTFS format, promoting data sharing and validation.^[12] Emerging standards like the General On-Demand Feed Specification (GOFS), launched in 2025, complement GTFS by providing a dedicated format for purely on-demand services, ensuring interoperability while keeping GTFS focused on scheduled and semi-flexible transit.^[13]

History

Origins and Development

GTFS originated in 2005 as part of Google's initiative to incorporate public transit schedules into its mapping platform, addressing the fragmentation of transit data across agencies. The project began through a collaboration between Google engineers and TriMet, the transit authority in Portland, Oregon, after TriMet's IT manager Bibiana McHugh contacted Google to enable transit integration in Google Maps. This effort resulted in the creation of a simple, text-based format initially dubbed the Google Transit Feed Specification (GTFS), focused on static route and schedule information in CSV files for easy parsing and integration. The inaugural implementation launched on December 7, 2005, with Google Transit featuring only TriMet's data, marking the format's debut in a live trip-planning tool.^[14]^[15] Development proceeded iteratively, involving feedback from over 30 transit agencies, developers, and Google staff to refine the specification for reliability and usability. Early priorities centered on supporting multi-agency feeds and headway-based scheduling, culminating in the public release of the GTFS specification in September 2006 alongside expansion to five additional U.S. cities: Eugene, Honolulu, Pittsburgh, Seattle, and Tampa. By 2009, documentation efforts advanced with updates to the Google Code wiki, including removal of Google-specific submission guidelines and proposals to rename the format to the General Transit Feed Specification, reflecting its growing independence from Google's ecosystem. This collaborative process emphasized simplicity and openness, enabling agencies to export data without proprietary software.^[3]^[16] Adoption accelerated rapidly in the late 2000s, with GTFS feeds published by an estimated 261 transit agencies worldwide by 2010, covering over 250 cities and extending to international regions like Europe and parts of Asia. This growth was driven by the format's ease of implementation and its role in enhancing Google Maps' transit directions. To address limitations in static data, Google introduced GTFS Realtime in August 2011, an extension for dynamic updates on vehicle locations, trip delays, and service alerts, developed in partnership with agencies including TriMet, MBTA, and BART. A pivotal milestone came in 2013, when control of the specification's evolution shifted fully to the open community through dedicated discussion forums, solidifying GTFS as a de facto global standard beyond Google's proprietary use.^[17]

Governance and Evolution

In 2013, the GTFS community began formalizing under the developers.google.com platform, shifting from initial Google-led development to collaborative maintenance of the specification through a wiki and the GitHub repository at google/transit, where stakeholders could propose and discuss changes.^[18]^[19] This transition empowered transit agencies, developers, and other users to contribute to the evolving standard, fostering an open process for updates while Google retained oversight of the core repository. The establishment of gtfs.org in 2022, led by the non-profit MobilityData in collaboration with community leaders like Andrew Byrd, marked a significant step toward centralized, accessible documentation for GTFS.^[8]^[20] This platform consolidated resources previously scattered across Google Transit APIs, the GitHub repository, and other sites, providing a unified hub for specifications, best practices, and community resources to support broader adoption and maintenance. A major advancement in governance occurred in 2025, when a new framework for GTFS Schedule took effect on July 7, following a community vote conducted from June to July 2025.^[21]^[22] This update, proposed via Pull Request #544 and refined over two years of feedback, introduced structured pull request reviews for proposed changes and established working groups for GTFS Schedule and Realtime to facilitate discussions, categorization of change types (e.g., clarifications, additions), and voting processes.^[23] These mechanisms ensure collaborative decision-making, with votes occurring in GitHub Pull Requests and summaries from working group meetings integrated into comments for transparency. Key evolution milestones include the October 28, 2025, revision of the GTFS Schedule reference, which incorporated enhancements for fare complexities through GTFS-Fares V2 (adding files like fare_leg_rules.txt, fare_leg_join_rules.txt, and fare_transfer_rules.txt), pathways via pathways.txt for intra-station navigation, and translations with translations.txt for multilingual support.^[9]^[24] Ongoing community efforts are tracked through monthly GTFS Digests, which highlight active proposals such as semantics clarifications for fields like trip modifications and transfer rules, encouraging participation via voting and discussion.^[25]^[26]

Applications

Journey Planning and Routing

GTFS plays a central role in journey planning software by providing the static data necessary to model transit networks and compute optimal routes for users. Journey planners parse key files such as routes.txt, which defines route paths and types; trips.txt, which specifies individual vehicle runs along those routes; and stop_times.txt, which details arrival and departure times at stops to enable calculations of itineraries, transfer points, and total travel durations.^[9]^[27] This parsing allows algorithms to construct graphs where stops serve as nodes and trips as edges weighted by time, facilitating efficient shortest-path computations like Dijkstra's or more specialized transit variants that account for service calendars and frequencies.^[9] Prominent tools leverage GTFS for integrated routing in widely used applications. Google Maps incorporates GTFS feeds to deliver transit directions, combining route and schedule data with mapping layers for seamless trip suggestions across thousands of agencies.^[28] Similarly, the open-source OpenTripPlanner (OTP) processes GTFS to perform multimodal routing, employing algorithms such as RAPTOR for rapid connection-based searches that optimize paths using stop sequences and calendar-constrained availability. Apple Maps also utilizes GTFS-compatible transit data for planning public transport legs within broader navigation.^[29] GTFS supports multimodal integration by allowing planners to combine transit data with pedestrian and cycling networks, enabling door-to-door routing from origin to destination. This involves augmenting GTFS-derived transit segments with walking or biking distances calculated from geographic coordinates in stops.txt and shapes.txt, while route types—such as 3 for bus or 2 for rail—help differentiate vehicle speeds and transfer rules in the overall itinerary.^[30]^[31] In practice, GTFS powers journey planning for over 10,000 agencies in more than 100 countries, enabling apps to offer features like estimated arrival times based on schedules and filters for accessibility, such as wheelchair-accessible routes and stops.^[7] For instance, users in cities like New York or London can plan trips via Google Maps that incorporate bus, rail, and walking segments with precise transfer timings.

Real-Time Information and Operations

GTFS Realtime extends the static GTFS Schedule by providing dynamic updates on vehicle positions, trip modifications, and service alerts, enabling transit agencies to disseminate live information to improve operational decision-making and passenger experience. This specification, developed as an open protocol buffer format, allows agencies to feed data from Automatic Vehicle Location (AVL) systems into centralized platforms for real-time vehicle tracking, where GPS-enabled devices on buses or trains transmit location data to monitor adherence to routes and schedules.^[32] In operations, GTFS Realtime supports dynamic schedule adjustments by alerting dispatchers to deviations, such as traffic delays or mechanical issues, facilitating proactive rerouting or holding decisions to maintain service reliability. Fleet management benefits from this integration, as agencies use the data to optimize resource allocation, including deploying spare vehicles or adjusting driver assignments based on live telemetry. For instance, the New York City Metropolitan Transportation Authority (MTA) employs GTFS Realtime feeds for subway tracking, publishing vehicle positions and trip updates that enable control centers to respond to congestion in real time across its extensive network.^[33] Third-party applications and rider apps consume these feeds to deliver predicted arrival times, delay notifications, and alerts for service disruptions, enhancing user confidence and trip planning accuracy. Studies indicate that access to such real-time information can reduce perceived wait times by up to 20% for transit riders, as demonstrated in field experiments where mobile apps using AVL-derived data allowed users to time their arrivals more precisely, thereby minimizing idle time at stops.^[34] Key challenges in GTFS Realtime implementation include maintaining data freshness, with best practices recommending updates every 10-30 seconds for trip updates and vehicle positions to ensure timeliness, though delays beyond 90 seconds can degrade usability. Additionally, feeds must be validated against the corresponding GTFS Schedule baseline to prevent inconsistencies, such as mismatched trip IDs or route alignments, which could lead to erroneous predictions if not regularly cross-checked by agency systems.^[35]

Research and Accessibility Analysis

Researchers have increasingly utilized GTFS data to conduct analyses of public transit equity and accessibility, leveraging its standardized structure to quantify service disparities across urban populations. By processing GTFS feeds, studies can evaluate how transit availability intersects with socioeconomic factors, revealing patterns of underinvestment in marginalized communities. This approach has become prominent in the 2010s and 2020s, as open GTFS datasets enable scalable, reproducible research without proprietary software.^[36] A key research tool involves parsing the stop_times.txt and stops.txt files to assess service frequency, coverage gaps, and demographic impacts. The stop_times.txt file provides arrival and departure timestamps for trips at specific stops, allowing researchers to calculate headways and operational reliability; for instance, aggregating these times can identify areas with infrequent service, such as headways exceeding 30 minutes during peak hours, which correlate with reduced ridership in low-income neighborhoods. Meanwhile, stops.txt details stop locations and attributes, enabling spatial analysis of coverage; by geospatially joining this data with demographic layers, studies have shown that transit gaps disproportionately affect minority and low-income populations, with coverage deficits up to 20% higher in such areas compared to affluent ones. These analyses often integrate GTFS with external datasets like the U.S. Census Bureau's American Community Survey to map equity metrics, such as the percentage of households without access to frequent transit within a half-mile walk.^[37]^[38]^[39] Accessibility studies employing GTFS focus on fields like wheelchair_boarding in stops.txt and pathways.txt to evaluate compliance with the Americans with Disabilities Act (ADA). The wheelchair_boarding field indicates whether a stop accommodates wheelchair users (e.g., values of 1 for ramps or lifts available, 2 for some limitations), allowing audits of ADA adherence; research highlights gaps in reporting full accessibility in GTFS feeds. The pathways.txt file describes intra-station connections, such as ramps or elevators between levels, which supports modeling barrier-free routes; when combined with routing algorithms, it enables computation of accessibility metrics like the number of jobs reachable within 45 minutes by wheelchair-accessible transit. These metrics reveal disparities in accessible job access for disabled users in sprawling urban areas.^[40]^[41]^[42] Notable examples from 2020s research illustrate GTFS's role in exposing urban inequality, particularly transit deserts—areas with minimal service relative to demand. A 2021 study using GTFS data identified transit deserts affecting millions of residents, where low-frequency routes leave low-income households isolated from employment centers; this revealed how such gaps exacerbate poverty cycles in urban peripheries. Similarly, integration of GTFS with census data has supported equity audits, showing that Black and Hispanic communities in major metros have fewer accessible jobs via transit compared to white counterparts, informing targeted interventions.^[43]^[44]^[36] GTFS analyses have influenced policy by providing evidence for transit investments, notably through the U.S. Federal Transit Administration's (FTA) performance metrics frameworks. The FTA incorporates GTFS-derived indicators into its National Transit Database reporting and Mobility Performance Metrics, using them to evaluate service equity and accessibility for federal funding allocations; for example, agencies must report on accessible trips and coverage to low-income areas, guiding over $10 billion in annual grants toward closing identified gaps. This has led to policy shifts, such as prioritizing ADA-compliant expansions in under-served regions based on GTFS audits.^[45]^[46] GTFS feeds are distributed through various registries and platforms that aggregate and provide access to publicly available datasets, facilitating open data sharing among transit agencies and developers. One prominent historical registry was TransitFeeds.com, which operated from 2013 to 2022 and hosted thousands of GTFS feeds before being archived and succeeded by the Mobility Database.^[12] The Mobility Database, launched in February 2024 by MobilityData, serves as a leading open repository containing over 4,000 transit and shared mobility feeds in GTFS and related formats from more than 70 countries, allowing users to search, download, and access metadata for feeds worldwide.^[47] In the United States, the Federal Transit Administration (FTA) maintains a national hub through the National Transit Database (NTD), which collects and publishes GTFS weblinks from reporting agencies, enabling geospatial analysis and service coverage data for over 1,000 U.S. transit providers as part of annual reporting requirements.^[48] Additionally, the official GTFS website (gtfs.org) hosts the Canonical GTFS Schedule Validator, a free open-source tool developed by MobilityData to check feed compliance with the specification and best practices, helping agencies ensure data quality before publication.^[49] Transit agencies typically share GTFS data by publishing a ZIP file containing the required text files at a stable, public URL on their website, such as agency.org/gtfs.zip, which allows direct access without registration or proprietary systems.^[9] To support version control and track updates, agencies include a feed_version field in the feed_info.txt file, often incrementing it as an integer or semantic version (e.g., "2.1") with each schedule change to help applications detect new data.^[50] By 2025, GTFS has seen widespread global adoption, with over 10,000 agencies in more than 100 countries publishing feeds to describe their services.^[7] Standards for updates emphasize regular refreshes to reflect timetable changes, such as seasonal adjustments for summer schedules or holidays, with many agencies updating every few months and larger ones doing so weekly to maintain accuracy.^[51]^[52] These sharing mechanisms provide significant benefits by enabling third-party developers to build applications using open GTFS data, bypassing the need for costly proprietary APIs from individual agencies.^[29] For instance, European Mobility as a Service (MaaS) platforms like Whim and Citymapper integrate GTFS feeds from multiple operators to offer seamless multimodal trip planning across cities such as Helsinki and London.^[53]

GTFS Schedule Structure

File Format and Dataset Organization

GTFS Schedule datasets are packaged as self-contained ZIP archives containing a collection of text files, each representing a specific aspect of transit information, such as agencies, routes, and stops. All files must reside at the root level of the ZIP archive, with no subdirectories permitted, ensuring a simple and portable structure. These files are formatted as comma-separated values (CSV) with a .txt extension, encoded in UTF-8 (with an optional Byte Order Mark), and the first line of each file must contain case-sensitive header names defining the fields. This format adheres to the CSV standard outlined in RFC 4180, prohibiting elements like tabs, carriage returns, new lines within fields, or HTML tags to maintain parsability.^[9] The organization of the dataset relies on unique identifiers (IDs) to establish relationships across files, enabling efficient querying and integration. For instance, the route_id field in routes.txt serves as a foreign key referenced in trips.txt to associate trips with their routes, while stop_id links stop_times.txt to stops.txt. Files are categorized as required, optional, or conditionally required based on the transit service type; mandatory files include agency.txt, routes.txt, trips.txt, and stop_times.txt, which form the core of any valid feed. Conditional requirements allow flexibility, such as making stops.txt optional when defining demand-responsive service zones via the locations.geojson file instead. This ID-based linking supports modular data management without redundancy.^[9] Validation of GTFS datasets is essential to ensure compliance with the specification, typically performed using open-source tools like the Canonical GTFS Schedule Validator maintained by MobilityData. The v7.0 release, issued in March 2025, incorporates full support for extensions like GTFS Flex and Rider Categories, flagging errors such as missing mandatory files, invalid field values, or inconsistent IDs. Common error types include the absence of required files like stop_times.txt or violations of conditional rules, such as referencing undefined stop_ids.^[54]^[55] GTFS feeds are designed for scalability, with typical sizes ranging from 1 MB for small agencies to 100 MB or more for large metropolitan systems; for example, the Sydney transit feed exceeds 85 MB uncompressed. The stop_times.txt file, which records arrival and departure times for each trip stop, often dominates the dataset size and can contain over 1 million entries in major networks, demonstrating the format's capacity to handle extensive schedules without performance degradation in standard processing tools.^[56]^[57]

Agency and Feed Metadata

The agency.txt file in a GTFS Schedule dataset defines the transit agencies responsible for the services included in the feed, enabling support for multiple agencies within a single dataset.^[58] This file is required and uses agency_id as its primary key, with each row representing one agency.^[58] All agencies in the dataset must share the same agency_timezone to ensure consistent time handling across the feed.^[58] The file includes the following fields:

Field Name	Type	Presence	Description
`agency_id`	Unique ID	Conditionally required	Identifies a transit agency or brand; required if multiple agencies are present in the feed, and recommended otherwise for unique identification.
`agency_name`	Text	Required	The full name of the transit agency.
`agency_url`	URL	Required	A fully qualified URL pointing to the agency's website.
`agency_timezone`	Timezone	Required	The timezone for the agency, following the Olson naming convention (e.g., "America/New_York"); must be identical for all agencies.
`agency_lang`	Language code	Optional	The primary language used by the agency, in IETF BCP 47 format (e.g., "en" for English).
`agency_phone`	Phone number	Optional	A voice telephone number for customer contact with the agency.
`agency_fare_url`	URL	Optional	A fully qualified URL for purchasing tickets or accessing fare information.
`agency_email`	Email	Optional	An actively monitored email address for customer service inquiries.

These fields allow consumers of the feed, such as trip planning applications, to display accurate agency information to users.^[58] For multi-agency regions, such as metropolitan areas served by coordinated operators, this structure facilitates unified data distribution without separate feeds.^[58] The feed_info.txt file provides essential metadata about the entire GTFS dataset, including details on its publisher and validity period.^[59] It is conditionally required—if the optional translations.txt file is present, feed_info.txt must be included; otherwise, it is recommended for all feeds.^[59] The file contains only one row, with no primary key, and uses language codes in IETF BCP 47 format for textual fields.^[59] The file includes the following fields:

Field Name	Type	Presence	Description
`feed_publisher_name`	Text	Required	The full name of the organization publishing the GTFS feed.
`feed_publisher_url`	URL	Required	A fully qualified URL for the publisher's website.
`feed_lang`	Language code	Required	The default language used for text in the dataset; use "mul" for multilingual feeds with translations.
`default_lang`	Language code	Optional	The fallback language to display when the user's preferred language is unknown (e.g., "en").
`feed_start_date`	Date	Recommended	The start date of the schedule's validity period, in YYYYMMDD format.
`feed_end_date`	Date	Recommended	The end date of the schedule's validity period, in YYYYMMDD format; must not precede `feed_start_date`.
`feed_version`	Text	Recommended	A version identifier for the feed (e.g., "2025-11-01"), useful for tracking updates like seasonal changes.
`feed_contact_email`	Email	Optional	An email address for technical inquiries about the feed.
`feed_contact_url`	URL	Optional	A URL providing contact or support information for the feed.

This metadata helps feed consumers understand the scope and maintenance of the dataset, such as through feed_version for detecting changes during seasonal updates.^[59] Best practices recommend including at least one contact method (feed_contact_email or feed_contact_url) to address data quality issues reported by users, and specifying feed_start_date and feed_end_date to clarify the temporal coverage of the schedule.^[59]

Route and Trip Definitions

In the GTFS Schedule dataset, routes are defined in the routes.txt file, which provides essential metadata for transit lines operated by agencies. Each route is identified by a unique route_id, serving as the primary key that links to other files in the feed. The agency_id field establishes a connection to the agency defined in agency.txt, allowing routes to be associated with specific operators, particularly in multi-agency feeds. Public-facing identifiers include route_short_name for abbreviated labels, such as "32", and route_long_name for descriptive titles like "32nd St. Crosstown", with at least one required for display purposes.^[9] The route_type field categorizes the mode of transportation using an enumerated system aligned with common transit standards, where values such as 0 indicate tram/streetcar/light rail, 1 for subway/metro, 2 for rail, 3 for bus (covering short- and long-distance services), 4 for ferry, and additional types like 11 for trolleybus. Visual elements are supported via route_color (a hexadecimal code, e.g., "00FF00" for green) and route_text_color for contrasting text. For routes with flexible service patterns, continuous_pickup and continuous_drop_off specify boarding and alighting rules along the path: 0 or empty for no continuous service, 1 for continuous, 2 for arrangement with driver, or 3 for advance coordination with the agency; these fields are conditionally forbidden if detailed stop times are provided elsewhere. As of updates in 2025, the network_id field in routes.txt—used to group routes—has been made conditionally forbidden when networks.txt or route_networks.txt files are present, promoting more structured network modeling.^[9]^[9] Trips, representing individual instances of service along a route, are detailed in the trips.txt file, which establishes a one-to-many relationship with routes through the shared route_id field. Each trip has a unique trip_id as its primary key and references a service_id from calendar.txt or calendar_dates.txt to indicate operational dates. Variants within a route, such as express versus local services, can be distinguished using fields like trip_headsign for destination signage (e.g., "Downtown via Express") or direction_id (0 for outbound, 1 for inbound, though conventions vary by agency). Additional attributes include shape_id linking to path geometry in shapes.txt, and accessibility indicators: wheelchair_accessible (0 for no information, 1 for fully accessible, 2 for partially accessible) and bikes_allowed (0 for no information, 1 for allowed, 2 for not allowed). This structure enables modeling of diverse trip patterns under a single route, such as peak-hour variants or seasonal adjustments, without duplicating route metadata.^[9]

Field in routes.txt	Type	Presence	Description
route_id	Unique ID	Required	Unique identifier for the route.
agency_id	Foreign ID	Conditionally Required	References the operating agency.
route_short_name	Text	Conditionally Required	Short public name (e.g., "3").
route_long_name	Text	Conditionally Required	Full public name (e.g., "Broadway Local").
route_type	Enum	Required	Transportation mode (e.g., 3=Bus).
route_color	Color (Hex)	Optional	Color for route visualization.
continuous_pickup	Enum (0-3)	Conditionally Forbidden	Rules for ongoing boarding.
continuous_drop_off	Enum (0-3)	Conditionally Forbidden	Rules for ongoing alighting.
network_id	ID	Conditionally Forbidden	Route grouping (deprecated in favor of networks files post-2025).

Field in trips.txt	Type	Presence	Description
route_id	Foreign ID	Required	Links to the parent route.
service_id	Foreign ID	Required	Defines service days.
trip_id	Unique ID	Required	Unique trip identifier.
trip_headsign	Text	Optional	Destination display text.
direction_id	Enum (0/1)	Optional	Travel direction.
shape_id	Foreign ID	Optional	Path shape reference.
wheelchair_accessible	Enum (0-2)	Optional	Accessibility level.
bikes_allowed	Enum (0-2)	Optional	Bike policy.

Stop Locations and Timetables

The stops.txt file in a GTFS dataset defines the physical locations of transit stops, stations, and related points, providing essential geographic and descriptive data for transit networks. Each entry includes a unique stop_id to identify the location, a stop_name that matches the agency's rider-facing nomenclature (such as on timetables or signage), and geographic coordinates via stop_lat and stop_lon in WGS84 decimal degrees, with a recommended precision of six decimal places (approximately 0.11 meters accuracy).^[60] These coordinates pinpoint the boarding point, such as a bus pole or platform edge, rather than the roadway or track, ensuring relevance for passenger navigation.^[60] The location_type field categorizes entries as 0 for a stop or platform, 1 for a station (a grouping of multiple stops), or other values like 2 for entrances/exits, allowing hierarchical organization.^[60] For instance, a multi-platform station can be represented as a parent entry (location_type=1) with child stops (location_type=0) linked via the parent_station field, which references the station's stop_id; this enables modeling complex sites like rail hubs with distinct platforms identified by an optional platform_code (e.g., "A1" or "Track 3").^[60] Additionally, the wheelchair_boarding field indicates accessibility: 0 for no information, 1 for possible (or accessible path from parent), or 2 for not possible, supporting inclusive routing applications.^[60] An optional zone_id can tag stops for fare zones, linking to fare rules without affecting core location data.^[60] The stop_times.txt file specifies scheduled arrival and departure timings for each stop along a trip, linking back to trip definitions via the required trip_id field.^[61] Timings use the arrival_time and departure_time fields in HH:MM:SS format relative to the agency's timezone, supporting values exceeding 24:00:00 for overnight or multi-day services (e.g., 25:30:00 for 1:30 AM the following day).^[61] The stop_sequence field orders stops sequentially for a trip (e.g., 1, 3, 5), not necessarily consecutively, while an optional stop_headsign provides destination signage that may vary per stop, overriding the trip-level headsign if needed.^[61] Pickup and drop-off rules are governed by pickup_type and drop_off_type enums: 0 (or empty) for regular service, 1 for none, 2 to arrange via agency phone, or 3 to coordinate with the driver, allowing restrictions like request-stop operations.^[61] The timepoint field distinguishes exact schedules (1) from approximate or interpolated times (0), which is crucial for services without fixed stops, such as express buses or trains using continuous positioning models where timings are estimated between major points.^[61] For exact timepoints, both arrival and departure times must be specified; approximate ones facilitate flexible operations like on-demand or variable-speed routes.^[61]

Service Calendars and Exceptions

The service calendars in GTFS Schedule are defined primarily through the calendar.txt file, which specifies recurring patterns of service availability for trips on a weekly basis within a defined date range.^[62] This file groups trips that operate on the same days of the week, using boolean indicators for each weekday to denote whether service is active (1) or inactive (0).^[62] The key fields include service_id (a unique identifier linking to trips in trips.txt), monday through sunday (required enum fields for daily availability), start_date (the first valid service date in YYYYMMDD format), and end_date (the last valid service date in YYYYMMDD format, which must be on or after start_date).^[62] For example, a weekday commuter service might set monday to 1, tuesday to 1, and so on through friday to 1, with saturday and sunday set to 0, applying this pattern from a start date to an end date spanning several months.^[62] To handle irregularities such as holidays, special events, or seasonal adjustments that deviate from the weekly pattern, the calendar_dates.txt file provides date-specific exceptions.^[63] This file references the same service_id from calendar.txt (or defines standalone services if calendar.txt is absent) and includes date (a specific date in YYYYMMDD format) and exception_type (an enum where 1 adds service on that date and 2 removes it).^[63] Each combination of service_id and date must be unique, allowing producers to override the default calendar—for instance, removing service on a holiday (exception_type 2) from a weekday pattern or adding extra service on an event day (exception_type 1).^[63] This approach enables precise control over service validity without altering the core weekly definitions. GTFS requires at least one of calendar.txt or calendar_dates.txt to be present in a feed, ensuring all service dates are explicitly defined; if calendar.txt is omitted, calendar_dates.txt must list every active date comprehensively.^[62]^[63] For holidays or events, a common practice is to remove the date from the regular service pattern using exception_type 2 in calendar_dates.txt and, if applicable, add it to an alternative pattern (e.g., a reduced holiday schedule) using exception_type 1.^[64] To promote feed simplicity and consumer efficiency, best practices recommend minimizing the use of exceptions in calendar_dates.txt by favoring broad weekly patterns in calendar.txt where possible, while reserving exceptions for true deviations.^[9] The service_id serves as the linkage mechanism, associating these calendar definitions directly with individual trips in trips.txt to determine operational validity on any given date.^[62]

Field Name	Type	Presence	Description
service_id	Unique ID	Required	Identifies the service pattern, referenced by `trips.txt`.
monday	0 or 1	Required	1 if service operates on Mondays; 0 otherwise.
tuesday	0 or 1	Required	1 if service operates on Tuesdays; 0 otherwise.
wednesday	0 or 1	Required	1 if service operates on Wednesdays; 0 otherwise.
thursday	0 or 1	Required	1 if service operates on Thursdays; 0 otherwise.
friday	0 or 1	Required	1 if service operates on Fridays; 0 otherwise.
saturday	0 or 1	Required	1 if service operates on Saturdays; 0 otherwise.
sunday	0 or 1	Required	1 if service operates on Sundays; 0 otherwise.
start_date	Text (YYYYMMDD)	Required	Start date of the service period.
end_date	Text (YYYYMMDD)	Required	End date of the service period.

Field Name	Type	Presence	Description
service_id	ID	Required	Identifies the service for the exception, matching `calendar.txt` or standalone.
date	Text (YYYYMMDD)	Required	The specific exception date.
exception_type	1 or 2	Required	1 adds service on this date; 2 removes it.

Path Shapes and Frequencies

In GTFS Schedule datasets, the shapes.txt file provides a mechanism to define the geographic paths followed by transit vehicles along routes, enabling more accurate visual representations in mapping and routing applications. This optional file consists of sequences of latitude and longitude coordinates that form polylines approximating the actual travel path, which may deviate from straight lines between stops to reflect road alignments or other constraints. Each shape is identified by a unique shape_id and is linked to specific trips via the trips.txt file, allowing multiple trips to share the same shape for efficiency.^[65] The structure of shapes.txt includes the following fields:

Field Name	Type	Presence	Description
`shape_id`	Unique ID	Required	Identifier for the shape, used to associate it with trips.
`shape_pt_lat`	Latitude	Required	Latitude coordinate of a point along the shape.
`shape_pt_lon`	Longitude	Required	Longitude coordinate of a point along the shape.
`shape_pt_sequence`	Non-negative integer	Required	Index defining the order of points in the shape, with values increasing from start to end (not necessarily consecutive).
`shape_dist_traveled`	Non-negative float	Optional	Actual distance traveled along the shape from the first point to the current one, in distance units consistent with other GTFS files; recommended for routes with loops or inline stops to improve position interpolation accuracy.

These points do not need to coincide exactly with stop locations but should be positioned close to the vehicle's expected path for effective rendering. When combined with stop_times.txt, shape points allow interpolation of vehicle positions between stops, enhancing trip visualization in journey planning tools.^[65] The frequencies.txt file complements path shapes by supporting headway-based scheduling for services where trips operate at regular intervals rather than fixed times, particularly useful for high-frequency routes like buses or subways. This optional file specifies recurring trip patterns within defined time windows, reducing the need to enumerate every individual trip in trips.txt for compressed representations. It applies to trips that may already reference shapes, adding temporal regularity to the spatial definitions.^[66] Key fields in frequencies.txt are:

Field Name	Type	Presence	Description
`trip_id`	Foreign ID → trips.txt	Required	Identifier of the trip pattern to which the frequency applies.
`start_time`	Time (HH:MM:SS)	Required	Start time of the frequency period, relative to the service day, when the first vehicle departs the first stop.
`end_time`	Time (HH:MM:SS)	Required	End time of the frequency period, when the headway changes or service stops.
`headway_secs`	Positive integer	Required	Interval in seconds between consecutive vehicle departures during the period; periods must not overlap.
`exact_times`	Enum (0 or 1)	Optional	Indicates service type: 0 for approximate headway-based scheduling (vehicles aim for intervals but may vary slightly); 1 for exact schedule adherence with consistent headways (default assumes 0 if omitted).

For headway-based services (exact_times=0), this enables modeling of non-scheduled operations where arrival predictions rely on intervals rather than precise timetables. In visual routing, frequencies integrate with shapes to depict service density along paths, aiding users in understanding wait times without detailed stop-by-stop data. The optional shape_dist_traveled field in shapes.txt further refines distance calculations for such interpolated positions, promoting consistency in applications handling both spatial and temporal aspects of transit.^[66]

Transfers and Fare Rules

The transfers.txt file in GTFS Schedule defines rules for passenger transfers between stops, trips, or routes, enabling transit planning software to model seamless connections.^[67] It is an optional file with a primary key consisting of from_stop_id, to_stop_id, from_trip_id, to_trip_id, from_route_id, and to_route_id.^[67] The from_stop_id field, a foreign key referencing stops.stop_id, identifies the starting point of a transfer and is conditionally required for transfer types 1 through 3, while optional for types 4 and 5.^[67] Similarly, to_stop_id references the ending stop and follows the same conditional requirement.^[67] The transfer_type field, which is required, specifies the nature of the transfer using enumerated values: 0 for recommended transfers without timing constraints, 1 for timed transfers where arrival and departure are synchronized, 2 for transfers requiring a minimum walking or waiting time, 3 for transfers that are not possible, 4 for transfers within the same vehicle (in-seat), and 5 for transfers involving alighting and reboarding the same vehicle at the same stop.^[67] An optional min_transfer_time field provides the minimum duration in seconds needed for the transfer, such as time for walking between stops or platform changes.^[67] More specific transfer rules, such as those linking particular trips, override general stop-to-stop rules to ensure accurate routing.^[67] For instance, in a multi-modal system, a type 2 transfer with 300 seconds minimum time might connect a bus stop to a nearby train platform, accounting for pedestrian access.^[67] Fare information in GTFS is handled through the optional fare_attributes.txt and fare_rules.txt files, forming the basis of GTFS-Fares V1 for modeling pricing structures.^[68] The fare_attributes.txt file, with fare_id as its primary key, defines core fare properties.^[68] The required price field is a non-negative float representing the cost in units of the specified currency_type, which uses ISO 4217 codes such as USD or EUR.^[68] The payment_method field, also required, indicates timing: 0 for payment on board and 1 for payment before boarding.^[68] The transfers field specifies allowed additional rides: 0 for no transfers, 1 for one additional transfer, 2 for two, or left empty for unlimited.^[68] An optional transfer_duration sets the validity period in seconds for those transfers, while agency_id conditionally links to a specific agency if the feed includes multiple providers.^[68] The fare_rules.txt file links fares to specific journeys via the required fare_id foreign key and optional fields like route_id (referencing routes.route_id), origin_id, destination_id, and contains_id, all of which reference stops.zone_id for zonal pricing.^[69] This allows fares to apply based on routes, origin-destination pairs, or zones traversed, enabling complex tariffs such as distance-based or area-specific charges.^[69] For example, a fare rule might assign a higher price for trips originating in zone A and ending in zone C, regardless of the route taken, provided the journey contains zone B.^[69] As of the October 28, 2025 revision, GTFS-Fares V2 introduces expansions like fare_leg_rules.txt to better support multi-leg journeys, where fares accumulate across transfers or segments, and enhanced zone-based pricing for more granular modeling of regional systems.^[70] These updates address limitations in V1 by allowing rules for intermediate legs and variable pricing per transfer, improving accuracy for integrated ticketing in large networks.^[68]

Accessibility Features and Pathways

The General Transit Feed Specification (GTFS) incorporates accessibility features through dedicated files that model the physical layout of transit stations, enabling trip planners to provide navigation guidance compliant with standards like the Americans with Disabilities Act (ADA).^[9] These elements focus on multi-level stations and internal pathways, allowing users to assess barriers such as stairs or slopes before traveling. By integrating with stop and trip attributes, GTFS facilitates end-to-end accessible routing, prioritizing details like elevator availability and pathway widths for wheelchair users.^[71] The levels.txt file defines the vertical structure of stations, essential for navigating multi-level facilities like underground or elevated transit hubs. It is conditionally required when pathways include elevators (pathway_mode=5 in pathways.txt), ensuring accurate modeling of elevation changes.^[72] The file uses a primary key of level_id to uniquely identify each level within a station.^[72]

Field	Type	Presence	Description
level_id	Unique ID	Required	Identifies a level within a station; must be unique across the feed.
level_index	Float	Required	Numeric index for vertical ordering: 0 for ground level, positive values for levels above ground, and negative for below; higher indices indicate upper positions.
level_name	Text	Optional	Human-readable name displayed to riders, such as "Mezzanine" or "Platform Level."

Levels link to stops via the level_id field in stops.txt, providing context for pathways that span elevations, such as elevators connecting a concourse (level_index=0) to a platform (level_index=1).^[73] The pathways.txt file models connections between station locations as a graph, detailing modes of traversal to support precise navigation and accessibility assessments. It is optional but recommended for complex stations to represent elements like escalators and elevators, which are critical for ADA compliance by indicating barrier-free options.^[74] Each pathway is identified by a unique pathway_id and connects from_stop_id to to_stop_id, both referencing entries in stops.txt (e.g., entrances, platforms, or intermediate nodes like landings).^[74] The pathway_mode field specifies the type, with values 4 for escalators (often unidirectional) and 5 for elevators (typically bidirectional and key for wheelchair access).^[71] Additional attributes quantify physical demands, such as length in meters for horizontal distance, traversal_time in seconds for expected duration (excluding waits), stair_count for the number of steps in stairs (positive for ascent, negative for descent), and max_slope as a ratio (e.g., 0.083 for an 8.3% incline, where values below 0.083 often denote wheelchair accessibility).^[74]

Field	Type	Presence	Description
pathway_id	Unique ID	Required	Identifies the pathway; must be unique.
from_stop_id	ID (from stops.txt)	Required	Starting location ID.
to_stop_id	ID (from stops.txt)	Required	Ending location ID.
pathway_mode	Enum (1-7)	Required	Traversal type: 1=walkway, 2=stairs, 3=moving sidewalk, 4=escalator, 5=elevator, 6=fare gate, 7=exit gate.
is_bidirectional	0 or 1	Required	0 for unidirectional (e.g., escalators), 1 for bidirectional.
length	Non-negative float	Optional	Pathway length in meters.
traversal_time	Non-negative integer	Optional	Average traversal time in seconds.
stair_count	Integer	Optional	Number of stairs (for mode=2); sign indicates direction.
max_slope	Non-negative float	Optional	Maximum slope ratio (e.g., 0.083).
min_width	Non-negative float	Optional	Minimum clear width in meters (e.g., 1.5 for accessible paths).
signposted_as	Text	Optional	Signage text for directions (e.g., "Follow signs to Platform 3").
reversed_path	ID (pathway_id)	Optional	ID of the reverse-direction pathway.

Introduced in updates as of October 2025, fields like min_width and reversed_path enhance modeling of accessible and directional paths, such as narrow corridors or one-way escalators with dedicated reverses.^[74] The signposted_as field provides directional cues, like "To Street Level via Elevator," aiding visual navigation.^[74] These files integrate with accessibility flags in stops.txt (via wheelchair_boarding, where 1 indicates step-free access) and trips.txt (via wheelchair_accessible), allowing planners to filter pathways by mode—e.g., prioritizing elevators (mode=5) between levels for wheelchair users while excluding stairs (mode=2).^[71] This conditional use of levels ensures elevators are only modeled with defined vertical ordering, creating comprehensive accessible routes from station entry to boarding.^[72]

Translations, Attributions, and Extensions

The translations.txt file enables multilingual support in GTFS datasets by providing translations for customer-facing text fields, such as stop names or route descriptions, from the default language specified in feed_info.txt.^[75] This optional file includes fields like table_name (identifying the GTFS table, e.g., stops), field_name (the translatable field, e.g., stop_name), language (a two-letter ISO 639-1 code, e.g., nl for Dutch), and translation (the translated text).^[75] Additional conditional fields, such as record_id (to target a specific record) or field_value (to match a particular value for translation), allow precise mappings.^[75] For instance, in a feed with French as the default language, translations.txt can translate a stop name like "Bruxelles-Ouest" to "Brussel-West" for Dutch users, enhancing accessibility for international riders.^[76] The attributions.txt file credits organizations involved in producing or operating the transit data, particularly useful in aggregated or third-party datasets.^[77] As an optional component, it features fields including attribution_id (a unique identifier), organization_name (e.g., "Rejseplanen"), is_producer (1 if the organization produced the data), is_operator (1 if it operates services), and is_authority (1 if it is the regulating authority), with at least one role flag required.^[77] Optional fields like attribution_url, attribution_email, or attribution_phone provide contact details, and linkages via agency_id, route_id, or trip_id scope attributions to specific elements or the entire feed.^[77] In practice, this file attributes data aggregators, such as crediting Rejseplanen for Danish transit data with a URL to their site.^[78] Extensions in GTFS address advanced use cases, including on-demand and flexible services. The booking_rules.txt file, part of the GTFS-Flex subset, specifies reservation requirements for demand-responsive transportation, such as dial-a-ride or route deviation services.^[79] Key fields include booking_rule_id (unique identifier), booking_type (e.g., 0 for real-time, 1 for same-day, 2 for prior-day), and prior_notice_duration variants like prior_notice_duration_min (minimum minutes in advance, e.g., 60) or prior_notice_start_day (earliest advance days, e.g., 14).^[79] Supporting fields such as message, phone_number, info_url, and booking_url guide riders on how to book, with examples like Heartland Express requiring bookings 1 to 14 days ahead between 8 AM and 3 PM on weekdays.^[80] GTFS-Flex, officially adopted into the core specification in March 2024, facilitates discoverability of these flexible services by modifying files like stop_times.txt to include booking rule references and supporting point-to-zone or checkpoint deviations.^[11] The locations.geojson file introduces GeoJSON-formatted zones for pickup and drop-off in on-demand services, diverging from traditional CSV to represent polygonal areas per RFC 7946.^[81] It requires a FeatureCollection structure with features containing id (unique identifier, shared with stops.stop_id or groups), properties (e.g., stop_name), and geometry as Polygon or MultiPolygon with coordinate arrays.^[81] This enables modeling service areas where riders request stops anywhere within boundaries, integrated via stop_times.txt for trips.^[11] Complementing this, areas.txt defines logical groupings like fare zones with area_id and area_name, while stop_areas.txt assigns stop_id values to these areas, allowing aggregated representations without physical pathways.^[82]^[83] In 2025, GTFS-Fares v2 introduced rider_categories.txt to model eligibility groups for fares, such as students or elderly riders, adopted in February via community proposal.^[84] This optional file includes rider_category_id, rider_category_name (e.g., "Senior"), is_default_fare_category (1 if applicable to all), and eligibility_url for details.^[85] It links to fare_products.txt, which describes purchasable tickets with fields like fare_product_id, fare_product_name, amount (cost), currency (ISO 4217 code), and rider_category_id to restrict eligibility.^[86] An empty rider_category_id in fare_products.txt indicates universal access, supporting complex fare structures in flexible services.^[86]

GTFS Realtime

Overview and Protocol

GTFS Realtime is a feed specification designed to provide dynamic transit information, such as trip delays, vehicle positions, and service alerts, as a complement to the static GTFS Schedule dataset.^[32] It enables public transportation agencies to deliver real-time updates to applications, improving rider experience by offering live departure times and alerts.^[32] These feeds require a corresponding GTFS Schedule feed for contextual reference, such as route and stop definitions.^[10] The format employs binary Protocol Buffers (protobuf) for efficient serialization, defined in the gtfs-realtime.proto schema (version 2.0 and higher).^[87] Feeds are delivered over HTTP or HTTPS from any web server, allowing frequent updates without complex infrastructure.^[32] The protocol is licensed under Apache 2.0, promoting open adoption and implementation.^[32] A GTFS Realtime feed consists of a single FeedMessage containing a header and a list of entities.^[10] The header includes a Unix timestamp indicating when the feed was generated and an incrementality field, which must be set to FULL_DATASET (value 0) as DIFFERENTIAL (value 1) is currently unsupported.^[88] Entities encompass TripUpdate for schedule adjustments, VehiclePosition for location data, and Alert for service notifications, which can be combined within the same feed.^[89] Validation is supported through reference implementations and language bindings generated from the protobuf schema.^[90]

Trip Updates

The Trip Update message in GTFS Realtime provides real-time information about deviations from the static schedule for individual trips, such as predicted arrival and departure times, delays, cancellations, or added unscheduled trips.^[91] It is designed to complement the static GTFS Schedule data by allowing transit agencies to report timetable fluctuations for trips that support real-time updates.^[89] This enables applications like trip planners to adjust user predictions dynamically, for instance, notifying passengers of a bus arriving five minutes late at a specific stop.^[91] The core structure of a Trip Update includes a TripDescriptor, which identifies the affected trip by referencing static GTFS elements, followed by an array of StopTimeUpdate messages for stop-specific changes.^[92] The TripDescriptor contains fields such as trip_id (a unique identifier from trips.txt), start_date (in YYYYMMDD format indicating the service date), route_id (from routes.txt), and direction_id (0 or 1 for the trip's direction).^[91] It also includes a schedule_relationship enum to indicate the trip's overall status: SCHEDULED (following the static schedule with possible updates), UNSCHEDULED (a trip without a static counterpart), or CANCELED (the entire trip is canceled). As of May 2025, the enum was expanded with experimental values including NEW (extra trip unrelated to existing trips), REPLACEMENT (replaces a scheduled trip with new schedule or routing), DUPLICATED (copies an existing trip with different start date/time), and DELETED (trip removed and not shown to users); the former ADDED value is deprecated.^[93]^[94] An optional VehicleDescriptor may link to vehicle position data, though it is not required for trip updates.^[91] Each StopTimeUpdate in the array targets a specific stop on the trip, identified by stop_sequence (the order from stop_times.txt) or stop_id (from stops.txt).^[95] Key fields include arrival and departure (absolute timestamps in seconds since Unix epoch), or delay (seconds offset from the static schedule, positive for late or negative for early).^[91] The schedule_relationship for the stop can be SCHEDULED (update applies normally), SKIPPED (the stop is bypassed), or NO_DATA (no prediction available).^[95] These updates integrate with the static schedule by using the baseline times from stop_times.txt as a reference; for example, a +300-second delay at stop sequence 5 adjusts the predicted arrival relative to the scheduled time.^[91] Additional fields under TripProperties allow modifications like changing the trip_headsign (destination sign text from trips.txt), enabling reports of route adjustments without altering the core trip identity.^[96] GTFS Realtime feeds use FULL_DATASET incrementality, but agencies can publish partial datasets containing only changed entities to reduce bandwidth; true differential mode remains unsupported.^[88] Partial updates are common, where only affected stops or trips are included, allowing efficient streaming of changes like a single-stop delay.^[89] Common use cases include predicting arrivals with delays (e.g., reporting a +5-minute offset via the delay field), handling cancellations by setting the trip's schedule_relationship to CANCELED, or updating headsigns for detour announcements.^[91] For instance, if a trip is running ahead, a negative delay like -120 seconds can be applied to earlier stops, while later ones use absolute times for precision.^[97] A representative example in Protocol Buffer text format (decoded for readability) illustrates a delayed trip:

entity {
  id: "simple-trip"
  trip_update {
    trip {
      trip_id: "trip1"
      start_time: "14:05:00"
      start_date: "20220628"
      route_id: "ROUTE1"
      direction_id: 0
      schedule_relationship: SCHEDULED
    }
    stop_time_update {
      stop_sequence: 3
      arrival {
        delay: 5
      }
      departure {
        delay: 5
      }
      schedule_relationship: SCHEDULED
    }
    stop_time_update {
      stop_sequence: 12
      arrival {
        delay: -2
      }
      departure {
        delay: -2
      }
      schedule_relationship: SCHEDULED
    }
  }
}
entity {
  id: "simple-trip"
  trip_update {
    trip {
      trip_id: "trip1"
      start_time: "14:05:00"
      start_date: "20220628"
      route_id: "ROUTE1"
      direction_id: 0
      schedule_relationship: SCHEDULED
    }
    stop_time_update {
      stop_sequence: 3
      arrival {
        delay: 5
      }
      departure {
        delay: 5
      }
      schedule_relationship: SCHEDULED
    }
    stop_time_update {
      stop_sequence: 12
      arrival {
        delay: -2
      }
      departure {
        delay: -2
      }
      schedule_relationship: SCHEDULED
    }
  }
}

This protobuf snippet updates stop sequences 3 and 12 for trip1 on June 28, 2022, showing a 5-second delay early in the route and 2 seconds ahead later, relative to static stop_times.txt.^[97] In JSON encoding (supported by some tools for interoperability), the structure mirrors this, with fields like "delay": 5 under arrival/departure objects.

Vehicle Positions

The VehiclePosition message in GTFS Realtime provides real-time location data for transit vehicles, enabling applications to track their movements and statuses dynamically. This feed entity is distinct from trip updates, which primarily handle timing modifications, by focusing on geospatial and operational details derived from onboard systems like GPS. Agencies publish these messages in a Protocol Buffers format, typically updated frequently to reflect current conditions, with data recommended to be no older than 90 seconds for accuracy.^[98]^[99] The core structure of a VehiclePosition includes several key components. The VehicleDescriptor identifies the vehicle with fields such as id (a unique identifier), label (a user-facing name), and license_plate (optional vehicle registration). The Position is mandatory and contains latitude and longitude in WGS-84 degrees, along with optional details like bearing (direction in degrees clockwise from true north), speed (in meters per second), and odometer (cumulative distance traveled in meters). Linkage to scheduled service occurs via the TripDescriptor, which references the static GTFS trip ID, schedule relationship (e.g., scheduled or added), and route or trip details. Additionally, the StopStatus indicates the vehicle's proximity to stops, using enums like INCOMING_AT (arriving), STOPPED_AT (at the stop), or IN_TRANSIT_TO (en route, default).^[98]^[100] Specialized fields enhance situational awareness. The current_status field specifies the vehicle's state relative to its next stop, aligning with the StopStatus enum for precision in arrival predictions. Congestion_level categorizes traffic conditions using enums such as UNKNOWN_CONGESTION_LEVEL, RUNNING_SMOOTHLY, STOP_AND_GO, CONGESTION, or SEVERE_CONGESTION, helping to contextualize delays. The experimental occupancy_status provides crowding information via enums including EMPTY, MANY_SEATS_AVAILABLE, FEW_SEATS_AVAILABLE, STANDING_ROOM_ONLY, CRUSHED_STANDING_ROOM_ONLY, FULL, and NOT_ACCEPTING_PASSENGERS, allowing apps to display passenger load levels. A timestamp records when the position was captured, separate from the feed's generation time.^[100]^[101] Vehicle positions integrate with Automatic Vehicle Location (AVL) systems to power live tracking maps in transit applications, such as displaying bus icons on Google Maps with real-time routes and estimated times based on GPS data. Occupancy data supports crowding alerts, enabling users to avoid overloaded vehicles; for instance, Google Maps visualizes levels as icons indicating low, medium, high crowding, or full capacity on Android and iOS devices. These features improve rider decision-making by combining location with capacity insights.^[102]^[103] Regarding precision, positions rely on GPS coordinates in the WGS-84 system, with typical AVL devices providing accuracy within 10-20 meters under urban conditions, though no strict requirement is mandated in the spec—agencies are encouraged to use reliable hardware for consistent updates. Examples include bus tracking apps like those from Transit or agency-specific tools, where vehicle positions enable features like next-bus notifications by snapping GPS points to static route shapes for smoother visualization. Research on GTFS Realtime accuracy highlights that positional errors can affect on-time performance metrics, underscoring the need for timely and precise feeds.^[98]^[104]

Service Alerts

Service Alerts in GTFS Realtime provide a mechanism for transit agencies to communicate disruptions and other important information affecting public transportation services, such as station closures, route detours, or accessibility issues.^[105] These alerts are distinct from trip updates or vehicle positions, focusing instead on broader network impacts that may not be tied to specific vehicle movements.^[10] The Alert message structure allows for targeted notifications by specifying affected entities and including descriptive text, enabling applications to display relevant warnings to users in real time.^[10] The core of a Service Alert is the InformedEntity field, which identifies the affected components of the transit network using selectors for agencies, routes, route types, trips, or stops from the corresponding GTFS Schedule feed.^[10] Multiple InformedEntity entries can be used to cover various impacts, with fields within a single entry combined via logical AND (e.g., a specific route and stop).^[105] The Alert header provides context through enums for cause (e.g., UNKNOWN_CAUSE=0, CONSTRUCTION=9, ACCIDENT=5, WEATHER=7), effect (e.g., NO_SERVICE=0, DETOUR=3, ACCESSIBILITY_ISSUE=10), and severity_level (e.g., INFO=0, WARNING=1, SEVERE=3), helping applications prioritize and categorize the alert.^[10] Descriptive content is delivered via TextMessage fields, including header_text for a concise summary and description_text for detailed explanations, both utilizing TranslatedString to support multiple languages through Translation sub-messages with BCP-47 language codes.^[10] Additional fields include a Url (also a TranslatedString) for linking to external details, such as an agency webpage, and active_period TimeRange entries to define when the alert should be displayed, with multiple ranges possible for recurring issues; omission of active_period implies indefinite activity.^[10] Common use cases for Service Alerts include notifying users of station closures due to maintenance (cause=8, effect=NO_SERVICE=0) or line-wide detours from accidents (cause=5, effect=DETOUR=3), with multi-language translations ensuring accessibility for diverse riders.^[105] For instance, an alert might target a specific stop on route "1" affected by weather (cause=7), integrating with GTFS Schedule entities to limit notifications to relevant trips.^[105] In practice, these alerts power push notifications in mobile apps or on-screen displays in stations, enhancing rider awareness during disruptions.^[106]

References

[1]
Overview - General Transit Feed Specification
The General Transit Feed Specification (GTFS) is an Open Standard used to distribute relevant information about transit systems to riders.
[2]
Background - General Transit Feed Specification
This transit data format was originally known as the Google Transit Feed Specification (GTFS). ... In 2010, the GTFS format name was changed to the General ...
[3]
GTFS Static Overview - Transit - Google for Developers
Oct 16, 2024 · A GTFS feed is composed of a series of text files collected in a ZIP file. Each file models a particular aspect of transit information: stops, routes, trips, ...
[4]
https://gtfs.org/documentation/schedule/reference
[5]
What is GTFS? - General Transit Feed Specification
GTFS is a standardized data format that provides a structure for public transit agencies to describe the details of their services such as schedules, stops, ...
[6]
STOPS - General Transit Feed Specification (GTFS) Data | FTA
Feb 15, 2019 · The GTFS format is used by many transit agencies to communicate their schedules to on-line mapping programs and smartphone/tablet applications ...
[7]
About - General Transit Feed Specification
GTFS evolution¶. GTFS started with a collaboration between TriMet in Portland, Oregon, and Google. TriMet worked with Google to format their transit data into ...
[8]
Reference - General Transit Feed Specification
This document defines the format and structure of the files that comprise a GTFS dataset. Table of Contents¶. Document Conventions; Dataset Files; File ...
[9]
GTFS Realtime Reference
A GTFS Realtime feed lets transit agencies provide consumers with realtime information about disruptions to their service (stations closed, lines not operating, ...Missing: components | Show results with:components
[10]
Flex - General Transit Feed Specification
GTFS Flex is a GTFS Schedule extension project that aims to facilitate the discoverability of Demand Responsive Transportation Services.
[11]
Frequently Asked Questions (FAQ) - Mobility Database
The Mobility Database is an open database containing over 4000+ transit and shared mobility feeds in GTFS, GTFS Realtime, and GBFS formats.
[12]
GOFS: A New Chapter for On-Demand Transportation Data
Sep 30, 2025 · This evolution of GOFS is important because it allows GTFS to remain focused on public transit while ensuring interoperability between the two ...
[13]
https://mobilitydata.org/gofs-a-new-chapter-for-on-demand-transportation-data/
[14]
Pioneering Open Data Standards: The GTFS Story
We eventually changed the name from Google Transit Feed Specification to General Transit Feed Specification—and the effect was transformative. It greatly ...
[15]
Full revision history - General Transit Feed Specification
February 28, 2007¶. Addition of frequencies.txt for headway-based schedule support. Multiple agencies now allowed in the the same feed. Also added new agency_id ...
[16]
[PDF] THE MANY USES OF GTFS DATA – OPENING THE DOOR TO ...
Feb 8, 2010 · There are an estimated 261 transit agencies worldwide, including 227 transit agencies in the U.S., that share their GTFS data openly with the ...
[17]
Static Transit - Community - Google for Developers
Oct 16, 2024 · Google Transit Partners Help Center · Latest GTFS specification on GitHub (English only) · TransitWiki.org "General Transit Feed Specification" ...Missing: governance 2013
[18]
google/transit - GitHub
The General Transit Feed Specification (GTFS) is an Open Standard used to distribute relevant information about transit systems to riders.Issues 101 · Pull requests 16 · SecurityMissing: 2013 wiki
[19]
Welcome to the new GTFS.org - MobilityData
Mar 1, 2022 · We've gathered everything in one place from sources like GTFS.org, Google Transit APIs, gtfs.mobilitydata.org, and the GTFS GitHub repository.Missing: establishment 2021
[20]
Introduction to GTFS Governance
GTFS governance is a framework that guides how the GTFS is maintained, updated, and developed, ensuring it remains open and collaborative.
[21]
GTFS Digest - Vote on Governance and Check Out a New Field!
Jul 2, 2025 · The June 2025 GTFS Digest is here! This month, the vote for the new GTFS Governance structure began with a deadline of July 6, 2025.Missing: framework | Show results with:framework
[22]
Change Process - General Transit Feed Specification
Discussions held in Working Group meetings should be summarized in the Pull Request comments. ... If the vote passes, the Maintainer merges the voted Pull Request ...Missing: framework July 2025
[23]
https://gtfs.org/community/governance/gtfs-schedule-governance/change-process/
[24]
GTFS Digest - August 2025 - Vote on a Semantics Clarification and ...
Sep 2, 2025 · Below is a list of proposals that are currently being voted on. We invite you to take a look and participate in the voting process. Proposal ...Missing: ongoing | Show results with:ongoing
[25]
GTFS Digest - General Transit Feed Specification
This month, the vote for the new GTFS Governance structure began with a deadline of July 6, 2025. Be sure to check out the discussion and get your votes in! ...
[26]
GTFS for trip planning - TransitFare
Public-facing info about route names, types. trips.txt. Service runs (i.e. trips). stop_times.txt.Missing: journey | Show results with:journey
[27]
https://transitfare.com/blog/gtfs-for-trip-planning-apps
[28]
Publish - General Transit Feed Specification
Integrating your feed into mobile and web-based trip-planning applications, allowing riders to plan trips on your system; Submitting your feed to a GTFS ...Making Your Gtfs Feed... · Benefits Of Sharing Your... · Sharing Your Data: Tips &...
[29]
Extended GTFS Route Types | Static Transit - Google for Developers
Oct 16, 2024 · GTFS defines a number of route types that developers can use to describe the type of service for a particular route, such as bus versus rail ...
[30]
The effects of bike-sharing-transit integration on accessibility equity
We included BSS data within GTFS files to enable integration with existing computational tools capable of PT routing using GTFS files.
[31]
General Transit Feed Specification: Home
GTFS is a community-driven open standard for rider-facing transit information. Get Started Documentation. Why use GTFS? Improved Rider Experience. GTFS ...What is GTFS? · Reference · Example GTFS feed · Feed information
[32]
GTFS Realtime Overview - Transit - Google for Developers
Oct 16, 2024 · GTFS Realtime is a feed specification that allows public transportation agencies to provide realtime updates about their fleet to application developers.<|separator|>
[33]
[PDF] GTFS-realtime Reference for the New York City Subway - MTA
This document defines how the GTFS-realtime feed is implemented by the New York City Transit. Subway. Elements not specified below are not used in the NYC ...
[34]
Where Is My Bus? Impact of mobile real-time information on the ...
... Where Is My Bus? Impact of mobile real-time information on the perceived and actual wait time of transit riders. Author links open overlay panel. Kari Edison ...
[35]
Realtime validation errors and warnings - Google for Developers
Oct 16, 2024 · This page provides a list of validation errors and warnings for the Realtime Transit feed, as well as tips on how to troubleshoot these issues.Missing: freshness challenges
[36]
Using General Transit Feed Specification (GTFS) Data as a Basis for ...
Meanwhile, the development of General Transit Feed Specification (GTFS), an open standard format, provides new opportunities for transit performance measurement ...
[37]
[PDF] using general transit feed specification (gtfs) data as a ... - ROSA P
2.4.1 History and Development of GTFS ... firstly created by TriMet and Google in 2005 for the Google Transit Web-based trip planner,.
[38]
[PDF] Analysis of demand–supply gaps in public transit systems based on ...
This study uses GTFS static data, which is distributed in a common format for public transportation schedules with the associated geographic information.Missing: 2020s | Show results with:2020s
[39]
[PDF] Defining and Measuring Equity in Public Transportation
“Using General Transit Feed Specification (GTFS) Data as a Basis for Evaluating and Improving Public Transit Equity.” Charlotte, NC: Center for Advanced ...
[40]
[PDF] GTFS Wheelchair Accessibility Data - Cal-ITP
GTFS wheelchair accessibility data uses `wheelchair_boarding` in `stops.txt` and `wheelchair_accessible` in `trips.txt` to specify accessibility, with values 0 ...
[41]
[PDF] GTFS for accessibility - Transportation Research Board
Overlaying such a shortest path tree of 30 or 45 minutes with population and employment data can be used to compute accessibility measures for employment and ...Missing: wheelchair_boarding ADA compliance
[42]
15-02 Estimating and Enhancing Public Transit Accessibility for ...
Integrates transit stop ADA compliance survey data together with customized GTFS wheelchair_boarding and wheelchair_accessibility and OpenTripPlanner ...
[43]
A Comprehensive Transit Accessibility and Equity Dashboard
Jul 2, 2021 · The TransitCenter Equity Dashboard tracks how well public transit systems in seven densely populated urban regions in the United States serve their riders.
[44]
[PDF] Access to Opportunity through Equitable Transportation
Oct 21, 2020 · Source: Authors' analysis of LEHD data, decennial census data, American Community. Survey data, and GTFS data. Note: Darker colors denote ...Missing: audits | Show results with:audits<|separator|>
[45]
[PDF] Mobility Performance Metrics (MPM) for Integrated Mobility and ...
Federal Transit Administration (FTA) Office of Budget and Policy. NTD Policy Manual. 2015 Report Year. 2015. 50. Transit Cooperative Research Program (TCRP).
[46]
[PDF] USE OF THE GENERAL TRANSIT FEED SPECIFICATION (GTFS ...
The General Transit Feed Specification (GTFS), first introduced in 2005, is the result of a project between Google and TriMet in Portland to create a transit ...<|separator|>
[47]
Feeds - Mobility Database
Access GTFS, GTFS Realtime, GBFS transit data with over 4000 feeds from 70+ countries on the web's leading transit data platform.
[48]
2023 Annual Database General Transit Feed Specification (GTFS ...
Contains GTFS data for fixed routes to collect NTD reporters' geographic service area coverage data. Keywords: agency, stops, routes, trips, stop times, ...
[49]
Validate - General Transit Feed Specification
The free and open-source Canonical GTFS Schedule validator maintained by MobilityData ensures your GTFS data is compliant with the official GTFS Schedule ...Missing: registries DOT
[50]
Feed information - General Transit Feed Specification
On top of providing information about agencies and their services, it is possible to provide information about the GTFS dataset using the file feed_info.txt.Missing: ZIP | Show results with:ZIP
[51]
[PDF] Understanding GTFS: An intro and overview - Cal-ITP
GTFS is used by over 10,000 transit agencies in over 100 countries. Most transit agencies have heard of. GTFS, and it has quickly become an industry standard.Missing: worldwide | Show results with:worldwide
[52]
National Transit Database: Reporting Changes and Clarifications
Mar 3, 2023 · FTA therefore expects agencies to maintain accurate, up to date GTFS data throughout the year. Agencies that experience changes in service will ...
[53]
Enable Multimodal Transport with NeTEx and GTFS Data Standards
Aug 1, 2023 · GTFS enables seamless integration and interoperability between different transit systems and software platforms, facilitating the development ...
[54]
[GTFS Digest] March 2025 - Community Weighs In on Governance ...
Apr 1, 2025 · The March 2025 GTFS Digest highlights the community review of the GTFS Schedule Governance Proposal and the release of GTFS Validator v7.0.
[55]
Canonical GTFS Schedule Validator - MobilityData
Canonical GTFS Schedule Validator. Evaluate your dataset against the official GTFS Reference and Best Practices.Missing: establishment 2021
[56]
Re: [transit-developers] Compression strategies for GTFS data
My issue is that GTFS data sets are usually quite large (85MB uncompressed for the city of Sydney for example). I've done a bit of reverse engineering on other ...Missing: typical entries
[57]
From Raw GPS to GTFS: A Real-World Open Dataset for Bus Travel ...
Each trip was matched to its corresponding stop sequence, resulting in 785,976 records in stop_times.txt. These stop-level events are proportionally distributed ...From Raw Gps To Gtfs: A... · 3. Methods · 3.1. Obtaining Gps Data
[58]
https://gtfs.org/documentation/schedule/reference/#agencytxt
[59]
https://gtfs.org/documentation/schedule/reference/#feed_infotxt
[60]
Reference - General Transit Feed Specification
Summary of each segment:
[61]
Reference - General Transit Feed Specification
Summary of each segment:
[62]
https://gtfs.org/reference/static/#calendar_txt
[63]
https://gtfs.org/reference/static/#calendar_dates_txt
[64]
https://support.google.com/transitpartners/answer/6377399
[65]
https://gtfs.org/documentation/schedule/reference/#shapestxt
[66]
https://gtfs.org/documentation/schedule/reference/#frequenciestxt
[67]
https://gtfs.org/schedule/reference/#transferstxt
[68]
https://gtfs.org/schedule/reference/#fare_attributestxt
[69]
https://gtfs.org/schedule/reference/#fare_rulestxt
[70]
https://gtfs.org/schedule/change-history/revision-history
[71]
Pathways - General Transit Feed Specification
GTFS-Pathways represents transit station details, helping riders understand transfer capabilities. It uses pathways.txt and levels.txt files.Missing: compliance metrics 45 minutes
[72]
https://gtfs.org/documentation/schedule/reference/#levelstxt
[73]
Pathways - General Transit Feed Specification
Pathway Connections At its foundational level, Pathways offers basic functionality to connect key areas defined in Location Types within a station.
[74]
https://gtfs.org/documentation/schedule/reference/#pathwaystxt
[75]
https://gtfs.org/documentation/schedule/reference/#translationstxt
[76]
Translations - General Transit Feed Specification
The file translations.txt is then used to translate the station names from the default agency language (French in this case) to Dutch.
[77]
https://gtfs.org/documentation/schedule/reference/#attributionstxt
[78]
Attributions - General Transit Feed Specification
In order to attribute Rejseplanen as a data producer, the file attributions.txt is used, where an attribution ID is defined alongside the name and URL of the ...
[79]
https://gtfs.org/documentation/schedule/reference/#booking_rulestxt
[80]
Demand responsive services - General Transit Feed Specification
GTFS Flex is a GTFS extension project which was adopted officially into the GTFS specification in March 2024, its aims to facilitate the discoverability of ...Missing: components | Show results with:components
[81]
https://gtfs.org/documentation/schedule/reference/#locationsgeojsonfile
[82]
https://gtfs.org/documentation/schedule/reference/#areastxt
[83]
https://gtfs.org/documentation/schedule/reference/#stop_areastxt
[84]
GTFS Digest - February 2025 - Rider Categories, Adopted!
Mar 6, 2025 · This group is discussing the GTFS Realtime, asking questions, and proposing changes. GTFS.org: The official GTFS documentation website. Here ...<|separator|>
[85]
https://gtfs.org/documentation/schedule/reference/#rider_categoriestxt
[86]
https://gtfs.org/documentation/schedule/reference/#fare_productstxt
[87]
Protobuf - General Transit Feed Specification
GTFS Realtime lets transit agencies provide consumers with realtime information about disruptions to their service (stations closed, lines not operating, ...Missing: 2024 | Show results with:2024
[88]
Google Transit GTFS Realtime Reference and Differences
Oct 16, 2024 · This page describes the major differences between the official GTFS Realtime and Google Transit's implementation of the specification in the following areas.
[89]
GTFS Realtime Bindings
GTFS Realtime data is encoded and decoded using Protocol Buffers, a compact binary representation designed for fast and efficient processing. The data ...
[90]
https://gtfs.org/documentation/realtime/language-bindings/overview/
[91]
https://gtfs.org/realtime/feed-entities/#trip-updates
[92]
https://developers.google.com/transit/gtfs-realtime/reference#tripdescriptor
[93]
https://gtfs.org/documentation/realtime/reference/#enum-schedulerelationship
[94]
Full Trip Update example | Realtime Transit - Google for Developers
Oct 16, 2024 · This document provides an example of a full-dataset Trip Update feed in the General Transit Feed Specification Realtime (GTFS-realtime) format.
[95]
Vehicle Positions - General Transit Feed Specification
Documentation is provided below. A timestamp denoting the time when the position reading was taken can be provided. Note that this is different from the ...
[96]
GTFS Realtime Best Practices
All GTFS Realtime feeds should be "2.0" or higher, as early version of GTFS Realtime did not require all fields needed to represent various transit situations ...Missing: schema 2024
[97]
https://developers.google.com/transit/gtfs-realtime/examples/trip-updates-full
[98]
https://gtfs.org/documentation/realtime/feed-entities/vehicle-positions/
[99]
[PDF] Assessing GTFS Accuracy - Mineta Transportation Institute
Furthermore, realtime arrival predictions allow customers to minimize transit wait times, typically seen as the most onerous component of a transit trip, and.
[100]
Provide vehicle occupancy data with GTFS - Transit Partners Help
To indicate how crowded a vehicle is, you can provide vehicle occupancy data. This data appears in Google Maps on Android and iOS.
[101]
Algorithmic Analysis of GTFS-RT vehicle position accuracy - arXiv
Jun 6, 2025 · However, GTFS and GTFS-RT data have little research done on their accuracy. Using this data, we can determine the accuracy of the real-time data ...
[102]
Service Alerts - General Transit Feed Specification
Service alerts allow you to provide updates whenever there is disruption on the network. Delays and cancellations of individual trips should usually be ...Missing: JSON | Show results with:JSON<|control11|><|separator|>
[103]
Service alerts - General Transit Feed Specification
Service alerts in GTFS include a header, entity type, active period, affected entities, cause, effect, and a description. For example, a stop is closed due to ...