tz database
The tz database, also known as the IANA Time Zone Database or zoneinfo database, is a public-domain collection of binary data files and C source code that records the past, present, and predicted future of civil time scales for representative locations worldwide, including UTC offsets, time zone boundaries, and rules for daylight saving time as established by political authorities.[1] It partitions the globe into over 400 regions where local clocks align, enabling accurate conversion between UTC and local times while accounting for historical irregularities such as wartime adjustments or irregular DST transitions.[2] Originally developed in the late 1970s by Arthur David Olson to support portable time zone handling in Unix systems, the database has evolved into a standardized resource maintained by the Internet Assigned Numbers Authority (IANA) since 2011, following procedures outlined in RFC 6557 (BCP 175).[2][3] Primary maintenance is handled by TZ Coordinator Paul Eggert, with assistance from Tim Parenti, through a community-driven process involving contributions via the [email protected] mailing list and periodic releases—typically 3 to 5 per year—to incorporate verified changes from governments and legal sources.[1][3] The current version, 2025b released on March 22, 2025, includes updates such as a new time zone for Chile's Aysén Region and adjustments for Paraguay's permanent adoption of UTC-03.[1] Widely adopted in major operating systems like Linux, macOS, iOS, Android, and Windows (via extensions), as well as in programming libraries such as the GNU C Library and ICU, the tz database ensures consistent and reliable time zone computations across software ecosystems, though its predictions beyond official announcements remain provisional to avoid overstepping into unverified policy speculation.[2][4]Introduction
Purpose and scope
The tz database, formally known as the IANA Time Zone Database, is a public-domain compilation of rules and data for civil time across global regions, initiated in the late 1970s to standardize representations of local time for computational purposes.[1] It records the history and anticipated future of time scales, including offsets from Coordinated Universal Time (UTC) and adjustments for daylight saving time (DST), enabling software to convert between UTC and local times accurately. The scope of the tz database encompasses over 400 representative time zones, partitioning the world into regions where local clocks synchronize since the POSIX epoch of January 1, 1970.[5] This includes detailed transitions for DST observance, permanent offsets, and historical shifts due to political or legal decisions, though pre-1970 data is provided selectively for continuity in location-specific zones rather than exhaustive coverage. By focusing on verifiable civil time rules, it supports applications in diverse fields like scheduling, logging, and internationalization without relying on vendor-specific or proprietary datasets.[1] Maintained collaboratively under procedures outlined in RFC 6557, the database tracks worldwide legal changes to time observance, such as new DST policies or boundary adjustments, with releases issued multiple times annually to ensure timeliness.[1] This ongoing effort reflects its role as a neutral, authoritative resource for developers and systems integrating time zone functionality.Key concepts
In the tz database, a timezone is defined as a geographical region where civil-time clocks have synchronized their offsets from Coordinated Universal Time (UTC) and daylight saving time (DST) observance since 1970, ensuring uniform local time across the area. This includes a base UTC offset, which represents the standard difference from UTC (e.g., -5 hours for Eastern Standard Time), and DST rules that specify temporary adjustments, such as advancing clocks by one hour during designated periods to extend evening daylight.[1] These elements capture political and legal decisions on timekeeping, rather than natural solar variations, allowing software to compute accurate local times for historical and future dates within the zone. Civil time, as tracked by the tz database, refers to the legally mandated time observed in everyday life and commerce, distinct from astronomical time, which is derived from the Earth's rotation and solar position (e.g., mean solar time). The database focuses exclusively on civil time scales, incorporating adjustments like leap seconds and political changes to align with societal observance, whereas astronomical time prioritizes celestial events without regard for legal standards. This emphasis ensures that computations reflect the time people actually use, such as for scheduling or records, rather than purely scientific measurements. In tz database computations, POSIX time typically denotes a UTC-based timestamp (e.g., seconds since the Unix epoch of 1970-01-01 00:00:00 UTC), serving as a neutral, absolute reference for global synchronization. Wall time, in contrast, is the local civil time displayed on clocks in a specific timezone, incorporating the UTC offset and any active DST to produce the observable "wall clock" reading. The tz code facilitates conversions between these, enabling applications to interpret POSIX timestamps as wall times while handling ambiguities like DST transitions, where a single wall time might correspond to multiple POSIX instants. The tz database distinguishes zones as logical groupings of regions sharing identical timekeeping rules, often named after a representative city (e.g., "America/New_York" for the Eastern Time Zone), from locations, which are specific geographical points or areas mapped to those zones.[2] Zones encapsulate the abstract rules and history applicable to multiple locations, promoting consistency in software, while locations provide the concrete geographical context, such as assigning "Europe/London" to the United Kingdom's territory.[1] This separation allows the database to model complex boundaries and changes without duplicating rule sets for every minor area.Data Structure
Timezone definitions
In the tz database, a timezone is conceptually defined as a sequence of UTC offsets and associated transition rules that dictate how local time is computed for a given location over historical and future periods. This structure captures the evolution of local time, including changes due to daylight saving time (DST) and permanent adjustments to standard offsets, ensuring that the database reflects accurate timestamps from the POSIX epoch onward.[6] The database logically partitions the world into distinct timezones, where all locations within a single zone maintain identical local clock times at any given moment after 1970-01-01 00:00:00 UTC. This partitioning prioritizes regions where clocks synchronize uniformly, often aligning with political jurisdictions rather than strict geographical features, as governmental decisions govern time observance.[1] Timezone definitions include standardized abbreviations, such as EST for Eastern Standard Time or EDT for Eastern Daylight Time, to denote the local time in use during specific periods. However, these abbreviations are not unique identifiers, as the same three- or four-letter codes can apply to multiple unrelated zones (e.g., CST for Central Standard Time in North America versus China Standard Time), rendering them ambiguous for precise identification.[7] Instead, the database relies on location-based names, such as America/New_York, to unambiguously label these definitions.File formats
The tz database is distributed in multiple file formats to support both human-readable source data and efficient runtime access. The primary source files are plain text, written in a syntax processed by the zic compiler to generate binary zoneinfo files. Additionally, auxiliary files like zone.tab, zone1970.tab, and zonenow.tab provide mappings for geographical and country-based queries, with zone.tab covering general locations, zone1970.tab focusing on pre-1970 data, and zonenow.tab addressing current and predicted zones; specialized files handle historical and leap second data.[8] The source text files, such as those for individual regions (e.g., northamerica), use a structured input format for the zic compiler. These consist of Zone lines, which define timezone names, standard offsets from UTC, references to rule sets, abbreviation formats, and optional until clauses for transitions, and Rule lines, which specify daylight saving time rules including start and end years, months, days, times, and abbreviation suffixes. This syntax enables the compilation of historical and future timezone transitions into a compact representation.[9] Compiled zoneinfo files follow the Time Zone Information Format (TZif), a binary standard that supports both 32-bit and 64-bit representations. In TZif version 1, transition times are stored as 32-bit signed integers representing seconds since the POSIX epoch (1970-01-01 00:00:00 UTC), limiting the range to approximately 1901–2038; versions 2 and 3 extend this to 64-bit integers for a vastly larger temporal scope of about 292 billion years in either direction, while version 4 (as of 2024) adds optional leap-second table truncation and expiration features. Offsets from UTC and daylight saving indicators are encoded as 32-bit signed integers in seconds within local time type records, accompanied by fields for transition counts, type indices, designations (abbreviations), and leap second corrections. The format includes a fixed header with version and count fields, followed by data blocks and an optional POSIX-compatible footer in higher versions.[10] The zone.tab file serves as a tabular index mapping ISO 3166-1 alpha-2 country codes to timezone identifiers, including latitude and longitude coordinates in degrees and minutes (e.g., +404251-0740023 for 40°42'51"N 74°00'23"W) and optional comments. Each line contains four tab-separated fields: country code, coordinates, timezone name, and comments, facilitating applications that need location-based timezone lookups.[5] Pre-1970 historical data, which is less standardized and potentially less accurate, is maintained in a separate source file called backzone to isolate it from the main post-1970 coverage. Leap seconds are tracked in a dedicated leapseconds file, formatted as lines with fields for the leap occurrence date (year, month, day, time as 23:59:60 UTC), correction sign (+ for insertion), and status (S for stationary), derived from IERS bulletins. This file supports systems requiring precise UTC-to-TAI conversions but is not integrated into standard zoneinfo binaries.[8][11]Zone naming
The zone names in the tz database follow a standardized hierarchical format ofAREA/LOCATION, where AREA typically denotes a continent, ocean, or broad region (such as America for both North and South America, Africa, Asia, Europe, Atlantic, Indian, Pacific, Australia, Antarctica, or Arctic), and LOCATION specifies a city or other representative place within that area.[8] This convention, designed by Paul Eggert, ensures that each name uniquely identifies a timezone applicable since 1970, when coordinated universal time became prevalent, while indicating a typical location of clocks observing that timezone for expert users.[8]
The selection of names adheres to specific criteria to promote stability and neutrality: they must be robust against political changes by avoiding direct ties to country names, comply with POSIX filename rules (using only ASCII letters, periods, hyphens, underscores, limited to 14 characters without digits or leading hyphens), and prioritize compact, populous, and well-known locations—often the largest city in the zone—to serve as stable representatives.[8] For instance, America/New_York is chosen over other U.S. Eastern Time cities due to New York's prominence, and Asia/Shanghai represents China Standard Time rather than the capital Beijing for its larger population and historical significance.[8] These guidelines evolved from earlier rules, such as requiring at least one name per ISO 3166-1 country code, which were abandoned to better accommodate complex regional variations.[8]
Special cases include the Etc/ area for fixed-offset zones and UTC variants, defined in the etcetera source file to support platforms requiring leap second information or simple offsets like Etc/UTC and Etc/GMT (noting that GMT here denotes a fixed offset, not Greenwich Mean Time).[8] Backward compatibility is maintained through the backward source file, which creates symbolic links from obsolete names (e.g., US/Eastern or Asia/Calcutta) to current ones (e.g., America/New_York or Asia/Kolkata, renamed in 2008 to reflect official spelling changes), ensuring legacy software continues to function without disruption.[8] Name changes are rare and occur only when necessary, such as for accuracy or geopolitical updates, with old aliases preserved indefinitely.[8]
DST rules
In the tz database, daylight saving time (DST) transitions are defined using Rule lines in the source files, which specify the start and end dates, times, and offsets for advancing or setting back clocks from standard time. These rules enable the database to model periods of DST observance, where local time is typically shifted forward by one hour (or sometimes more) during warmer months to conserve energy or align with social patterns. Each Rule line applies to a contiguous range of years and can represent either the onset of DST (spring-forward) or its cessation (fall-back), with the SAVE field indicating the offset from standard time, such as +1:00 for a one-hour advance.[8] The structure of a Rule line is: Rule <NAME> <FROM> <TO> <TYPE> <IN> <ON> <AT> <SAVE> <LETTER>/<S>, where <NAME> identifies the rule set (e.g., US for United States rules), <FROM> and <TO> delimit the applicable years (e.g., 1987 or max for ongoing), <TYPE> is typically a hyphen, <IN> specifies the month (e.g., Mar), <ON> defines the day (e.g., lastSun or 15), <AT> sets the transition time (e.g., 2:00), <SAVE> denotes the time shift, and <LETTER> provides the abbreviation suffix (e.g., D for DST). Rule sets consist of one or more such lines to cover historical and predicted changes, allowing for complex patterns like double DST or wartime adjustments.[8] DST rules fall into several types based on recurrence: annual patterns using relative day specifiers like lastSun in a given month for recurring transitions (e.g., the last Sunday of October); fixed-date rules for one-off or periodic exact dates (e.g., Sep 30); or no rules at all for zones without DST, indicated by a zero offset in the Zone file. For instance, a syntax example for a recurring DST start might appear as Rule EU 1996 max - Mar lastSun 1:00u 1:00 D, specifying a UTC-based time to precisely define the shift. These types accommodate diverse global practices, from biannual shifts in North America to year-round DST in some regions.[8] Transition times are computed relative to local wall clock time by default, meaning the specified hour (e.g., 2:00) refers to the time as shown on clocks before the change, but a 'u' suffix (e.g., 2:00u) interprets it in UTC to ensure consistency across zones. This distinction is crucial for handling ambiguities during fall-back transitions, where clocks retreat (e.g., from 3:00 DST to 2:00 standard), creating a one-hour period that occurs twice; the database resolves this by applying the transition to the earlier wall-clock instance, treating later duplicates as post-transition standard time until the next change. Spring-forward transitions avoid such overlaps but skip the advanced hour entirely.[8] Rules are often shared rather than duplicated, with Zone entries referencing a rule name in their RULES field (e.g., -5:00 US E%sT) to apply the same DST logic across multiple locations, promoting consistency for regions like the European Union or Australia that follow unified policies. This referencing mechanism supports the database's goal of covering over 400 zones while minimizing redundancy in DST specifications.[8]Example entries
To illustrate the structure of the tz database, consider the Zone entry for America/New_York, which represents the Eastern Time Zone in the United States and Canada. A representative Zone line from the source files is:Zone America/New_York -4:56:02 - LMT 1883 Nov 18 12:03:58 -5:00 US E%sT. This defines the initial local mean time (LMT) offset of -4:56:02 until November 18, 1883, at 12:03:58 universal time, after which it adopts a -5:00 standard offset (Eastern Standard Time, EST) and applies US-specific rules for transitions, formatting as EST or EDT (Eastern Daylight Time).[12]
Complementing this, Rule lines specify daylight saving time (DST) transitions. For the period from 1967 to 2006, key US rules include: Rule US 1967 2006 - Oct lastSun 2:00 0 S for ending DST on the last Sunday in October at 2:00 standard time (saving 0 hours, letter S for standard), and Rule US 1967 1973 - Apr lastSun 2:00 1:00 D for starting DST on the last Sunday in April at 2:00 standard time (saving 1 hour, letter D for daylight). Later refinements, such as Rule US 1987 2006 - Apr Sun>=1 2:00 1:00 D, adjusted the spring transition to the first Sunday on or after April 1. These rules link to the Zone entry via the "US" reference, dictating when offsets shift between -5:00 (EST) and -4:00 (EDT).[12]
The zic compiler processes these Zone and Rule lines into binary zoneinfo files, compiling them into a sequence of transitions represented as timestamps (in UTC) with associated offsets, abbreviations, and DST indicators. For America/New_York, this results in discrete entries for each historical change, such as the 1967-04-30 06:00 UTC transition to EDT (corresponding to 2:00 local standard time on the last Sunday in April). The binary format stores these as an array of transition times followed by type descriptors (offset, DST flag, abbreviation index), enabling efficient lookups for any given date without recalculating rules.[12][13]
Examples like these highlight significant policy shifts, such as the 2007 US DST extension under the Energy Policy Act of 2005, which advanced the start to the second Sunday in March (Rule US 2007 max - Mar Sun>=8 2:00 1:00 D) and delayed the end to the first Sunday in November (Rule US 2007 max - Nov Sun>=1 2:00 0 S), adding about a month of DST annually and requiring tz database updates to reflect the new transitions in affected zones.[12][14]