HTTP referer
The HTTP Referer header (often misspelled as "referer" in the specification rather than the correct English spelling "referrer") is an optional HTTP request header field that identifies the URI of the resource from which the user agent (such as a web browser) initiated the request, allowing the receiving server to determine the referring page or context.[1] This header was first defined in HTTP/1.0 and formalized in subsequent standards, enabling servers to track navigation origins for purposes like analytics, logging, maintenance of back-link lists, and basic security checks against unauthorized access.[1] The value is typically a full URI or a relative reference to the previous resource, but its inclusion is not mandatory and depends on user agent configuration, protocol security levels, and policy settings.
While useful for legitimate web operations, the Referer header raises significant privacy and security concerns because it can inadvertently disclose sensitive information, such as query parameters containing personal data, authentication tokens, or full URLs from secure (HTTPS) contexts when requests are made to insecure (HTTP) endpoints.[2] For instance, clicking a link from a banking site to an external resource might expose the full referral URL, including account details, to third-party servers, potentially enabling tracking or phishing attacks.[3] To mitigate these risks, modern web standards introduce the Referrer-Policy response header, which allows site owners to control the amount of referrer information sent—options include stripping the path, origin only, or no referrer at all—and user agents often default to stricter policies for cross-origin requests or when upgrading from HTTP to HTTPS. Additionally, browsers may omit the header entirely in private browsing modes or when navigating from secure to insecure sites to protect user privacy.[2]
The header's syntax follows the ABNF notation as Referer = absolute-URI / partial-URI, where it is sent only in requests (not responses) and is case-insensitive, though implementations typically use "Referer".[1] Despite its utility in web analytics and anti-bot measures, evolving privacy regulations like GDPR and browser features such as Intelligent Tracking Prevention have led to reduced reliance on Referer data, with alternatives like the Fetch Metadata Request Headers providing safer context without full URL exposure.[3] Overall, while the Referer remains a core part of HTTP semantics, its use is increasingly balanced against user privacy protections in contemporary web development.
Origins and Etymology
Etymology
The HTTP "Referer" header derives its name from a misspelling of the standard English term "referrer," which denotes the source providing a reference or directing a user to another resource. This typographical error occurred in the initial proposal for the header by computer scientist Phillip Hallam-Baker, who suggested incorporating the field into the HTTP specification. Hallam-Baker later acknowledged responsibility for the error in a 1995 IETF mailing list discussion, noting that he had suggested the field without initially naming it, and Roy Fielding proposed the misspelled "Referer" as the header name.[4]
The misspelling persisted into the formal HTTP/1.0 specification published as RFC 1945 in 1996, where it was defined without correction due to emerging implementations already adopting the term. Subsequent standards, including HTTP/1.1 in RFC 2616 (1999) and later revisions, retained "Referer" to ensure backward compatibility with existing web infrastructure, avoiding disruptions to deployed systems. Early contributors from the IETF working group, including Tim Berners-Lee, prioritized protocol stability over orthographic accuracy, solidifying the non-standard spelling across the web ecosystem.
Linguistically, "referer" functions as a back-formation from "referral," treating the header as an agent of the referral process, but it deviates from conventional English morphology where "referrer" preserves the double 'r' from the verb "refer" (as in "programmer" from "program"). This irregularity mirrors other technical terms where variant spellings have become entrenched, such as "advisor" (American English) versus "adviser" (British English), both accepted despite differing conventions.[5] The persistence of "referer" underscores how protocol design often favors practical consistency over linguistic precision in computing standards.
Historical Development
The HTTP referer header was first introduced as an optional request header in the HTTP/1.0 specification, although it first appeared in early HTTP drafts, including the W3C request headers specification updated on May 3, 1994,[6] outlined in RFC 1945 and published in May 1996. Defined in Section 10.13, it specified the URI of the originating resource—typically from a hyperlink or form submission—to assist servers in generating back-links, logging navigation patterns, optimizing caching, and tracing issues like obsolete links. The header's syntax allowed for absolute or relative URIs, though fragments were excluded, and it was not required for requests lacking a clear origin, such as direct keyboard input. Authors Tim Berners-Lee, Roy T. Fielding, and Henrik Frystyk Nielsen emphasized user control over its transmission due to early privacy considerations.[7]
The header persisted without alteration in the HTTP/1.1 specification, formalized in RFC 2616 and published in June 1999, maintaining its role in enabling server-side analysis while recommending against its use in non-secure requests following secure ones to protect sensitive URIs. During the mid-1990s, initial browser support emerged with implementations in early web clients like Netscape Navigator (version 1.0 released in December 1994) and successors, aligning with the growing adoption of HTTP for hypermedia systems. In the late 1990s, the referer became instrumental in rudimentary web analytics, as server log analyzers such as Analog (launched in 1995) parsed it to track referral sources and user paths on nascent websites.[8][9][10]
By the early 2000s, growing recognition of privacy risks—such as unintended disclosure of browsing history or secure site details—prompted partial browser implementations, including options to suppress or strip the header in cross-protocol transitions (e.g., HTTPS to HTTP). This evolution culminated in the 2014 update via RFC 7231, which redefined HTTP/1.1 semantics across multiple documents but retained the referer header's name and function unchanged for backward compatibility, despite ongoing discussions within the IETF HTTP Working Group about potential renaming to "referrer" that were ultimately rejected to avoid breaking existing systems. The specification reiterated safeguards, advising against its inclusion in downgraded security contexts.[11][12]
Technical Functionality
Header Syntax and Values
The Referer header field follows the general HTTP header syntax, consisting of the field name "Referer" followed by a colon and a space, then the field value as a URI reference that identifies the referring resource from which the request originated.[13] The value adheres to the URI-reference production defined in RFC 3986, allowing either an absolute URI or a partial URI (relative reference). This structure enables servers to contextualize requests without requiring a full URL in all cases.[13]
Possible values for the Referer header include absolute URLs, such as https://example.com/page.html, which specify the complete scheme, host, path, and optional query parameters.[13] Partial URIs, like /path/to/resource?query=value, represent relative paths from the same origin, omitting the scheme and authority components.[13] An empty value or omission of the header indicates no referrer information is provided, often due to policy or lack of a referring context.[13]
Encoding rules for the Referer value require percent-encoding of special characters as per URI syntax in RFC 3986; for instance, spaces are encoded as %20, and non-ASCII characters use UTF-8 percent-encoding. Query parameters are included if present in the referring URI, but the fragment identifier (e.g., #section) must not be sent, as it is client-side and irrelevant to the server.[13] This ensures the value remains a valid, parsable reference without extraneous components.
In edge cases involving HTTPS origins requesting mixed content over HTTP, user agents SHOULD NOT send the Referer header to prevent leakage of secure URL details.[14] For invalid or malformed values, such as non-URI strings, HTTP servers handle them as opaque strings per general field parsing rules, often ignoring or logging them without error.
The formal grammar for the Referer header, as specified in RFC 9110 (which obsoletes RFC 7231), is given in Augmented Backus-Naur Form (ABNF) below:
Referer = absolute-URI / partial-URI
absolute-URI = [scheme](/page/Scheme) ":" hier-part [ "?" query ]
partial-URI = relative-part [ "?" query ]
Referer = absolute-URI / partial-URI
absolute-URI = [scheme](/page/Scheme) ":" hier-part [ "?" query ]
partial-URI = relative-part [ "?" query ]
These productions reference additional definitions from RFC 9110 (e.g., query, hier-part, relative-part) and RFC 3986 for complete URI components.[13]
Usage in HTTP Requests and Responses
The HTTP Referer header is exclusively a request header field sent by user agents, such as web browsers, to indicate the address of the referring resource from which the current request originated.[15] It is included in outgoing HTTP requests when the user agent fetches a resource that was invoked or obtained from another URI, enabling the recipient server to identify the source for purposes like logging or link validation. According to the HTTP/1.1 specification, user agents SHOULD send the Referer field whenever they have knowledge of the referring URI, but its inclusion is optional and subject to implementation choices.[15]
User agents trigger the Referer header in various resource fetch scenarios, including navigation via hyperlinks (e.g., clicking an tag to perform a GET request), form submissions (e.g., POST requests from