Fact-checked by Grok 2 weeks ago

Percent-encoding

Percent-encoding, also known as URL encoding, is a mechanism for encoding special or non-ASCII characters in Uniform Resource Identifiers (URIs) by replacing them with a percent sign (%) followed by two hexadecimal digits representing the octet value of the character in its encoding (typically UTF-8 for non-ASCII characters).^[1] This process ensures that URIs remain syntactically correct and unambiguous across different systems, preventing reserved characters from being misinterpreted as delimiters or structural elements.^[2] It is essential for representing arbitrary data in web addresses, such as spaces in query parameters or non-Latin characters in paths, while maintaining compatibility with the limited US-ASCII character set required for safe transmission over the internet.^[3] In percent-encoding, only unreserved characters—letters (A-Z, a-z), digits (0-9), hyphen (-), period (.), underscore (_), and tilde (~)—may appear literally without encoding in most URI components, as they pose no risk of confusion.^[4] Reserved characters, divided into generic delimiters (such as : / ? # [ ] @) and sub-delimiters (! $ & ' ( ) * + , ; =), must be percent-encoded when used as data rather than for their syntactic roles, to avoid altering the URI's structure.^[5] For example, a space is encoded as %20, a hash (#) as %23, and a non-ASCII character like é (UTF-8 octet sequence) would be encoded as %C3%A9 in UTF-8-based URIs.^[1] Decoding reverses this by converting %HH sequences back to their original octets, with uppercase hexadecimal digits (A-F) preferred for consistency.^[1] The concept originated in the early development of the web, first specified in RFC 1738 for Uniform Resource Locators (URLs) in 1994, which introduced percent-encoding to handle unsafe characters in network paths.^[6] It was refined in RFC 2396 for generic URIs in 1998 and fully standardized in RFC 3986 in 2005, which obsoleted prior versions and clarified rules for internationalized resource identifiers (IRIs) via UTF-8 encoding.^[7] Today, percent-encoding is widely implemented in web technologies, including HTML forms, HTTP requests, and JavaScript APIs, ensuring robust handling of user-generated content in URLs.^[2]

Overview

Definition and Purpose

Percent-encoding is a mechanism for representing data octets within components of a Uniform Resource Identifier (URI) or similar contexts by replacing characters outside the allowed set or those that could interfere with parsing—such as reserved delimiters—with a percent sign (%) followed by two uppercase hexadecimal digits corresponding to the octet's value.^[1] This method, formally defined as pct-encoded = "%" HEXDIG HEXDIG, ensures that the encoded form adheres to the US-ASCII subset while preserving the original data.^[8] The primary purpose of percent-encoding is to enable the safe and unambiguous transmission of arbitrary binary data or non-standard characters across network protocols that rely on specific characters as structural delimiters, such as in URIs where characters like /, ?, or # define components.^[9] By transforming potentially conflicting characters into a standardized escape sequence, it prevents misinterpretation by parsers, servers, or intermediaries, thereby supporting the inclusion of spaces, non-ASCII text, or other unsafe elements in web addresses and queries without disrupting protocol syntax.^[1] A common example is the encoding of a space character (ASCII 32), which becomes %20 to avoid its interpretation as a query separator.^[9] For non-ASCII characters, the process involves first converting the character to its UTF-8 byte sequence and then percent-encoding each byte; for instance, the accented letter é (Unicode U+00E9) is UTF-8 encoded as the bytes C3 A9, resulting in %C3%A9.^[10] Although frequently called URL encoding in informal contexts, percent-encoding specifically refers to this %-hexadecimal escape mechanism and differs from specialized encodings like Punycode, which handles internationalized domain names by mapping Unicode to ASCII-compatible strings.^[2]^[11]

History and Development

Percent-encoding emerged in the early 1990s alongside the foundational development of the World-Wide Web, serving as a mechanism to represent non-ASCII and unsafe characters in network addresses while ensuring compatibility with 7-bit ASCII transmission. Initially implemented in WWW software as early as 1990, it addressed the need to avoid corruption of URIs in environments with limited character sets, such as those using only printable US-ASCII characters. This approach drew from broader traditions of encoding binary or 8-bit data for safe transit over text-based protocols, predating the widespread adoption of UTF-8 for internationalization.^[12] The technique saw its first informal applications around 1993 with the introduction of the Common Gateway Interface (CGI) by the National Center for Supercomputing Applications (NCSA), where it facilitated the encoding of form data in query strings passed to server-side scripts. As web usage expanded rapidly following the release of graphical browsers like NCSA Mosaic in 1993, variations in how percent-encoding was handled across early implementations highlighted the need for clearer guidelines to prevent interoperability issues. These early uses built on draft specifications for uniform resource locators, emphasizing the percent sign ("%") followed by two hexadecimal digits to escape problematic octets without conflicting with existing syntax like Unix paths or attribute-value pairs.^[13]^[14] A key milestone came in 1994 with RFC 1630, authored by Tim Berners-Lee, which formally introduced percent-encoding for relative uniform resource identifiers (URIs) in the WWW context, defining it as a way to encode reserved and unsafe characters while maintaining hierarchical structure. This was further refined in RFC 1738 later that year, specifying mandatory escaping for certain octets and reserved delimiters to ensure consistent interpretation across Internet protocols. By 1998, RFC 2396 provided a comprehensive generic syntax for URIs, clarifying ambiguities from prior documents like RFC 1738 and RFC 1808, and expanding the set of unreserved characters to include tilde (~) while standardizing the escaping mechanism for broader applicability.^[12]^[13]^[15] The evolution continued with RFC 3986 in 2005, which updated the URI syntax to better support internationalization by integrating UTF-8 encoding: non-ASCII characters are first converted to UTF-8 octets, then percent-encoded if not unreserved. This revision mandated uppercase hexadecimal digits for consistency, addressed normalization challenges, and replaced terms like "escaped" with "percent-encoded" for precision, reflecting lessons from two decades of web deployment and the shift toward global character handling. These IETF efforts, driven by practical inconsistencies in early browser and server implementations, solidified percent-encoding as a core component of web standards.^[7]

Encoding Mechanism

Encoding Process

The encoding process for percent-encoding involves transforming input data, typically represented as a sequence of characters, into a sequence of bytes using UTF-8 encoding, then replacing each byte that does not belong to the unreserved set with a percent sign (%) followed by its two-digit uppercase hexadecimal representation.^[10] This mechanism ensures that the encoded data can be safely transmitted within URI components without ambiguity, as defined in the URI generic syntax standard.^[1] For instance, a space character (Unicode U+0020), which encodes to the byte 0x20 in UTF-8, becomes %20.^[1] The algorithmic steps for encoding are as follows:

Convert the input string to a byte sequence using UTF-8 encoding.^[10]
For each byte in the sequence, determine if it corresponds to an unreserved character (A-Z, a-z, 0-9, hyphen '-', period '.', underscore '_', or tilde '~').^[4]
If the byte is unreserved, output it directly as a character.^[1]
If the byte is not unreserved, output a percent sign (%) immediately followed by the uppercase hexadecimal representation of the byte value, padded to two digits (e.g., byte 0xA3 becomes %A3).^[1]

This process operates on individual bytes rather than characters to handle international text via UTF-8, ensuring compatibility with ASCII-based protocols.^[10] Decoding reverses this process by scanning the encoded string for percent-encoded triplets (% followed by two hexadecimal digits), converting each such triplet back to its corresponding byte, and then interpreting the resulting byte sequence as UTF-8 characters.^[9] To avoid false positives, the decoder must only interpret %XX sequences where XX are valid hexadecimal digits (0-9, A-F), and literal percent signs in the original data are handled by first encoding them as %25 during the encoding step.^[9] Decoding is typically applied after parsing the URI into components to prevent misinterpretation of delimiters.^[16] Edge cases include the encoding of the percent sign itself, which becomes %25 to distinguish it from escape sequences, and the handling of multi-byte UTF-8 sequences, where each byte is encoded separately (e.g., the Unicode character é (U+00E9) encodes to bytes 0xC3 0xA9, resulting in %C3%A9).^[10] Modern libraries prevent over-encoding by strictly adhering to the unreserved set and avoiding unnecessary escapes for safe characters.^[4] Many programming languages provide built-in functions to perform this encoding and decoding. In JavaScript, the encodeURIComponent() function implements percent-encoding for URI components by UTF-8 encoding the input and escaping non-unreserved bytes as %HH, per the ECMAScript specification.^[17] Similarly, Python's urllib.parse.quote() in the standard library applies percent-encoding to strings using UTF-8 bytes, leaving specified safe characters unencoded.^[18]

Character Classification

In percent-encoding, characters are classified based on their syntactic roles and compatibility within protocols like URIs, determining whether they can remain literal or must be encoded as a sequence of three octets: a percent sign (%) followed by two hexadecimal digits representing the byte value.^[1] This classification ensures unambiguous transmission of data across systems, preventing misinterpretation of delimiters or invalid characters.^[1] Unreserved characters are those that pose no risk of confusion in most contexts and thus can be transmitted literally without percent-encoding. These include uppercase and lowercase letters (A-Z and a-z), digits (0-9), and the symbols hyphen (-), period (.), underscore (_), and tilde (~).^[4] For example, the string "example.com" can appear unchanged, as all its characters fall into this set.^[4] Although unreserved characters may optionally be percent-encoded (e.g., "A" as %41), such forms are considered equivalent and should be decoded for normalization to avoid redundancy.^[4] Reserved characters, by contrast, have special syntactic purposes in protocols such as URIs and must be percent-encoded when used in a data role rather than their delimiter function, to avoid altering the structure.^[5] They are divided into two subsets: general delimiters (gen-delims), which include colon (:), slash (/), question mark (?), number sign (#), left and right square brackets ([ and ]), and commercial at (@); and sub-delimiters (sub-delims), which encompass exclamation mark (!), dollar sign ($), ampersand (&), single quote ('), left and right parentheses ( ( and ) ), asterisk (*), plus sign (+), comma (,), semicolon (;), and equals sign (=).^[5] For instance, a slash (/) in a path component might delimit segments but requires encoding as %2F if intended as literal data within a segment.^[5] Characters outside the unreserved and reserved sets—such as non-ASCII characters, control characters (e.g., NUL, carriage return, or line feed), or any octet not in the US-ASCII range (0-127)—must always be percent-encoded for safety and compatibility.^[1] Non-ASCII and control characters are first converted to their UTF-8 byte representation before encoding, ensuring portability across character sets like EBCDIC.^[10] This byte-level approach, rather than character-level, allows percent-encoding to handle international text by representing each UTF-8 octet individually (e.g., the accented character "à" becomes %C3%A0), supporting global interoperability while adhering to URI octet-sequence assumptions.^[10] The percent sign (%) itself receives special treatment and is never used literally in encoded data; it must always be encoded as %25 to prevent ambiguity with the start of a percent-encoded sequence.^[1] This rule applies universally, even if % appears in unreserved or reserved contexts, safeguarding the integrity of the encoding mechanism.^[1]

Applications

In Uniform Resource Identifiers (URIs)

Percent-encoding plays a crucial role in Uniform Resource Identifiers (URIs) by allowing the inclusion of characters that could otherwise interfere with the syntactic structure of URI components. According to RFC 3986, URIs consist of components such as the scheme, authority, path, query, and fragment, where percent-encoding is applied selectively to preserve delimiters and ensure unambiguous parsing. Reserved characters, which include generic delimiters like ":", "/", "?", "#", "[", "]", "@" and sub-delimiters like "!", "$", "&", "'", "(", ")", "*", "+", ",", ";", "=", must be percent-encoded when used as data within a component to avoid confusion with their syntactic roles. Unreserved characters, such as alphanumeric characters and "-", ".", "_", "~", may be left unencoded but can be percent-encoded without altering equivalence.^[5]^[4] In the path component, percent-encoding is used to encode characters outside the allowed pchar production, which permits unreserved characters, percent-encoded octets, sub-delimiters, ":", and "@". For instance, spaces or other reserved characters in a file path must be encoded to prevent misinterpretation as path separators. The URI http://[example.com](/page/Example.com)/[path](/page/Path) with space/to [file](/page/File) would be encoded as http://example.com/path%20with%20space/to%20file, where the forward slash "/" remains unencoded as it serves as the path delimiter. This ensures the path is treated as a single hierarchical segment without unintended splits.^[19] The query component, following the "?" delimiter, employs percent-encoding more permissively, allowing pchar, "/", and "?" while encoding characters that might conflict with parameter delimiters like "&" and "=". This prevents the query string from being parsed incorrectly; for example, a parameter with an ampersand in its value, such as name=John&Doe, must have the ampersand encoded as name=John%26Doe to prevent the query string from being parsed as multiple parameters. In practice, implementations often encode all non-alphanumeric characters except those explicitly needed for structure, though the RFC advises encoding only when necessary to avoid conflicts.^[20] For the fragment component, introduced by "#", similar rules apply using the same allowed characters as the query, with percent-encoding for any data that could mimic delimiters. Browsers and user agents may automatically apply percent-encoding during URI navigation to handle fragments safely, but the semantics of fragments are media-type dependent rather than scheme-specific. The scheme and authority components generally prohibit or limit percent-encoding: schemes use no encoding and are case-insensitive, while authority parts like userinfo and registered names allow it for non-ASCII or reserved data via UTF-8 octet encoding.^[21]^[22]^[23] Common pitfalls in URI percent-encoding include double-encoding, where an already encoded sequence like "%20" is re-encoded to "%2520", leading to incorrect dereferencing. Implementations must avoid encoding or decoding the same string multiple times, as the percent character "%" itself requires encoding as "%25" when used as data. Hexadecimal digits in percent-encodings are case-insensitive per the RFC, but uppercase is conventionally preferred for consistency across systems. A complete example is the URI http://[example.com](/page/Example.com)/search?q=hello%20world#results, where the space in the query is encoded as "%20" to maintain parameter integrity, and the fragment remains unencoded unless containing reserved data.^[9]^[1]

In Form Data Submission

In the submission of HTML form data using the application/x-www-form-urlencoded media type, percent-encoding ensures that special characters in user input do not interfere with the structured transmission over HTTP. This format, the default for form enctype attributes, serializes form fields as a sequence of key-value pairs joined by & delimiters, with each pair formatted as key=value. Keys and values undergo percent-encoding to escape reserved characters, preventing conflicts with the syntax of the encoded string.^[24]^[25] The encoding process follows rules akin to those for URI query strings but incorporates a key distinction: spaces are replaced with + symbols, a convention established in early HTML specifications rather than the strict %20 used elsewhere. Other special characters, such as &, =, and non-ASCII bytes, are represented as % followed by two hexadecimal digits corresponding to their UTF-8 byte values. Non-ASCII characters are first converted to UTF-8 bytes, which are then individually percent-encoded if they belong to the form-urlencoded percent-encode set—encompassing all code points except ASCII alphanumerics and the symbols *, -, ., _. This set ensures safe transmission while minimizing encoding overhead for common characters. The historical use of + for spaces stems from conventions in initial web form handling, as defined in HTML 2.0.^[26]^[27]^[28] For instance, a form containing fields name with value "John Doe" and age with value "30" serializes to name=John+Doe&age=30. Here, the space in "John Doe" becomes +, while no further encoding is needed for the numeric "30" or the alphanumeric "John". If the name included a special character like "&", it would encode as name=John%26Doe. Upon receipt, servers or parsers decode + back to spaces and expand %HH sequences to their original bytes, reconstructing the form data.^[29]^[30] Unlike the multipart/form-data format, which partitions data into labeled parts suitable for binary files and avoids universal percent-encoding, application/x-www-form-urlencoded treats all content as text and mandates encoding of potentially unsafe characters across the entire payload. This makes it more compact for simple textual submissions but less versatile. Modern web browsers handle this encoding automatically during form submission via GET (appending to the URI query) or POST (in the request body), ensuring compliance with the standard. However, for forms including file inputs, browsers default to multipart/form-data to accommodate binary uploads without encoding distortion.^[31]^[32] A primary limitation of this format is its incompatibility with binary data, as percent-encoding can inflate payload sizes significantly (e.g., each non-ASCII byte becomes three characters) and risks corruption if binary sequences mimic control characters like % or &. It is thus recommended only for short, text-only forms, with multipart/form-data preferred for complex or binary-inclusive submissions.

Standards and Variations

Current Standards

The current standards for percent-encoding are primarily defined in RFC 3986, published in 2005, which specifies the syntax for Uniform Resource Identifiers (URIs) and mandates percent-encoding for reserved characters when they serve a non-reserved purpose (i.e., used as data), and for non-ASCII characters. In this specification, percent-encoding uses the percent sign (%) followed by two hexadecimal digits to encode octets, with decoding requiring case-insensitive matching of the hexadecimal values. For internationalized resource identifiers (IRIs), RFC 3987 extends these rules by requiring UTF-8 encoding of Unicode characters prior to percent-encoding, enabling support for non-ASCII scripts in URIs while maintaining compatibility with legacy ASCII-based systems. In the context of HTTP, RFC 9110, published in 2022, which defines HTTP semantics, references percent-encoding for handling characters in request and response headers, as well as in message bodies where URI-like components appear, ensuring safe transmission of encoded data over the protocol. Browser implementations are further guided by the WHATWG URL Standard, a living specification that aligns with RFC 3986 for parsing and serializing URLs, including strict percent-encoding rules to promote interoperability across web agents. For application/x-www-form-urlencoded data, commonly used in HTML form submissions, the HTML Living Standard specifies that characters must be encoded using UTF-8 followed by percent-encoding, with a focus on strict adherence to avoid security issues like injection attacks. This includes encoding spaces as %20 and other non-alphanumeric characters as needed, while prohibiting certain ambiguities in decoding. Since the publication of RFC 3986, there have been no fundamental changes to the core percent-encoding mechanism, but subsequent errata and clarifications emphasize UTF-8 normalization for input characters and address edge cases, such as the encoding of IPv6 addresses within URIs to prevent parsing errors. Libraries and implementations achieving compliance must support case-insensitive hexadecimal decoding, as explicitly required to handle variations in how encoded data is generated across systems.

Non-Standard Implementations

In early implementations of web browsers, deviations from standard percent-encoding rules were common. For instance, early versions of Internet Explorer (prior to IE 7) often left the tilde (~) character unencoded in URI paths, leading to interoperability issues with servers expecting strict compliance. Similarly, Netscape Navigator allowed lowercase hexadecimal digits in percent-encodings (e.g., %7e instead of %7E), which, while functional in many cases, violated the RFC 3986 recommendation for uppercase hex digits to ensure consistent normalization and decoding across systems.^[33] Extensions of percent-encoding appear in non-URI contexts, though they often conflict with preferred standards. In email headers, RFC 2047 specifies an encoded-word syntax using a "Q" encoding akin to quoted-printable for non-ASCII text, explicitly avoiding percent-encoding to prevent parsing ambiguities; however, some legacy mail clients and custom implementations have misused percent-encoding for header values, resulting in non-compliant messages that may fail delivery or display.^[34]^[35] In contrast, URL-safe Base64 variants used in JSON Web Tokens (JWTs) and similar formats modify the alphabet (replacing + with - and / with _) to produce output that requires minimal or no additional percent-encoding, distinguishing it from traditional percent-encoding while serving analogous URL-safety goals. Non-web applications sometimes apply percent-encoding to arbitrary data in ways that extend beyond URI specifications. For example, certain REST APIs over-encode URL fragments (the portion after #), percent-encoding characters that standards leave to application-specific handling, which can cause decoding errors or expose unintended paths when clients expect raw fragments.^[36] Vendor-specific libraries introduce further variations. In PHP, the urlencode() function is tailored for application/x-www-form-urlencoded data, replacing spaces with + and encoding other reserved characters, whereas rawurlencode() adheres more closely to RFC 3986 by using %20 for spaces and encoding a broader set of characters suitable for general URI components.^[37]^[38] Similarly, Java's URLEncoder class defaults to application/x-www-form-urlencoded format, encoding spaces as + and relying on the platform's default charset (historically not always UTF-8), which can lead to inconsistent results across environments unless UTF-8 is explicitly specified.^[39] Modern implementations reveal ongoing gaps, particularly in legacy or under-maintained libraries. Older software libraries often provide incomplete UTF-8 support for percent-encoding, truncating or misinterpreting multi-byte sequences and resulting in garbled international characters or data loss during encoding/decoding.^[40] These issues heighten security risks, such as CRLF injection, where incomplete decoding of sequences like %0D%0A (representing carriage return and line feed) allows attackers to inject malicious headers, enabling HTTP response splitting and potential session hijacking if input is not strictly sanitized before processing.^[41]^[42] In mobile environments, Android's URL handling exhibits quirks that persist into recent versions. Up through Android 14 (2023), the platform's Uri class and related APIs inconsistently handle percent-encoded spaces in query strings, sometimes favoring + over %20 in form-like data, which deviates from strict URI decoding and can cause mismatches in API calls or web views.^[43]

References

[1]
https://datatracker.ietf.org/doc/html/rfc3986#section-2.1
[2]
Percent-encoding - Glossary - MDN Web Docs
Jul 11, 2025 · Percent-encoding is a mechanism to encode 8-bit characters that have specific meaning in the context of URLs. It is sometimes called URL encoding.
[3]
https://datatracker.ietf.org/doc/html/rfc3986#section-2
[4]
https://datatracker.ietf.org/doc/html/rfc3986#section-2.3
[5]
https://datatracker.ietf.org/doc/html/rfc3986#section-2.2
[6]
RFC 1738: Uniform Resource Locators (URL)
This document specifies a Uniform Resource Locator (URL), the syntax and semantics of formalized information for location and access of resources via the ...Missing: history | Show results with:history
[7]
RFC 3986 - Uniform Resource Identifier (URI): Generic Syntax
Percent-Encoding A percent-encoding mechanism is used to represent a data octet in a component when that octet's corresponding character is outside the ...
[8]
https://datatracker.ietf.org/doc/html/rfc3986#appendix-A
[9]
https://datatracker.ietf.org/doc/html/rfc3986#section-2.4
[10]
https://datatracker.ietf.org/doc/html/rfc3986#section-2.5
[11]
RFC 3492 - Punycode: A Bootstring encoding of Unicode for ...
Punycode is a simple and efficient transfer encoding syntax designed for use with Internationalized Domain Names in Applications (IDNA).
[12]
RFC 1630: Universal Resource Identifiers in WWW: A Unifying Syntax for the Expression of Names and Addresses of Objects on the Network as used in the World-Wide Web
### Summary of Percent-Encoding in RFC 1630
[13]
RFC 1738: Uniform Resource Locators (URL)
### Summary of Percent-Encoding in RFC 1738
[14]
The World Wide Web - The Common Gateway Interface (CGI)
Jan 28, 2009 · The CGI interface has been in use with the World Wide Web since 1993, and the current version is CGI/1.1. ... The URL-encoding replaces any ...
[15]
RFC 2396 - Uniform Resource Identifiers (URI): Generic Syntax
This document defines the generic syntax of URI, including both absolute and relative forms, and guidelines for their use.Missing: history | Show results with:history
[16]
https://datatracker.ietf.org/doc/html/rfc3986#section-7.3
[17]
encodeURIComponent() - JavaScript - MDN Web Docs
Oct 30, 2025 · The encodeURIComponent() function encodes a URI by replacing each instance of certain characters by one, two, three, or four escape sequences representing the ...
[18]
https://docs.python.org/3/library/urllib.parse.html#urllib.parse.quote
[19]
https://datatracker.ietf.org/doc/html/rfc3986#section-3.3
[20]
https://datatracker.ietf.org/doc/html/rfc3986#section-3.4
[21]
https://datatracker.ietf.org/doc/html/rfc3986#section-3.5
[22]
https://datatracker.ietf.org/doc/html/rfc3986#section-3.1
[23]
https://datatracker.ietf.org/doc/html/rfc3986#section-3.2
[24]
HTML Standard
### Summary of `application/x-www-form-urlencoded` Serialization and Related Details from HTML Standard (Section 4.10 Forms)
[25]
https://url.spec.whatwg.org/#urlencoded-serializer
[26]
https://url.spec.whatwg.org/#application/x-www-form-urlencoded-percent-encode-set
[27]
https://url.spec.whatwg.org/#string-percent-encode-after-encoding
[28]
https://www.rfc-editor.org/rfc/rfc1866.html#section-8.2
[29]
https://html.spec.whatwg.org/multipage/forms.html#constructing-the-form-data-set
[30]
https://url.spec.whatwg.org/#urlencoded-parser
[31]
https://html.spec.whatwg.org/multipage/form-control-infrastructure.html#urlencoded-form-data-set
[32]
https://html.spec.whatwg.org/multipage/forms.html#attr-fs-enctype
[33]
.net UrlEncode - lowercase problem - Stack Overflow
May 27, 2009 · URL Encoding returns the original string with invalid characters replaced by %xx, where xx is the hexadecimal value of the invalid character in ISO-8859-1.Is URL percent-encoding case sensitive? - Stack OverflowInternet Explorer having problems with special chars in querystringsMore results from stackoverflow.comMissing: legacy Netscape
[34]
Hex digits in URL encoding should be upper-case #2281 - GitHub
Apr 1, 2021 · For consistency, URI producers and normalizers should use uppercase hexadecimal digits for all percent-encodings. Implementing this change would ...Missing: legacy IE tilde ~ Netscape
[35]
MIME: Message Header Extensions for Non-ASCII Text - IETF
The "Q" encoding is similar to the "Quoted-Printable" content- transfer-encoding defined in RFC 2045. It is designed to allow text containing mostly ASCII ...
[36]
Are these email headers RFC-2047 compliant? - Stack Overflow
May 24, 2017 · This is a totally invalid mime header according to RFC 2047. It has no quoted-printable identifier (?Q?), the different bytes are encoded with % ...How to decode a quoted printable e-mail header (with MimeKit)c# - Decode quoted printable correct - Stack OverflowMore results from stackoverflow.com
[37]
URL Standard
Oct 30, 2025 · The application/x-www-form-urlencoded percent-encode set contains all code points, except the ASCII alphanumeric, U+002A (*), U+002D (-), U+002E ...
[38]
rawurlencode - Manual - PHP
PHP's functions rawurlencode() and urlencode(), both encode the whole argument parameter string, making the result useless as a valid link. The function listed ...Description ¶ · Return Values ¶ · Examples ¶
[39]
php - urlencode vs rawurlencode? - Stack Overflow
Jun 15, 2009 · Differences in ASCII: · UrlEncode checks for space, assigns a + sign, RawURLEncode does not. · UrlEncode does not assign a \0 byte to the string, ...Urlencode Vs Rawurlencode? · 7 Comments · Differences In AsciiWhat is the difference between urlencode and rawurlencode?what's the difference between rawurldecode() and ... - Stack OverflowMore results from stackoverflow.com
[40]
URLEncoder (Java Platform SE 8 ) - Oracle Help Center
Translates a string into x-www-form-urlencoded format. This method uses the platform's default encoding as the encoding scheme to obtain the bytes for unsafe ...
[41]
Use UTF-8 (Unicode) charset encoding for pages and email for ...
A quick web search shows that some browsers do have issues with utf8-encoding, although it appeasr that they're ok with the ascii (or maybe latin-1) ...
[42]
CWE-113: Improper Neutralization of CRLF Sequences in HTTP ...
When an HTTP request contains unexpected CR and LF characters, the server may respond with an output stream that is interpreted as splitting the stream into ...
[43]
How to fix "Improper Neutralization of CRLF Sequences in HTTP ...
Feb 24, 2014 · Try adding a function call to remove any carriage returns or line feed characters (including their encoded equivalents like %0d and %0a ) from that query ...
[44]
URL encoding the space character: + or %20? - Stack Overflow
Oct 27, 2009 · The real percent encoding uses %20 while form data in URLs is in a modified form that uses +. So you're most likely to only see + in URLs in the query string ...URL encoding in AndroidWhen should space be encoded to plus (+) or %20?More results from stackoverflow.com
[45]
Android Security Bulletin-January 2025
Jan 1, 2025 · The Android Security Bulletin contains details of security vulnerabilities affecting Android devices. Security patch levels of 2025-01-05 or later address all ...<|control11|><|separator|>