Fact-checked by Grok 2 weeks ago

Internationalized domain name

An Internationalized Domain Name (IDN) is a domain name that incorporates non-ASCII characters from Unicode, allowing registration and use of top-level domains (TLDs) in scripts and languages beyond the basic Latin alphabet, such as Arabic, Chinese, Cyrillic, or Devanagari. These names enable internet users to access websites using familiar local scripts, promoting a more inclusive and multilingual global internet. IDNs are stored and transmitted in the Domain Name System (DNS) via an ASCII-compatible encoding called Punycode, prefixed with "xn--", while applications display the original Unicode form to users. The for IDNs is the Internationalizing Domain Names in Applications (IDNA) protocol, first standardized by the (IETF) in 2003 as IDNA2003 ( 3490) to handle non-ASCII domain names without modifying the underlying DNS . This was updated in 2008 to IDNA2008 (s 5890–5894), which improved against attacks, supported newer , and refined character validation rules, including handling and context-specific restrictions. ( 3492) serves as the encoding , reversibly transforming Unicode strings into ASCII for DNS , ensuring seamless across systems. Development of IDNs began in the late 1990s through IETF working groups addressing the limitations of ASCII-only domain names, with initial guidelines emerging in 2003. The Internet Corporation for Assigned Names and Numbers (ICANN) endorsed these standards in March 2003 and published its first IDN Implementation Guidelines in June 2003, authorizing generic TLD (gTLD) registries to offer IDNs at the second level. ICANN's IDN ccTLD Fast Track Process, approved in October 2009, led to the delegation of the first IDN country-code TLDs (ccTLDs) into the DNS root zone in 2010, such as .рф for Russia and .الاردن for Jordan. By 2013, IDNs were integrated into the New gTLD Program, expanding availability to generic TLDs like .みんな (Japanese for "everyone"). ICANN continues to oversee IDN implementation through community-driven processes, including the Root Zone Label Generation Rules (RZ-LGR), which define permissible scripts and variants to prevent conflicts, with Version 6 released in September 2025 covering 27 scripts. As of June 2025, 61 IDN ccTLDs and 90 IDN gTLDs are operational, totaling 151 IDN TLDs, though adoption varies by region due to factors like script complexity and localization efforts. Ongoing IETF updates, such as in 2020, ensure IDNA compatibility with evolving standards, maintaining stability and security.

Overview and Purpose

Definition and Technical Scope

An Internationalized Domain Name (IDN) is a domain name that permits the use of a wider range of characters than the traditional ASCII set, specifically incorporating characters from various scripts to represent labels in languages other than English. These non-ASCII characters are encoded into an ASCII-compatible encoding (ACE) format using , which transforms the Unicode string into a valid ASCII label prefixed with "xn--", ensuring compatibility with the existing Domain Name System (DNS) infrastructure. This encoding allows IDNs to be registered, resolved, and used without modifications to the core DNS protocols. IDNs support a diverse array of scripts beyond the basic Latin alphabet used in ASCII-only domains, including extended Latin characters (e.g., with diacritics), Cyrillic (e.g., for Russian and Bulgarian), Arabic (e.g., for right-to-left languages), Chinese (e.g., Han ideographs), Devanagari (e.g., for Hindi), and many others defined in the Unicode standard. In contrast, ASCII-only domains are limited to the 26 letters, 10 digits, and hyphen from the US-ASCII repertoire, excluding scripts that require non-Latin glyphs. The scope of supported scripts is determined by the Unicode Consortium's character properties and ICANN's guidelines, which evaluate scripts for stability, confusability, and linguistic viability in domain labels. IDNs integrate seamlessly into the hierarchical structure of the DNS, where domain names are parsed from right to left across zones (e.g., top-level domains, second-level domains), with each encoded as an A-label in the files and wire format. This approach preserves the DNS's reliance on ASCII for transmission and storage, mapping U-labels (the user-facing form) bidirectionally to A-labels during resolution without requiring protocol changes. Applications handle the conversion transparently, displaying U-labels to users while querying the DNS with A-labels. A critical aspect of IDN validity involves Unicode normalization, particularly Normalization Form C (NFC), which canonically decomposes and recomposes characters to ensure consistent representation (e.g., combining diacritics into precomposed forms like "é" instead of separate "e" and acute accent). Strings must be normalized to NFC before processing to prevent equivalence issues, such as multiple encodings of the same label (e.g., NFC vs. NFD forms), thereby maintaining uniqueness and security in registrations. Normalization Form D (NFD) may appear in input but is converted to NFC for IDNA compliance.

Historical Context and Benefits

In the pre-IDN era, the Domain Name System (DNS) was limited to the ASCII character set, which supported only Latin-based scripts and effectively excluded non-English languages such as Arabic, Chinese, Cyrillic, and Devanagari from domain names. This restriction forced users in non-Latin script regions to rely on transliteration, approximating native terms with Roman characters, which often resulted in ambiguities, misspellings, and challenges for accurate domain entry, particularly in the 1990s for languages like Japanese and Arabic. For instance, Japanese users faced difficulties with romanized addresses that did not intuitively match their native script, hindering effective internet navigation. The drive for IDNs accelerated with the explosive growth of internet adoption in non-English speaking regions after 1995, including Asia and the Middle East, where billions of potential users encountered barriers due to the English-centric DNS. Early multilingual initiatives, such as the Tamil Internet project in 1995 and Chinese script-based email systems that same year, underscored the urgency for native script support in web addressing. In response, the Internet Engineering Task Force (IETF) and the Internet Corporation for Assigned Names and Numbers (ICANN) began addressing these multilingual needs; the IETF formed its IDN Working Group in 2000 to standardize approaches, while ICANN collaborated through the Multilingual Internet Names Consortium (MINC), founded that July, to advocate for global inclusivity. Pioneering proposals emerged in 1998, including an IETF draft from researchers at the that outlined internationalizing host names via UTF-5 encoding to enable non-ASCII characters in domains. Concurrently, the Networking Group (APNG) launched a testbed in the second half of 1998, providing one of the first practical demonstrations of IDN functionality and paving the way for broader experimentation. IDNs offer key benefits by enhancing usability for native speakers, allowing them to enter and recall domain names in familiar scripts like Cyrillic or Hangul, which simplifies online navigation compared to ASCII transliterations. They promote cultural relevance by enabling domains that reflect local identities and languages, such as .рф for Russian content or .中国 for Chinese users, fostering a more representative digital presence. Furthermore, IDNs reduce entry errors for non-English users by eliminating the need to approximate scripts, and they advance digital inclusion by empowering approximately 75% of the global internet population—who primarily use non-English languages (as of 2024)—to participate fully in the online ecosystem.

Technical Standards

IDNA Protocol Fundamentals

The Internationalized Domain Names in Applications (IDNA) protocol suite provides the foundational standards for incorporating non-ASCII characters into domain names, enabling their use across the internet while preserving compatibility with the existing Domain Name System (DNS), which is limited to ASCII characters. Developed by the Internet Engineering Task Force (IETF), IDNA operates at the application layer, converting internationalized labels into an ASCII-Compatible Encoding (ACE) form that can be processed by DNS resolvers without requiring modifications to the DNS protocol itself. This approach ensures that domain names in scripts such as Arabic, Chinese, or Cyrillic can be registered, resolved, and displayed seamlessly in user-facing applications. A key prerequisite for IDNA is the standard, which defines a universal character repertoire encompassing over 140,000 characters from various writing systems, serving as the basis for representing internationalized labels. IDNA relies on Unicode code points (ranging from to 0x10FFFF in IDNA2008) to identify valid characters, with normalization to () often applied in applications to ensure consistent representation. Without this Unicode foundation, the protocol could not systematically map diverse scripts to DNS-compatible formats. The initial IDNA2003 specification, outlined in RFC 3490 along with supporting documents RFC 3491 (Nameprep), RFC 3492 (Punycode), and RFC 3454 (Stringprep), introduced the core mechanism for encoding Unicode labels into Punycode-based ACE strings prefixed with "xn--", allowing backward compatibility with ASCII-only systems. In contrast, IDNA2008, detailed in RFC 5890 (definitions and framework), RFC 5891 (protocol), RFC 5892 (character mapping tables), RFC 5893 (right-to-left stability), and RFC 5894 (internationalizing registrations), obsoletes much of IDNA2003 by removing dependencies like Stringprep, rejecting unassigned Unicode code points, and introducing stricter rules for context-dependent characters such as zero-width joiners. Transitioning between versions posed challenges, including interoperability issues for existing registrations, as IDNA2008 alters the validity of certain labels (e.g., disallowing some symbols previously permitted) and shifts normalization responsibilities to applications rather than the protocol core. At its heart, IDNA's principles revolve around bidirectional conversion between user-readable Unicode (U-labels) and DNS-transmittable ACE (A-labels), ensuring that only valid, non-problematic characters are processed through prohibition lists that exclude disallowed code points like private-use characters or those causing visual confusion. These lists, defined in IDNA2008's mapping tables, prevent invalid or ambiguous strings from entering the DNS, promoting stability and security. In applications such as web browsers, email clients, and DNS resolvers, IDNA facilitates this by processing input strings to generate ACE for queries and reversing the process for output, thereby supporting IDNs without exposing users to the underlying encoding. This application-centric design allows global deployment of IDNs while minimizing disruptions to the ASCII-dominated internet infrastructure.

ToASCII and ToUnicode Processes

The ToASCII and ToUnicode processes form the core bidirectional conversion mechanisms in the Internationalized Domain Names in Applications (IDNA) protocol, enabling the transformation of Unicode-based domain labels (U-labels) into ASCII-Compatible Encoding (ACE) format for DNS compatibility and vice versa. These algorithms ensure that non-ASCII labels can be registered and resolved in the DNS while maintaining reversibility, with ToASCII producing an A-label from a U-label and ToUnicode performing the inverse operation. The processes differ between IDNA2003 (RFC 3490) and IDNA2008 (RFC 5891), with the latter introducing stricter validity rules and eliminating certain mappings present in the former. In IDNA2003, the ToASCII algorithm processes an input sequence of Unicode code points with optional flags for allowing unassigned code points and enforcing STD3 ASCII rules. First, if the input consists entirely of ASCII characters (code points 0x00-0x7F), it proceeds directly to length validation; otherwise, it applies the Nameprep normalization profile (RFC 3491), which includes case mapping, normalization, and prohibition of certain characters, failing if errors occur. Next, if the UseSTD3ASCIIRules flag is set, it verifies compliance with STD3 restrictions, such as excluding non-LDH (Letter-Digit-Hyphen) ASCII characters (e.g., excluding code points like 0x00-0x2C) and prohibiting leading or trailing hyphens (U+002D). For inputs containing non-ASCII code points, it confirms the sequence does not begin with the ACE prefix "xn--", then encodes the normalized sequence using Punycode (RFC 3492), appending the "xn--" prefix upon success, and finally checks that the resulting string length is between 1 and 63 characters. Failure at any step results in an error, preventing invalid labels from proceeding. IDNA2008 refines ToASCII with a more rigorous structure divided into preparation, validity, and encoding phases, emphasizing Unicode Normalization Form C (NFC) as input and removing the mapping steps from Nameprep. Preparation ensures the input is in NFC and identifies whether it is a U-label (containing non-ASCII characters) or A-label. Validity checks (Section 4.2) are stricter than in IDNA2003: the label must contain only permitted code points from the Protocol Valid categories (excluding DISALLOWED and UNASSIGNED per RFC 5892), with no leading or trailing hyphens, no "--" in the third and fourth positions, and no leading combining marks. Additionally, it enforces contextual rules, such as CONTEXTJ (prohibiting certain characters like U+200C ZERO WIDTH NON-JOINER in joiner contexts unless permitted) and CONTEXTO (for other disallowed contexts), as defined in RFC 5892. For labels involving bidirectional (Bidi) scripts, it applies Bidi rules from RFC 5893, requiring that right-to-left characters (e.g., Arabic or Hebrew) have matching left-to-right characters, the first character is right-to-left or LRI/RLI, and the last is right-to-left, among other criteria to prevent visual spoofing. Upon passing validity, non-ASCII U-labels are encoded via Punycode to produce an A-label, prefixed with "xn--". Unlike IDNA2003, which allowed mappings to normalize inputs (e.g., case folding or canonical equivalents), IDNA2008 rejects invalid inputs outright without mapping, ensuring greater consistency but potentially higher rejection rates. The ToUnicode reverses ToASCII, converting an A-label back to a U-label while validating to maintain protocol integrity. It first if the input is an A-label by verifying the "xn--" prefix and validity; if not prefixed or decoding fails, it treats the input as a valid U-label (assuming all-ASCII) without alteration. For valid A-labels, decoding yields a Unicode string in NFC, which then undergoes the same validity as ToASCII, including code point categories, hyphen rules, contextual (CONTEXTJ/CONTEXTO), and Bidi rules. Invalid —such as non-NFC forms, prohibited code points, or failing contextual/Bidi tests—result in failure, returning the original input unchanged as a fallback to avoid breaking legacy ASCII domains. In IDNA2003, ToUnicode was simpler, relying on decoding without the extensive validity rules of 2008, and it did not enforce NFC or Bidi explicitly. This symmetric design in both versions ensures that ToUnicode(ToASCII(U-label)) recovers the original U-label if valid, supporting reliable DNS operations.

Encoding and Normalization Examples

Internationalized domain names (IDNs) require normalization to ensure consistency across different input methods and systems, typically using Unicode Normalization Form C (NFC), which composes characters where possible to create a canonical representation. This step precedes encoding into , the ASCII-Compatible Encoding (ACE) format prefixed with "xn--", allowing non-ASCII characters to be represented in the (DNS). For instance, the domain "café.com", where "é" is the precomposed Latin small letter e with acute (U+00E9), normalizes directly to NFC and encodes to "xn--caf-dma.com". If entered in decomposed form as "caf\u0065\u0301.com" (e followed by combining acute accent), NFC recombines it to the same precomposed "é", yielding identical output and preventing duplicate registrations. For scripts with bidirectional text like Arabic, normalization and encoding must also adhere to bidirectional rules to maintain readability and security. The domain "مثال.مثال" (meaning "example.example" in Arabic) normalizes to NFC, ensuring consistent character composition, and encodes to "xn--mgbh0fb.xn--mgbh0fb". Arabic labels, being right-to-left (RTL), must start and end with a strong left-to-right character or an RTL character permitted for IDNA, and the overall direction cannot mix LTR and RTL in ways that violate the Bidi Rule, such as prohibiting LTR characters in RTL labels without proper framing. This prevents visual spoofing attacks where mirrored characters could confuse users. Edge cases highlight 's in handling variations. Combining characters are generally disallowed in IDNA unless they map to a single under NFKC normalization, but permitted ones like certain diacritics in NFC form are encoded normally; for example, a domain with a valid combining mark like "résumé.com" (with acute on e) encodes to "xn--rsum-dma0p.com" after case folding to lowercase. Case folding maps uppercase to lowercase (e.g., "Café.com" becomes "café.com" before encoding), ensuring domain insensitivity to case, as per Unicode's case-folding algorithm. Invalid sequences, such as disallowed characters (e.g., spaces or emojis) or unmapped combining marks, trigger errors in the ToASCII process, rejecting the label; for instance, input with a prohibited character like Greek final sigma in isolation fails validation. Standard libraries facilitate and of these processes. In , the idna (implementing IDNA2008 with UTS #46 ) can "café.com" using idna.encode('café.com'), returning b'xn--caf-dma.com', and handles normalization automatically. Similarly, idna.decode(b'xn--mgbh0fb.xn--mgbh0fb') recovers the Arabic "مثال.مثال", demonstrating round-trip while flagging edge cases like Bidi directions.

Implementation Frameworks

ICANN Guidelines and Updates

The Internationalized Domain Name (IDN) Guidelines were first published 2003 with version 1.0, with subsequent updates including version 2.1 2006, establishing initial standards for second-level IDNs in generic top-level domains (gTLDs), emphasizing compliance with IETF protocols and measures to prevent script mixing unless linguistically justified. Over the subsequent years, the guidelines underwent iterative updates to address emerging challenges in global deployment, culminating 4.0 proposals that led to version 4.1, approved on 22 September 2022 and published 2022. Version 4.1, which defers certain elements from 4.0 such as specific variant allocation rules (guidelines 11, 12, 13), became effective with full required from registry operators by 30 April 2025, as announced on 28 October 2024. This evolution prioritizes enhanced variant handling to mitigate user confusion and ensures operational across diverse scripts. Key components of the guidelines impose specific requirements on TLD registries to maintain and . Registries must validate IDN labels in strict adherence to the IETF's IDNA (RFCs 5890–5893), prohibiting disallowed points like hyphens in the third or fourth positions except in A-labels, and publish their supported Unicode repertoires in the IANA . For variant bundling, registries are mandated to allocate variant labels only to the same registrant or them entirely, promoting the "same " to avoid fragmentation and . Display rules further require that all points within a label belong to the same Unicode script per Annex #24, with limited exceptions for established orthographies, while minimizing risks from homoglyphs and whole-script confusables as defined in Unicode Technical Reports #36 and #39. In October 2024, the Expedited Policy Development Process (EPDP) on IDNs Phase 2 released its final report, adopted by the GNSO Council on 13 November 2024, integrating rights protection mechanisms tailored to IDN contexts. This update aligns existing tools like the (UDRP), Uniform Rapid Suspension System (URS), and Trademark Clearinghouse (TMCH) with IDN variants, ensuring that suspensions or transfers under these mechanisms encompass entire variant sets while upholding the "same entity" principle, without expanding TMCH matching to include variants beyond exact matches. It also mandates harmonized IDN tables across variant gTLDs and outreach to educate stakeholders on variant impacts in . In October 2025, ICANN launched a public comment on string similarity evaluation data for the next gTLD round, focusing on IDN variant assessments to enhance security and usability. Recent advancements include ICANN's Universal (UA) initiatives, which aim to guarantee seamless IDN compatibility across software applications and systems. By July 2025, ICANN achieved a in UA by enabling its account systems to fully support internationalized addresses ( Internationalization, or EAI), allowing sending and receiving of emails with non-ASCII domains. These efforts, ongoing through 2025, involve evaluating software readiness and forming expert working groups to develop implementation guidelines, ensuring that IDNs and new TLDs are treated equally in . Complementing these guidelines, Zone Label Generation Rules (LGRs) provide script-specific tools for consistent label validation.

Root Zone Label Generation Rules (LGRs)

The Root Zone Label Generation Rules (RZ-LGRs) serve as standardized rulesets that define the permissible code points, variants, and validation criteria for Internationalized Domain Name (IDN) labels in the DNS root zone. These rules ensure a secure and stable operation of the root zone by specifying which characters from various scripts can form valid top-level domains (TLDs) and their associated variants, thereby minimizing risks of label confusion across different writing systems. For instance, in the Chinese (Han) script, LGRs generate variants that account for differences between simplified and traditional forms, allowing related labels to be treated as equivalents or blocked to prevent conflicts. The of RZ-LGRs follows a structured involving script-specific Panels composed of experts from relevant linguistic communities, who propose rules tailored based on standards. These proposals are then reviewed and integrated by a centralized Panel, appointed by , which ensures across scripts while adhering to principles such as , , and . The includes public periods for , culminating in ICANN's approval and of the unified ruleset. A notable example is the release of RZ-LGR-6 in September 2025, following a public comment proceeding initiated in June 2025, which integrated the Thaana script and provided updates for Bangla (Bengali), Japanese, and Khmer scripts to refine variant handling and code point repertoires. Reference LGRs, which provide baseline rules adaptable for both root and second-level domains, have expanded to include new scripts and languages, such as the additions of Balinese, , and in November 2024, bringing the to 27 script-based and 32 language-based reference LGRs. As of June 2025, over 11,000 IDN tables—each representing permitted code points for specific scripts or languages—have been published in the IANA , reflecting the cumulative output of these rulesets and supporting IDN deployment. Developing LGRs presents challenges in balancing inclusivity, which promotes broad representation of scripts and languages to foster a multilingual Internet, with the need for stability in multi-script environments to avoid usability issues or security vulnerabilities like visual similarity attacks. The Integration Panel's methodology applies principles of inclusion alongside stability and conservatism to reconcile community-driven proposals, ensuring that expansions do not compromise DNS integrity.

Global Deployment

IDN Top-Level Domains (TLDs)

Internationalized top-level domains (IDN ccTLDs) are managed through ICANN's Fast , which was launched on November 16, 2009, to enable eligible countries and territories to request and deploy non-Latin script TLDs representing their names in local languages. This involves rigorous to and , including for visual similarity to existing TLDs and adherence to script-specific guidelines. As of 2024, Libya's Arabic-script IDN ccTLD .ليبيا successfully completed and became eligible for , marking in the to add for Arabic-speaking users in the . For generic top-level domains (gTLDs), IDN variants were introduced as part of ICANN's 2012 New gTLD Program, allowing applicants to propose TLDs in scripts beyond Latin, such as Cyrillic and Chinese. By 2025, 90 IDN gTLDs had been delegated, including prominent examples like .рф for Russia (delegated in 2010 to represent "RF" in Cyrillic) and .中国 for China (introduced to denote the country in Simplified Chinese characters). These delegations expand the global DNS to better serve non-English-speaking communities. Overall, as of 2025, there are 151 IDN TLDs in the root —comprising 61 IDN ccTLDs and 90 IDN gTLDs—out of a total of 1,440 TLDs, covering 37 languages and 23 scripts. Registry operators for these IDN TLDs must comply with 's operational requirements, including standardized registry agreements that for Internationalized Domain Names (IDNs) at the second level and proper handling of to prevent conflicts and . Variant involves harmonizing IDN tables across related TLDs and integrating bundling where applicable, as outlined in 's IDN Variant TLD Implementation recommendations. As of June 2025, there were approximately 4.4 million Internationalized Domain Name (IDN) registrations across all top-level domains (TLDs) worldwide. In generic TLDs (gTLDs), second-level IDN registrations stood at 1.396 million as of March 2025, reflecting a decline of 4.84% from 1.467 million the previous year. The of IDN registrations in gTLDs highlights dominance by certain scripts, with accounting for 49% (about 681,000 registrations), followed by extensions at 28% (393,000 registrations), Cyrillic at roughly 65,000, and at 14,000. Regionally, is concentrated in , with seven root server providers supporting IDN services, and , hosting ten such providers, underscoring these areas as hotspots for multilingual deployment. Adoption trends reveal a contrast between country-code TLDs (ccTLDs) and gTLDs: while 61 IDN ccTLDs demonstrate slow but steady growth, gTLD registrations continue to decline amid broader market shifts. The push for Universal Acceptance—ensuring systems handle non-ASCII characters seamlessly—has played a pivotal role in bolstering IDN uptake by addressing compatibility barriers in applications and networks. Looking ahead, the ICANN IDN Annual Report 2025 projects continued multilingual expansion through ongoing development of Label Generation Rules (LGRs) for additional scripts and languages, aiming to further integrate IDNs into the global domain ecosystem.
ScriptPercentage (gTLDs)Approximate Registrations (March 2025)
Chinese49%681,000
Latin extensions28%393,000
Cyrillic~5%65,000
Arabic~1%14,000

Non-ICANN Registries Supporting Non-ASCII Names

Non-ICANN registries supporting non-ASCII names operate primarily within or frameworks, the registration and of internationalized domain names through localized rather than the . These registries often adhere to IDNA standards for but emphasize regional , allowing to policies to linguistic and cultural needs. Alternative approaches outside IDNA include Unicode in private networks or custom national DNS implementations, where servers are modified to non-ASCII labels natively without Punycode conversion. For instance, RFC 6055 outlines how UTF-8 or other non-ASCII encodings can be used privately over DNS, though this risks incompatibility with the broader . A historical example is Thailand's ThaiURL system, launched in , which used a encoding for Thai-script .com domains, bypassing ASCII restrictions but requiring client plugins for resolution and limiting global access. These registries offer advantages such as greater flexibility in script-specific rules and faster , fostering regional post-2010s. However, they face disadvantages including potential challenges with the DNS if deviations from IDNA occur, as well as fragmented experiences across borders. Overall, their is significantly smaller than ICANN's , with millions of registrations confined to specific locales rather than worldwide deployment, prioritizing over .

Specialized Initiatives

Arabic Script IDN Working Group (ASIWG)

The IDN (ASIWG) was formed in 2008 as a self-organizing, community-led initiative involving experts from governments, intergovernmental organizations, and technical communities to facilitate the implementation of Internationalized Names (IDNs) using the . Sponsored initially by the Economic and for (UN-ESCWA) and in partnership with , the group held its first workshop in , UAE, to establish a framework for handling Arabic-specific challenges in domain names. The ASIWG addressed hurdles inherent to the Arabic script, including its right-to-left (BiDi) rendering, contextual shaping—where forms change based on in a word—and the of elongation characters like tatweel (U+0640), which can alter visual similarity and in domain labels. The group also tackled issues with ligatures, such as the mandatory joining of certain combinations (e.g., لام-الف forming لا), ensuring consistent across diverse Arabic dialects and related scripts like Persian and Urdu. These efforts produced guidelines for BiDi domain validation and handling shared , preventing confusability in mixed-script environments. A primary output of the ASIWG was its foundational work leading to the for Rules (LGR), submitted in 2015 after the group's activities from 2008 to 2012. This was integrated into the Rules Version 2 (RZ-LGR-2) in June 2017, defining a repertoire of 128 code points and 192 mappings (of which 26 are allocatable and 166 are blocked) to support secure IDN top-level domains (TLDs). The LGR has since been updated, with for second-level domains released in 2023 to align with 11.0 and expand handling for broader deployment. The ASIWG's contributions enabled the delegation of 22 Arabic script country code TLDs (ccTLDs) through ICANN's IDN Fast Track Process (as of November 2025), including .مصر (Egypt), .السعودية (Saudi Arabia), and .امارات (United Arab Emirates), enhancing accessibility for over 400 million Arabic speakers. Ongoing efforts, transitioned to script generation panels post-2012, continue to refine variant mechanisms for tatweel and ligatures, ensuring stability in the Root Zone LGR updates as of 2025.

Other Script-Specific Developments

In the Chinese script community, significant efforts have focused on managing variants between simplified and traditional characters to ensure consistent representation in domain names. The Reference Label Generation Rules (LGR) for the script, updated in 2024, incorporate a full set that resolves labels into allocatable or blocked categories, addressing the complexities of where a single code point may represent multiple glyphs across writing systems. This approach prevents confusability while allowing flexibility for users of different Chinese variants, as highlighted in ongoing Root Zone LGR discussions. For the Cyrillic script, community-driven proposals have emphasized stability in label generation, particularly for the Russian IDN ccTLD ., which has seen expansions in repertoire alignment with . since its delegation. The Second-Level Reference LGR for Russian, published in January 2024, maintains a consistent repertoire for new Cyrillic TLDs in Russia, incorporating acute accents and ensuring compatibility with existing deployments to promote secure and stable operations. Similarly, in the Devanagari script, community generation panels have proposed updates to enhance stability, with the Root Zone LGR for Devanagari revised in September 2025 to refine whole-label evaluation rules and variant handling for Indic languages. Recent advancements include the of Second-Level LGRs for the and the in 2024, developed through panels to define valid labels and , thereby supporting registry operations and reviews for these underrepresented scripts. Additionally, the LGR 6 (RZ-LGR-6), released following in 2025, incorporates updated rules for the and scripts; the LGR adds rules for subjoined and to mitigate rendering issues, while the LGR expands to 6,532 points with refined sets for Hiragana, , and . Cross-script coordination advanced through collaborations between and the IETF, particularly in standardizing IDNA protocols that underpin LGRs for scripts like Thai and Indic. For instance, the 2024 Reference LGR for Thai integrates IETF-defined normalization to handle tonal marks and ligatures consistently, while Indic scripts such as from shared Unicode stability guidelines developed in joint efforts to avoid cross-script confusability.

Security and Challenges

Homoglyph Attacks and Spoofing Risks

Homoglyph attacks, also known as IDN homograph attacks, exploit visually similar characters—known as homoglyphs—from different writing scripts to create deceptive domain names that mimic legitimate ones, primarily for phishing purposes. These attacks leverage Internationalized Domain Names (IDNs) to register domains where characters like the Latin lowercase 'a' (U+0061) appear identical to the Cyrillic 'а' (U+0430), tricking users into visiting malicious sites. For instance, a domain such as "exаmple.com"—with the second character as Cyrillic 'а'—can visually impersonate "example.com" in many fonts and browsers. In mixed-script environments, ASCII spoofing occurs when IDNs combine Latin characters with confusable ones from other scripts, such as Cyrillic or Greek, to form deceptive labels under the IDNA protocol. This IDNA homograph attack specifically targets the Punycode encoding of non-ASCII characters (e.g., "xn--exmple-9cf.com" for the homoglyph example above), allowing attackers to host phishing pages or malware on domains that appear trustworthy. Such risks are amplified in cross-script domains, where subtle visual differences evade casual inspection. Historical incidents trace back to the early 2000s, coinciding with the initial rollout of IDN support by the IETF and IANA around 2003, when browsers like Internet Explorer and early Firefox versions lacked safeguards against mixed-script displays. A seminal demonstration in 2002 highlighted vulnerabilities by registering "micrоsоft.com" using Cyrillic 'о' (U+043E) and 'с' (U+0441) to spoof "microsoft.com," exploiting browsers' failure to flag non-Latin substitutions. These early exploits, including a 2000 hoax mimicking Bloomberg.com with similar tactics, underscored how IDN adoption without visual verification enabled widespread spoofing. As of 2025, homoglyph threats persist and intensify with growing IDN , where remains the leading and homoglyph spoofing contributes to breaches averaging $4.44 million in costs (as of 2025). Increased registration of non-Latin TLDs—with 151 IDN TLDs delegated as of 2025—heightens , as attackers continue to target high-value with cross-script domains, evading outdated policies in some implementations. Key risk factors include bidirectional (BiDi) scripts, such as Arabic or Hebrew, which reverse text direction (right-to-left vs. left-to-right) to reorder characters and create misleading URLs, as seen in "BiDi Swap" techniques that mask malicious paths. Additionally, zero-width joiners (ZWJ, U+200D) enable deceptive labels by invisibly linking characters in cursive scripts, potentially altering visual rendering without detection and facilitating single-script or mixed-script spoofing. These elements compound vulnerabilities in IDN resolution, particularly where context rules for joining are inconsistently applied across systems.

Mitigation Measures and Best Practices

Technical mitigations for IDN security risks primarily involve protocol-level restrictions and client-side rendering policies to prevent visual spoofing. The IDNA2008 specification disallows certain code points, such as symbols, most punctuation, and characters from multiple scripts within a single label, to reduce the potential for homograph confusion by excluding inherently problematic Unicode elements. Browsers implement display rules as a fallback mechanism; for instance, if an IDN mixes scripts or includes mixed numbering systems, it is rendered in Punycode (e.g., "xn--...") rather than native Unicode characters, alerting users to potential risks. Policy measures at the registry level complement these technical safeguards by enforcing restrictions on registration. Registries are required to or allocate variant labels—those that are visually or functionally equivalent—only to the same registrant, as outlined in ICANN's IDN Implementation Guidelines, preventing unauthorized use of confusables. ICANN's Generation Rules (LGRs) incorporate variant policies that define permissible code points and cross-script exclusions for specific scripts, ensuring that IDN TLDs and second-level domains avoid allocatable confusables through automated validation tools. Best practices extend these measures to operational and user-facing strategies. education campaigns emphasize verifying authenticity by checking for displays or using trusted sources, while organizations are advised to maintain software updates that support Universal Acceptance () standards, enabling seamless handling of IDNs without truncation or rejection. tools, such as ICANN's LGR Toolset, allow registries and developers to validate labels against updated rulesets in real-time, identifying potential variants before delegation. IDN Guidelines version 4.1, effective April 30, 2025, strengthen similarity , with ongoing efforts such as the October 2025 public comment period on string similarity to integrate community-sourced confusable into LGRs, further minimizing DNS abuse risks through refined string comparison algorithms.

Historical Timeline

Pre-2003 Foundations

The (DNS), established in the 1980s, was inherently limited to the ASCII character set, restricting domain names to letters (a-z), digits (0-9), and hyphens, which excluded scripts and characters from most non-Latin languages. This constraint became increasingly problematic as the expanded globally in the 1990s, prompting early recognition that the DNS needed adaptation to support multilingual access. The adoption of in 1991 served as a foundational precursor, providing a universal encoding standard for over 100,000 characters across scripts, which laid the groundwork for handling non-ASCII text in networked applications. In the mid-1990s, the (IETF) began exploring solutions through drafts on internationalizing names, such as Dürst's for a "zero-level " to characters into ASCII-compatible labels without altering the core DNS . These efforts culminated in the formation of the IETF IDN in , which produced numerous drafts between and addressing requirements for non-ASCII names, including and strategies. However, proposals for native Unicode support in the DNS were rejected due to the risks of disrupting the vast installed base of DNS infrastructure, leading instead to application-layer approaches that preserved ASCII compatibility. The formation of the () in intensified discussions on internationalized domain names (IDNs), as it highlighted the need for equitable beyond English-centric systems. Key figures like Tan Tin Wee, an early advocate, contributed significantly; his team at the National University of Singapore proposed IDN concepts as early as 1987 through collaborators like Martin Dürst and developed practical implementations starting in 1990. 's establishment sparked broader stakeholder engagement, emphasizing the benefits of IDNs in enabling native-language domain usage to promote inclusivity. Proof-of-concept systems emerged in the late , including experimental resolvers that operated parallel to the standard DNS to handle non-ASCII queries, such as Tan Tin Wee's iDNS.net trials for domain names in , which demonstrated feasibility through proxy-based resolution without protocol overhauls. These efforts validated the potential for IDNs while underscoring the challenges of interoperability in a predominantly ASCII .

2003-2020 Milestones

In March 2003, the (IETF) published the foundational RFCs for Internationalized Domain Names in Applications (IDNA2003), including RFC 3490 defining the IDNA , RFC 3491 specifying name preparation, RFC 3454 on preparation, and RFC 3492 introducing as the encoding for non-ASCII characters in domain labels. These standards enabled the representation of characters in domain names while maintaining compatibility with the ASCII-based (DNS). Following their publication, early software implementations supporting emerged, including versions of web browsers such as 1.4 and 7.11, which allowed users to enter and resolve internationalized domain names. On November 16, 2009, the Internet Corporation for Assigned Names and Numbers (ICANN) launched the IDN ccTLD Fast Track Process to facilitate the delegation of internationalized country code top-level domains (IDN ccTLDs) for countries and territories seeking native-script representations of their existing ASCII ccTLDs. This initiative addressed the need for localized top-level domains while ensuring stability through string evaluation for uniqueness and script compatibility. The first delegations occurred in 2010, with examples including Egypt's .مصر (xn--wgbl6a) in July, Russia's .рф (xn--p1ai) in August, and Thailand's .ไทย (xn--o3cw4h) in July. By the end of 2010, at least seven IDN ccTLDs had been delegated, marking the practical deployment of IDNs at the root level of the DNS. In 2010, the IETF updated the IDNA specifications with IDNA2008 (RFCs 5890 through 5894), refining handling, validation rules, and processes to better align with evolving standards and mitigate limitations in the 2003 , such as inconsistent of certain scripts and diacritics. issued guidelines in 2011 to registries transitioning to IDNA2008, emphasizing with IDNA2003 to avoid disruptions. However, the shift introduced challenges, including issues where some names valid under IDNA2003 became or resolved differently under IDNA2008—for instance, involving characters like the ß or final —leading to software bugs, resolution failures, and potential security vulnerabilities during the transition period in the 2010s. These incompatibilities affected client applications and registries, prompting the of transitional mechanisms like Unicode Technical Standard #46 to ensure backward compatibility. In January 2012, ICANN opened the application period for the New Generic Top-Level Domain (gTLD) Program, which explicitly included support for IDN gTLDs alongside ASCII ones, resulting in over 1,900 applications and the eventual delegation of numerous internationalized generic domains. Concurrently, the Arabic Script IDN Working Group (ASIWG) was formalized within ICANN's community efforts to coordinate script-specific policies, building on earlier regional collaborations to harmonize Arabic character variants for domain use. Throughout the 2010s, adoption of IDNA2008 grew among software vendors and DNS resolvers, while the Fast Track Process expanded, leading to over 50 IDN ccTLDs delegated by 2020 across various scripts, including Cyrillic, Arabic, Chinese, and Devanagari. This period solidified IDNs as a core feature of the global DNS, though ongoing challenges like variant management and software updates persisted.

2021-2025 Recent Advances

In 2021, the Supporting Organization (GNSO) initiated the Expedited (EPDP) on Internationalized Names (IDNs) to address protections for IDN top-level domains (TLDs), focusing on variant management and stability in the root zone. The EPDP 1 examined issues such as the delegation of variant TLDs and the adaptation of existing policies for IDN contexts, culminating in an Initial Report published in 2023 that proposed recommendations to minimize and ensure equitable treatment. The 1 Final Report, submitted to the GNSO in November 2023, included 69 policy recommendations on topics like variant bundling and registry obligations, which were later adopted by the ICANN Board in June 2024. During the same period, ICANN expanded the Label Generation Rules (LGR) framework to support additional scripts, enhancing IDN compatibility at the second level. In January 2023, seven new script-based Reference LGRs were published for Armenian, Cyrillic, Greek, Latin, Japanese, Korean, and Myanmar, incorporating community-driven variant mappings to prevent visual similarities. These expansions built on prior efforts, such as the 2021 release of LGRs for Arabic, Hebrew, and Sinhala, facilitating broader script integration in domain registrations. In October 2024, announced the successful for Libya's Arabic-script TLD (ccTLD), ليبيا, under the IDN ccTLD Fast , marking a key advancement in non-Latin representations for domains. This , representing the name in Arabic, underwent linguistic and reviews to and in the . Later that year, the EPDP 2 Final was published in October, addressing second-level IDN with 20 outputs, including 14 recommendations, adopted by the GNSO Council in November, emphasizing bundling mechanisms and stability measures. In November, released additional Reference LGRs for the Balinese script, Thaana script, and Inuktitut language, along with updates to Myanmar script LGRs, expanding support to 27 script-based and 32 language-based rules for second-level domains. By April 2025, IDN Implementation Guidelines Version 4.1 took full effect, requiring registry operators to comply with enhanced protections against consumer confusion and DNS abuse, including stricter variant handling and reservation policies. In June 2025, ICANN launched a public comment period for Root Zone Label Generation Rules Version 6 (RZ-LGR-6), which integrated the Thaana script as the 27th supported script and updated the Maximal Starting Repertoire to accommodate emerging needs. RZ-LGR Version 6 was finalized and published on September 25, 2025, supporting 28 scripts including updates to Devanagari and Bengali. The ICANN IDN Report released in August 2025 noted that 151 TLDs had been delegated as IDNs, spanning 37 languages and 23 scripts, with the Chinese script dominating at 59 delegations (7 ccTLDs and 52 gTLDs). Ongoing efforts included (UA) campaigns, with UA Day engaging thousands worldwide through co-hosted by and to promote for IDNs and internationalized addresses in software and systems. Amid a noted decline in gTLD registrations, including IDNs, emphasized and refinement over , with IDN gTLD registrations decreasing at a slower rate while focusing on implementation of EPDP outcomes.

References

  1. [1]
    Internationalized Domain Names - ICANN
    Internationalized Domain Names ( IDNs ) enable people around the world to use domain names in local languages and scripts. IDNs are formed using characters ...Missing: IETF | Show results with:IETF
  2. [2]
    Internationalized Domain Names | ICANN New gTLDs
    Internationalized Domain Names (IDNs) are domain names using characters other than traditional ASCII, such as non-Latin scripts like Chinese or Arabic.Missing: IETF | Show results with:IETF
  3. [3]
    RFC 5891 - Internationalized Domain Names in Applications (IDNA)
    RFC 5891 defines IDNA, a protocol for registering and looking up IDNs without changing DNS, using ASCII strings for non-ASCII domain names.
  4. [4]
    RFC 3490 - Internationalizing Domain Names in Applications (IDNA)
    This document defines internationalized domain names (IDNs) and a mechanism called Internationalizing Domain Names in Applications (IDNA) for handling them in ...
  5. [5]
    RFC 5890 - Internationalized Domain Names for Applications (IDNA)
    This document is one of a collection that, together, describe the protocol and usage context for a revision of Internationalized Domain Names for Applications ...
  6. [6]
    [PDF] Internationalized Domain Name (IDN) Annual Report 2022 - icann
    The IDNA Standard was developed by the Internet Engineering Task Force (IETF) in 2003, and later updated in 2008 (IDNA2008). The technical community also issued ...
  7. [7]
    IDN ccTLD Fast Track Process - icann
    Please contact IDNProgram@icann.org for any inquiries about the IDN ccTLD Fast Track Process. FIP has been revised multiple times since its initial approval.
  8. [8]
    Root Zone Label Generation Rules Version 6 (RZ- LGR - icann
    Jun 21, 2015 · Root Zone Label Generation Rules (RZ- LGR ) provide a conservative mechanism to determine valid IDN TLDs and their variant labels, for stable and secure ...
  9. [9]
    RFC 8753 - Internationalized Domain Names for Applications (IDNA ...
    Apr 15, 2020 · Internationalized Domain Names for Applications (IDNA) Review for New Unicode Versions (RFC 8753, April 2020)
  10. [10]
    Workshop 297 Report: Digital Inclusion Through a Multilingual Internet
    Jun 7, 2024 · The Internet's naming system was designed to help people find resources – websites, videos, email servers, images – on the Internet. Normally, ...
  11. [11]
    [PDF] The History of Internationalised Domain Names (IDN) - icann
    Jul 21, 2004 · The IDN Movement first started in a big way in 1998 with the first working implementation of a “primitive” ASCII.Missing: timeline | Show results with:timeline
  12. [12]
    [PDF] IDNs: Internationalized Domain Names - icann
    The IETF is creating the standards for using non-ASCII characters in the. Domain Name System. In 2003, an international IDN working group devel- oped the IDNA ...Missing: timeline | Show results with:timeline
  13. [13]
  14. [14]
  15. [15]
  16. [16]
  17. [17]
  18. [18]
  19. [19]
  20. [20]
  21. [21]
  22. [22]
  23. [23]
  24. [24]
  25. [25]
  26. [26]
  27. [27]
    Punycode converter (IDN converter), Punycode to Unicode
    A tool that converts a text with special characters (Unicode) to the Punycode encoding (just ASCII). Used for internationalized domain names (IDN).URL encoder & decoder · See also · Internationalized domain... · About
  28. [28]
    idna - PyPI
    Oct 12, 2025 · This library provides support for Unicode IDNA Compatibility Processing which normalizes input from different potential ways a user may input a ...0.2 Jul 16, 2013 · 0.6 Apr 29, 2014 · 0.3 Jul 18, 2013 · 0.4 Jan 7, 2014
  29. [29]
    [PDF] idn-guidelines-22sep22-en.pdf - icann
    Sep 22, 2022 · These guidelines are for implementing IDNs, mainly for TLD registries, and require compliance with IETF protocols, and focus on owner-name of ...
  30. [30]
    ICANN Announces IDN Guidelines Version 4.1 Implementation ...
    Oct 28, 2024 · The IDN Guideline 4.1 is in effect and ROs must comply fully with the IDN Guidelines 4.1 no later than 30 April 2025. As of this announcement, ...
  31. [31]
    Repository of IDN Practices - Internet Assigned Numbers Authority
    Repository of IDN Practices. We maintain a collection of “IDN tables”, which represent permitted code points (letters) allowed for Internationalised Domain ...Missing: total | Show results with:total<|separator|>
  32. [32]
  33. [33]
  34. [34]
    [PDF] Phase 2 Final Report on the Internationalized Domain Names ...
    Oct 7, 2024 · 213 By 2 October 2024, no objection was received from EPDP members to the Leadership Team's Proposed Consensus Designations.214.
  35. [35]
    ICANN Achieves Key Milestone in Universal Acceptance
    Jul 2, 2025 · ICANN Account now supports internationalized email addresses, also known as Email Address Internationalization (EAI).Missing: compatibility | Show results with:compatibility
  36. [36]
    ICANN Call for Nominations: Universal Acceptance Expert Working ...
    May 7, 2025 · Tech Developers and System Administrators: Ensure existing and new software applications support UA and Internationalized Domain Names (IDNs).Missing: compatibility | Show results with:compatibility
  37. [37]
    IDN Implementation Guidelines - icann
    From the set of guidelines proposed in version 4.0, version 4.1 defers guidelines 6a, 11, 12, 13 and 18, as resolved by the ICANN Board.
  38. [38]
    [PDF] Procedure to Develop and Maintain the Label Generation Rules for ...
    This document provides a procedure for establishing some of the label generation rules for the root zone. The label generation rules govern the way a zone ...
  39. [39]
    ICANN Publishes Root Zone Label Generation Rules Version 6
    Sep 25, 2025 · The RZ-LGR-6, finalized after its Public Comment proceeding, integrates the Thaana script and makes additional updates in Devanagari, Bengali ( ...Missing: June | Show results with:June
  40. [40]
    ICANN Publishes Three Additional Second-Level Reference Label ...
    ICANN has published Second-Level Reference Label Generation Rules (LGRs) for the Balinese script, the Thaana script, and the Inuktitut language.Missing: additions | Show results with:additions
  41. [41]
    [PDF] ONE WORLD, ONE INTERNET - icann
    Oct 21, 2025 · Over 11,000 IDN tables have been published in the IANA Repository, demonstrating broad support for a ... As of 30 June 2025, ICANN managed a total ...
  42. [42]
    [PDF] Maximal Starting Repertoire — MSR-5 Overview and Rationale - icann
    Jun 24, 2021 · The methodology followed by the Integration Panel ensures that the Stability, Inclusion and. Conservatism Principles may be fully applied to the ...
  43. [43]
    IDN ccTLD Fast Track Process Launch - icann
    Nov 16, 2009 · Starting November 16, 2009 at 00:00UTC ICANN will accept requests from representatives of countries and territories around the world for new ...
  44. [44]
    ICANN Announces Successful String Evaluation for Libya IDN ccTLD
    String delegation: Requests successfully meeting string evaluation criteria are eligible to apply for delegation following the same ICANN IANA ...
  45. [45]
    [PDF] Internationalized Domain Name (IDN) Report - June 2024 | ICANN
    Jul 31, 2024 · This report provides an overview of the status of IDNs and the implementation of IDN work from January 2023 to June 2024. As of June 2024, the ...
  46. [46]
    Recommendations for Managing IDN Variant Top-Level Domains
    Jul 25, 2018 · Recommendations for managing Internationalized Domain Name (IDN) variant labels for top level domains (TLDs) have been developed by ICANN ...
  47. [47]
    [PDF] IDN Variant TLD Implementation: Recommendations and Analysis
    Jan 25, 2019 · The registry agreement must also require all the different IDN tables being used by the IDN TLD and its variant TLDs to be harmonized, meaning ...
  48. [48]
    None
    ### Summary of ICANN IDN Annual Report 2025
  49. [49]
    .RF (domain) - TAdviser
    Aug 11, 2025 · In 2023, 244,266 domain names were registered in the RF. Read more here. 700 thousand registered names. By, in to data Coordination center of ...
  50. [50]
    Russian Domain Space 2023 report published
    Apr 3, 2024 · In 2023, 1,709,718 new domains were registered in .RU. By the end of the year, the registration rate almost doubled: 123,000 domains were ...
  51. [51]
    Delegation of the .中国 and .中國 (“Zhongguo”) domains ...
    ICANN has received a request to delegate 中国 and 中國 as country-code top-level domains representing China, to China Internet Network Information Center.
  52. [52]
    Understanding the .CN Top-Level Domain - Nominus.com
    The top-level domain (TLD) .cn is the country code TLD of China. It serves as an identifier for Chinese businesses and entities online.
  53. [53]
    Chinese Domain Name - 中国互联网络信息中心
    The .中国 domain names are Chinese character top-level domain names representing China on the Internet, and same as the .CN domain names.Missing: non- ICANN
  54. [54]
    Internet Domains - ایرنیک
    Domain Registration (under .ایران). Before requesting to register a domain under .ایران TLD, you should read and be aware of followings rules and agreements.
  55. [55]
    Cheapest .ایران Domain Registration, Renewal, Transfer ... - TLD-List
    The .ایران domain extension is a country-specific top-level domain (ccTLD) for Iran. This domain extension is intended for use by individuals, businesses, ...
  56. [56]
    IAB Thoughts on Encodings for Internationalized Domain Names
    This document explores issues with Internationalized Domain Names (IDNs) that result from the use of various encoding schemes such as UTF-8 and the ASCII- ...
  57. [57]
    ThaiURL - ประวัติความเป็นมา
    ### Summary of ThaiURL Encoding from http://www.thaiurl.com/19/background.htm
  58. [58]
    ThaiURL - ข้อมูลทั่วไป
    ### Summary of ThaiURL for Non-ASCII Domains
  59. [59]
    [PDF] Untitled - Sched
    Arabic Script IDN Working Group (ASIWG) – April 2008-2012. • Develop a unified IDN table for the Arabic script, allowing Arabic-speaking internet users to ...
  60. [60]
    Global Harmonization of Arabic Script Use in Domain Names, 4th ...
    Hence, the Arabic Script in IDNs Working Group (ASIWG) was established, and three meeting were organized in 2008 by UN-ESCWA in partnership with Public ...Missing: date ICANN<|control11|><|separator|>
  61. [61]
    [PDF] Arabic Script IDN Working Group (ASIWG) - ICANN
    Jun 26, 2008 · Internationalized Domain Names. Page 7. Initial Goals. Initial Goals. • Establish a framework for implementation of. IDNs in Arabic Script which ...Missing: formation | Show results with:formation
  62. [62]
    [PDF] Proposal for Arabic Script Root Zone LGR | icann
    Nov 18, 2015 · This report was preceded by work done through a self-formed and community-led group called Arabic Script IDN Working. Group (ASIWG). These ...
  63. [63]
    RFC 5564
    This document constitutes a technical specification for the implementation of the IDN standards in the case of the Arabic language. It will allow the use of ...
  64. [64]
    Label Generation Rules for the Root Zone Version 2 (RZ-LGR-2)
    Jun 6, 2017 · RZ-LGR-2 [PDF, 891 KB] contains rules for six scripts, including Arabic, Ethiopic, Georgian, Lao, Khmer and Thai, based on the proposals ...
  65. [65]
    [PDF] Reference Label Generation Rules (LGR) for the Second Level - icann
    Jan 12, 2023 · All other LGRs, including script LGRs, are derived from the Root Zone LGR for the corresponding script. This process the repertoire to ...
  66. [66]
    IDN ccTLD Fast Track String Evaluation Completion - ICANN
    ### Summary of IDN ccTLDs Using Arabic Script
  67. [67]
    Root Zone Label Generation Rules for the Arabic Script - icann
    Sep 23, 2025 · This file contains a set of Label Generation Rules (LGR) for the Arabic script for the Root Zone. For more details on this LGR and additional ...
  68. [68]
    Reference LGR for script: Chinese (Hani) - icann
    These instructions cover how to adopt an LGR based on this reference LGR for a given zone and how to prepare the file for deposit in the IANA Repository of IDN ...
  69. [69]
    [PDF] Root Zone Label Generation Rules (RZ LGR-6) Overview ... - icann
    Sep 23, 2025 · 1.1 Label Generation Rules. A set of label generation rules for a zone governs the set of labels that may be allocated and eventually.
  70. [70]
    Reference LGR for language: Russian (ru) - icann
    Jan 24, 2024 · There are other new Cyrillic TLDs created in Russia, but they all use the same repertoire as .ru. In Russian, the acute accent may be used as a ...Missing: proposals stability .рф
  71. [71]
    Root Zone LGR for script: Devanagari (Deva) - icann
    Sep 23, 2025 · This file contains a set of Label Generation Rules (LGR) for the Devanagari script for the Root Zone. For more details on this LGR and ...
  72. [72]
    Thai: Reference LGR for script - icann
    This document specifies a set of Label Generation Rules (LGR) for the Thai script for the second level domain or domains identified above.Missing: collaboration | Show results with:collaboration
  73. [73]
    What Is a Homoglyph Attack? 2025 Guide to Unicode Spoofing ...
    Homoglyph (homograph) attacks exploit visually identical characters often from different scripts to spoof domains, emails, or filenames.Missing: bidi | Show results with:bidi
  74. [74]
    Out of character: Homograph attacks explained | Malwarebytes Labs
    Oct 6, 2017 · In an internationalized domain name (IDN) homograph attack, a threat actor creates and registers one or several fake domains using at least ...
  75. [75]
    Homograph attacks: How hackers exploit look-alike domains
    Apr 16, 2025 · A homograph attack is a type of phishing technique where attackers exploit the similarities in character appearance to deceive users.What are homograph attacks? · How to identify and prevent...
  76. [76]
    Watch Your Step: The Prevalence of IDN Homograph Attacks - Akamai
    May 27, 2020 · IDN homograph attacks are used by attackers to form domain names that look trustworthy to victims in order to serve phishing pages and malware.
  77. [77]
    HTTP Spoofing (IDN Homograph Attacks) - Invicti
    Discover all about HTTP spoofing, which means creating fake domain names that look just like real ones by using internationalized domain names (IDNs).
  78. [78]
    (PDF) The Homograph Attack - ResearchGate
    Aug 6, 2025 · Let us begin with a short recourse to history. On April 7, 2000 an anonymous site published a bogus story intimating that the company ...
  79. [79]
    [PDF] ShamFinder: An Automated Frameworkfor Detecting IDN Homographs
    Sep 17, 2019 · 2.2 IDN Homograph Attack. As mentioned in the previous section, the history of IDN homograph attacks can be traced back to the early 2000s.
  80. [80]
    The Subtle Art of Domain Impersonation using IDN homographic ...
    Sep 14, 2025 · What is an IDN homograph attack? IDN homograph attack swaps characters with look-alikes from other alphabets (Cyrillic, Greek, Armenian, etc.) ...
  81. [81]
    BIDI Swap: Unmasking the Art of URL Misleading with Bidirectional ...
    Varonis reveals a decade-old Unicode flaw that enables BiDi URL spoofing and poses phishing risks. Learn how attackers exploit RTL/LTR scripts and browser gaps.Missing: IDN | Show results with:IDN
  82. [82]
  83. [83]
    UTS #39: Unicode Security Mechanisms
    U+200C ZERO WIDTH NON-JOINER (ZWNJ) U+200D ZERO WIDTH JOINER (ZWJ). There are also two global conditions incorporated in each of A1, A2, and B: Script ...
  84. [84]
    Puny How!?: How internationalized domain names work in browsers
    Oct 13, 2025 · It includes rules for which characters are allowed, how to normalize them, and how to convert between Unicode and ASCII forms. Speaking of which ...
  85. [85]
    IDN is crazy | daniel.haxx.se
    Dec 14, 2022 · IDN works by having apps convert the Unicode name into the ASCII based punycode version under the hood, and then use that with DNS etc. The puny ...
  86. [86]
    Universal Acceptance (UA) - ICANN
    UA ensures that all domain names, including new top-level domains (TLDs), Internationalized Domain Names ( IDNs ), and email addresses are treated equally.Missing: software | Show results with:software
  87. [87]
    Label Generation Rules Tool - ICANN
    Jun 21, 2015 · The Label Generation Rules Tool, provided by ICANN, helps with reviewing IDN tables, validating labels against a Label Generation Rule (LGR), and developing ...<|separator|>
  88. [88]
    ICANN Highlights IDN Progress With Release of IDN Annual Report ...
    Aug 5, 2025 · ICANN published the annual report on Internationalized Domain Names, which help advance a more linguistically diverse online experience.Icann Highlights Idn... · Key Milestones And Trends · Policy Progress
  89. [89]
    ICANN Seeks Input on String Similarity Evaluation Data
    Oct 16, 2025 · ICANN org collected confusable data from script experts to be used in the string similarity evaluation and is seeking the community's ...Missing: IDN Guidelines 4.1
  90. [90]
    2 The Domain Name System: Emergence and Evolution
    The Domain Name System (DNS) was designed and deployed in the 1980s to overcome technical and operational constraints of its predecessor, the HOSTS.
  91. [91]
    Early Years of Unicode
    Mar 26, 2015 · Ground work for the Unicode project began in late 1987 with initial discussions between three software engineers -- Joe Becker of Xerox Corporation, Lee ...
  92. [92]
    draft-duerst-dns-i18n-00 - IETF Datatracker
    This document is an Internet-Draft (I-D). Anyone may submit an I-D to the IETF. This I-D is not endorsed by the IETF and has no formal standing in the IETF ...Missing: 1996-2000 | Show results with:1996-2000
  93. [93]
    Internationalized Domain Name (idn) - IETF Datatracker
    Group history ; 2008-04-23, (System), Concluded group ; 2000-10-10, (System), Changed milestone "Final discussion on the requirement document", resolved as "Done".Missing: pre- 2003
  94. [94]
  95. [95]
    ICANN 's Historical Relationship with the U.S. Government
    ICANN grew out of a 1998 commitment from the US Government to transfer the management of the domain name system to a new non-profit corporation based in the US.
  96. [96]
    Official Biography: Tan Tin Wee - Internet Hall of Fame
    ... IDNs, a working system of which he and his team invented and implemented since 1998 despite numerous challenges. In 1990 to 1992, when Dr. Tan returned home ...
  97. [97]
    Singapore's father of IDNs | APNIC Blog
    Sep 16, 2022 · Dr Tan Tin Wee has had a huge impact on Internet development in Singapore. While practising as an academic at the National University of Singapore (NUS)
  98. [98]
    PROMOTING THE MULTILINGUAL INTERNET - ITU
    ... IDN. Its founder, and an initiator of the IDN movement in the late 1990s, was Tan Tin Wee, an Associate Professor at the University of Singapore, who also ...
  99. [99]
    RFC 3492 - Punycode: A Bootstring encoding of Unicode for ...
    Jan 21, 2020 · Punycode is a simple and efficient transfer encoding syntax designed for use with Internationalized Domain Names in Applications (IDNA).
  100. [100]
    [PDF] IDNA 2003 & IDNA2008
    3492 Punycode: A Bootstring encoding of Unicode for Internationalized. Domain Names in Applications (IDNA). A. Costello. March 2003. (Format: TXT=67439 bytes) ...Missing: enabled | Show results with:enabled
  101. [101]
    Delegation of the .ไทย (“Thai”) domain representing Thailand in Thai ...
    In December 2009, an application was made to the "IDN Fast Track" process to have the string “ไทย” recognised as representing Thailand. The request was ...
  102. [102]
    First IDN ccTLDs Available - icann
    May 5, 2010 · These are the first IDN ccTLDs to appear online as a result of the IDN ccTLD Fast Track Process which was approved by the ICANN Board.Missing: examples | Show results with:examples
  103. [103]
    Guidelines for the Implementation of Internationalized Domain Names
    Internationalized Domain Name ,IDN,"IDNs are domain names that include characters used in the local representation of languages that are not written with the ...
  104. [104]
  105. [105]
    UTS #46: Unicode IDNA Compatibility Processing
    The series of RFCs collectively known as IDNA2003 [IDNA2003] allows domain names to contain non-ASCII Unicode characters, which includes not only the characters ...
  106. [106]
    History of the New gTLD Program - ICANN
    In January 2012, ICANN launched the New gTLD Program and subsequently received 1,930 gTLD applications. ... Internationalized Domain Name ,IDN,"IDNs are domain ...
  107. [107]
    ICANN Publishes Phase1 Initial Report on the Internationalized ...
    Apr 24, 2023 · This Public Comment proceeding seeks input on the Phase 1 Initial Report on the Internationalized Domain Names Expedited Policy Development Process.Missing: 2021-2023 | Show results with:2021-2023
  108. [108]
    [PDF] Phase 1 Final Report GNSO Council Presentation - 16 Nov 2023
    Nov 16, 2023 · EPDP-IDNs Team Overview (Cont.) Difficult Topics: ○ Whether to impose a ceiling on the number of variants that can be delegated. ○ Adapt ...
  109. [109]
    Phase 1 Final Report of the EPDP on Internationalized Domain Names
    Jan 23, 2024 · The ICANN Board is seeking the community's input on the Phase 1 Final Recommendations from the Expedited Policy Development Process (EPDP) on Internationalized ...
  110. [110]
    ICANN Publishes Seven Additional Script-Based Reference Label ...
    Jan 19, 2023 · ICANN has developed seven additional script-based reference Label Generation Rules (LGRs) for the second level for Armenian, Cyrillic, Greek, Latin, Japanese, ...
  111. [111]
    Additional Reference Label Generation Rulesets (LGRs) for ... - icann
    Jan 28, 2021 · Four LGRs are being released for Public Comment, including Arabic, Hebrew, and Sinhala script-based LGRs, and the Hebrew language-based LGR.Additional Reference Label... · Brief Overview · Section Ii: BackgroundMissing: expansions 2021-2023
  112. [112]
    ICANN seeks feedback on updated Root Zone Label Generation ...
    Jun 17, 2025 · ICANN is launching a Public Comment on the Root Zone Label Generation Rules Version 6 (RZ-LGR-6). It is the latest update to the guidelines that determine ...Missing: Bangla Japanese Khmer
  113. [113]
    UA Day 2025 Report: Thousands Join to Advance Universal ... - icann
    Jun 26, 2025 · ICANN, UNESCO, and thousands globally participated in UA Day 2025 to advance the Universal Acceptance of domain names in Internet-enabledMissing: 2021-2025 | Show results with:2021-2025