Latin Extended Additional
Latin Extended Additional is a block of the Unicode Standard that provides 256 additional characters for the Latin script, spanning the code point range U+1E00 to U+1EFF.[1] These characters consist primarily of precomposed combinations of Latin letters with one or more diacritical marks, supporting orthographic needs in various languages and scripts.[2] Introduced in Version 1.1 of the Unicode Standard, the block addresses gaps in earlier Latin extensions by including characters for specific linguistic and historical applications.[3] Notable subsets include diacritic variations for Irish Gaelic (such as letters with dots above, like ḃ and ṗ), Vietnamese quốc ngữ (precomposed vowels with tone marks and dots below, in the range U+1EA0–U+1EF9), and Indic transliteration (letters with underdots, like ḍ and ṇ).[1] It also encompasses specialized forms for medievalist studies (such as long s with a dot above, ẛ) and German typography (the capital sharp S, ẞ at U+1E9E, used in uppercased contexts like titles and signage).[2] Many characters in the block are canonically decomposable into a base letter and combining diacritics from the Combining Diacritical Marks block (U+0300–U+036F), facilitating compatibility with normalization processes.[2] The block's design emphasizes practical extensions for phonetic accuracy, regional orthographies, and scholarly transcription, ensuring broad support for Latin-based writing systems without relying solely on dynamic composition.[1] Non-decomposable characters, such as the small letter a with right half ring (U+1E9A ẚ), further cater to unique typographic requirements in historical and linguistic contexts.[2] As part of the Basic Multilingual Plane, Latin Extended Additional remains integral to digital text processing for European, African, and Asian languages using Latin scripts.[4]Overview
Purpose and Design
The Latin Extended Additional block, designated as U+1E00–U+1EFF in the Unicode Standard, encompasses 256 assigned code points dedicated to Latin letters, providing an extension to the Basic Latin and Latin Extended-A blocks for representing accented and diacritic-bearing characters in various orthographies.[1] This block focuses on encoding uppercase and lowercase forms of letters modified by diacritical marks such as dots above or below, macrons, acutes, graves, cedillas, and circumflexes, enabling direct support for phonetic and orthographic needs in languages like Irish, Vietnamese, and others without duplication of simpler forms already covered in prior blocks.[5][2] A core design principle of this block is the use of precomposed characters, which integrate base Latin letters with one or more diacritical marks into single code points, thereby avoiding the need for complex sequences of combining marks that could complicate text processing, rendering, and collation in systems handling languages with stacked or multiple diacritics.[2] This approach enhances efficiency in digital typography and software implementations, particularly for legacy systems or environments where normalization to decomposed forms might introduce errors or performance issues.[1] Most characters in the block—approximately 246—are general Latin letters with such diacritics, supplemented by specialized subsets for applications like medievalist transcriptions and certain phonetic notations.[5] The block's allocation underscores a commitment to phonetic accuracy in both native orthographies and scholarly transcriptions, allowing precise representation of sounds that cannot be adequately conveyed using only the unadorned letters from Basic Latin (U+0000–U+007F) or Latin Extended-A (U+0100–U+017F) alone.[2] By prioritizing these precomposed forms, the design facilitates seamless integration into broader Latin script ecosystems, supporting diverse linguistic traditions while maintaining compatibility with Unicode's normalization algorithms for interconversion between composed and decomposed representations where needed.[1]Unicode Allocation
The Latin Extended Additional block occupies the hexadecimal range U+1E00 to U+1EFF, encompassing exactly 256 code points dedicated to extended Latin characters with diacritics and other modifications.[1] As of Unicode 17.0, released in September 2025, all 256 positions within this block are fully assigned, with no unassigned or reserved code points remaining.[1][6] This block is positioned in the Unicode repertoire after the Latin Extended-A (U+0100–U+017F) and Latin Extended-B (U+0180–U+024F) blocks, continuing the sequential expansion of Latin script support within the Basic Multilingual Plane (BMP).[4][7] The placement in the BMP (plane 0, U+0000 to U+FFFF) ensures compatibility with systems limited to 16-bit encoding, distinguishing it from later Latin extensions in the Supplementary Multilingual Plane (plane 1).[6]Linguistic Applications
Support for European and Insular Languages
The Latin Extended Additional Unicode block (U+1E00–U+1EFF) encodes a range of precomposed characters essential for representing the orthographies of various European languages, with particular emphasis on Insular Celtic traditions such as Irish Gaelic. These characters facilitate accurate digital rendering of diacritic-modified letters used in historical and minority language contexts across Europe. The block's design prioritizes compatibility with combining diacritical marks while providing standalone forms for efficient typography.[1] A key application is in Irish Gaelic, where the block supplies characters for the traditional orthography's lenition markers—dots above consonants indicating aspiration or softening. Representative examples include Ḃ (U+1E02, LATIN CAPITAL LETTER B WITH DOT ABOVE) and ḃ (U+1E03, LATIN SMALL LETTER B WITH DOT ABOVE) for lenited /b/, Ḋ (U+1E0A) and ḋ (U+1E0B) for lenited /d/, Ḟ (U+1E1E) and ḟ (U+1E1F) for lenited /f/, Ṗ (U+1E56) and ṗ (U+1E57) for lenited /p/, Ṡ (U+1E60) and ṡ (U+1E61) for lenited /s/, and Ṫ (U+1E6A) and ṫ (U+1E6B) for lenited /t/. These dot-above forms, proposed for encoding to support Irish Gaelic texts, differ from International Phonetic Alphabet symbols by serving purely orthographic roles in lenition rather than broad phonetic transcription.[1] Although modern Irish primarily uses an adjacent "h" for lenition (e.g., bh for /v/), the dotted forms remain vital for historical manuscripts, scholarly editions, and certain typographic styles.[1] Beyond Irish, the block supports other European orthographies, including pre-1921 Latvian usage with characters like Ḑ (U+1E10, LATIN CAPITAL LETTER D WITH CEDILLA) and ḑ (U+1E11, LATIN SMALL LETTER D WITH CEDILLA) for palatalized /d/, as well as ẜ (U+1E9C, LATIN SMALL LETTER LONG S WITH DIAGONAL STROKE) in Blackletter-influenced texts. It also aids minority languages through diacritics like rings below (e.g., Ḁ U+1E00 for A) and other marks such as acutes and circumflexes. In modern typography, over 200 of the block's 256 characters are precomposed accented forms for capitals and lowercase letters, incorporating dots (52 characters), rings (2), and additional marks like acutes and circumflexes to ensure scalable support in digital fonts for these languages.[1][1]Support for Vietnamese and Southeast Asian Scripts
The Latin Extended Additional Unicode block plays a crucial role in supporting the Vietnamese orthography, known as quốc ngữ, by providing 90 precomposed characters that integrate Latin letters with diacritics essential to the language's phonology.[8] These characters enable the representation of complex vowel modifications, including the horn diacritic on letters like ơ and ư, combined with tone indicators, which are vital for distinguishing the six tonal contours in Vietnamese. For instance, forms such as Ớ (U+1EDB, Latin capital letter O with horn and acute) and ự (U+1EE5, Latin small letter U with dot below) exemplify how the block accommodates stacked diacritics without relying solely on combining marks.[1] Integration with tone marks is a key feature, offering precomposed variants for acute, grave, hook above, tilde, and dot below on modified vowels, such as ầ (U+1EAB, Latin small letter A with circumflex and grave) and ẩ (U+1EAD, Latin small letter A with circumflex and hook above). This design supports the full range of tonal distinctions—unmarked level, rising acute, falling grave, rising glottalized hook, even tilde, and falling glottalized dot below—directly within the orthography. By prioritizing these precomposed sequences, the block ensures compatibility with legacy systems and input methods that may struggle with multiple combining diacritics.[1][9] In regional usage, these characters are indispensable for digital text in Vietnam, facilitating accurate rendering of quốc ngữ in documents, websites, and software without decomposition challenges that could arise from separate combining marks. This avoids inconsistencies in diacritic positioning or normalization forms, promoting seamless adoption in computing environments across Southeast Asia where Vietnamese is predominant.[8] Compared to the Latin Extended-A block, which provides base forms like Ơ (U+01A1, Latin capital letter O with horn) and Ư (U+01AF, Latin capital letter U with horn) but lacks tone combinations, Latin Extended Additional uniquely adds the horn and hook diacritics in conjunction with tones, completing the set needed for modern Vietnamese.[1]Historical, Medieval, and Scholarly Uses
The Latin Extended Additional block provides essential support for reconstructing and transcribing historical and medieval Latin-based texts, enabling scholars to preserve the original graphemic forms without alteration. These characters facilitate paleographic analysis of manuscripts from medieval Europe, where Latin was adapted for various vernaculars and abbreviations. In particular, Unicode 5.1 (2008) introduced 10 dedicated characters for medievalist applications, proposed by experts including Michael Everson to address gaps in encoding medieval Welsh and other Insular traditions, as well as abbreviation systems common in paleography.[10][11] Key medievalist characters include U+1E9C (ẜ, LATIN SMALL LETTER LONG S WITH DIAGONAL STROKE) and U+1E9D (ẝ, LATIN SMALL LETTER LONG S WITH HIGH STROKE), which represent variant forms of the long s used in medieval abbreviations for words like "sanctus" or "sit." These diacritic-modified longs allow precise reproduction of scribal conventions in 13th- to 15th-century manuscripts, aiding philologists in studying textual evolution. Similarly, U+1E9E (ẞ, LATIN CAPITAL LETTER SHARP S) encodes the uppercase sharp s, derived from medieval ligatures of long s and z, historically employed in German Blackletter printing and now standard for uppercase ß in scholarly editions of early modern texts.[1][12] For paleographic support in Insular languages, characters such as U+1EFA (Ỻ, LATIN CAPITAL LETTER MIDDLE-WELSH LL) and U+1EFB (ỻ, LATIN SMALL LETTER MIDDLE-WELSH LL) denote the voiceless lateral fricative (/ɬ/) in medieval Welsh orthography, appearing in manuscripts like the Red Book of Hergest. These extend to related Insular uses, including notations for lenition in Old Irish contexts where similar sounds occur. Additional variants like U+1EFE (Ỿ, LATIN CAPITAL LETTER Y WITH LOOP) and U+1EFF (ỿ, LATIN SMALL LETTER Y WITH LOOP) mark the schwa sound (/ə/) in Middle Welsh, essential for accurate transcription of poetic and legal documents from the period.[1][13] In scholarly applications, particularly historical linguistics, characters like U+1E9F (ẟ, LATIN SMALL LETTER DELTA) serve as phonetic symbols for dental or alveolar approximants in reconstructions of ancient and medieval languages. Likewise, U+1EFC (Ṽ, LATIN CAPITAL LETTER MIDDLE-WELSH V) and U+1EFD (ṽ, LATIN SMALL LETTER MIDDLE-WELSH V) represent velar fricatives or labial variants in Insular Celtic studies, supporting detailed phonetic analyses of sound changes over time. These encodings, drawn from the Medieval Unicode Font Initiative (MUFI), ensure that academic transcriptions maintain fidelity to source materials, promoting interdisciplinary research in paleography and etymology.[1][12]Development and Encoding
Historical Evolution
The Latin Extended Additional block was first introduced in Unicode 1.1 in June 1993, establishing an initial repertoire of 245 characters primarily to support precomposed forms with diacritics for various European languages, including Irish Gaelic orthography (such as dotted letters for lenition, e.g., U+1E02 Ḃ LATIN CAPITAL LETTER B WITH DOT ABOVE) and basic Vietnamese tone marks (e.g., the range U+1EA0–U+1EF9 for letters like Ạ and ả).[14] This allocation aligned closely with the inaugural edition of ISO/IEC 10646:1993, incorporating amendments from international standardization efforts to extend the Latin script beyond earlier blocks like Latin Extended-A and Latin Extended-B.[3] In Unicode 2.0, released in July 1996, the block received a single addition: U+1E9B ẛ LATIN SMALL LETTER LONG S WITH DOT ABOVE, which provided a specialized form for historical and phonetic transcriptions in Gaelic scripts, serving as a variant of the modern dotted s (U+1E61 ṡ) used in Irish standardization.[15][14] This refinement addressed needs from linguistic communities focused on Insular scripts, drawing on input from scholars standardizing representations of medieval and early modern texts.[16] The block reached its current total of 256 characters with the final expansions in Unicode 5.1 in April 2008, incorporating 10 new characters tailored for medievalist and scholarly applications. These included variants of the long s (U+1E9C ẜ LATIN SMALL LETTER LONG S WITH DIAGONAL STROKE and U+1E9D ẝ LATIN SMALL LETTER LONG S WITH HIGH STROKE), the sharp s uppercase (U+1E9E ẞ LATIN CAPITAL LETTER SHARP S for German typography), a phonetic delta (U+1E9F ẟ LATIN SMALL LETTER DELTA), and Middle Welsh digraphs (U+1EFA Ỻ LATIN CAPITAL LETTER MIDDLE-WELSH LL, U+1EFB ỻ LATIN SMALL LETTER MIDDLE-WELSH LL, U+1EFC Ỽ LATIN CAPITAL LETTER MIDDLE-WELSH V, U+1EFD ỽ LATIN SMALL LETTER MIDDLE-WELSH V) along with a looped y (U+1EFE Ỿ LATIN CAPITAL LETTER Y WITH LOOP, U+1EFF ỿ LATIN SMALL LETTER Y WITH LOOP).[10][14] These additions stemmed from proposals by the Medieval Unicode Font Initiative (MUFI) and other experts, emphasizing compatibility with historical manuscripts and paleographic needs while integrating feedback from ISO/IEC 10646 Amendment 5.[17][11][18]Technical Implementation
The Latin Extended Additional block, spanning code points U+1E00 to U+1EFF, necessitates robust font implementations to ensure accurate rendering of its precomposed characters, many of which incorporate diacritics such as dots, strokes, and rings. Fonts supporting this block typically include OpenType features like 'ccmp' (contextual alternates) to handle composition and decomposition of glyphs, particularly for scenarios involving diacritic attachment or substitution in legacy or decomposed forms.[19] Additionally, the 'mark' feature is essential for proper positioning and stacking of multiple diacritics when text is normalized to decomposed forms, preventing visual overlaps or misalignments in complex ligatures.[19] Comprehensive support is evident in open-source fonts such as Noto Sans, which covers 100% of the block's 256 characters across its weights and styles, enabling seamless display in applications handling European and Southeast Asian scripts.[20][21] Compatibility with existing Unicode infrastructure is facilitated through canonical decomposition mappings, where most characters in the block break down into base letters from the Basic Latin block (U+0000–U+007F) combined with diacritical marks from the Combining Diacritical Marks block (U+0300–U+036F).[1] For instance, under Normalization Form D (NFD), precomposed forms like U+1E02 (Ḃ) decompose to U+0042 (B) followed by U+0307 (combining dot above), allowing interoperability with systems that prefer separate components for processing.[22] In contrast, Normalization Form C (NFC) recomposes these into the original precomposed characters, preserving the block's intended single-code-point representations and ensuring round-trip fidelity in storage and transmission.[22] This dual compatibility supports migration from older encodings while maintaining semantic equivalence in modern Unicode-aware environments. Input methods for the block vary by language but generally leverage keyboard layouts with dead keys or mnemonic conventions to generate code points efficiently. For Irish Gaelic, standard Windows and macOS layouts incorporate dead keys—such as the dot above key—to produce characters like U+1E02 (Ḃ) by sequencing the modifier before the base letter B, facilitating direct Unicode input without requiring custom software.[23] Similarly, dedicated Gaelic keyboards extend this mechanism to cover the full range of lenited and dotted forms used in traditional orthography. For Vietnamese, input method editors (IMEs) like those supporting Telex and VIQR schemes map alphabetic sequences to precomposed tones and diacritics in the block, such as typing "ax" in Telex to yield U+1EA3 (ả).[24] These methods, integrated into operating systems via libraries like IBUS or Windows IME, convert user input on-the-fly to Unicode, with Telex emphasizing letter-based shortcuts (e.g., 'w' for horns) and VIQR using ASCII approximations for broader compatibility.[24] Implementation challenges arise primarily from legacy systems, where partial overlaps with code pages like Windows-1258 limit support to a subset of Vietnamese characters in the block, excluding rarer historical or extended forms.[25] This can result in mojibake or substitution errors during conversion, as Windows-1258 prioritizes single-byte mappings for common tones but omits full decomposition handling for the block's 256 entries.[25] Modern UTF-8 environments, however, provide complete coverage, with libraries in languages like Java and Python offering built-in normalization and rendering to mitigate these issues, ensuring the block's characters display correctly across platforms without data loss.[26]Reference Materials
Full Character Chart
The Latin Extended Additional Unicode block encompasses 256 code points in the range U+1E00 to U+1EFF, of which 224 are assigned characters, designed to support precomposed Latin letters with diacritical marks for various orthographies and transliterations. All characters in this block are assigned the bidirectional class L (Left-to-Right), ensuring consistent rendering in left-to-right text flows. The characters are primarily uppercase and lowercase letters modified with diacritics such as dots, rings, macrons, acutes, graves, and cedillas, facilitating accurate representation in languages like Irish, Latvian, Lithuanian, Vietnamese, and scholarly transliterations of non-Latin scripts. The following tables present the full chart, grouped into subranges for clarity, with columns for the hexadecimal code point, glyph (rendered in a monospaced font for precision), official character name, and a brief category denoting the primary linguistic or typographic function (e.g., "Accented Uppercase" for diacritic-modified capital letters used in specific orthographies). Data verified as of Unicode 17.0.[1]U+1E00–U+1E3F: Accented Letters (A–G Variants)
| Code Point | Glyph | Name | Category |
|---|---|---|---|
| U+1E00 | Ḁ | LATIN CAPITAL LETTER A WITH RING BELOW | Accented Uppercase |
| U+1E01 | ḁ | LATIN SMALL LETTER A WITH RING BELOW | Accented Lowercase |
| U+1E02 | Ḃ | LATIN CAPITAL LETTER B WITH DOT ABOVE | Accented Uppercase |
| U+1E03 | ḃ | LATIN SMALL LETTER B WITH DOT ABOVE | Accented Lowercase |
| U+1E04 | Ḅ | LATIN CAPITAL LETTER B WITH DOT BELOW | Accented Uppercase |
| U+1E05 | ḅ | LATIN SMALL LETTER B WITH DOT BELOW | Accented Lowercase |
| U+1E06 | Ḇ | LATIN CAPITAL LETTER B WITH LINE BELOW | Accented Uppercase |
| U+1E07 | ḇ | LATIN SMALL LETTER B WITH LINE BELOW | Accented Lowercase |
| U+1E08 | Ḉ | LATIN CAPITAL LETTER C WITH CEDILLA AND ACUTE | Accented Uppercase |
| U+1E09 | ḉ | LATIN SMALL LETTER C WITH CEDILLA AND ACUTE | Accented Lowercase |
| U+1E0A | Ḋ | LATIN CAPITAL LETTER D WITH DOT ABOVE | Accented Uppercase |
| U+1E0B | ḋ | LATIN SMALL LETTER D WITH DOT ABOVE | Accented Lowercase |
| U+1E0C | Ḍ | LATIN CAPITAL LETTER D WITH DOT BELOW | Accented Uppercase |
| U+1E0D | ḍ | LATIN SMALL LETTER D WITH DOT BELOW | Accented Lowercase |
| U+1E0E | Ḏ | LATIN CAPITAL LETTER D WITH LINE BELOW | Accented Uppercase |
| U+1E0F | ḏ | LATIN SMALL LETTER D WITH LINE BELOW | Accented Lowercase |
| U+1E10 | Ḑ | LATIN CAPITAL LETTER D WITH CEDILLA | Accented Uppercase |
| U+1E11 | ḑ | LATIN SMALL LETTER D WITH CEDILLA | Accented Lowercase |
| U+1E12 | Ḓ | LATIN CAPITAL LETTER D WITH CEDILLA AND ACUTE | Accented Uppercase |
| U+1E13 | ḓ | LATIN SMALL LETTER D WITH CEDILLA AND ACUTE | Accented Lowercase |
| U+1E14 | Ḕ | LATIN CAPITAL LETTER E WITH MACRON AND GRAVE | Accented Uppercase |
| U+1E15 | ḕ | LATIN SMALL LETTER E WITH MACRON AND GRAVE | Accented Lowercase |
| U+1E16 | Ḗ | LATIN CAPITAL LETTER E WITH MACRON AND ACUTE | Accented Uppercase |
| U+1E17 | ḗ | LATIN SMALL LETTER E WITH MACRON AND ACUTE | Accented Lowercase |
| U+1E18 | Ḙ | LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND DOT BELOW | Accented Uppercase |
| U+1E19 | ḙ | LATIN SMALL LETTER E WITH CIRCUMFLEX AND DOT BELOW | Accented Lowercase |
| U+1E1A | Ḛ | LATIN CAPITAL LETTER E WITH MACRON AND DOT ABOVE | Accented Uppercase |
| U+1E1B | ḛ | LATIN SMALL LETTER E WITH MACRON AND DOT ABOVE | Accented Lowercase |
| U+1E1C | Ḝ | LATIN CAPITAL LETTER E WITH MACRON AND DOT BELOW | Accented Uppercase |
| U+1E1D | ḝ | LATIN SMALL LETTER E WITH MACRON AND DOT BELOW | Accented Lowercase |
| U+1E1E | Ḟ | LATIN CAPITAL LETTER F WITH DOT ABOVE | Accented Uppercase |
| U+1E1F | ḟ | LATIN SMALL LETTER F WITH DOT ABOVE | Accented Lowercase |
| U+1E20 | Ǵ | LATIN CAPITAL LETTER G WITH MACRON | Accented Uppercase |
| U+1E21 | ǵ | LATIN SMALL LETTER G WITH MACRON | Accented Lowercase |
| U+1E22 | Ḡ | LATIN CAPITAL LETTER H WITH DOT ABOVE | Accented Uppercase |
| U+1E23 | ḣ | LATIN SMALL LETTER H WITH DOT ABOVE | Accented Lowercase |
| U+1E24 | Ḥ | LATIN CAPITAL LETTER H WITH DOT BELOW | Accented Uppercase |
| U+1E25 | ḥ | LATIN SMALL LETTER H WITH DOT BELOW | Accented Lowercase |
| U+1E26 | Ḧ | LATIN CAPITAL LETTER H WITH DIAERESIS | Accented Uppercase |
| U+1E27 | ḧ | LATIN SMALL LETTER H WITH DIAERESIS | Accented Lowercase |
| U+1E28 | Ḩ | LATIN CAPITAL LETTER H WITH CEDILLA | Accented Uppercase |
| U+1E29 | ḩ | LATIN SMALL LETTER H WITH CEDILLA | Accented Lowercase |
| U+1E2A | Ậ | LATIN CAPITAL LETTER K WITH CEDILLA | Accented Uppercase |
| U+1E2B | ậ | LATIN SMALL LETTER K WITH CEDILLA | Accented Lowercase |
| U+1E2C | Ḭ | LATIN CAPITAL LETTER K WITH LINE BELOW | Accented Uppercase |
| U+1E2D | ḭ | LATIN SMALL LETTER K WITH LINE BELOW | Accented Lowercase |
| U+1E2E | Ḯ | LATIN CAPITAL LETTER L WITH DOT BELOW | Accented Uppercase |
| U+1E2F | ḯ | LATIN SMALL LETTER L WITH DOT BELOW | Accented Lowercase |
| U+1E30 | Ḱ | LATIN CAPITAL LETTER K WITH ACUTE | Accented Uppercase |
| U+1E31 | ḱ | LATIN SMALL LETTER K WITH ACUTE | Accented Lowercase |
| U+1E32 | Ḳ | LATIN CAPITAL LETTER L WITH MIDDLE DOT | Accented Uppercase |
| U+1E33 | ḳ | LATIN SMALL LETTER L WITH MIDDLE DOT | Accented Lowercase |
| U+1E34 | Ḵ | LATIN CAPITAL LETTER L WITH LINE BELOW | Accented Uppercase |
| U+1E35 | ḵ | LATIN SMALL LETTER L WITH LINE BELOW | Accented Lowercase |
| U+1E36 | Ḷ | LATIN CAPITAL LETTER L WITH CIRCUMFLEX | Accented Uppercase |
| U+1E37 | ḷ | LATIN SMALL LETTER L WITH CIRCUMFLEX | Accented Lowercase |
| U+1E38 | Ḹ | LATIN CAPITAL LETTER L WITH CEDILLA | Accented Uppercase |
| U+1E39 | ḹ | LATIN SMALL LETTER L WITH CEDILLA | Accented Lowercase |
| U+1E3A | Ḻ | LATIN CAPITAL LETTER L WITH LINE BELOW | Accented Uppercase |
| U+1E3B | ḻ | LATIN SMALL LETTER L WITH LINE BELOW | Accented Lowercase |
| U+1E3C | Ḽ | LATIN CAPITAL LETTER L WITH LINE BELOW | Accented Uppercase |
| U+1E3D | ḽ | LATIN SMALL LETTER L WITH LINE BELOW | Accented Lowercase |
| U+1E3E | Ḿ | LATIN CAPITAL LETTER M WITH ACUTE | Accented Uppercase |
| U+1E3F | ḿ | LATIN SMALL LETTER M WITH ACUTE | Accented Lowercase |
U+1E40–U+1E7F: Accented Letters (M–Z and Additional Variants)
| Code Point | Glyph | Name | Category |
|---|---|---|---|
| U+1E40 | Ṁ | LATIN CAPITAL LETTER M WITH DOT ABOVE | Accented Uppercase |
| U+1E41 | ṁ | LATIN SMALL LETTER M WITH DOT ABOVE | Accented Lowercase |
| U+1E42 | Ṃ | LATIN CAPITAL LETTER M WITH DOT BELOW | Accented Uppercase |
| U+1E43 | ṃ | LATIN SMALL LETTER M WITH DOT BELOW | Accented Lowercase |
| U+1E44 | Ṅ | LATIN CAPITAL LETTER N WITH DOT ABOVE | Accented Uppercase |
| U+1E45 | ṅ | LATIN SMALL LETTER N WITH DOT ABOVE | Accented Lowercase |
| U+1E46 | Ṇ | LATIN CAPITAL LETTER N WITH DOT BELOW | Accented Uppercase |
| U+1E47 | ṇ | LATIN SMALL LETTER N WITH DOT BELOW | Accented Lowercase |
| U+1E48 | Ṉ | LATIN CAPITAL LETTER N WITH LINE BELOW | Accented Uppercase |
| U+1E49 | ṉ | LATIN SMALL LETTER N WITH LINE BELOW | Accented Lowercase |
| U+1E4A | Ṋ | LATIN CAPITAL LETTER N WITH CIRCUMFLEX | Accented Uppercase |
| U+1E4B | ṋ | LATIN SMALL LETTER N WITH CIRCUMFLEX | Accented Lowercase |
| U+1E4C | Ṍ | LATIN CAPITAL LETTER O WITH TILDE AND ACUTE | Accented Uppercase |
| U+1E4D | ṍ | LATIN SMALL LETTER O WITH TILDE AND ACUTE | Accented Lowercase |
| U+1E4E | Ṏ | LATIN CAPITAL LETTER O WITH TILDE AND DIAERESIS | Accented Uppercase |
| U+1E4F | ṏ | LATIN SMALL LETTER O WITH TILDE AND DIAERESIS | Accented Lowercase |
| U+1E50 | Ṑ | LATIN CAPITAL LETTER O WITH MACRON AND GRAVE | Accented Uppercase |
| U+1E51 | ṑ | LATIN SMALL LETTER O WITH MACRON AND GRAVE | Accented Lowercase |
| U+1E52 | Ṓ | LATIN CAPITAL LETTER O WITH MACRON AND ACUTE | Accented Uppercase |
| U+1E53 | ṓ | LATIN SMALL LETTER O WITH MACRON AND ACUTE | Accented Lowercase |
| U+1E54 | Ṕ | LATIN CAPITAL LETTER P WITH ACUTE | Accented Uppercase |
| U+1E55 | ṕ | LATIN SMALL LETTER P WITH ACUTE | Accented Lowercase |
| U+1E56 | Ṗ | LATIN CAPITAL LETTER P WITH DOT ABOVE | Accented Uppercase |
| U+1E57 | ṗ | LATIN SMALL LETTER P WITH DOT ABOVE | Accented Lowercase |
| U+1E58 | Ṙ | LATIN CAPITAL LETTER R WITH DOT ABOVE | Accented Uppercase |
| U+1E59 | ṙ | LATIN SMALL LETTER R WITH DOT ABOVE | Accented Lowercase |
| U+1E5A | Ṛ | LATIN CAPITAL LETTER R WITH DOT BELOW | Accented Uppercase |
| U+1E5B | ṛ | LATIN SMALL LETTER R WITH DOT BELOW | Accented Lowercase |
| U+1E5C | Ṝ | LATIN CAPITAL LETTER R WITH DOT BELOW AND MACRON | Accented Uppercase |
| U+1E5D | ṝ | LATIN SMALL LETTER R WITH DOT BELOW AND MACRON | Accented Lowercase |
| U+1E5E | Ṟ | LATIN CAPITAL LETTER R WITH LINE BELOW | Accented Uppercase |
| U+1E5F | ṟ | LATIN SMALL LETTER R WITH LINE BELOW | Accented Lowercase |
| U+1E60 | Ṡ | LATIN CAPITAL LETTER S WITH DOT ABOVE | Accented Uppercase |
| U+1E61 | ṡ | LATIN SMALL LETTER S WITH DOT ABOVE | Accented Lowercase |
| U+1E62 | Ṣ | LATIN CAPITAL LETTER S WITH DOT BELOW | Accented Uppercase |
| U+1E63 | ṣ | LATIN SMALL LETTER S WITH DOT BELOW | Accented Lowercase |
| U+1E64 | Ṥ | LATIN CAPITAL LETTER S WITH ACUTE AND DOT ABOVE | Accented Uppercase |
| U+1E65 | ṥ | LATIN SMALL LETTER S WITH ACUTE AND DOT ABOVE | Accented Lowercase |
| U+1E66 | Ṧ | LATIN CAPITAL LETTER S WITH CARON AND DOT ABOVE | Accented Uppercase |
| U+1E67 | ṧ | LATIN SMALL LETTER S WITH CARON AND DOT ABOVE | Accented Lowercase |
| U+1E68 | Ṩ | LATIN CAPITAL LETTER S WITH DOT BELOW AND DOT ABOVE | Accented Uppercase |
| U+1E69 | ṩ | LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE | Accented Lowercase |
| U+1E6A | Ṫ | LATIN CAPITAL LETTER T WITH DOT ABOVE | Accented Uppercase |
| U+1E6B | ṫ | LATIN SMALL LETTER T WITH DOT ABOVE | Accented Lowercase |
| U+1E6C | Ṭ | LATIN CAPITAL LETTER T WITH DOT BELOW | Accented Uppercase |
| U+1E6D | ṭ | LATIN SMALL LETTER T WITH DOT BELOW | Accented Lowercase |
| U+1E6E | Ṯ | LATIN CAPITAL LETTER T WITH LINE BELOW | Accented Uppercase |
| U+1E6F | ṯ | LATIN SMALL LETTER T WITH LINE BELOW | Accented Lowercase |
| U+1E70 | Ṱ | LATIN CAPITAL LETTER T WITH CIRCUMFLEX | Accented Uppercase |
| U+1E71 | ṱ | LATIN SMALL LETTER T WITH CIRCUMFLEX | Accented Lowercase |
| U+1E72 | Ṳ | LATIN CAPITAL LETTER U WITH DIAERESIS BELOW | Accented Uppercase |
| U+1E73 | ṳ | LATIN SMALL LETTER U WITH DIAERESIS BELOW | Accented Lowercase |
| U+1E74 | Ṵ | LATIN CAPITAL LETTER U WITH TILDE BELOW | Accented Uppercase |
| U+1E75 | ṵ | LATIN SMALL LETTER U WITH TILDE BELOW | Accented Lowercase |
| U+1E76 | Ṷ | LATIN CAPITAL LETTER U WITH CIRCUMFLEX AND DOT BELOW | Accented Uppercase |
| U+1E77 | ṷ | LATIN SMALL LETTER U WITH CIRCUMFLEX AND DOT BELOW | Accented Lowercase |
| U+1E78 | Ṹ | LATIN CAPITAL LETTER U WITH MACRON AND DIAERESIS | Accented Uppercase |
| U+1E79 | ṹ | LATIN SMALL LETTER U WITH MACRON AND DIAERESIS | Accented Lowercase |
| U+1E7A | Ṻ | LATIN CAPITAL LETTER U WITH MACRON AND DIAERESIS AND ACUTE | Accented Uppercase |
| U+1E7B | ṻ | LATIN SMALL LETTER U WITH MACRON AND DIAERESIS AND ACUTE | Accented Lowercase |
| U+1E7C | Ṽ | LATIN CAPITAL LETTER V WITH TILDE | Accented Uppercase |
| U+1E7D | ṽ | LATIN SMALL LETTER V WITH TILDE | Accented Lowercase |
| U+1E7E | Ṿ | LATIN CAPITAL LETTER V WITH DOT BELOW | Accented Uppercase |
| U+1E7F | ṿ | LATIN SMALL LETTER V WITH DOT BELOW | Accented Lowercase |
U+1E80–U+1EBF: Additional Accented Letters and Vietnamese-Specific
| Code Point | Glyph | Name | Category |
|---|---|---|---|
| U+1E80 | Ẁ | LATIN CAPITAL LETTER W WITH GRAVE | Accented Uppercase |
| U+1E81 | ẁ | LATIN SMALL LETTER W WITH GRAVE | Accented Lowercase |
| U+1E82 | Ẃ | LATIN CAPITAL LETTER W WITH ACUTE | Accented Uppercase |
| U+1E83 | ẃ | LATIN SMALL LETTER W WITH ACUTE | Accented Lowercase |
| U+1E84 | Ẅ | LATIN CAPITAL LETTER W WITH DIAERESIS | Accented Uppercase |
| U+1E85 | ẅ | LATIN SMALL LETTER W WITH DIAERESIS | Accented Lowercase |
| U+1E86 | Ẇ | LATIN CAPITAL LETTER W WITH DOT ABOVE | Accented Uppercase |
| U+1E87 | ẇ | LATIN SMALL LETTER W WITH DOT ABOVE | Accented Lowercase |
| U+1E88 | Ẉ | LATIN CAPITAL LETTER W WITH DOT BELOW | Accented Uppercase |
| U+1E89 | ẉ | LATIN SMALL LETTER W WITH DOT BELOW | Accented Lowercase |
| U+1E8A | Ẋ | LATIN CAPITAL LETTER X WITH DOT ABOVE | Accented Uppercase |
| U+1E8B | ẋ | LATIN SMALL LETTER X WITH DOT ABOVE | Accented Lowercase |
| U+1E8C | Ẍ | LATIN CAPITAL LETTER X WITH DIAERESIS | Accented Uppercase |
| U+1E8D | ẍ | LATIN SMALL LETTER X WITH DIAERESIS | Accented Lowercase |
| U+1E8E | Ẏ | LATIN CAPITAL LETTER Y WITH DOT ABOVE | Accented Uppercase |
| U+1E8F | ẏ | LATIN SMALL LETTER Y WITH DOT ABOVE | Accented Lowercase |
| U+1E90 | Ẑ | LATIN CAPITAL LETTER Z WITH CIRCUMFLEX | Accented Uppercase |
| U+1E91 | ẑ | LATIN SMALL LETTER Z WITH CIRCUMFLEX | Accented Lowercase |
| U+1E92 | Ẓ | LATIN CAPITAL LETTER Z WITH DOT BELOW | Accented Uppercase |
| U+1E93 | ẓ | LATIN SMALL LETTER Z WITH DOT BELOW | Accented Lowercase |
| U+1E94 | Ẕ | LATIN CAPITAL LETTER Z WITH LINE BELOW | Accented Uppercase |
| U+1E95 | ẕ | LATIN SMALL LETTER Z WITH LINE BELOW | Accented Lowercase |
| U+1E96 | ḧ | LATIN SMALL LETTER H WITH LINE BELOW | Legacy Variant |
| U+1E97 | ẗ | LATIN SMALL LETTER T WITH DIAERESIS | Legacy Variant |
| U+1E98 | ẘ | LATIN SMALL LETTER W WITH RING ABOVE | Legacy Variant |
| U+1E99 | ẙ | LATIN SMALL LETTER Y WITH RING ABOVE | Legacy Variant |
| U+1E9A | ẚ | LATIN SMALL LETTER A WITH RIGHT HALF RING | Legacy Variant |
| U+1E9B | ẛ | LATIN SMALL LETTER LONG S WITH DOT ABOVE | Legacy Variant |
| U+1E9C | — | <reserved> | Unassigned |
| U+1E9D | — | <reserved> | Unassigned |
| U+1E9E | ẞ | LATIN CAPITAL LETTER SHARP S | German Uppercase |
| U+1E9F | — | <reserved> | Unassigned |
| U+1EA0 | Ạ | LATIN CAPITAL LETTER A WITH DOT BELOW | Vietnamese Uppercase |
| U+1EA1 | ạ | LATIN SMALL LETTER A WITH DOT BELOW | Vietnamese Lowercase |
| U+1EA2 | Ả | LATIN CAPITAL LETTER A WITH HOOK ABOVE | Vietnamese Uppercase |
| U+1EA3 | ả | LATIN SMALL LETTER A WITH HOOK ABOVE | Vietnamese Lowercase |
| U+1EA4 | Ấ | LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND ACUTE | Vietnamese Uppercase |
| U+1EA5 | ấ | LATIN SMALL LETTER A WITH CIRCUMFLEX AND ACUTE | Vietnamese Lowercase |
| U+1EA6 | Ầ | LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND GRAVE | Vietnamese Uppercase |
| U+1EA7 | ầ | LATIN SMALL LETTER A WITH CIRCUMFLEX AND GRAVE | Vietnamese Lowercase |
| U+1EA8 | Ẩ | LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE | Vietnamese Uppercase |
| U+1EA9 | ẩ | LATIN SMALL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE | Vietnamese Lowercase |
| U+1EAA | Ẫ | LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND TILDE | Vietnamese Uppercase |
| U+1EAB | ẫ | LATIN SMALL LETTER A WITH CIRCUMFLEX AND TILDE | Vietnamese Lowercase |
| U+1EAC | Ậ | LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND DOT BELOW | Vietnamese Uppercase |
| U+1EAD | ậ | LATIN SMALL LETTER A WITH CIRCUMFLEX AND DOT BELOW | Vietnamese Lowercase |
| U+1EAE | Ắ | LATIN CAPITAL LETTER A WITH BREVE AND ACUTE | Vietnamese Uppercase |
| U+1EAF | ắ | LATIN SMALL LETTER A WITH BREVE AND ACUTE | Vietnamese Lowercase |
| U+1EB0 | Ằ | LATIN CAPITAL LETTER A WITH BREVE AND GRAVE | Vietnamese Uppercase |
| U+1EB1 | ằ | LATIN SMALL LETTER A WITH BREVE AND GRAVE | Vietnamese Lowercase |
| U+1EB2 | Ẳ | LATIN CAPITAL LETTER A WITH BREVE AND HOOK ABOVE | Vietnamese Uppercase |
| U+1EB3 | ẳ | LATIN SMALL LETTER A WITH BREVE AND HOOK ABOVE | Vietnamese Lowercase |
| U+1EB4 | Ẵ | LATIN CAPITAL LETTER A WITH BREVE AND TILDE | Vietnamese Uppercase |
| U+1EB5 | ẵ | LATIN SMALL LETTER A WITH BREVE AND TILDE | Vietnamese Lowercase |
| U+1EB6 | Ặ | LATIN CAPITAL LETTER A WITH BREVE AND DOT BELOW | Vietnamese Uppercase |
| U+1EB7 | ặ | LATIN SMALL LETTER A WITH BREVE AND DOT BELOW | Vietnamese Lowercase |
| U+1EB8 | Ẹ | LATIN CAPITAL LETTER E WITH DOT BELOW | Vietnamese Uppercase |
| U+1EB9 | ẹ | LATIN SMALL LETTER E WITH DOT BELOW | Vietnamese Lowercase |
| U+1EBA | Ẻ | LATIN CAPITAL LETTER E WITH HOOK ABOVE | Vietnamese Uppercase |
| U+1EBB | ẻ | LATIN SMALL LETTER E WITH HOOK ABOVE | Vietnamese Lowercase |
| U+1EBC | Ẽ | LATIN CAPITAL LETTER E WITH TILDE | Vietnamese Uppercase |
| U+1EBD | ẽ | LATIN SMALL LETTER E WITH TILDE | Vietnamese Lowercase |
| U+1EBE | Ế | LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND ACUTE | Vietnamese Uppercase |
| U+1EBF | ế | LATIN SMALL LETTER E WITH CIRCUMFLEX AND ACUTE | Vietnamese Lowercase |
U+1EC0–U+1EFF: Vietnamese Tones and Legacy Forms
| Code Point | Glyph | Name | Category |
|---|---|---|---|
| U+1EC0 | Ẹ | LATIN CAPITAL LETTER E WITH DOT BELOW | Vietnamese Uppercase |
| U+1EC1 | ẹ | LATIN SMALL LETTER E WITH DOT BELOW | Vietnamese Lowercase |
| U+1EC2 | Ể | LATIN CAPITAL LETTER E WITH HOOK ABOVE | Vietnamese Uppercase |
| U+1EC3 | ể | LATIN SMALL LETTER E WITH HOOK ABOVE | Vietnamese Lowercase |
| U+1EC4 | Ế | LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND ACUTE | Vietnamese Uppercase |
| U+1EC5 | ế | LATIN SMALL LETTER E WITH CIRCUMFLEX AND ACUTE | Vietnamese Lowercase |
| U+1EC6 | Ề | LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND GRAVE | Vietnamese Uppercase |
| U+1EC7 | ề | LATIN SMALL LETTER E WITH CIRCUMFLEX AND GRAVE | Vietnamese Lowercase |
| U+1EC8 | Ễ | LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND TILDE | Vietnamese Uppercase |
| U+1EC9 | ễ | LATIN SMALL LETTER E WITH CIRCUMFLEX AND TILDE | Vietnamese Lowercase |
| U+1ECA | Ệ | LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND DOT BELOW | Vietnamese Uppercase |
| U+1ECB | ệ | LATIN SMALL LETTER E WITH CIRCUMFLEX AND DOT BELOW | Vietnamese Lowercase |
| U+1ECC | Ỉ | LATIN CAPITAL LETTER I WITH HOOK ABOVE | Vietnamese Uppercase |
| U+1ECD | ỉ | LATIN SMALL LETTER I WITH HOOK ABOVE | Vietnamese Lowercase |
| U+1ECE | Ọ | LATIN CAPITAL LETTER O WITH DOT BELOW | Vietnamese Uppercase |
| U+1ECF | ọ | LATIN SMALL LETTER O WITH DOT BELOW | Vietnamese Lowercase |
| U+1ED0 | Ỏ | LATIN CAPITAL LETTER O WITH HOOK ABOVE | Vietnamese Uppercase |
| U+1ED1 | ỏ | LATIN SMALL LETTER O WITH HOOK ABOVE | Vietnamese Lowercase |
| U+1ED2 | Ố | LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND ACUTE | Vietnamese Uppercase |
| U+1ED3 | ố | LATIN SMALL LETTER O WITH CIRCUMFLEX AND ACUTE | Vietnamese Lowercase |
| U+1ED4 | Ồ | LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND GRAVE | Vietnamese Uppercase |
| U+1ED5 | ồ | LATIN SMALL LETTER O WITH CIRCUMFLEX AND GRAVE | Vietnamese Lowercase |
| U+1ED6 | Ổ | LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE | Vietnamese Uppercase |
| U+1ED7 | ổ | LATIN SMALL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE | Vietnamese Lowercase |
| U+1ED8 | Ỗ | LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND TILDE | Vietnamese Uppercase |
| U+1ED9 | ỗ | LATIN SMALL LETTER O WITH CIRCUMFLEX AND TILDE | Vietnamese Lowercase |
| U+1EDA | Ộ | LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND DOT BELOW | Vietnamese Uppercase |
| U+1EDB | ộ | LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW | Vietnamese Lowercase |
| U+1EDC | Ớ | LATIN CAPITAL LETTER O WITH HORN AND ACUTE | Vietnamese Uppercase |
| U+1EDD | ớ | LATIN SMALL LETTER O WITH HORN AND ACUTE | Vietnamese Lowercase |
| U+1EDE | Ờ | LATIN CAPITAL LETTER O WITH HORN AND GRAVE | Vietnamese Uppercase |
| U+1EDF | ờ | LATIN SMALL LETTER O WITH HORN AND GRAVE | Vietnamese Lowercase |
| U+1EE0 | Ở | LATIN CAPITAL LETTER O WITH HORN AND HOOK ABOVE | Vietnamese Uppercase |
| U+1EE1 | ở | LATIN SMALL LETTER O WITH HORN AND HOOK ABOVE | Vietnamese Lowercase |
| U+1EE2 | Ỡ | LATIN CAPITAL LETTER O WITH HORN AND TILDE | Vietnamese Uppercase |
| U+1EE3 | ỡ | LATIN SMALL LETTER O WITH HORN AND TILDE | Vietnamese Lowercase |
| U+1EE4 | Ợ | LATIN CAPITAL LETTER O WITH HORN AND DOT BELOW | Vietnamese Uppercase |
| U+1EE5 | ợ | LATIN SMALL LETTER O WITH HORN AND DOT BELOW | Vietnamese Lowercase |
| U+1EE6 | Ứ | LATIN CAPITAL LETTER U WITH HORN AND ACUTE | Vietnamese Uppercase |
| U+1EE7 | ứ | LATIN SMALL LETTER U WITH HORN AND ACUTE | Vietnamese Lowercase |
| U+1EE8 | Ừ | LATIN CAPITAL LETTER U WITH HORN AND GRAVE | Vietnamese Uppercase |
| U+1EE9 | ừ | LATIN SMALL LETTER U WITH HORN AND GRAVE | Vietnamese Lowercase |
| U+1EEA | Ử | LATIN CAPITAL LETTER U WITH HORN AND HOOK ABOVE | Vietnamese Uppercase |
| U+1EEB | ử | LATIN SMALL LETTER U WITH HORN AND HOOK ABOVE | Vietnamese Lowercase |
| U+1EEC | Ữ | LATIN CAPITAL LETTER U WITH HORN AND TILDE | Vietnamese Uppercase |
| U+1EED | ữ | LATIN SMALL LETTER U WITH HORN AND TILDE | Vietnamese Lowercase |
| U+1EEE | Ự | LATIN CAPITAL LETTER U WITH HORN AND DOT BELOW | Vietnamese Uppercase |
| U+1EEF | ự | LATIN SMALL LETTER U WITH HORN AND DOT BELOW | Vietnamese Lowercase |
| U+1EF0 | Ỳ | LATIN CAPITAL LETTER Y WITH GRAVE | Vietnamese Uppercase |
| U+1EF1 | ỳ | LATIN SMALL LETTER Y WITH GRAVE | Vietnamese Lowercase |
| U+1EF2 | Ỷ | LATIN CAPITAL LETTER Y WITH HOOK ABOVE | Vietnamese Uppercase |
| U+1EF3 | ỷ | LATIN SMALL LETTER Y WITH HOOK ABOVE | Vietnamese Lowercase |
| U+1EF4 | Ỵ | LATIN CAPITAL LETTER Y WITH DOT BELOW | Vietnamese Uppercase |
| U+1EF5 | ỵ | LATIN SMALL LETTER Y WITH DOT BELOW | Vietnamese Lowercase |
| U+1EF6 | Ỹ | LATIN CAPITAL LETTER Y WITH TILDE | Vietnamese Uppercase |
| U+1EF7 | ỹ | LATIN SMALL LETTER Y WITH TILDE | Vietnamese Lowercase |
| U+1EF8 | Ỹ | LATIN CAPITAL LETTER Y WITH ACUTE | Vietnamese Uppercase |
| U+1EF9 | ỵ | LATIN SMALL LETTER Y WITH ACUTE | Vietnamese Lowercase |
| U+1EFA | — | <reserved> | Unassigned |
| U+1EFB | — | <reserved> | Unassigned |
| U+1EFC | — | <reserved> | Unassigned |
| U+1EFD | — | <reserved> | Unassigned |
| U+1EFE | — | <reserved> | Unassigned |
| U+1EFF | — | <reserved> | Unassigned |
Compact Listing
The Latin Extended Additional block (U+1E00–U+1EFF) provides 256 code points for extended Latin characters, primarily precomposed forms with diacritics for linguistic and typographic needs, many of which decompose canonically to a base letter plus a combining mark (e.g., U+1E02 Ḃ decomposes to B + COMBINING DOT ABOVE). This compact listing organizes characters alphabetically by base letter within diacritic categories for quick reference, including assigned code points only (224 total, excluding 32 unassigned), with counts per category to highlight distribution. It serves developers and typographers for font implementation, normalization, and collation, where aliases like NFKC compatibility decompositions may apply for legacy systems.[1][27] Characters are grouped by primary diacritic type, then by base letter (A–Z, a–z pairs where applicable), showing code point, glyph, and abbreviated name derived from official nomenclature.[1]Dot Above (52 characters: 26 uppercase, 26 lowercase)
These include letters like B, D, F, etc., with a single dot above, used in Irish and other orthographies.| Code Point | Glyph | Short Name |
|---|---|---|
| 1E02 | Ḃ | B DOT ABOVE |
| 1E03 | ḃ | b DOT ABOVE |
| 1E0A | Ḋ | D DOT ABOVE |
| 1E0B | ḋ | d DOT ABOVE |
| 1E1E | Ḟ | F DOT ABOVE |
| 1E1F | ḟ | f DOT ABOVE |
| ... | ... | (20 more pairs) |
Dot Below (48 characters: 24 uppercase, 24 lowercase)
Common in Vietnamese and African languages, applied to vowels and consonants like A, B, D.| Code Point | Glyph | Short Name |
|---|---|---|
| 1E0C | Ḍ | D DOT BELOW |
| 1E0D | ḍ | d DOT BELOW |
| 1EA0 | Ạ | A DOT BELOW |
| 1EA1 | ạ | a DOT BELOW |
| 1EAC | Ậ | A CIRCUMFLEX DOT BELOW |
| 1EAD | ậ | a CIRCUMFLEX DOT BELOW |
| ... | ... | (18 more pairs) |
Macron (20 characters: 10 uppercase, 10 lowercase)
For length marking in Baltic and other scripts, on letters like G, O, U.| Code Point | Glyph | Short Name |
|---|---|---|
| 1E20 | Ǵ | G MACRON |
| 1E21 | ǵ | g MACRON |
| 1E14 | Ḕ | E MACRON GRAVE |
| 1E15 | ḕ | e MACRON GRAVE |
| 1E44 | Ṅ | N DOT ABOVE |
| 1E45 | ṅ | n DOT ABOVE |
| ... | ... | (7 more pairs) |
Acute Accent (16 characters: 8 uppercase, 8 lowercase, often combined)
Seen in Irish and Vietnamese, e.g., on K, P, R.| Code Point | Glyph | Short Name |
|---|---|---|
| 1E30 | Ḱ | K ACUTE |
| 1E31 | ḱ | k ACUTE |
| 1E54 | Ṕ | P ACUTE |
| 1E55 | ṕ | p ACUTE |
| 1E58 | Ṙ | R DOT ABOVE |
| 1E59 | ṙ | r DOT ABOVE |
| ... | ... | (3 more pairs) |
Tilde (28 characters: 14 uppercase, 14 lowercase)
For nasalization in Portuguese and Vietnamese, on A, E, O, etc.| Code Point | Glyph | Short Name |
|---|---|---|
| 1E74 | Ṵ | U TILDE BELOW |
| 1E75 | ṵ | u TILDE BELOW |
| 1E4C | Ṍ | O TILDE ACUTE |
| 1E4D | ṍ | o TILDE ACUTE |
| 1E7C | Ṽ | V TILDE |
| 1E7D | ṽ | v TILDE |
| ... | ... | (9 more pairs) |
Other Diacritics (e.g., Circumflex, Hook Above, Ring Below; 60 characters total)
Includes diverse forms like ring below (4), stroke (8), double acute (4), and Vietnamese hooks/horns (32+). Examples:| Code Point | Glyph | Short Name |
|---|---|---|
| 1E00 | Ḁ | A RING BELOW |
| 1E01 | ḁ | a RING BELOW |
| 1E24 | Ḥ | H DOT BELOW |
| 1E25 | ḥ | h DOT BELOW |
| 1EA6 | Ầ | A CIRCUMFLEX GRAVE |
| 1EA7 | ầ | a CIRCUMFLEX GRAVE |
| 1ED2 | Ố | O CIRCUMFLEX ACUTE |
| 1ED3 | ố | o CIRCUMFLEX ACUTE |
| ... | ... | (52 more) |