Fact-checked by Grok 2 weeks ago

Latin Extended Additional

Latin Extended Additional is a block of the Unicode Standard that provides 256 additional characters for the Latin script, spanning the code point range U+1E00 to U+1EFF.^[1] These characters consist primarily of precomposed combinations of Latin letters with one or more diacritical marks, supporting orthographic needs in various languages and scripts.^[2] Introduced in Version 1.1 of the Unicode Standard, the block addresses gaps in earlier Latin extensions by including characters for specific linguistic and historical applications.^[3] Notable subsets include diacritic variations for Irish Gaelic (such as letters with dots above, like ḃ and ṗ), Vietnamese quốc ngữ (precomposed vowels with tone marks and dots below, in the range U+1EA0–U+1EF9), and Indic transliteration (letters with underdots, like ḍ and ṇ).^[1] It also encompasses specialized forms for medievalist studies (such as long s with a dot above, ẛ) and German typography (the capital sharp S, ẞ at U+1E9E, used in uppercased contexts like titles and signage).^[2] Many characters in the block are canonically decomposable into a base letter and combining diacritics from the Combining Diacritical Marks block (U+0300–U+036F), facilitating compatibility with normalization processes.^[2] The block's design emphasizes practical extensions for phonetic accuracy, regional orthographies, and scholarly transcription, ensuring broad support for Latin-based writing systems without relying solely on dynamic composition.^[1] Non-decomposable characters, such as the small letter a with right half ring (U+1E9A ẚ), further cater to unique typographic requirements in historical and linguistic contexts.^[2] As part of the Basic Multilingual Plane, Latin Extended Additional remains integral to digital text processing for European, African, and Asian languages using Latin scripts.^[4]

Overview

Purpose and Design

The Latin Extended Additional block, designated as U+1E00–U+1EFF in the Unicode Standard, encompasses 256 assigned code points dedicated to Latin letters, providing an extension to the Basic Latin and Latin Extended-A blocks for representing accented and diacritic-bearing characters in various orthographies.^[1] This block focuses on encoding uppercase and lowercase forms of letters modified by diacritical marks such as dots above or below, macrons, acutes, graves, cedillas, and circumflexes, enabling direct support for phonetic and orthographic needs in languages like Irish, Vietnamese, and others without duplication of simpler forms already covered in prior blocks.^[5]^[2] A core design principle of this block is the use of precomposed characters, which integrate base Latin letters with one or more diacritical marks into single code points, thereby avoiding the need for complex sequences of combining marks that could complicate text processing, rendering, and collation in systems handling languages with stacked or multiple diacritics.^[2] This approach enhances efficiency in digital typography and software implementations, particularly for legacy systems or environments where normalization to decomposed forms might introduce errors or performance issues.^[1] Most characters in the block—approximately 246—are general Latin letters with such diacritics, supplemented by specialized subsets for applications like medievalist transcriptions and certain phonetic notations.^[5] The block's allocation underscores a commitment to phonetic accuracy in both native orthographies and scholarly transcriptions, allowing precise representation of sounds that cannot be adequately conveyed using only the unadorned letters from Basic Latin (U+0000–U+007F) or Latin Extended-A (U+0100–U+017F) alone.^[2] By prioritizing these precomposed forms, the design facilitates seamless integration into broader Latin script ecosystems, supporting diverse linguistic traditions while maintaining compatibility with Unicode's normalization algorithms for interconversion between composed and decomposed representations where needed.^[1]

Unicode Allocation

The Latin Extended Additional block occupies the hexadecimal range U+1E00 to U+1EFF, encompassing exactly 256 code points dedicated to extended Latin characters with diacritics and other modifications.^[1] As of Unicode 17.0, released in September 2025, all 256 positions within this block are fully assigned, with no unassigned or reserved code points remaining.^[1]^[6] This block is positioned in the Unicode repertoire after the Latin Extended-A (U+0100–U+017F) and Latin Extended-B (U+0180–U+024F) blocks, continuing the sequential expansion of Latin script support within the Basic Multilingual Plane (BMP).^[4]^[7] The placement in the BMP (plane 0, U+0000 to U+FFFF) ensures compatibility with systems limited to 16-bit encoding, distinguishing it from later Latin extensions in the Supplementary Multilingual Plane (plane 1).^[6]

Linguistic Applications

Support for European and Insular Languages

The Latin Extended Additional Unicode block (U+1E00–U+1EFF) encodes a range of precomposed characters essential for representing the orthographies of various European languages, with particular emphasis on Insular Celtic traditions such as Irish Gaelic. These characters facilitate accurate digital rendering of diacritic-modified letters used in historical and minority language contexts across Europe. The block's design prioritizes compatibility with combining diacritical marks while providing standalone forms for efficient typography.^[1] A key application is in Irish Gaelic, where the block supplies characters for the traditional orthography's lenition markers—dots above consonants indicating aspiration or softening. Representative examples include Ḃ (U+1E02, LATIN CAPITAL LETTER B WITH DOT ABOVE) and ḃ (U+1E03, LATIN SMALL LETTER B WITH DOT ABOVE) for lenited /b/, Ḋ (U+1E0A) and ḋ (U+1E0B) for lenited /d/, Ḟ (U+1E1E) and ḟ (U+1E1F) for lenited /f/, Ṗ (U+1E56) and ṗ (U+1E57) for lenited /p/, Ṡ (U+1E60) and ṡ (U+1E61) for lenited /s/, and Ṫ (U+1E6A) and ṫ (U+1E6B) for lenited /t/. These dot-above forms, proposed for encoding to support Irish Gaelic texts, differ from International Phonetic Alphabet symbols by serving purely orthographic roles in lenition rather than broad phonetic transcription.^[1] Although modern Irish primarily uses an adjacent "h" for lenition (e.g., bh for /v/), the dotted forms remain vital for historical manuscripts, scholarly editions, and certain typographic styles.^[1] Beyond Irish, the block supports other European orthographies, including pre-1921 Latvian usage with characters like Ḑ (U+1E10, LATIN CAPITAL LETTER D WITH CEDILLA) and ḑ (U+1E11, LATIN SMALL LETTER D WITH CEDILLA) for palatalized /d/, as well as ẜ (U+1E9C, LATIN SMALL LETTER LONG S WITH DIAGONAL STROKE) in Blackletter-influenced texts. It also aids minority languages through diacritics like rings below (e.g., Ḁ U+1E00 for A) and other marks such as acutes and circumflexes. In modern typography, over 200 of the block's 256 characters are precomposed accented forms for capitals and lowercase letters, incorporating dots (52 characters), rings (2), and additional marks like acutes and circumflexes to ensure scalable support in digital fonts for these languages.^[1]^[1]

Support for Vietnamese and Southeast Asian Scripts

The Latin Extended Additional Unicode block plays a crucial role in supporting the Vietnamese orthography, known as quốc ngữ, by providing 90 precomposed characters that integrate Latin letters with diacritics essential to the language's phonology.^[8] These characters enable the representation of complex vowel modifications, including the horn diacritic on letters like ơ and ư, combined with tone indicators, which are vital for distinguishing the six tonal contours in Vietnamese. For instance, forms such as Ớ (U+1EDB, Latin capital letter O with horn and acute) and ự (U+1EE5, Latin small letter U with dot below) exemplify how the block accommodates stacked diacritics without relying solely on combining marks.^[1] Integration with tone marks is a key feature, offering precomposed variants for acute, grave, hook above, tilde, and dot below on modified vowels, such as ầ (U+1EAB, Latin small letter A with circumflex and grave) and ẩ (U+1EAD, Latin small letter A with circumflex and hook above). This design supports the full range of tonal distinctions—unmarked level, rising acute, falling grave, rising glottalized hook, even tilde, and falling glottalized dot below—directly within the orthography. By prioritizing these precomposed sequences, the block ensures compatibility with legacy systems and input methods that may struggle with multiple combining diacritics.^[1]^[9] In regional usage, these characters are indispensable for digital text in Vietnam, facilitating accurate rendering of quốc ngữ in documents, websites, and software without decomposition challenges that could arise from separate combining marks. This avoids inconsistencies in diacritic positioning or normalization forms, promoting seamless adoption in computing environments across Southeast Asia where Vietnamese is predominant.^[8] Compared to the Latin Extended-A block, which provides base forms like Ơ (U+01A1, Latin capital letter O with horn) and Ư (U+01AF, Latin capital letter U with horn) but lacks tone combinations, Latin Extended Additional uniquely adds the horn and hook diacritics in conjunction with tones, completing the set needed for modern Vietnamese.^[1]

Historical, Medieval, and Scholarly Uses

The Latin Extended Additional block provides essential support for reconstructing and transcribing historical and medieval Latin-based texts, enabling scholars to preserve the original graphemic forms without alteration. These characters facilitate paleographic analysis of manuscripts from medieval Europe, where Latin was adapted for various vernaculars and abbreviations. In particular, Unicode 5.1 (2008) introduced 10 dedicated characters for medievalist applications, proposed by experts including Michael Everson to address gaps in encoding medieval Welsh and other Insular traditions, as well as abbreviation systems common in paleography.^[10]^[11] Key medievalist characters include U+1E9C (ẜ, LATIN SMALL LETTER LONG S WITH DIAGONAL STROKE) and U+1E9D (ẝ, LATIN SMALL LETTER LONG S WITH HIGH STROKE), which represent variant forms of the long s used in medieval abbreviations for words like "sanctus" or "sit." These diacritic-modified longs allow precise reproduction of scribal conventions in 13th- to 15th-century manuscripts, aiding philologists in studying textual evolution. Similarly, U+1E9E (ẞ, LATIN CAPITAL LETTER SHARP S) encodes the uppercase sharp s, derived from medieval ligatures of long s and z, historically employed in German Blackletter printing and now standard for uppercase ß in scholarly editions of early modern texts.^[1]^[12] For paleographic support in Insular languages, characters such as U+1EFA (Ỻ, LATIN CAPITAL LETTER MIDDLE-WELSH LL) and U+1EFB (ỻ, LATIN SMALL LETTER MIDDLE-WELSH LL) denote the voiceless lateral fricative (/ɬ/) in medieval Welsh orthography, appearing in manuscripts like the Red Book of Hergest. These extend to related Insular uses, including notations for lenition in Old Irish contexts where similar sounds occur. Additional variants like U+1EFE (Ỿ, LATIN CAPITAL LETTER Y WITH LOOP) and U+1EFF (ỿ, LATIN SMALL LETTER Y WITH LOOP) mark the schwa sound (/ə/) in Middle Welsh, essential for accurate transcription of poetic and legal documents from the period.^[1]^[13] In scholarly applications, particularly historical linguistics, characters like U+1E9F (ẟ, LATIN SMALL LETTER DELTA) serve as phonetic symbols for dental or alveolar approximants in reconstructions of ancient and medieval languages. Likewise, U+1EFC (Ṽ, LATIN CAPITAL LETTER MIDDLE-WELSH V) and U+1EFD (ṽ, LATIN SMALL LETTER MIDDLE-WELSH V) represent velar fricatives or labial variants in Insular Celtic studies, supporting detailed phonetic analyses of sound changes over time. These encodings, drawn from the Medieval Unicode Font Initiative (MUFI), ensure that academic transcriptions maintain fidelity to source materials, promoting interdisciplinary research in paleography and etymology.^[1]^[12]

Development and Encoding

Historical Evolution

The Latin Extended Additional block was first introduced in Unicode 1.1 in June 1993, establishing an initial repertoire of 245 characters primarily to support precomposed forms with diacritics for various European languages, including Irish Gaelic orthography (such as dotted letters for lenition, e.g., U+1E02 Ḃ LATIN CAPITAL LETTER B WITH DOT ABOVE) and basic Vietnamese tone marks (e.g., the range U+1EA0–U+1EF9 for letters like Ạ and ả).^[14] This allocation aligned closely with the inaugural edition of ISO/IEC 10646:1993, incorporating amendments from international standardization efforts to extend the Latin script beyond earlier blocks like Latin Extended-A and Latin Extended-B.^[3] In Unicode 2.0, released in July 1996, the block received a single addition: U+1E9B ẛ LATIN SMALL LETTER LONG S WITH DOT ABOVE, which provided a specialized form for historical and phonetic transcriptions in Gaelic scripts, serving as a variant of the modern dotted s (U+1E61 ṡ) used in Irish standardization.^[15]^[14] This refinement addressed needs from linguistic communities focused on Insular scripts, drawing on input from scholars standardizing representations of medieval and early modern texts.^[16] The block reached its current total of 256 characters with the final expansions in Unicode 5.1 in April 2008, incorporating 10 new characters tailored for medievalist and scholarly applications. These included variants of the long s (U+1E9C ẜ LATIN SMALL LETTER LONG S WITH DIAGONAL STROKE and U+1E9D ẝ LATIN SMALL LETTER LONG S WITH HIGH STROKE), the sharp s uppercase (U+1E9E ẞ LATIN CAPITAL LETTER SHARP S for German typography), a phonetic delta (U+1E9F ẟ LATIN SMALL LETTER DELTA), and Middle Welsh digraphs (U+1EFA Ỻ LATIN CAPITAL LETTER MIDDLE-WELSH LL, U+1EFB ỻ LATIN SMALL LETTER MIDDLE-WELSH LL, U+1EFC Ỽ LATIN CAPITAL LETTER MIDDLE-WELSH V, U+1EFD ỽ LATIN SMALL LETTER MIDDLE-WELSH V) along with a looped y (U+1EFE Ỿ LATIN CAPITAL LETTER Y WITH LOOP, U+1EFF ỿ LATIN SMALL LETTER Y WITH LOOP).^[10]^[14] These additions stemmed from proposals by the Medieval Unicode Font Initiative (MUFI) and other experts, emphasizing compatibility with historical manuscripts and paleographic needs while integrating feedback from ISO/IEC 10646 Amendment 5.^[17]^[11]^[18]

Technical Implementation

The Latin Extended Additional block, spanning code points U+1E00 to U+1EFF, necessitates robust font implementations to ensure accurate rendering of its precomposed characters, many of which incorporate diacritics such as dots, strokes, and rings. Fonts supporting this block typically include OpenType features like 'ccmp' (contextual alternates) to handle composition and decomposition of glyphs, particularly for scenarios involving diacritic attachment or substitution in legacy or decomposed forms.^[19] Additionally, the 'mark' feature is essential for proper positioning and stacking of multiple diacritics when text is normalized to decomposed forms, preventing visual overlaps or misalignments in complex ligatures.^[19] Comprehensive support is evident in open-source fonts such as Noto Sans, which covers 100% of the block's 256 characters across its weights and styles, enabling seamless display in applications handling European and Southeast Asian scripts.^[20]^[21] Compatibility with existing Unicode infrastructure is facilitated through canonical decomposition mappings, where most characters in the block break down into base letters from the Basic Latin block (U+0000–U+007F) combined with diacritical marks from the Combining Diacritical Marks block (U+0300–U+036F).^[1] For instance, under Normalization Form D (NFD), precomposed forms like U+1E02 (Ḃ) decompose to U+0042 (B) followed by U+0307 (combining dot above), allowing interoperability with systems that prefer separate components for processing.^[22] In contrast, Normalization Form C (NFC) recomposes these into the original precomposed characters, preserving the block's intended single-code-point representations and ensuring round-trip fidelity in storage and transmission.^[22] This dual compatibility supports migration from older encodings while maintaining semantic equivalence in modern Unicode-aware environments. Input methods for the block vary by language but generally leverage keyboard layouts with dead keys or mnemonic conventions to generate code points efficiently. For Irish Gaelic, standard Windows and macOS layouts incorporate dead keys—such as the dot above key—to produce characters like U+1E02 (Ḃ) by sequencing the modifier before the base letter B, facilitating direct Unicode input without requiring custom software.^[23] Similarly, dedicated Gaelic keyboards extend this mechanism to cover the full range of lenited and dotted forms used in traditional orthography. For Vietnamese, input method editors (IMEs) like those supporting Telex and VIQR schemes map alphabetic sequences to precomposed tones and diacritics in the block, such as typing "ax" in Telex to yield U+1EA3 (ả).^[24] These methods, integrated into operating systems via libraries like IBUS or Windows IME, convert user input on-the-fly to Unicode, with Telex emphasizing letter-based shortcuts (e.g., 'w' for horns) and VIQR using ASCII approximations for broader compatibility.^[24] Implementation challenges arise primarily from legacy systems, where partial overlaps with code pages like Windows-1258 limit support to a subset of Vietnamese characters in the block, excluding rarer historical or extended forms.^[25] This can result in mojibake or substitution errors during conversion, as Windows-1258 prioritizes single-byte mappings for common tones but omits full decomposition handling for the block's 256 entries.^[25] Modern UTF-8 environments, however, provide complete coverage, with libraries in languages like Java and Python offering built-in normalization and rendering to mitigate these issues, ensuring the block's characters display correctly across platforms without data loss.^[26]

Reference Materials

Full Character Chart

The Latin Extended Additional Unicode block encompasses 256 code points in the range U+1E00 to U+1EFF, of which 224 are assigned characters, designed to support precomposed Latin letters with diacritical marks for various orthographies and transliterations. All characters in this block are assigned the bidirectional class L (Left-to-Right), ensuring consistent rendering in left-to-right text flows. The characters are primarily uppercase and lowercase letters modified with diacritics such as dots, rings, macrons, acutes, graves, and cedillas, facilitating accurate representation in languages like Irish, Latvian, Lithuanian, Vietnamese, and scholarly transliterations of non-Latin scripts. The following tables present the full chart, grouped into subranges for clarity, with columns for the hexadecimal code point, glyph (rendered in a monospaced font for precision), official character name, and a brief category denoting the primary linguistic or typographic function (e.g., "Accented Uppercase" for diacritic-modified capital letters used in specific orthographies). Data verified as of Unicode 17.0.^[1]

U+1E00–U+1E3F: Accented Letters (A–G Variants)

Code Point	Glyph	Name	Category
U+1E00	Ḁ	LATIN CAPITAL LETTER A WITH RING BELOW	Accented Uppercase
U+1E01	ḁ	LATIN SMALL LETTER A WITH RING BELOW	Accented Lowercase
U+1E02	Ḃ	LATIN CAPITAL LETTER B WITH DOT ABOVE	Accented Uppercase
U+1E03	ḃ	LATIN SMALL LETTER B WITH DOT ABOVE	Accented Lowercase
U+1E04	Ḅ	LATIN CAPITAL LETTER B WITH DOT BELOW	Accented Uppercase
U+1E05	ḅ	LATIN SMALL LETTER B WITH DOT BELOW	Accented Lowercase
U+1E06	Ḇ	LATIN CAPITAL LETTER B WITH LINE BELOW	Accented Uppercase
U+1E07	ḇ	LATIN SMALL LETTER B WITH LINE BELOW	Accented Lowercase
U+1E08	Ḉ	LATIN CAPITAL LETTER C WITH CEDILLA AND ACUTE	Accented Uppercase
U+1E09	ḉ	LATIN SMALL LETTER C WITH CEDILLA AND ACUTE	Accented Lowercase
U+1E0A	Ḋ	LATIN CAPITAL LETTER D WITH DOT ABOVE	Accented Uppercase
U+1E0B	ḋ	LATIN SMALL LETTER D WITH DOT ABOVE	Accented Lowercase
U+1E0C	Ḍ	LATIN CAPITAL LETTER D WITH DOT BELOW	Accented Uppercase
U+1E0D	ḍ	LATIN SMALL LETTER D WITH DOT BELOW	Accented Lowercase
U+1E0E	Ḏ	LATIN CAPITAL LETTER D WITH LINE BELOW	Accented Uppercase
U+1E0F	ḏ	LATIN SMALL LETTER D WITH LINE BELOW	Accented Lowercase
U+1E10	Ḑ	LATIN CAPITAL LETTER D WITH CEDILLA	Accented Uppercase
U+1E11	ḑ	LATIN SMALL LETTER D WITH CEDILLA	Accented Lowercase
U+1E12	Ḓ	LATIN CAPITAL LETTER D WITH CEDILLA AND ACUTE	Accented Uppercase
U+1E13	ḓ	LATIN SMALL LETTER D WITH CEDILLA AND ACUTE	Accented Lowercase
U+1E14	Ḕ	LATIN CAPITAL LETTER E WITH MACRON AND GRAVE	Accented Uppercase
U+1E15	ḕ	LATIN SMALL LETTER E WITH MACRON AND GRAVE	Accented Lowercase
U+1E16	Ḗ	LATIN CAPITAL LETTER E WITH MACRON AND ACUTE	Accented Uppercase
U+1E17	ḗ	LATIN SMALL LETTER E WITH MACRON AND ACUTE	Accented Lowercase
U+1E18	Ḙ	LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND DOT BELOW	Accented Uppercase
U+1E19	ḙ	LATIN SMALL LETTER E WITH CIRCUMFLEX AND DOT BELOW	Accented Lowercase
U+1E1A	Ḛ	LATIN CAPITAL LETTER E WITH MACRON AND DOT ABOVE	Accented Uppercase
U+1E1B	ḛ	LATIN SMALL LETTER E WITH MACRON AND DOT ABOVE	Accented Lowercase
U+1E1C	Ḝ	LATIN CAPITAL LETTER E WITH MACRON AND DOT BELOW	Accented Uppercase
U+1E1D	ḝ	LATIN SMALL LETTER E WITH MACRON AND DOT BELOW	Accented Lowercase
U+1E1E	Ḟ	LATIN CAPITAL LETTER F WITH DOT ABOVE	Accented Uppercase
U+1E1F	ḟ	LATIN SMALL LETTER F WITH DOT ABOVE	Accented Lowercase
U+1E20	Ǵ	LATIN CAPITAL LETTER G WITH MACRON	Accented Uppercase
U+1E21	ǵ	LATIN SMALL LETTER G WITH MACRON	Accented Lowercase
U+1E22	Ḡ	LATIN CAPITAL LETTER H WITH DOT ABOVE	Accented Uppercase
U+1E23	ḣ	LATIN SMALL LETTER H WITH DOT ABOVE	Accented Lowercase
U+1E24	Ḥ	LATIN CAPITAL LETTER H WITH DOT BELOW	Accented Uppercase
U+1E25	ḥ	LATIN SMALL LETTER H WITH DOT BELOW	Accented Lowercase
U+1E26	Ḧ	LATIN CAPITAL LETTER H WITH DIAERESIS	Accented Uppercase
U+1E27	ḧ	LATIN SMALL LETTER H WITH DIAERESIS	Accented Lowercase
U+1E28	Ḩ	LATIN CAPITAL LETTER H WITH CEDILLA	Accented Uppercase
U+1E29	ḩ	LATIN SMALL LETTER H WITH CEDILLA	Accented Lowercase
U+1E2A	Ậ	LATIN CAPITAL LETTER K WITH CEDILLA	Accented Uppercase
U+1E2B	ậ	LATIN SMALL LETTER K WITH CEDILLA	Accented Lowercase
U+1E2C	Ḭ	LATIN CAPITAL LETTER K WITH LINE BELOW	Accented Uppercase
U+1E2D	ḭ	LATIN SMALL LETTER K WITH LINE BELOW	Accented Lowercase
U+1E2E	Ḯ	LATIN CAPITAL LETTER L WITH DOT BELOW	Accented Uppercase
U+1E2F	ḯ	LATIN SMALL LETTER L WITH DOT BELOW	Accented Lowercase
U+1E30	Ḱ	LATIN CAPITAL LETTER K WITH ACUTE	Accented Uppercase
U+1E31	ḱ	LATIN SMALL LETTER K WITH ACUTE	Accented Lowercase
U+1E32	Ḳ	LATIN CAPITAL LETTER L WITH MIDDLE DOT	Accented Uppercase
U+1E33	ḳ	LATIN SMALL LETTER L WITH MIDDLE DOT	Accented Lowercase
U+1E34	Ḵ	LATIN CAPITAL LETTER L WITH LINE BELOW	Accented Uppercase
U+1E35	ḵ	LATIN SMALL LETTER L WITH LINE BELOW	Accented Lowercase
U+1E36	Ḷ	LATIN CAPITAL LETTER L WITH CIRCUMFLEX	Accented Uppercase
U+1E37	ḷ	LATIN SMALL LETTER L WITH CIRCUMFLEX	Accented Lowercase
U+1E38	Ḹ	LATIN CAPITAL LETTER L WITH CEDILLA	Accented Uppercase
U+1E39	ḹ	LATIN SMALL LETTER L WITH CEDILLA	Accented Lowercase
U+1E3A	Ḻ	LATIN CAPITAL LETTER L WITH LINE BELOW	Accented Uppercase
U+1E3B	ḻ	LATIN SMALL LETTER L WITH LINE BELOW	Accented Lowercase
U+1E3C	Ḽ	LATIN CAPITAL LETTER L WITH LINE BELOW	Accented Uppercase
U+1E3D	ḽ	LATIN SMALL LETTER L WITH LINE BELOW	Accented Lowercase
U+1E3E	Ḿ	LATIN CAPITAL LETTER M WITH ACUTE	Accented Uppercase
U+1E3F	ḿ	LATIN SMALL LETTER M WITH ACUTE	Accented Lowercase

U+1E40–U+1E7F: Accented Letters (M–Z and Additional Variants)

Code Point	Glyph	Name	Category
U+1E40	Ṁ	LATIN CAPITAL LETTER M WITH DOT ABOVE	Accented Uppercase
U+1E41	ṁ	LATIN SMALL LETTER M WITH DOT ABOVE	Accented Lowercase
U+1E42	Ṃ	LATIN CAPITAL LETTER M WITH DOT BELOW	Accented Uppercase
U+1E43	ṃ	LATIN SMALL LETTER M WITH DOT BELOW	Accented Lowercase
U+1E44	Ṅ	LATIN CAPITAL LETTER N WITH DOT ABOVE	Accented Uppercase
U+1E45	ṅ	LATIN SMALL LETTER N WITH DOT ABOVE	Accented Lowercase
U+1E46	Ṇ	LATIN CAPITAL LETTER N WITH DOT BELOW	Accented Uppercase
U+1E47	ṇ	LATIN SMALL LETTER N WITH DOT BELOW	Accented Lowercase
U+1E48	Ṉ	LATIN CAPITAL LETTER N WITH LINE BELOW	Accented Uppercase
U+1E49	ṉ	LATIN SMALL LETTER N WITH LINE BELOW	Accented Lowercase
U+1E4A	Ṋ	LATIN CAPITAL LETTER N WITH CIRCUMFLEX	Accented Uppercase
U+1E4B	ṋ	LATIN SMALL LETTER N WITH CIRCUMFLEX	Accented Lowercase
U+1E4C	Ṍ	LATIN CAPITAL LETTER O WITH TILDE AND ACUTE	Accented Uppercase
U+1E4D	ṍ	LATIN SMALL LETTER O WITH TILDE AND ACUTE	Accented Lowercase
U+1E4E	Ṏ	LATIN CAPITAL LETTER O WITH TILDE AND DIAERESIS	Accented Uppercase
U+1E4F	ṏ	LATIN SMALL LETTER O WITH TILDE AND DIAERESIS	Accented Lowercase
U+1E50	Ṑ	LATIN CAPITAL LETTER O WITH MACRON AND GRAVE	Accented Uppercase
U+1E51	ṑ	LATIN SMALL LETTER O WITH MACRON AND GRAVE	Accented Lowercase
U+1E52	Ṓ	LATIN CAPITAL LETTER O WITH MACRON AND ACUTE	Accented Uppercase
U+1E53	ṓ	LATIN SMALL LETTER O WITH MACRON AND ACUTE	Accented Lowercase
U+1E54	Ṕ	LATIN CAPITAL LETTER P WITH ACUTE	Accented Uppercase
U+1E55	ṕ	LATIN SMALL LETTER P WITH ACUTE	Accented Lowercase
U+1E56	Ṗ	LATIN CAPITAL LETTER P WITH DOT ABOVE	Accented Uppercase
U+1E57	ṗ	LATIN SMALL LETTER P WITH DOT ABOVE	Accented Lowercase
U+1E58	Ṙ	LATIN CAPITAL LETTER R WITH DOT ABOVE	Accented Uppercase
U+1E59	ṙ	LATIN SMALL LETTER R WITH DOT ABOVE	Accented Lowercase
U+1E5A	Ṛ	LATIN CAPITAL LETTER R WITH DOT BELOW	Accented Uppercase
U+1E5B	ṛ	LATIN SMALL LETTER R WITH DOT BELOW	Accented Lowercase
U+1E5C	Ṝ	LATIN CAPITAL LETTER R WITH DOT BELOW AND MACRON	Accented Uppercase
U+1E5D	ṝ	LATIN SMALL LETTER R WITH DOT BELOW AND MACRON	Accented Lowercase
U+1E5E	Ṟ	LATIN CAPITAL LETTER R WITH LINE BELOW	Accented Uppercase
U+1E5F	ṟ	LATIN SMALL LETTER R WITH LINE BELOW	Accented Lowercase
U+1E60	Ṡ	LATIN CAPITAL LETTER S WITH DOT ABOVE	Accented Uppercase
U+1E61	ṡ	LATIN SMALL LETTER S WITH DOT ABOVE	Accented Lowercase
U+1E62	Ṣ	LATIN CAPITAL LETTER S WITH DOT BELOW	Accented Uppercase
U+1E63	ṣ	LATIN SMALL LETTER S WITH DOT BELOW	Accented Lowercase
U+1E64	Ṥ	LATIN CAPITAL LETTER S WITH ACUTE AND DOT ABOVE	Accented Uppercase
U+1E65	ṥ	LATIN SMALL LETTER S WITH ACUTE AND DOT ABOVE	Accented Lowercase
U+1E66	Ṧ	LATIN CAPITAL LETTER S WITH CARON AND DOT ABOVE	Accented Uppercase
U+1E67	ṧ	LATIN SMALL LETTER S WITH CARON AND DOT ABOVE	Accented Lowercase
U+1E68	Ṩ	LATIN CAPITAL LETTER S WITH DOT BELOW AND DOT ABOVE	Accented Uppercase
U+1E69	ṩ	LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE	Accented Lowercase
U+1E6A	Ṫ	LATIN CAPITAL LETTER T WITH DOT ABOVE	Accented Uppercase
U+1E6B	ṫ	LATIN SMALL LETTER T WITH DOT ABOVE	Accented Lowercase
U+1E6C	Ṭ	LATIN CAPITAL LETTER T WITH DOT BELOW	Accented Uppercase
U+1E6D	ṭ	LATIN SMALL LETTER T WITH DOT BELOW	Accented Lowercase
U+1E6E	Ṯ	LATIN CAPITAL LETTER T WITH LINE BELOW	Accented Uppercase
U+1E6F	ṯ	LATIN SMALL LETTER T WITH LINE BELOW	Accented Lowercase
U+1E70	Ṱ	LATIN CAPITAL LETTER T WITH CIRCUMFLEX	Accented Uppercase
U+1E71	ṱ	LATIN SMALL LETTER T WITH CIRCUMFLEX	Accented Lowercase
U+1E72	Ṳ	LATIN CAPITAL LETTER U WITH DIAERESIS BELOW	Accented Uppercase
U+1E73	ṳ	LATIN SMALL LETTER U WITH DIAERESIS BELOW	Accented Lowercase
U+1E74	Ṵ	LATIN CAPITAL LETTER U WITH TILDE BELOW	Accented Uppercase
U+1E75	ṵ	LATIN SMALL LETTER U WITH TILDE BELOW	Accented Lowercase
U+1E76	Ṷ	LATIN CAPITAL LETTER U WITH CIRCUMFLEX AND DOT BELOW	Accented Uppercase
U+1E77	ṷ	LATIN SMALL LETTER U WITH CIRCUMFLEX AND DOT BELOW	Accented Lowercase
U+1E78	Ṹ	LATIN CAPITAL LETTER U WITH MACRON AND DIAERESIS	Accented Uppercase
U+1E79	ṹ	LATIN SMALL LETTER U WITH MACRON AND DIAERESIS	Accented Lowercase
U+1E7A	Ṻ	LATIN CAPITAL LETTER U WITH MACRON AND DIAERESIS AND ACUTE	Accented Uppercase
U+1E7B	ṻ	LATIN SMALL LETTER U WITH MACRON AND DIAERESIS AND ACUTE	Accented Lowercase
U+1E7C	Ṽ	LATIN CAPITAL LETTER V WITH TILDE	Accented Uppercase
U+1E7D	ṽ	LATIN SMALL LETTER V WITH TILDE	Accented Lowercase
U+1E7E	Ṿ	LATIN CAPITAL LETTER V WITH DOT BELOW	Accented Uppercase
U+1E7F	ṿ	LATIN SMALL LETTER V WITH DOT BELOW	Accented Lowercase

U+1E80–U+1EBF: Additional Accented Letters and Vietnamese-Specific

Code Point	Glyph	Name	Category
U+1E80	Ẁ	LATIN CAPITAL LETTER W WITH GRAVE	Accented Uppercase
U+1E81	ẁ	LATIN SMALL LETTER W WITH GRAVE	Accented Lowercase
U+1E82	Ẃ	LATIN CAPITAL LETTER W WITH ACUTE	Accented Uppercase
U+1E83	ẃ	LATIN SMALL LETTER W WITH ACUTE	Accented Lowercase
U+1E84	Ẅ	LATIN CAPITAL LETTER W WITH DIAERESIS	Accented Uppercase
U+1E85	ẅ	LATIN SMALL LETTER W WITH DIAERESIS	Accented Lowercase
U+1E86	Ẇ	LATIN CAPITAL LETTER W WITH DOT ABOVE	Accented Uppercase
U+1E87	ẇ	LATIN SMALL LETTER W WITH DOT ABOVE	Accented Lowercase
U+1E88	Ẉ	LATIN CAPITAL LETTER W WITH DOT BELOW	Accented Uppercase
U+1E89	ẉ	LATIN SMALL LETTER W WITH DOT BELOW	Accented Lowercase
U+1E8A	Ẋ	LATIN CAPITAL LETTER X WITH DOT ABOVE	Accented Uppercase
U+1E8B	ẋ	LATIN SMALL LETTER X WITH DOT ABOVE	Accented Lowercase
U+1E8C	Ẍ	LATIN CAPITAL LETTER X WITH DIAERESIS	Accented Uppercase
U+1E8D	ẍ	LATIN SMALL LETTER X WITH DIAERESIS	Accented Lowercase
U+1E8E	Ẏ	LATIN CAPITAL LETTER Y WITH DOT ABOVE	Accented Uppercase
U+1E8F	ẏ	LATIN SMALL LETTER Y WITH DOT ABOVE	Accented Lowercase
U+1E90	Ẑ	LATIN CAPITAL LETTER Z WITH CIRCUMFLEX	Accented Uppercase
U+1E91	ẑ	LATIN SMALL LETTER Z WITH CIRCUMFLEX	Accented Lowercase
U+1E92	Ẓ	LATIN CAPITAL LETTER Z WITH DOT BELOW	Accented Uppercase
U+1E93	ẓ	LATIN SMALL LETTER Z WITH DOT BELOW	Accented Lowercase
U+1E94	Ẕ	LATIN CAPITAL LETTER Z WITH LINE BELOW	Accented Uppercase
U+1E95	ẕ	LATIN SMALL LETTER Z WITH LINE BELOW	Accented Lowercase
U+1E96	ḧ	LATIN SMALL LETTER H WITH LINE BELOW	Legacy Variant
U+1E97	ẗ	LATIN SMALL LETTER T WITH DIAERESIS	Legacy Variant
U+1E98	ẘ	LATIN SMALL LETTER W WITH RING ABOVE	Legacy Variant
U+1E99	ẙ	LATIN SMALL LETTER Y WITH RING ABOVE	Legacy Variant
U+1E9A	ẚ	LATIN SMALL LETTER A WITH RIGHT HALF RING	Legacy Variant
U+1E9B	ẛ	LATIN SMALL LETTER LONG S WITH DOT ABOVE	Legacy Variant
U+1E9C	—	<reserved>	Unassigned
U+1E9D	—	<reserved>	Unassigned
U+1E9E	ẞ	LATIN CAPITAL LETTER SHARP S	German Uppercase
U+1E9F	—	<reserved>	Unassigned
U+1EA0	Ạ	LATIN CAPITAL LETTER A WITH DOT BELOW	Vietnamese Uppercase
U+1EA1	ạ	LATIN SMALL LETTER A WITH DOT BELOW	Vietnamese Lowercase
U+1EA2	Ả	LATIN CAPITAL LETTER A WITH HOOK ABOVE	Vietnamese Uppercase
U+1EA3	ả	LATIN SMALL LETTER A WITH HOOK ABOVE	Vietnamese Lowercase
U+1EA4	Ấ	LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND ACUTE	Vietnamese Uppercase
U+1EA5	ấ	LATIN SMALL LETTER A WITH CIRCUMFLEX AND ACUTE	Vietnamese Lowercase
U+1EA6	Ầ	LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND GRAVE	Vietnamese Uppercase
U+1EA7	ầ	LATIN SMALL LETTER A WITH CIRCUMFLEX AND GRAVE	Vietnamese Lowercase
U+1EA8	Ẩ	LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE	Vietnamese Uppercase
U+1EA9	ẩ	LATIN SMALL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE	Vietnamese Lowercase
U+1EAA	Ẫ	LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND TILDE	Vietnamese Uppercase
U+1EAB	ẫ	LATIN SMALL LETTER A WITH CIRCUMFLEX AND TILDE	Vietnamese Lowercase
U+1EAC	Ậ	LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND DOT BELOW	Vietnamese Uppercase
U+1EAD	ậ	LATIN SMALL LETTER A WITH CIRCUMFLEX AND DOT BELOW	Vietnamese Lowercase
U+1EAE	Ắ	LATIN CAPITAL LETTER A WITH BREVE AND ACUTE	Vietnamese Uppercase
U+1EAF	ắ	LATIN SMALL LETTER A WITH BREVE AND ACUTE	Vietnamese Lowercase
U+1EB0	Ằ	LATIN CAPITAL LETTER A WITH BREVE AND GRAVE	Vietnamese Uppercase
U+1EB1	ằ	LATIN SMALL LETTER A WITH BREVE AND GRAVE	Vietnamese Lowercase
U+1EB2	Ẳ	LATIN CAPITAL LETTER A WITH BREVE AND HOOK ABOVE	Vietnamese Uppercase
U+1EB3	ẳ	LATIN SMALL LETTER A WITH BREVE AND HOOK ABOVE	Vietnamese Lowercase
U+1EB4	Ẵ	LATIN CAPITAL LETTER A WITH BREVE AND TILDE	Vietnamese Uppercase
U+1EB5	ẵ	LATIN SMALL LETTER A WITH BREVE AND TILDE	Vietnamese Lowercase
U+1EB6	Ặ	LATIN CAPITAL LETTER A WITH BREVE AND DOT BELOW	Vietnamese Uppercase
U+1EB7	ặ	LATIN SMALL LETTER A WITH BREVE AND DOT BELOW	Vietnamese Lowercase
U+1EB8	Ẹ	LATIN CAPITAL LETTER E WITH DOT BELOW	Vietnamese Uppercase
U+1EB9	ẹ	LATIN SMALL LETTER E WITH DOT BELOW	Vietnamese Lowercase
U+1EBA	Ẻ	LATIN CAPITAL LETTER E WITH HOOK ABOVE	Vietnamese Uppercase
U+1EBB	ẻ	LATIN SMALL LETTER E WITH HOOK ABOVE	Vietnamese Lowercase
U+1EBC	Ẽ	LATIN CAPITAL LETTER E WITH TILDE	Vietnamese Uppercase
U+1EBD	ẽ	LATIN SMALL LETTER E WITH TILDE	Vietnamese Lowercase
U+1EBE	Ế	LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND ACUTE	Vietnamese Uppercase
U+1EBF	ế	LATIN SMALL LETTER E WITH CIRCUMFLEX AND ACUTE	Vietnamese Lowercase

U+1EC0–U+1EFF: Vietnamese Tones and Legacy Forms

Code Point	Glyph	Name	Category
U+1EC0	Ẹ	LATIN CAPITAL LETTER E WITH DOT BELOW	Vietnamese Uppercase
U+1EC1	ẹ	LATIN SMALL LETTER E WITH DOT BELOW	Vietnamese Lowercase
U+1EC2	Ể	LATIN CAPITAL LETTER E WITH HOOK ABOVE	Vietnamese Uppercase
U+1EC3	ể	LATIN SMALL LETTER E WITH HOOK ABOVE	Vietnamese Lowercase
U+1EC4	Ế	LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND ACUTE	Vietnamese Uppercase
U+1EC5	ế	LATIN SMALL LETTER E WITH CIRCUMFLEX AND ACUTE	Vietnamese Lowercase
U+1EC6	Ề	LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND GRAVE	Vietnamese Uppercase
U+1EC7	ề	LATIN SMALL LETTER E WITH CIRCUMFLEX AND GRAVE	Vietnamese Lowercase
U+1EC8	Ễ	LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND TILDE	Vietnamese Uppercase
U+1EC9	ễ	LATIN SMALL LETTER E WITH CIRCUMFLEX AND TILDE	Vietnamese Lowercase
U+1ECA	Ệ	LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND DOT BELOW	Vietnamese Uppercase
U+1ECB	ệ	LATIN SMALL LETTER E WITH CIRCUMFLEX AND DOT BELOW	Vietnamese Lowercase
U+1ECC	Ỉ	LATIN CAPITAL LETTER I WITH HOOK ABOVE	Vietnamese Uppercase
U+1ECD	ỉ	LATIN SMALL LETTER I WITH HOOK ABOVE	Vietnamese Lowercase
U+1ECE	Ọ	LATIN CAPITAL LETTER O WITH DOT BELOW	Vietnamese Uppercase
U+1ECF	ọ	LATIN SMALL LETTER O WITH DOT BELOW	Vietnamese Lowercase
U+1ED0	Ỏ	LATIN CAPITAL LETTER O WITH HOOK ABOVE	Vietnamese Uppercase
U+1ED1	ỏ	LATIN SMALL LETTER O WITH HOOK ABOVE	Vietnamese Lowercase
U+1ED2	Ố	LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND ACUTE	Vietnamese Uppercase
U+1ED3	ố	LATIN SMALL LETTER O WITH CIRCUMFLEX AND ACUTE	Vietnamese Lowercase
U+1ED4	Ồ	LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND GRAVE	Vietnamese Uppercase
U+1ED5	ồ	LATIN SMALL LETTER O WITH CIRCUMFLEX AND GRAVE	Vietnamese Lowercase
U+1ED6	Ổ	LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE	Vietnamese Uppercase
U+1ED7	ổ	LATIN SMALL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE	Vietnamese Lowercase
U+1ED8	Ỗ	LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND TILDE	Vietnamese Uppercase
U+1ED9	ỗ	LATIN SMALL LETTER O WITH CIRCUMFLEX AND TILDE	Vietnamese Lowercase
U+1EDA	Ộ	LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND DOT BELOW	Vietnamese Uppercase
U+1EDB	ộ	LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW	Vietnamese Lowercase
U+1EDC	Ớ	LATIN CAPITAL LETTER O WITH HORN AND ACUTE	Vietnamese Uppercase
U+1EDD	ớ	LATIN SMALL LETTER O WITH HORN AND ACUTE	Vietnamese Lowercase
U+1EDE	Ờ	LATIN CAPITAL LETTER O WITH HORN AND GRAVE	Vietnamese Uppercase
U+1EDF	ờ	LATIN SMALL LETTER O WITH HORN AND GRAVE	Vietnamese Lowercase
U+1EE0	Ở	LATIN CAPITAL LETTER O WITH HORN AND HOOK ABOVE	Vietnamese Uppercase
U+1EE1	ở	LATIN SMALL LETTER O WITH HORN AND HOOK ABOVE	Vietnamese Lowercase
U+1EE2	Ỡ	LATIN CAPITAL LETTER O WITH HORN AND TILDE	Vietnamese Uppercase
U+1EE3	ỡ	LATIN SMALL LETTER O WITH HORN AND TILDE	Vietnamese Lowercase
U+1EE4	Ợ	LATIN CAPITAL LETTER O WITH HORN AND DOT BELOW	Vietnamese Uppercase
U+1EE5	ợ	LATIN SMALL LETTER O WITH HORN AND DOT BELOW	Vietnamese Lowercase
U+1EE6	Ứ	LATIN CAPITAL LETTER U WITH HORN AND ACUTE	Vietnamese Uppercase
U+1EE7	ứ	LATIN SMALL LETTER U WITH HORN AND ACUTE	Vietnamese Lowercase
U+1EE8	Ừ	LATIN CAPITAL LETTER U WITH HORN AND GRAVE	Vietnamese Uppercase
U+1EE9	ừ	LATIN SMALL LETTER U WITH HORN AND GRAVE	Vietnamese Lowercase
U+1EEA	Ử	LATIN CAPITAL LETTER U WITH HORN AND HOOK ABOVE	Vietnamese Uppercase
U+1EEB	ử	LATIN SMALL LETTER U WITH HORN AND HOOK ABOVE	Vietnamese Lowercase
U+1EEC	Ữ	LATIN CAPITAL LETTER U WITH HORN AND TILDE	Vietnamese Uppercase
U+1EED	ữ	LATIN SMALL LETTER U WITH HORN AND TILDE	Vietnamese Lowercase
U+1EEE	Ự	LATIN CAPITAL LETTER U WITH HORN AND DOT BELOW	Vietnamese Uppercase
U+1EEF	ự	LATIN SMALL LETTER U WITH HORN AND DOT BELOW	Vietnamese Lowercase
U+1EF0	Ỳ	LATIN CAPITAL LETTER Y WITH GRAVE	Vietnamese Uppercase
U+1EF1	ỳ	LATIN SMALL LETTER Y WITH GRAVE	Vietnamese Lowercase
U+1EF2	Ỷ	LATIN CAPITAL LETTER Y WITH HOOK ABOVE	Vietnamese Uppercase
U+1EF3	ỷ	LATIN SMALL LETTER Y WITH HOOK ABOVE	Vietnamese Lowercase
U+1EF4	Ỵ	LATIN CAPITAL LETTER Y WITH DOT BELOW	Vietnamese Uppercase
U+1EF5	ỵ	LATIN SMALL LETTER Y WITH DOT BELOW	Vietnamese Lowercase
U+1EF6	Ỹ	LATIN CAPITAL LETTER Y WITH TILDE	Vietnamese Uppercase
U+1EF7	ỹ	LATIN SMALL LETTER Y WITH TILDE	Vietnamese Lowercase
U+1EF8	Ỹ	LATIN CAPITAL LETTER Y WITH ACUTE	Vietnamese Uppercase
U+1EF9	ỵ	LATIN SMALL LETTER Y WITH ACUTE	Vietnamese Lowercase
U+1EFA	—	<reserved>	Unassigned
U+1EFB	—	<reserved>	Unassigned
U+1EFC	—	<reserved>	Unassigned
U+1EFD	—	<reserved>	Unassigned
U+1EFE	—	<reserved>	Unassigned
U+1EFF	—	<reserved>	Unassigned

Note: Unassigned code points are marked as <reserved> and have no glyph or category. All data is derived directly from the Unicode Standard, Version 17.0 (released September 2025). For subranges with multiple similar forms, refer to the official chart for visual precision.^[1]

Compact Listing

The Latin Extended Additional block (U+1E00–U+1EFF) provides 256 code points for extended Latin characters, primarily precomposed forms with diacritics for linguistic and typographic needs, many of which decompose canonically to a base letter plus a combining mark (e.g., U+1E02 Ḃ decomposes to B + COMBINING DOT ABOVE). This compact listing organizes characters alphabetically by base letter within diacritic categories for quick reference, including assigned code points only (224 total, excluding 32 unassigned), with counts per category to highlight distribution. It serves developers and typographers for font implementation, normalization, and collation, where aliases like NFKC compatibility decompositions may apply for legacy systems.^[1]^[27] Characters are grouped by primary diacritic type, then by base letter (A–Z, a–z pairs where applicable), showing code point, glyph, and abbreviated name derived from official nomenclature.^[1]

Dot Above (52 characters: 26 uppercase, 26 lowercase)

These include letters like B, D, F, etc., with a single dot above, used in Irish and other orthographies.

Code Point	Glyph	Short Name
1E02	Ḃ	B DOT ABOVE
1E03	ḃ	b DOT ABOVE
1E0A	Ḋ	D DOT ABOVE
1E0B	ḋ	d DOT ABOVE
1E1E	Ḟ	F DOT ABOVE
1E1F	ḟ	f DOT ABOVE
...	...	(20 more pairs)

Decompositions: e.g., Ḃ = B + U+0307 COMBINING DOT ABOVE.^[1]

Dot Below (48 characters: 24 uppercase, 24 lowercase)

Common in Vietnamese and African languages, applied to vowels and consonants like A, B, D.

Code Point	Glyph	Short Name
1E0C	Ḍ	D DOT BELOW
1E0D	ḍ	d DOT BELOW
1EA0	Ạ	A DOT BELOW
1EA1	ạ	a DOT BELOW
1EAC	Ậ	A CIRCUMFLEX DOT BELOW
1EAD	ậ	a CIRCUMFLEX DOT BELOW
...	...	(18 more pairs)

Decompositions: e.g., Ḍ = D + U+0323 COMBINING DOT BELOW.^[1]

Macron (20 characters: 10 uppercase, 10 lowercase)

For length marking in Baltic and other scripts, on letters like G, O, U.

Code Point	Glyph	Short Name
1E20	Ǵ	G MACRON
1E21	ǵ	g MACRON
1E14	Ḕ	E MACRON GRAVE
1E15	ḕ	e MACRON GRAVE
1E44	Ṅ	N DOT ABOVE
1E45	ṅ	n DOT ABOVE
...	...	(7 more pairs)

Decompositions: e.g., Ǵ = G + U+0304 COMBINING MACRON.^[1]

Acute Accent (16 characters: 8 uppercase, 8 lowercase, often combined)

Seen in Irish and Vietnamese, e.g., on K, P, R.

Code Point	Glyph	Short Name
1E30	Ḱ	K ACUTE
1E31	ḱ	k ACUTE
1E54	Ṕ	P ACUTE
1E55	ṕ	p ACUTE
1E58	Ṙ	R DOT ABOVE
1E59	ṙ	r DOT ABOVE
...	...	(3 more pairs)

Decompositions: e.g., Ḱ = K + U+0301 COMBINING ACUTE ACCENT.^[1]

Tilde (28 characters: 14 uppercase, 14 lowercase)

For nasalization in Portuguese and Vietnamese, on A, E, O, etc.

Code Point	Glyph	Short Name
1E74	Ṵ	U TILDE BELOW
1E75	ṵ	u TILDE BELOW
1E4C	Ṍ	O TILDE ACUTE
1E4D	ṍ	o TILDE ACUTE
1E7C	Ṽ	V TILDE
1E7D	ṽ	v TILDE
...	...	(9 more pairs)

Decompositions: e.g., Ṽ = V + U+0303 COMBINING TILDE.^[1]

Other Diacritics (e.g., Circumflex, Hook Above, Ring Below; 60 characters total)

Includes diverse forms like ring below (4), stroke (8), double acute (4), and Vietnamese hooks/horns (32+). Examples:

Code Point	Glyph	Short Name
1E00	Ḁ	A RING BELOW
1E01	ḁ	a RING BELOW
1E24	Ḥ	H DOT BELOW
1E25	ḥ	h DOT BELOW
1EA6	Ầ	A CIRCUMFLEX GRAVE
1EA7	ầ	a CIRCUMFLEX GRAVE
1ED2	Ố	O CIRCUMFLEX ACUTE
1ED3	ố	o CIRCUMFLEX ACUTE
...	...	(52 more)

Decompositions: e.g., Ḁ = A + U+0325 COMBINING RING BELOW; many Vietnamese forms combine multiple marks (e.g., Ầ = A + U+0302 + U+0300).^[1] Unassigned code points (e.g., 1E9C–1E9D, 1E9F, 1EFA–1EFF) reserve space for future extensions. For full visual charts and exact decompositions, refer to the official Unicode data files.^[27]

References

[1]
[PDF] Latin Extended Additional - The Unicode Standard, Version 17.0
204. 1EFF. Latin Extended Additional. 1E00. 1E0 1E1 1E2 1E3 1E4 1E5 1E6 1E7 1E8 1E9 1EA 1EB 1EC 1ED 1EE 1EF. Ḁ ḁ. Ḃ ḃ. Ḅ ḅ. Ḇ ḇ. Ḉ ḉ. Ḋ ḋ. Ḍ ḍ. Ḏ ḏ. Ḑ ḑ. Ḓ ḓ. Ḕ.
[2]
Chapter 7 – Unicode 17.0.0
7 Latin Extended Additional: U+1E00–U+1EFF. The characters in this block are mostly precomposed combinations of Latin letters with one or more general ...
[3]
[PDF] Block Names - Unicode
The Unicode Standard, Version 1.1. Appendix E Block Names. Start Stop ... LATIN EXTENDED ADDITIONAL. 1E00 1EFF. 1F00 1FFF. GREEK EXTENDED. 2000 206F. 2070 ...
[4]
Unicode 17.0 Character Code Charts
Latin Extended Additional · Latin Ligatures · Fullwidth Latin Letters · IPA Extensions · Phonetic Extensions · Phonetic Extensions Supplement · Linear A. Linear ...Help and Links · Name Index · Unihan Database Lookup
[5]
Latin Extended Additional - Unicode
Latin Extended Additional ; Latin general use extensions ; 1E00, Ḁ, Latin Capital Letter A With Ring Below ; ≡ ; ↓ ; 1E01, ḁ, Latin Small Letter A With Ring Below.
[6]
Unicode 17.0.0
Sep 9, 2025 · This page summarizes the important changes for the Unicode Standard, Version 17.0.0. This version supersedes all previous versions of the Unicode Standard.
[7]
Unicode Blocks - Compart
List of Blocks ; U+0000 - U+007F. Basic Latin ; U+0080 - U+00FF. Latin-1 Supplement ; U+0100 - U+017F. Latin Extended-A ; U+0180 - U+024F. Latin Extended-B ; U+0250 ...U+024F Latin Extended-B 208 · U+017F Latin Extended-A 128 · Basic Latin
[8]
Unicode & Existing Vietnamese Character Encodings
The Vietnamese alphabets are listed in several noncontiguous Unicode ranges: Basic Latin {U+0000..U+007F}, Latin-1 Supplement {U+0080..U+00FF}, Latin Extended-A ...Missing: support | Show results with:support
[9]
Vietnamese Writing System - CJVLang
Modern Vietnamese is written with the Latin alphabet, known as quoc ngu (quốc ngữ) in Vietnamese. Quoc ngu consists of 29 letters.
[10]
https://www.unicode.org/versions/Unicode5.1.0/
[11]
[PDF] ISO/IEC JTC1/SC2/WG2 N3027 L2/06-027 - Unicode
Jan 30, 2006 · Accurate transcriptions of medieval texts allow scholars to quote medieval texts without distorting their graphemic content, and allow the ...
[12]
https://mufi.info/
[13]
https://mufi.info/q.php?p=mufi/chars/unichar/7934
[14]
None
Summary of each segment:
[15]
https://www.unicode.org/versions/Unicode2.0.0/
[16]
ẛ U+1E9B LATIN SMALL LETTER LONG S ... - Unicode Explorer
ẛ U+1E9B LATIN SMALL LETTER LONG S WITH DOT ABOVE, copy and paste, unicode character symbol info, in current use in Gaelic types (as glyph variant of 1E61)
[17]
[PDF] ISO/IEC JTC1/SC2/WG2 N2957 L2/05-183 - Unicode
Aug 2, 2005 · Accurate transcriptions of medieval texts allow scholars to quote medieval texts without distorting their graphemic content, and allow the ...
[18]
[PDF] L2/07-157 - Unicode
we are in favour of supporting the assignment of a codepoint for the letter “Capital Sharp-S”. (Latin Capital Letter Sharp S) to the character standard ISO ...<|control11|><|separator|>
[19]
Registered features, a-e (OpenType 1.9.1) - Typography
Jul 6, 2024 · Tag: 'ccmp' This feature permits such composition/decomposition. The feature should be processed as the first feature processed, and should be ...Tag: 'aalt' · Tag: 'apkn' · Tag: 'ccmp'
[20]
Noto Sans - Font Families - openSUSE
ὕαλον ϕαγεῖν δύναμαι· τοῦτο οὔ με βλάπτει. ὕαλον ϕαγεῖν δύναμαι· τοῦτο οὔ με βλάπτει. ὕαλον ϕαγεῖν δύναμαι· τοῦτο οὔ με βλάπτει. ὕαλον ϕαγεῖν δύναμαι· τοῦτο ...
[21]
Font Support for Unicode Block 'Latin Extended Additional'
This is a list of fonts that support characters in the Latin Extended Additional Unicode block. Detail. Font, Support. 100%, Arial · 100% (256 of 256).Missing: OpenType ccmp stacking Noto
[22]
https://www.unicode.org/reports/tr15/
[23]
Gaelic Keyboards for MS-Windows
A number of free layouts exist for keyboarding Gaelic. They are summarised in the following table, and individually described afterwards.
[24]
Vietnamese Unicode FAQs
They typically support the three most common Vietnamese input methods: Telex, VNI, and VIQR. For them to work, the applications that they are to be used ...
[25]
Code Page Identifiers - Win32 apps - Microsoft Learn
Jan 7, 2021 · For the most consistent results, applications should use Unicode, such as UTF-8 or UTF-16, instead of a specific code page. Expand table ...
[26]
Supported Encodings
String classes, and classes in the java.nio.charset package can convert between Unicode and a number of other character encodings. The supported encodings vary ...
[27]
Latin Extended Additional - Unicode
Latin Extended Additional, 1EFF. Ḁ 1E00, ḁ. 1E01, Ḃ 1E02, ḃ. 1E03, Ḅ 1E04, ḅ. 1E05, Ḇ 1E06, ḇ. 1E07, Ḉ 1E08, ḉ. 1E09, Ḋ 1E0A, ḋ. 1E0B, Ḍ 1E0C, ḍ. 1E0D, Ḏ 1E0E ...