Latin Extended-G
Latin Extended-G is a block of the Unicode character encoding standard located in the Supplementary Multilingual Plane (Plane 1), designed to provide additional Latin-script characters for advanced phonetic transcription and specialized linguistic notations.[1] It encompasses the code point range U+1DF00 to U+1DFFF, allocating 256 positions in total.[1] The block was introduced in Unicode version 14.0, released in September 2021, initially encoding 31 characters focused on extensions to the International Phonetic Alphabet (IPA) for representing disordered speech patterns, such as atypical articulations in clinical linguistics.[2] Subsequent versions have expanded it, with Unicode 15.0 (September 2022) adding 6 characters for Malayalam transliteration; as of Unicode 17.0 (September 2025), a total of 37 characters are assigned across categories including IPA extensions, clicks, laterals, letters with palatal or retroflex hooks, and characters for Malayalam transliteration.[1] Examples include 𝼀 (U+1DF00, Latin small letter feng digraph with trill) for disordered speech trills and 𝼝 (U+1DF1D, Latin small letter c with retroflex hook) for retroflex consonants in phonetic analysis.[1] These characters support precise documentation in fields like phonetics, speech therapy, and dialectology, filling gaps in earlier Latin extension blocks by accommodating rare diacritics and digraphs not feasible through combining marks.[1] Ongoing proposals continue to propose additions to the block for emerging needs, such as modifier letters for implosives and symbols from historical alphabets like the Initial Teaching Alphabet.[3][4]Overview
Description
Latin Extended-G is a Unicode block that provides additional Latin characters primarily for use in phonetic transcription systems. It occupies the code point range from U+1DF00 to U+1DFFF in the Supplementary Multilingual Plane (SMP), marking it as one of the initial extensions of the Latin script beyond the Basic Multilingual Plane (BMP).[1] The block's key purpose is to support specialized phonetic notations, including symbols required for transcribing African languages, click consonants, extensions to the International Phonetic Alphabet (IPA) for disordered speech, and characters for Malayalam transliteration. These characters enable precise representation of sounds that are not adequately covered by earlier Latin blocks or combining diacritics.[1] This expansion holds significant importance for linguistic and academic transcription, allowing for standalone precomposed characters that enhance compatibility and readability in digital texts without dependence on complex combining sequences. Along with the Latin Extended-F block, it represents the first Latin characters encoded in the SMP, broadening Unicode's coverage for diverse phonetic needs. As of Unicode 17.0, the block allocates 256 code points, with 37 assigned to specific characters.[1][5]Unicode Block Details
The Latin Extended-G block is allocated the code point range U+1DF00 to U+1DFFF, encompassing 256 consecutive positions in the Unicode character encoding standard.[1] This range is situated in the Supplementary Multilingual Plane (SMP), designated as Plane 1, which extends beyond the initial 65,536 code points of the Basic Multilingual Plane (BMP).[6] The block is categorized under the Latin script, serving as an extension to earlier Latin blocks primarily located within the BMP.[1] As of Unicode version 17.0, 37 characters within this block have been formally assigned, leaving 219 positions unassigned and reserved for potential future allocations.[1] The official Unicode block name is "Latin Extended-G," with no widely recognized aliases in use.[6] Unlike preceding Latin Extended blocks such as Latin Extended-A (U+0100–U+017F) and Latin Extended-D (U+A720–U+A7FF), which reside in the 16-bit BMP and can be represented in a single UTF-16 code unit, Latin Extended-G marks the first such block to require the full 21-bit Unicode encoding capacity, typically encoded as surrogate pairs in UTF-16 or directly in UTF-32. This placement reflects the evolving needs for expanded Latin-based character sets in supplementary planes.[1]Characters
Assigned Characters
The Latin Extended-G Unicode block assigns 37 characters across specific code points in the range U+1DF00–U+1DFFF, leaving gaps at U+1DF1F–U+1DF24 and U+1DF2B–U+1DFFF unassigned.[1] These characters consist of small letters with distinctive modifications such as hooks, curls, belts, and reversals, designed for precise phonetic notation in linguistic applications.[7] The primary sub-range U+1DF00–U+1DF1E encompasses 31 characters, grouped thematically by phonetic function: those for extended IPA in disordered speech (U+1DF00–U+1DF07), IPA extensions including retroflex and click notations (U+1DF08–U+1DF10), laterals (U+1DF11), palatal hook modifications (U+1DF12–U+1DF18), and retroflex hook modifications (U+1DF19–U+1DF1D), plus an IPA extension (U+1DF1E). A secondary sub-range U+1DF25–U+1DF2A includes six characters with mid-height left hooks for Malayalam transliteration. Below is the complete list, with brief descriptions of glyph appearance and primary phonetic role based on their official categories.[7]| Code Point | Glyph | Name | Description |
|---|---|---|---|
| U+1DF00 | 𝼀 | LATIN SMALL LETTER FENG DIGRAPH WITH TRILL | A ligature-like small letter combining f and ng with a trill mark; used in extended IPA for disordered speech to denote fricative trills.[7] |
| U+1DF01 | 𝼁 | LATIN SMALL LETTER REVERSED SCRIPT G | A small reversed script-style g; represents specific velar or uvular articulations in disordered speech per extended IPA.[7] |
| U+1DF02 | 𝼂 | LATIN LETTER SMALL CAPITAL TURNED G | A small capital turned g; denotes backed or uvular fricatives in extended IPA for disordered speech.[7] |
| U+1DF03 | 𝼃 | LATIN SMALL LETTER REVERSED K | A small reversed k; used for retroflex or uvular stops in extended IPA disordered speech notation.[7] |
| U+1DF04 | 𝼄 | LATIN LETTER SMALL CAPITAL L WITH BELT | A small capital l crossed by a horizontal belt; indicates lateral fricatives in extended IPA for disordered speech.[7] |
| U+1DF05 | 𝼅 | LATIN SMALL LETTER LEZH WITH RETROFLEX HOOK | A small ʒ (ezh) with a retroflex hook; represents retroflex postalveolar fricatives in IPA extensions.[7] |
| U+1DF06 | 𝼆 | LATIN SMALL LETTER TURNED Y WITH BELT | A small turned y crossed by a belt; used for lateral approximants in extended IPA disordered speech.[7] |
| U+1DF07 | 𝼇 | LATIN SMALL LETTER REVERSED ENG | A small reversed ŋ (eng); denotes nasal sounds in extended IPA for disordered speech.[7] |
| U+1DF08 | 𝼈 | LATIN SMALL LETTER TURNED R WITH LONG LEG AND RETROFLEX HOOK | A small turned r with extended leg and retroflex hook; for retroflex rhotics in IPA extensions.[7] |
| U+1DF09 | 𝼉 | LATIN SMALL LETTER T WITH HOOK AND RETROFLEX HOOK | A small t with retroflex and hook diacritics; represents retroflex dentals in IPA extensions.[7] |
| U+1DF0A | 𝼊 | LATIN LETTER RETROFLEX CLICK WITH RETROFLEX HOOK | A small retroflex click symbol with hook; used for retroflex click consonants.[7] |
| U+1DF0B | 𝼋 | LATIN SMALL LETTER ESH WITH DOUBLE BAR | A small ʃ (esh) with double vertical bar; denotes fricative clicks.[7] |
| U+1DF0C | 𝼌 | LATIN SMALL LETTER ESH WITH DOUBLE BAR AND CURL | A small ʃ with double bar and rightward curl; for ejective or delayed release fricatives in clicks.[7] |
| U+1DF0D | 𝼍 | LATIN SMALL LETTER TURNED T WITH CURL | A small turned t with curl; represents alveolar clicks.[7] |
| U+1DF0E | 𝼎 | LATIN LETTER INVERTED GLOTTAL STOP WITH CURL | An inverted small glottal stop with curl; used for glottalized clicks.[7] |
| U+1DF0F | 𝼏 | LATIN LETTER STRETCHED C WITH CURL | A stretched small c with curl; denotes palatal or alveolar lateral clicks.[7] |
| U+1DF10 | 𝼐 | LATIN LETTER SMALL CAPITAL TURNED K | A small capital turned k; for voiceless lateral clicks.[7] |
| U+1DF11 | 𝼑 | LATIN SMALL LETTER L WITH FISHHOOK | A small l with a fishhook curl; represents alveolar lateral approximant.[7] |
| U+1DF12 | 𝼒 | LATIN SMALL LETTER DEZH DIGRAPH WITH PALATAL HOOK | A small dʒ (dezh) digraph with palatal hook; for palatalized postalveolar affricates.[7] |
| U+1DF13 | 𝼓 | LATIN SMALL LETTER L WITH BELT AND PALATAL HOOK | A small l with belt and palatal hook; denotes palatalized lateral fricatives.[7] |
| U+1DF14 | 𝼔 | LATIN SMALL LETTER ENG WITH PALATAL HOOK | A small ŋ with palatal hook; for palatalized velar nasals.[7] |
| U+1DF15 | 𝼕 | LATIN SMALL LETTER TURNED R WITH PALATAL HOOK | A small turned r with palatal hook; represents palatalized rhotics.[7] |
| U+1DF16 | 𝼖 | LATIN SMALL LETTER R WITH FISHHOOK AND PALATAL HOOK | A small r with fishhook and palatal hook; for palatalized alveolar trills.[7] |
| U+1DF17 | 𝼗 | LATIN SMALL LETTER TESH DIGRAPH WITH PALATAL HOOK | A small tʃ (tesh) digraph with palatal hook; denotes palatalized postalveolar affricates.[7] |
| U+1DF18 | 𝼘 | LATIN SMALL LETTER EZH WITH PALATAL HOOK | A small ʒ with palatal hook; for palatalized postalveolar fricatives.[7] |
| U+1DF19 | 𝼙 | LATIN SMALL LETTER DEZH DIGRAPH WITH RETROFLEX HOOK | A small dʒ with retroflex hook; represents retroflex postalveolar affricates.[7] |
| U+1DF1A | 𝼚 | LATIN SMALL LETTER I WITH STROKE AND RETROFLEX HOOK | A small i with stroke and retroflex hook; for retroflex vowels or approximants.[7] |
| U+1DF1B | 𝼛 | LATIN SMALL LETTER O WITH RETROFLEX HOOK | A small o with retroflex hook; denotes retroflex rounded vowels.[7] |
| U+1DF1C | 𝼜 | LATIN SMALL LETTER TESH DIGRAPH WITH RETROFLEX HOOK | A small tʃ with retroflex hook; for retroflex postalveolar affricates.[7] |
| U+1DF1D | 𝼝 | LATIN SMALL LETTER C WITH RETROFLEX HOOK | A small c with retroflex hook; represents retroflex alveolo-palatal fricatives as an IPA extension.[7] |
| U+1DF1E | 𝼞 | LATIN SMALL LETTER S WITH CURL | A small s with rightward curl; used for sibilant fricatives in Malayalam transliteration and as an IPA extension.[7] |
| U+1DF25 | 𝼥 | LATIN SMALL LETTER D WITH MID-HEIGHT LEFT HOOK | A small d with mid-height left-pointing hook; for voiced dental stops in Malayalam transliteration.[7] |
| U+1DF26 | 𝼦 | LATIN SMALL LETTER L WITH MID-HEIGHT LEFT HOOK | A small l with mid-height left hook; represents retroflex laterals in Malayalam transliteration.[7] |
| U+1DF27 | 𝼧 | LATIN SMALL LETTER N WITH MID-HEIGHT LEFT HOOK | A small n with mid-height left hook; for retroflex nasals in Malayalam transliteration.[7] |
| U+1DF28 | 𝼨 | LATIN SMALL LETTER R WITH MID-HEIGHT LEFT HOOK | A small r with mid-height left hook; denotes retroflex flaps in Malayalam transliteration.[7] |
| U+1DF29 | 𝼩 | LATIN SMALL LETTER S WITH MID-HEIGHT LEFT HOOK | A small s with mid-height left hook; for retroflex sibilants in Malayalam transliteration.[7] |
| U+1DF2A | 𝼪 | LATIN SMALL LETTER T WITH MID-HEIGHT LEFT HOOK | A small t with mid-height left hook; represents retroflex stops in Malayalam transliteration.[7] |