Old Turkic
Old Turkic is the earliest attested stage of the Turkic languages, a family spoken across Eurasia, documented from the 6th to the 13th centuries CE primarily in Central Asia, Mongolia, and Siberia.[1] It encompasses dialects such as East Old Turkic (including Orkhon Turkic), Old Uyghur, and Karakhanid, representing a relatively uniform early form before significant divergence into modern branches like Oghuz, Kipchak, and Uyghur.[2] The language is preserved in a diverse corpus of approximately 700 runiform inscriptions from the 7th to 10th centuries, extensive Uyghur manuscripts from the 9th century onward, and Karakhanid literary texts from the 11th to 12th centuries.[1][3] The historical context of Old Turkic is tied to the political and cultural expansions of Turkic-speaking peoples, beginning with the Göktürk Khaganates in the 6th century and extending through the Uyghur Khaganate and the Karakhanid Khanate.[1] The oldest documents, such as the Orkhon inscriptions erected around 720 CE in Mongolia's Orkhon Valley, commemorate rulers like Bilge Khagan and Kül Tigin, providing insights into governance, warfare, and shamanistic beliefs.[4] Later texts reflect religious influences, including Manichaean, Buddhist, and Christian manuscripts from the Uyghur period in the Tarim Basin, as well as Islamic works like the Qutadghu Bilig under the Karakhanids.[2] These sources were deciphered starting in the late 19th century, with Vilhelm Thomsen identifying the runiform script in 1893.[1] Linguistically, Old Turkic is agglutinative with vowel harmony, featuring a nine-vowel system and distinctions in consonants like /p/ versus /b/ or /v/.[1] It employs a rich morphology, including up to 12 nominal cases (e.g., nominative, genitive, dative, ablative), possessive suffixes, and complex verb forms such as preterite (-dI), aorist (-Ur), and converbs for subordinate clauses.[1] Syntax favors topic-comment structure with flexible word order, relative clauses marked by participles like -gU, and evidential markers in the perfect tense (-mIš).[1] Scripts evolved from the indigenous runiform (runic-like, possibly influenced by tribal marks or Semitic origins) for early inscriptions to borrowed systems: Sogdian-derived Uyghur script for manuscripts, Manichaean, Syriac, and eventually Arabic for Karakhanid texts, adapting to phonetic needs like diacritics for vowels.[1] Old Turkic holds foundational significance for Turkic linguistics, enabling reconstructions of Proto-Turkic and tracing evolutions in phonology, lexicon, and syntax across the family.[1] It reveals early contacts with Iranian (Sogdian), Chinese, and Indo-European (Tokharian) languages, influencing vocabulary and cultural exchanges during steppe empires and the Mongol era.[2] Studies of its corpus continue to inform historical, anthropological, and comparative research on Turkic identity and migrations.[1]History
Origins and Periodization
Old Turkic, the earliest attested stage of the Turkic language family, evolved from Proto-Turkic, with its roots in the Central Asian and northern Eurasian steppes, particularly around the Altai Mountains and Mongolia.[1] The language's origins are linked to the emergence of early Turkic-speaking nomadic groups, with initial evidence appearing in the 6th century CE through contacts with neighboring languages such as Sogdian, Iranian, Mongolic, Chinese, and Tocharian, resulting in loanwords and calques.[1] Phonological characteristics, including the retention of /h/ and /ñ/ in some dialects (e.g., Khalaj) and the absence of initial /p/ or /š/ in native vocabulary (possibly evolving from *h-), distinguish it from later stages and reflect its Proto-Turkic heritage.[1] Scholarly consensus places the formative period in the 5th–6th centuries CE, coinciding with the establishment of the First Turkic Khaganate around 552 CE, though direct attestation begins slightly later.[5] The periodization of Old Turkic generally spans from the 6th to the 13th centuries CE, marking the transition from pre-Islamic nomadic societies to more settled, literate communities influenced by Buddhism, Manichaeism, and eventually Islam.[1] This era ends with the Mongol invasions of the 13th century, which disrupted Turkic polities and led to the emergence of Middle Turkic varieties.[1] Within this framework, scholars like Marcel Erdal subdivide Old Turkic into three phases: the early Orkhon Turkic (6th–8th centuries), characterized by runic inscriptions from the Göktürk period; the middle Uyghur Turkic (9th–13th centuries), featuring diverse manuscript traditions after the Uyghurs' migration to the Tarim Basin in 840 CE; and the late Karakhanid Turkic (11th–13th centuries), represented by Arabic-script literary works blending eastern and western dialects.[1] Alternative classifications, such as those by N. A. Baskakov, align the Old Turkic phase with the 5th–10th centuries, emphasizing its pre-Islamic normativity across Köktürk, Uyghur, and early Kyrgyz dialects, while extending some features into the 13th century before the Mongol impact.[6] Attestation of Old Turkic primarily derives from epigraphic and manuscript sources, beginning with the Sogdian-language Bugut inscription (ca. 580 CE) containing Turkic phrases and culminating in the Orkhon-Yenisei runic monuments of the 8th century, such as the Kül Tegin (732 CE) and Bilge Khagan (735 CE) inscriptions, which represent the oldest extensive readable texts.[1] Over 200 runic inscriptions from Mongolia, South Siberia, and the Yenisei region document official and funerary language, while Uyghur-era materials from the 9th century onward—including Buddhist, Manichaean, and Nestorian Christian texts in scripts like Sogdian-derived Uyghur and Brahmi-based systems—provide the bulk of the corpus, with key examples like the runic Irk Bitig and the Sogdian-derived Maitrisimit nom bitig.[1] Later Karakhanid works, such as Mahmud al-Kashgari's Diwan Lughat al-Turk (ca. 1072–1074 CE) and Yusuf Balasaghuni's Qutadgu Bilig (1069–1070 CE), illustrate dialectal koines incorporating Oghuz, Karluk, and Kipchak elements, highlighting the language's role as a lingua franca in Central Asia.[1] These sources, first deciphered by scholars like Vasily Radlov and Vilhelm Thomsen in the late 19th century, form the foundation for reconstructing Old Turkic grammar and vocabulary.[1]Discovery and Research
The discovery of Old Turkic inscriptions occurred in 1889 during an expedition led by Nikolai M. Yadrintsev, organized by the East Siberian branch of the Russian Geographical Society, in the Orkhon River basin of northern Mongolia.[7] This expedition uncovered the famous Orkhon monuments, including the memorials to Kül Tigin (erected 732 CE) and Bilge Khaqan (735 CE), along with the Tonyukuk inscription (ca. 716–725 CE), which are among the earliest extensive texts in the runiform script.[1] These findings were subsequently published and described by Vasily Radlov, who recognized their significance for Turkic studies.[1] The decipherment of the runiform script was achieved in 1893 by Danish philologist Vilhelm Thomsen, who successfully read the Orkhon inscriptions and identified them as Turkic, linking the script to historical Chinese records of the Göktürks from the 6th–8th centuries CE.[1] Thomsen's breakthrough, detailed in his publications of 1896 and 1916, established the phonetic values of the runes and revealed distinctions such as /t/ versus /d/, enabling the translation of key texts like the Bilge Khaqan inscription.[1] Radlov contributed early editions of related materials, including his 1891 publication of the Qutadghu Bilig, a Karakhanid Turkic work that provided comparative context for the older runic corpus.[1] Early 20th-century research expanded with the editing of Uyghur manuscripts from expeditions in Turfan by scholars such as Friedrich W. K. Müller, Albert von Le Coq, Wilhelm Bang, and Thomsen himself, who analyzed phonetic and morphological features in Manichaean and Buddhist texts.[1] Bang and Annemarie von Gabain advanced dialectal studies in the 1920s–1930s, identifying variations such as the shift from ñ to n in Manichaean Uyghur versus y in Buddhist texts, while Gabain's Alttürkische Grammatik (1941) synthesized grammar based on the inscriptions.[1] Soviet scholars like Sergei E. Malov refined translations of the Orkhon texts between 1893 and 1945, focusing on linguistic accuracy, followed by phonological analyses by Andrey N. Kononov and dialect classifications by Edkham R. Tenishev.[8] Post-World War II developments included Talât Tekin's A Grammar of Orkhon Turkic (1968), which detailed vowel harmony and case systems from the runic corpus, and Gerhard Doerfer's etymological work on reflexes like /h/ in ethnonyms (1980).[1] Marcel Erdal's comprehensive A Grammar of Old Turkic (2004) integrated the full corpus—including Orkhon, Yenisei, and Uyghur materials—emphasizing diachronic syntax, borrowings from Iranian and Chinese, and anthropocentric linguistic approaches.[1] Contemporary research continues through international projects, such as the documentation of Altai runic inscriptions and digital corpora, building on these foundations to explore Proto-Turkic reconstruction and cultural contexts.[8]Classification and Dialects
Linguistic Classification
Old Turkic represents the earliest attested stage of the Turkic language family, encompassing inscriptions and texts from the 7th to 13th centuries CE, including the Orkhon-Yenisei runic inscriptions and Uyghur manuscripts. It is classified as a direct descendant of Proto-Turkic, the reconstructed ancestor of all Turkic languages, and serves as the primary source for its reconstruction due to its retention of archaic phonological, morphological, and syntactic features.[1][9] The Turkic family itself comprises at least 35 languages spoken by over 200 million people across Eurasia (as of 2025), characterized by agglutinative morphology, vowel harmony, and subject-object-verb word order.[4] Within the Turkic family, Old Turkic is often subdivided into East Old Turkic—a koiné form from the 8th century onward, exemplified by the standardized language of the Orkhon inscriptions and early Uyghur texts—and less well-attested West Old Turkic varieties.[9] This East Old Turkic phase predates the major branch divergences and exhibits features shared across the family, such as the plural suffix +lAr and dative case +kA, while showing dialectal variations like the n/y alternation in Uyghur compared to runic texts.[1] The family's internal classification typically recognizes branches including Oghuz (Southwestern, e.g., Turkish), Kipchak (Northwestern, e.g., Kazakh), Karluk (Southeastern, e.g., Uyghur), Siberian (Northeastern, e.g., Yakut), and Oghur (e.g., Chuvash), with Old Turkic providing evidence for Common Turkic innovations before these splits.[9][4] Broader linguistic affiliations place the Turkic family within the controversial Altaic macrofamily hypothesis, which posits genetic ties to Mongolic, Tungusic, Korean, and Japonic languages based on shared typological features like agglutination and vowel harmony, as well as potential etymological correspondences (e.g., Turkic ordo 'court' akin to Mongolic ordō).[1] However, this classification remains debated, with many scholars attributing similarities to prolonged language contact and areal diffusion rather than common ancestry, emphasizing instead the robust internal coherence of the Turkic family.[1] Proto-Turkic is estimated to date to around the 1st millennium BCE, with varying scholarly estimates (e.g., ca. 500 BCE to early CE), and Old Turkic texts offering direct attestation from the mid-1st millennium CE.[9][10]Dialectal Variations
Old Turkic, spanning roughly from the 7th to the 13th century, displays notable dialectal variations reflecting regional, temporal, and cultural influences across Central Asia and Siberia. These variations are attested in runic inscriptions, Buddhist and Manichaean manuscripts, and literary texts, with distinctions emerging in phonology, morphology, and syntax. Scholars divide Old Turkic into early (7th–8th centuries), middle (8th–11th centuries), and late (11th–13th centuries) periods, during which dialects evolved from a relatively homogeneous base into more diverse forms, influenced by interactions with Indo-European, Iranian, and Mongolic languages.[11][1] The primary dialects include Orkhon Turkic, Uyghur, Qarakhanid, and others such as Argu, Khotan, and Manichaean Turkic, each tied to specific geographical areas and text corpora. Orkhon Turkic, the earliest attested form from 7th–10th century runic inscriptions in Mongolia and South Siberia (e.g., the Orkhon and Yenisey texts), features strict vowel harmony (synharmonism), retention of the vowel /e/, and morphological elements like the -sAr converb, -zUn imperative, and -mIš/-dOk participles, without a distinct -gAy future form. It uses +nX for the genitive and limits the +lAr plural marker primarily to humans.[1] Uyghur, documented from the 9th century onward in manuscripts from Xinjiang, Gansu, and the Tarim Basin (e.g., Buddhist texts like the Suyut Tngri and Maitrisimit), shows innovations such as the shift of /ñ/ to /y/, vowel lowering near /g/ or /r/, introduction of the -gAy future and -nI accusative, flexible suffix ordering, and broader use of +lAr for all plurals. It also employs the conjunction takï and fourfold vowel harmony in some instrumental forms, indicating greater syntactic complexity with increased subordination via kim relative clauses.[1] Qarakhanid, emerging in the 11th century in West Turkistan and Central Asia through texts like the Qutadgu Bilig and Dīwān Lughāt al-Turk, exhibits fused verb forms (e.g., alumadï), retention of /d/ in the +dA locative, -sUn imperative, -mAs negative aorist, and -gAn habitual participle, alongside a genitive in +nXg. It fronted /ï/ to /i/ near palatals and lacked the p > f shift before /š/, marking a transition toward Karluk varieties.[1] Lesser-attested dialects include Argu and Khotan, from Sogdian-script manuscripts in the Khotan and Argu regions, characterized by /ñ/ > /n/ and suffix rounding (e.g., kurtgar-dum); Manichaean Turkic, overlapping with Uyghur areas, with /ñ/ > /n/, +dAn ablative, and -sXk participles; early Oghuz traits like pronunciation for /b/; and Khaladj in northern Afghanistan, retaining /h/ and /ñ/ > /n/. These variations underscore Old Turkic's heterogeneity, with no single uniform language but rather a continuum shaped by geography and borrowing, challenging notions of a monolithic "Common Turkic."[1][11] The following table summarizes key dialectal features for comparison:| Dialect | Region | Key Phonological Traits | Key Morphological Traits |
|---|---|---|---|
| Orkhon | Mongolia, South Siberia | Strict synharmonism, retains /e/ | -zUn imperative, no -gAy future, +nX genitive |
| Uyghur | Xinjiang, Tarim Basin | /ñ/ > /y/, vowel lowering | -gAy future, +nI accusative, takï conjunction |
| Qarakhanid | West Turkistan | /ï/ > /i/ near palatals, retains /d/ | -sUn imperative, +nXg genitive |
| Argu/Khotan | Khotan region | /ñ/ > /n/, suffix rounding | Vowel lowering in forms |
| Manichaean | Uyghur-influenced areas | /ñ/ > /n/ | +dAn ablative, -sXk participles |
Phonology
Consonants
The consonant system of Old Turkic features a symmetrical inventory of stops, fricatives, nasals, laterals, and affricates, characterized by distinctions in voicing, place of articulation, and palatalization, with allophonic variations influenced by vowel harmony and positional context.[1] This system, reconstructed primarily from runic inscriptions and manuscripts, includes 18-20 phonemes depending on dialectal analysis, reflecting a Proto-Turkic heritage with minimal clusters and a preference for CV syllable structure.[1][12] The stops comprise voiceless-voiced pairs at bilabial, dental, and velar/uvular places: /p b/, /t d/, /k g/, and /q/ (uvular, often positional with /k/). Fricatives include /s z/, /š ž/ (sibilants with palatal variants), /f v/ (labiodental, rare initially), and /x ɣ h/ (velar/uvular/glottal). Nasals are /m n ŋ ñ/ (with /ñ/ palatal and dialectally unstable, shifting to /y/ in some forms like Uygur). Liquids /l r/ function as sonants, while affricates /č ǰ/ and the glide /y/ complete the set. The following table summarizes the core inventory:| Place/ Manner | Bilabial | Labiodental | Dental/Alveolar | Palatal | Velar | Uvular | Glottal |
|---|---|---|---|---|---|---|---|
| Stops | p, b | t, d | k, g | q | |||
| Affricates | č, ǰ | ||||||
| Fricatives | f, v | s, z | š, ž | x | ɣ | h | |
| Nasals | m | n | ñ | ŋ | |||
| Liquids | l, r | ||||||
| Glide | y |
Vowels and Harmony
Old Turkic possessed a vowel system consisting of nine phonemes: the back unrounded vowels /a/ (low) and /ï/ (high); the front unrounded vowels /ä/ (low), /e/ (mid-low), and /i/ (high); the back rounded vowels /o/ (mid) and /u/ (high); and the front rounded vowels /ö/ (mid) and /ü/ (high). This inventory reflects a reduction from the reconstructed Proto-Turkic system of 16 vowels (distinguishing long and short variants), with length distinctions largely lost or non-phonemic in attested Old Turkic texts; debates persist on the phonemic status of vowel length, which was distinctive in Proto-Turkic but became sub-phonemic or lost in most Old Turkic dialects, though preserved in some comparative evidence from languages like Yakut. The mid vowel /e/ often arose from lengthening of /ä/ or through dialectal developments, as seen in words like elig 'king' (from Proto-Turkic */älïg/).[1] Vowel length was phonemic in Proto-Turkic but became marginal or sub-phonemic in Old Turkic, with evidence primarily from runiform inscriptions and comparative data from modern languages like Yakut and Khaladj. For instance, potential minimal pairs such as kan 'blood' versus xan 'ruler' suggest length may have played a role in some contexts, though it is not consistently marked in Orkhon or Uyghur texts. Syncopation frequently affected non-initial vowels, as in tolup > tolp 'completely', further complicating length analysis. The hallmark of Old Turkic phonology is vowel harmony, a system enforcing agreement in backness and rounding across syllables within a word. Backness harmony requires all vowels to align as either back (/a, ï, o, u/) or front (/ä, e, i, ö, ü/), operating progressively from the root vowel to affixes—a process known as synharmonism. For example, the back-harmony word oxša- 'to caress' takes back-vowel suffixes like -ïz 'our', yielding oxšadïz, while the front-harmony word elig 'hand' selects front vowels, as in eligimizdä 'in our hand'. Rounding harmony complements backness by assimilating rounded features, primarily progressively but occasionally regressively, especially through labial consonants. Rounded vowels (/o, ö, u, ü/) trigger harmony in following syllables, whereas unrounded ones (/a, ä, i, ï, e/) do not; for instance, suv 'water' (from earlier sïv) shows rounding extension, and kurtgar-dum illustrates labial-mediated rounding. In morphological contexts, suffixes employ archiphonemes that resolve according to the root's harmony: /A/ becomes /a/ after back vowels or /ä/ after front (/barï-dA/ 'there is-[connective]' vs. /kälmišimiz-dä/ 'having come-[locative]'); /I/ yields /i/ or /ï/; /U/ gives /u/ or /ü/; and /O/ produces /o/ or /ö/. This system ensures morphological cohesion, as in tap-un-tïlar 'they find-[negative]-[plural]'. Exceptions to harmony occur mainly in loanwords and early or dialectal texts, disrupting the otherwise strict rules. Borrowings from Indo-Iranian or Chinese sources, such as bodisatva* (Sanskrit bodhisattva) or lenxwa 'lotus', retain non-harmonizing vowels like front /e/ in back contexts. Sporadic fronting of /ï/ to /i/ near palatals (e.g., bïš- > biš- 'five') or unconditioned lowering (e.g., beside /g/ or /r/) appears in pre-classical inscriptions, reflecting transitional Proto-Turkic features. Dialectal variations, such as in Uyghur (yänä 'again' with front harmony) versus Orkhon Turkic (yana with back), and suffix fluctuations (e.g., Qarakhanid -mAs vs. -mAz), highlight evolving harmony patterns across Old Turkic corpora.| Harmony Type | Key Features | Example (Back) | Example (Front) |
|---|---|---|---|
| Backness | Agreement in back/front quality | qara qoy 'black sheep' (/a, o/) | öt 'grass' (/ö, e/) |
| Rounding | Progressive assimilation of rounding | süök 'bone' (/ü, ö/) | özün 'identity' (/ö, ü/) |
Writing Systems
Runic Script
The Old Turkic runic script, also known as the Orkhon script, Göktürk script, or Orkhon-Yenisei script, is an ancient alphabetic writing system employed to record the Old Turkic language during the 7th to 10th centuries CE.[13] It was primarily used by the Göktürk and other early Turkic khanates across regions including Mongolia, Siberia, and Central Asia.[13] The script's name derives from the Orkhon Valley in Mongolia, where the earliest known inscriptions were discovered in the late 19th century. Although termed "runic" due to superficial resemblances in angular forms to Germanic runes, this label has been critiqued by scholars as misleading, emphasizing its distinct alphabetic nature rather than any direct connection to runic traditions. The script's origins trace back to the Aramaic writing system, likely transmitted through intermediate Iranian scripts such as Sogdian, adapting to the phonetic needs of Turkic languages around the 7th century CE.[13] It emerged during the Second Turkic Khaganate (ca. 682–744 CE) and persisted into the Uyghur Khaganate period, serving as a marker of Turkic imperial identity.[13] The system was deciphered in 1893 by Danish linguist Vilhelm Thomsen, who used bilingual parallels from the Kul Tigin inscription (erected 732 CE) alongside Chinese translations to identify its phonetic values. Thomsen's breakthrough, detailed in his publication Déchiffrement des inscriptions de l’Orkhon et de l’Jénissei, revealed an alphabet of approximately 38–39 primary signs, accounting for consonantal and select vocalic distinctions. Key features of the script include its vertical orientation, typically written in columns from right to left, with lines progressing from bottom to top in ancient usage—though modern transcriptions often reverse this to top-to-bottom for readability.[13] It is largely alphabetic, with signs representing consonants and a smaller set for vowels, but vowels are often implied through context and the language's vowel harmony system, which features front/back and rounded/unrounded distinctions.[13] Allographs (variant forms) exist for certain consonants, such as velar/palatal pairs (e.g., k/q and g/ğ), reflecting dialectal or harmonic variations without separate symbols for every phoneme. Punctuation is minimal, employing a two-dot leader (similar to modern U+205A) for word breaks and occasional ring points (U+2E30) for sentence ends.[13] Rare instances of boustrophedon writing occur, where alternate lines reverse direction and mirror glyph shapes.[13] Two main variants are recognized: the eastern Orkhon style, more angular and uniform, and the western Yenisei style, with curvier forms and regional adaptations.[13] The script was chiefly utilized for monumental inscriptions on stone stelae, commemorating rulers, victories, and funerary rites, as seen in the famous Orkhon inscriptions such as those of Bilge Khagan and Kul Tigin (732–735 CE). These texts, often bilingual with Sogdian or Chinese, provide the primary corpus, totaling over 200 known examples from sites like the Orkhon and Yenisei valleys.[13] Additional uses include graffiti on rocks, border markers, and rare wooden or metallic artifacts, extending to some Iranian-language texts in Turkic contexts.[13] Manuscripts in the script are scarce, but fragments suggest limited administrative or literary application before the widespread adoption of the Sogdian-derived Uyghur script in the 8th-9th centuries. For Unicode encoding, 71 characters were standardized in 2009 (U+10C00–U+10C48), facilitating digital representation while preserving historical variants.[13] Representative examples illustrate the script's phonetic rendering. The word tengri ("deity" or "heaven") is written as É∫´√, combining signs for /t/, /ŋ/, /r/, and /i/ with harmonic implication.[13] Similarly, otboq äč ("ferocious bull") appears as ∞çâ¡, showcasing consonant clusters and vowel elision typical of the system's efficiency.[13] These forms highlight how the script prioritizes consonantal skeletons, relying on readers' knowledge of Turkic morphology for full interpretation. The decipherment and study of this script have profoundly influenced Turkology, enabling reconstruction of early Turkic phonology and history, as explored in seminal works like Talat Tekin's A Grammar of Orkhon Turkic (1968).Sogdian-Derived Uyghur Script
The Sogdian-derived Uyghur script, also known as the Old Uyghur script, was the primary writing system for Old Uyghur texts from the 8th to the 13th centuries CE.[1] It originated from the Sogdian script, an Aramaic-derived system used by Sogdian merchants along the Silk Road, and was adapted by the Uyghur Khaganate around the 8th century following their migration to the Tarim Basin and conversion to Manichaeism.[1] This cursive, right-to-left script evolved into a more fluid form suitable for paper and wood, becoming the standard for administrative, literary, and religious documents in the Uyghur Kingdom of Qocho (9th-13th centuries).[2] The script is alphabetic with 22 consonants and 7-8 vowels, featuring ligatures and diacritics to denote Turkic vowel harmony and specific sounds absent in Sogdian, such as /ö/ and /ü/.[1] It lacks inherent vowels unlike abugidas, allowing direct representation of Turkic phonology. Thousands of manuscripts survive from Turfan and Dunhuang, including Buddhist sutras, Manichaean hymns, legal contracts, and medical texts, often in multilingual contexts with Chinese or Tibetan.[2] Notable works include the Irk Bitig (Book of Omens) and translations of the Diamond Sutra. The script influenced later Central Asian writing systems and was gradually supplanted by the Arabic script under Islamic influence in the 13th-14th centuries.[1] Its study, advanced by scholars like Peter Zieme and Larry Clark, provides key insights into Old Uyghur literature and culture. For digital use, it is supported in Unicode as the "Old Uyghur" block (proposed but not yet encoded as of 2025).Other Religious Scripts
Old Turkic texts also employed specialized scripts for religious purposes. The Manichaean script, derived from Syriac and adapted in the 8th century, was used for Manichaean literature during the Uyghur Khaganate, featuring a cursive form with vowel indicators for Turkic.[1] Surviving fragments from Turfan include sermons and cosmological texts. The Syriac script, introduced via Nestorian Christianity in the 8th-10th centuries, appears in rare Christian manuscripts blending Syriac and Old Uyghur, such as prayers and Bible excerpts.[2] By the Karakhanid period (11th-12th centuries), the Arabic script was adapted with vowel diacritics for the Qutadghu Bilig and other Islamic works.[1] These scripts reflect the religious diversity of Turkic societies.Brahmi-Derived Scripts
The Brahmi-derived scripts employed for Old Turkic languages, specifically Old Uyghur, consist primarily of the North Turkestan Brahmi (NTB) and its specialized Uyghur variety, which developed in the oases of the Tarim Basin and Turfan region (modern Xinjiang, China) as part of the Buddhist cultural transmission along the northern Silk Road.[14] This script family traces its roots to the Gupta-derived Brahmi traditions of northern India, introduced to Central Asia around the 5th–7th centuries CE through missionary activities and trade, evolving into a slanting, cursive form suited to local writing materials like wood, paper, and stone. Attested from the early 7th century CE, NTB served multiple languages in the region, including Tocharian, Sanskrit, Khotanese Saka, and Old Uyghur, with the Uyghur adaptation emerging prominently after the Uyghur Khaganate's conversion to Manichaeism and later Buddhism in the 8th–9th centuries CE.[15] The script's use persisted into the pre-Mongol period, up to the 14th century, used alongside the dominant Sogdian-derived Uyghur script, before declining.[16] The Uyghur variety of NTB modified standard Brahmi characters to accommodate Turkic phonological features, such as vowel harmony and specific consonants, through diacritical marks and ligatures, while retaining the abugida structure where consonants carry an inherent vowel (typically /a/) that could be suppressed or altered.[14] This adaptation is evident in over 40 known fragments from collections like those of Berezovsky and Krotkov (acquired 1905–1907 in Turfan), now held at the Institute of Oriental Manuscripts, Russian Academy of Sciences, which include calligraphic manuscripts on reused Chinese paper scrolls.[16] A dated example is a 1277/78 CE manuscript, highlighting the script's longevity in Buddhist monastic contexts.[15] Inscriptions in Uyghur Brahmi, collected during the German Turfan Expeditions (1902–1914) from sites like Turfan and Kucha, often feature short dedicatory or donor formulas carved on wood or stone, reflecting everyday religious practices. Most surviving texts in Uyghur NTB are Buddhist literature, underscoring the script's association with Mahayana traditions transmitted from India via Kucha.[14] Representative examples include fragments of the Abhidharmadīpavibhāṣaprabhāvaṝtti, a commentary on Abhidharma philosophy; the Prajñāpāramitā sutra; and the Suvarṇabhāsottamasūtra, a protective text, all partially inscribed in Brahmi alongside Uyghur script portions.[15] Other notable pieces comprise a confession of sins (No. 21 in the Berezovsky-Krotkov corpus), bilingual Sanskrit-Uyghur excerpts from the Prasādapratibhodbhava (No. 33), and Tocharian B-Uyghur hybrids possibly relating to medicinal or narrative content (Nos. 37–38), such as a prophecy of the Arhat Candravasu.[16] These documents, often brief (2–5 lines, 2–3 cm in size), demonstrate the script's role in interlinguistic translation and adaptation within multilingual Uyghur Buddhist communities.[14] Scholarly analysis, beginning with early 20th-century expeditions, has revealed NTB's stemma as a distinct branch from southern Central Asian Brahmi varieties, with the Uyghur form showing innovations like simplified strokes for efficiency on portable media. Peter Zieme's foundational study documented its sporadic but deliberate use among Uyghurs, often as a prestige script for sacred texts, contrasting with the more common Sogdian-derived Uyghur alphabet. Recent editions, such as those by Dieter Maue and Olga Lundysheva, emphasize the script's cultural hybridity, bridging Indic, Indo-European, and Turkic traditions in pre-Islamic Central Asia.[15]Grammar
Nominal System
The nominal system of Old Turkic encompasses nouns, adjectives, and pronouns, which exhibit agglutinative morphology without grammatical gender. Nouns and adjectives inflect for case and number, while possession is marked by person-specific suffixes that precede case endings. Vowel harmony governs suffix selection, ensuring front or back vowels match the stem, and consonant assimilation occurs at morpheme boundaries. Adjectives function attributively or predicatively, agreeing optionally with the nouns they modify in case and number, but they lack a distinct paradigm and often derive from verbs or nouns via suffixes like +lIg (e.g., tapag-lïg "revered"). Pronouns decline similarly to nouns, with personal forms showing dialectal variations such as bän/män "I". This system reflects the language's synthetic nature, where multiple suffixes stack sequentially on stems. Number is binary, distinguishing singular (unmarked) from plural, primarily via the suffix +lAr, which harmonizes with the stem's vowels (e.g., kiši-lär "people" from kiši "person"). Plural marking is optional, especially for non-human referents, and does not require agreement with predicates (e.g., plural subjects take singular verbs like kapagï biz "we entered"). Rare alternatives include + (U)t for collectives (e.g., tegit "they" as a plural title) and +s in loanwords (e.g., išvara-s "gods"). Adjectives and pronouns follow the same pattern, with plurals like bolar "these" from demonstrative bo "this". Old Turkic features a rich case system with up to 12 cases, combining primary spatial and relational functions with secondary ones. The nominative is unmarked, serving as the base for subjects and direct objects in context. Other cases attach post-positionally, often after possessive or plural markers, and may double for emphasis (e.g., muntada "herefrom"). The following table summarizes key cases, their suffixes, functions, and examples:| Case | Suffix | Function | Example |
|---|---|---|---|
| Nominative | (none) | Subject, direct object, vocative | bo "this"; kiši "person" |
| Genitive | + (n)X | Possession, attribution | bäg-i "of the lord"; mäni "my" |
| Accusative | + (X)g, +nI | Direct object (specific) | kiši-ni "the person"; nom-um-ïn "my book" |
| Dative | +kA, +gA | Direction, beneficiary | kägän-ïm-e "to my khan"; tä yïgïlur-lar "they gather there" |
| Locative | +dA | Location, time | kögmän tag-da "on Kögmän mountain"; bokünki küm-tä "today" |
| Ablative | +dIn, +dAn | Source, origin | ögüz-dän "from the bull"; antadïn "from there" |
| Instrumental | +(X)n, +(I)n | Means, instrument | ok-un "with an arrow"; almat-ïn "with a diamond" |
| Directive | +gArU | Motion toward (rare) | ötükän yïš-garu "to Ötüken" |
| Comitative | +lXgU | Accompaniment ("with") | ini-ligü "with the younger sibling" |
| Similative | +lAyU | Comparison ("like") | op-layu "like an ox" |
| Equative | +čA, +lAyU | Equality, quantity | munu-layu "thus"; barča "all" |
| Partitive-Locative | +rA | Partitive or locative | töpör-ä "on the axe" |
| Person | Singular Suffix | Plural Suffix | Example |
|---|---|---|---|
| 1st | +(X)m | +(X)mXz | ogl-um "my son"; kälmiš-imiz-dä "when we came" |
| 2nd | +(X)ñ, +(X)g | +(X)ñXz, +(X)gXz | anaŋ "your mother"; sizi "you (pl.)" |
| 3rd | +(s)I(n) | +(s)I(n)lArI(n) | suv-ïn "his water"; kïz-ïn "his daughter" |
Verbal System
The verbal system of Old Turkic is agglutinative, with suffixes marking tense, mood, aspect, voice, and person, adhering strictly to vowel harmony and consonant assimilation rules.[1] Verbs are derived through denominal and deverbal processes, forming a rich inventory of lexical items. Denominal verbs arise from nouns or adjectives via suffixes such as +lA- or +A- for transitives and intransitives, +U- or +(A)d- for causatives, and +gAr- for inchoatives, as in taš+gar- "to get out" from taš "stone."[1] Deverbal derivations include causatives with -Xt-, -It-, -tUr-, or -Ar- (e.g., adart- "to cause to bite" from at- "to bite"), passives with -(X)l- or -tXl- (e.g., kör-l- "to be seen" from kör- "to see"), reflexives with -(X)n- or -lXn-, reciprocals with -Xš-, and desideratives with -(X)gsA-.[1] Syncopation is frequent in these formations, such as äšidil- becoming eštil-.[1] Finite verb forms are constructed by adding tense/aspect/mood suffixes to the stem, followed by personal endings that indicate subject person and number.[1] The system distinguishes tenses like the present imperfective (marked by -A/-I, -r, -Ur, or -yUr, e.g., bar-ïr "he goes"), past constative (-dI, e.g., bar-dï "he went"), inferential past (-mIš, e.g., bar-mïš "he has gone"), and future (-gAy, e.g., bar-gay "he will go").[1] Moods include the indicative (default), imperative (zero for 2nd singular, -gIl for emphasis, e.g., bar-gIl "go!"), optative (-gAy, e.g., bar-gay "may he go"), conditional (-sAr, e.g., ärsär "if it is"), and necessitative (-gU).[1] Aspects encompass imperfective (ongoing, via -A/-I or -Ur, e.g., kïlur "he does"), perfective (completed, -mIš), durative (-Ar or -tur-), and prospective (-gU).[1] Voices feature active (unmarked), passive (-(X)l-), causative (-tUr-), and reciprocal (-Xš-).[1] Personal endings follow the tense/mood markers and vary by conjugation class, influenced by vowel harmony; they often resemble possessive suffixes in past tenses.[1] The following table illustrates representative singular and plural endings across persons:| Person/Number | Singular Example | Plural Example |
|---|---|---|
| 1st | -m (e.g., bar-ïm "I go") | -mIz (e.g., bar-ïmIz "we go") |
| 2nd | -ñ (e.g., bar-ïñ "you go") | -ñIz (e.g., bar-ïñIz "you pl. go") |
| 3rd | -Ø or -r (e.g., bar-ïr "he goes") | -lAr (e.g., bar-ïr-lAr "they go") |
- 1st singular: bar-ïm "I go"
- 2nd singular: bar-ïñ "you go"
- 3rd singular: bar-ïr "he/she/it goes"
- 1st plural: bar-ïmIz "we go"
- 2nd plural: bar-ïñIz "you pl. go"
- 3rd plural: bar-ïr-lAr "they go" [1]
Derivational Morphology
Old Turkic derivational morphology is predominantly suffixal, with affixes attaching to roots or stems to form new words by altering lexical category, semantic nuance, or relational meaning, while adhering to principles of vowel harmony and agglutination.[1] This system is highly productive, enabling the creation of nouns from verbs or adjectives, verbs from nouns, and various relational forms, often reflecting an ergative pattern in deverbal derivations where intransitive verbs yield subject-oriented nouns and transitive verbs yield object-oriented ones.[17] Compounding and rarer processes like zero-derivation and apophony supplement suffixation, contributing to a rich lexicon attested in runic inscriptions from the 7th to 10th centuries.[18] Nominal derivation encompasses several subtypes. From nouns to nouns, suffixes such as +lXg denote location or possession, as in suv + lag "watering place" from suv "water."[1] Diminutives or caritatives use +kIñA/+kIyA, exemplified by ata + qïña "dear father" from ata "father."[1] Relational or similative forms employ +sIg or +lI, such as öñü + sig "distinct" from öñü "forehead" or tärsli oñli "wrong or right" from tärs "wrong" and oñ "right."[1] Deverbal nominals, forming action or result nouns, include +mXr as in yagmur "rain" from yag "to rain," or -(X)g for events like közï yüm-ügüg "with closed eyes" from yüm "to close."[1] Habitual or resultative nouns arise with -gAn, e.g., tutgan "rapacious" from tut "to hold," and perfective forms use -mIš, as in bititmiš "having written" from bit "to write."[1] From adjectives to nouns, abstract qualities are derived via +lXgU or -lIg, such as sädräklig "sparseness" from sädräk "sparse" or adgülüg "friendship" from adgü "friendly."[1] Privative adjectives, functioning nominally in some contexts, use -sïz, e.g., küçsïz "powerless" from küç "power."[18] Verbal derivation primarily converts nouns and adjectives into verbs, often expressing inchoative or causative senses. Denominal verbs frequently employ +lA-, as in kök + lA- "to become blue" from kök "blue," or sözlä- "to speak" from söz "word."[1] Directional or factitive derivations use -gAr-, exemplified by and + gar- "to make swear" from and "oath."[1] Transitive or privative verbs form with +A-, such as sïrA- "to be without" from a nominal base implying absence.[1] Causative suffixes like -t- or -Xrt- derive transitives from nouns, though specific Old Turkic examples are contextually integrated into broader verbal stems.[1] De-adjectival verbs mirror this with +lA- for inchoatives, e.g., yavlakïla- "to worsen" from yavlakï "bad," or -gUr- for state changes like ärgür- "to become man" from är "man."[1] Other forms include -gA- for suitability, as in yara-gï- "to be suitable" from yara "fitting," and -rA- for becoming, e.g., sädrä- "to become sparse" from sädrä "sparse."[1] Beyond affixation, compounding is a key process, combining elements like nouns with nouns (türk bodun "Turk people" from türk "Turk" and bodun "people") or nouns with adjectives (yagïz-elig "brave hand" from yagïz "brave" and el "hand").[1] Zero-derivation allows nouns or adjectives to function as verbs without overt marking, following an ergative alignment: intransitives yield subject nouns (e.g., ḳarï "old" from ḳarï- "to become old"), and transitives yield objects (e.g., kes "piece" from kes- "to cut").[17] Such pairs exist, including ač "hunger" and ač- "to be hungry."[17] Apophony, involving vowel lengthening, serves similarly: intransitives produce subject nouns (e.g., tïn "breath" from tïn- "to breathe" via tï̄n), and transitives produce objects (e.g., yār "cliff" from yar- "to split").[17] Such pairs are attested, indicating pre-Old Turkic origins integrated into the morphological system.[17] Phonological adaptations, such as vowel harmony and consonant assimilation, condition suffix allomorphy (e.g., +lXg as +lïg or +lag), ensuring harmony with the stem while preserving semantic productivity.[18] These processes underscore Old Turkic's agglutinative nature, where derivation builds layered stems before inflectional suffixes attach.[1]Lexicon
Core Vocabulary
The core vocabulary of Old Turkic encompasses the fundamental lexicon attested in early runic inscriptions, such as those from the Orkhon Valley (8th century CE), and later texts like the Dīwān Lughāt al-Turk (11th century), reflecting basic concepts related to nature, society, kinship, and human actions. These terms, often monosyllabic or disyllabic, demonstrate the language's agglutinative nature and vowel harmony, with many surviving in descendant languages like modern Turkish and Kazakh. Scholarly reconstructions, such as those in Marcel Erdal's A Grammar of Old Turkic (2004), identify over 1,000 lexical items from these sources, prioritizing native Turkic roots over borrowings.[1] The Swadesh-inspired basic word lists for Old Turkic, derived from inscriptional data, highlight stability in core items like body parts and numerals, with etymological analyses in Gerard Clauson's An Etymological Dictionary of Pre-Thirteenth-Century Turkish (1972) tracing them to Proto-Turkic forms.[19] Representative examples of core nouns illustrate the language's focus on nomadic and steppe life. Terms for natural elements include suv ('water'), yer ('earth' or 'land'), kün ('sun' or 'day'), and täŋri ('god' or 'sky deity'), frequently appearing in commemorative inscriptions to invoke divine favor.[1] Kinship and social terms feature ana ('mother'), ata ('father'), bodun ('people' or 'tribe'), and el ('country' or 'realm'), underscoring communal identity in texts like the Bilge Khagan inscription.[1] Body parts are denoted by baš ('head'), köz ('eye'), qol ('hand' or 'arm'), and ädgü ('good', often in compounds), with these attested across Orkhon and Uyghur sources for both literal and metaphorical use.[19] Core verbs capture essential actions, many forming the backbone of narrative in inscriptions. Basic motion and interaction verbs include bar- ('to go'), kel- ('to come'), al- ('to take'), and ber- ('to give'), used in phrases describing migrations and alliances, as in the Tonyukuk inscription.[1] Communication and state-change verbs such as ay- ('to say' or 'moon'), bol- ('to become'), kïl- ('to do' or 'make'), and öl- ('to die') appear ubiquitously, with aytu ('to speak') evolving into modern forms.[19] Perception verbs like kör- ('to see') and existential tur- ('to stand' or 'be') structure clauses in administrative and epic contexts.[1] Adjectives and qualifiers in the core lexicon emphasize quality and quantity, often derived from verbs or nouns. Positive attributes include ädgü ('good'), kutlug ('fortunate' or 'blessed'), ulug ('great'), and bilgä ('wise'), applied to rulers in inscriptions like those of Kul Tigin.[1] Descriptive terms cover qara ('black'), aq ('white'), sariγ ('yellow'), and köŋül ('heart' or 'mind', implying emotional states), reflecting color symbolism in Turkic culture.[19] Numerals form a stable set: bir ('one'), eki ('two'), üč ('three'), tört ('four'), and beš ('five'), used for counting livestock and years in historical records.[1] The following table summarizes selected core vocabulary across categories, drawn from primary inscriptional and manuscript evidence, with attestations noted for context:
This selection prioritizes high-frequency items from the earliest attestations, avoiding exhaustive listings while illustrating phonological and semantic patterns, such as front-back vowel harmony in pairs like ädgü (front) versus adak ('foot', back).[1]