Fact-checked by Grok 2 weeks ago

Kra languages

The Kra languages, also known as Kadai or Gēyāng languages, form a primary branch of the , consisting of approximately six to eight closely related but diverse tonal languages spoken by small communities. These languages are characterized by their isolating , subject-verb-object , use of classifiers, and serial verb constructions, with rich and systems that can support up to nine lexical tones. With an estimated total of around 22,000 speakers as of , the Kra branch represents one of the smaller and less-documented subgroups within the Kra-Dai family, which overall encompasses over 90 languages and more than 100 million speakers across and southern China. The Kra languages are primarily distributed in the mountainous regions of southern , including the provinces of , , and , as well as northern in areas such as , , , and Sơn La. This geographic concentration reflects the early divergence of the Kra branch from proto-Kra-Dai, likely originating in southern before some groups migrated southward, with linguistic evidence suggesting interactions with neighboring Austroasiatic and Hmong-Mien languages. Key languages in the branch include Gelao (with around 5,000 speakers as of 2025 and three main dialect varieties: Southwestern, Central, and Northern), Lachi (approximately 10,000 speakers, mostly in ), Laha (about 1,400 speakers), Buyang (roughly 2,000 speakers across four villages), Qabiao (fewer than 1,000 speakers), and smaller varieties such as Bē and (also known as Nùng Vên). These languages are often endangered due to assimilation pressures from dominant and societies, with limited documentation available in English or other widely accessible formats. Linguistically, the Kra languages exhibit distinctive features that set them apart within Kra-Dai, such as a proto-tonal with four categories (*A, *B, *C, D) that evolved into high and low registers, often marked by glottal constriction in certain vocabularies, and contrasts in before stop codas like /-p, -t, -k/. For instance, Laha uniquely preserves lateral codas (-l, *-r), while Buyang displays sesquisyllabic word structures combining monosyllabic roots with prefixes. Reconstruction efforts, notably Weera Ostapirat's 2000 phonological study of Proto-, have identified shared innovations like initial consonant clusters and a core lexicon that supports the branch's coherence, while highlighting its basal position in the Kra-Dai family tree, predating the diversification of larger branches like and Kam-Sui. Recent phylogenetic analyses further indicate an early split for Kra around 5,000–6,000 years ago, aligning with archaeological evidence of cultural expansions in the region.

Introduction

Names

The Kra languages derive their name from the reconstructed Proto-Kra form *kra<sup>C</sup>, an autonym meaning "human being," which appears in various descendant languages as forms such as *kra, *ka, *fa, or *ha. This nomenclature was proposed by linguist Weera Ostapirat in his reconstruction of Proto-Kra, highlighting the group's internal self-designation for "person" or "people." Within China, the languages are commonly termed the Geyang (仡央) branch, a designation coined by Chinese scholars Min and Zhang by combining "Ge" from Gelao and "Yang" from Buyang to represent major subgroups. This name reflects official classifications in Chinese linguistic and ethnographic contexts, where it encompasses languages spoken primarily in Guizhou, Guangxi, and Yunnan provinces. Historically, the group was included under the broader label "Kadai," an older term introduced by Paul K. Benedict in 1942 to denote the entire Kra-Dai family, though it has since been narrowed or replaced in favor of "Kra" for this specific branch.

Significance

The Kra languages, as an early-diverging branch of the Kra-Dai family, play a pivotal role in reconstructing the proto-phonology and historical development of the broader language group. Their divergent features, including distinct tonal systems and consonant inventories, provide critical evidence for linking the Kra branch to other subgroups like Kam-Sui and Tai, revealing shared innovations such as glottal constrictions in certain tone categories and vowel length contrasts before codas. This has enabled linguists to trace the family's internal diversification, with phylogenetic analyses estimating the Proto-Kra divergence around 2,435 years before present, highlighting an ancient split that informs the overall timeline of Kra-Dai expansion from southern China. The systematic study of Kra languages culminated in Weera Ostapirat's seminal reconstruction of Proto-, which not only solidified their position within Kra-Dai but also inspired the modern nomenclature "Kra-Dai," derived from the reconstructed autonyms of the and Tai branches. This proposal marked a shift from earlier terms like "Tai-Kadai," emphasizing a more balanced representation of the family's structure and challenging prior views that marginalized as mere outliers. By demonstrating regular sound correspondences across Kra varieties—such as the development of four categories into high/low reflexes—Ostapirat's work (2000) established a foundation for comparative studies, underscoring Kra's value in resolving debates on the family's genetic coherence. Beyond , Kra languages hold sociolinguistic significance as the heritage of small ethnic minority groups in southern and , with total speakers numbering around 22,000 across seven languages, many of which are endangered due to assimilation pressures. Their preservation efforts contribute to documenting cultural diversity in the , where Kra-Dai languages, including Kra, act as vectors for areal features like and classifiers, influencing neighboring families such as Austroasiatic. This understudied branch thus aids in broader understandings of , migration, and identity in East and .

Reconstruction

Proto-Kra phonology

The phonology of , the reconstructed ancestor of the Kra branch of the Kra-Dai language family, was first systematically reconstructed by Weera Ostapirat in his 2000 monograph. This reconstruction draws on comparative data from six representative Kra languages and their dialects: Gelao (various varieties including A'ou and Aqaw), Lachi, Laha, Buyang, Paha, and Pubiao. Ostapirat's analysis identifies a syllable structure of the form (C₁)(C₂)V(C₃), where C₁ is a main initial, C₂ a medial or preinitial (often glottal or liquid), V a vocalic nucleus, and C₃ a final consonant or tone-bearing coda. The system reflects typical Kra-Dai areal features, such as sesquisyllabicity in some forms and the development of tones from earlier segmental contrasts, but with innovations like a robust retroflex series unique to the branch. Proto-Kra features a large inventory of 32 phonemes, including series of voiceless aspirated and unaspirated stops, voiced stops, nasals, affricates, fricatives, laterals, rhotics, and glides. Notably, it includes a full set of seven simple retroflex initials (*ʈ, *ʈʰ, *ɖ, *ʈʂ, *ʈʂʰ, *ɖʐ, ɳ) and eleven complex retroflex clusters (e.g., *ʈ-l-, *ɖ-l-, *ʔɳ-), as well as retroflex rhotics (*hr-, r-). These retroflexes, which do not survive as distinct sounds in any modern Kra , likely arose from earlier alveolar or palatal contacts and merged with alveolar or palatal series in daughter languages; for instance, Proto-Kra *mʈa^A 'eye' corresponds to alveolar reflexes like Gelao mta^1. Only seven consonants occur as finals: voiceless stops *-p, *-t, *-k; nasals *-m, *-n, *-ŋ; and possibly a glide or nasalized coda. Preinitial glottal stops (*ʔ-) and liquids (*l-, r-) frequently form clusters, contributing to sesquisyllabic onsets in words like *ʔɳəŋ^B 'salty'. Subsequent scholarship has questioned the retroflex series, proposing disyllabic origins for some forms (e.g., *ma.ta^A 'eye') to explain the lack of direct reflexes without invoking unattested mergers. The vowel system is modest, with six monophthongs forming a symmetrical trapezoidal pattern: high *i and *u, mid *e and *o, central *, and low *a. These occur in both open and closed syllables, with length potentially contrastive in some environments though not fully distinguished in the . Four diphthongs are posited—*ai, *aɯ, *ui, *au—restricted to open syllables, as in *kau^A 'forest'. Vowel qualities show regular correspondences across Kra languages, such as *ə merging with *a in some daughter branches. Lexical tones number four, labeled A, B, C, and D in the conventional Kra-Dai system, arising from the split of earlier proto-final consonants and types in Pre-Proto-Kra-Dai. Tone A is typically high rising or level, B low falling, C high with creaky or (reflecting a proto-glottal stop), and D low level or falling, confined to closed syllables with stop or nasal codas. For example, *na^A 'thick' contrasts with *na^D in checked syllables. This tonal system, while shared with other Kra-Dai branches, shows branch-specific innovations in C-tone and the restriction of D to non-open syllables. Ostapirat's ties these tones to higher-level Kra-Dai etyma, supporting the family's internal coherence.

Proto-Kra vocabulary

The reconstruction of Proto-Kra vocabulary relies on the , utilizing data from six primary Kra languages: Lachi, three varieties of Gelao, Buyang, Laha, Paha, and Pubiao (Qabiao). Weera Ostapirat's 2000 monograph provides the foundational lexicon, comprising around 250 etyma drawn from basic vocabulary domains such as body parts, numerals, , , and daily activities. These reconstructions emphasize monosyllabic roots with tonal distinctions (marked as A–D, corresponding to level, rising, falling, and checked tones) and petiolar prefixes (e.g., *C- for presyllables), reflecting the phonological system outlined in parallel studies. Ostapirat's etyma demonstrate regular correspondences across daughter languages, enabling the identification of innovations and retentions. For example, body part terms often preserve initial clusters or liquids, as in *krai B 'head' (reflected as /xɯi/ in Lachi and /kʰlɛ/ in Gelao) and *m-ʈa A 'eye' (cognate with /mta/ in Buyang and /mtaː/ in Pubiao). Such forms highlight Proto-Kra's sesquisyllabic tendencies in some roots, though most are reduced to monosyllables in modern reflexes. Kinship vocabulary includes *mai C 'mother' (seen in /mɛ/ Lachi and /mɔj/ Gelao) and *pa B 'father' (/pʰa/ Buyang, /pa/ Pubiao), underscoring familial terms' conservatism. Natural phenomena etyma, like *ʔuŋ C 'water' (/ʔuŋ/ Lachi, /ʔɔŋ/ Gelao), reveal shared semantic fields with higher-level Kra-Dai reconstructions. Numeral systems are among the most stable, providing crucial evidence for subgrouping and external affiliations. Ostapirat reconstructs a decimal base with forms showing initial variation and tonal contours:
NumeralProto-Kra FormExample Reflexes
one*tʂəm C/tʃʰam/ (Lachi), /tsʰaŋ/ (Gelao)
two*sa A/saj/ (Buyang), /sa/ (Pubiao)
three*tu A/tʰu/ (Pubiao), /to/ (Gelao)
four*pə A/pə/ (Lachi), /fa/ (Buyang)
five*r-ma A/ŋma/ (Gelao), /ma/ (Laha)
six*x-nəm A/snam/ (Lachi), /nɛm/ (Pubiao)
seven*t-ru A/tʰɯ/ (Buyang), /sru/ (Gelao)
eight*m-ru A/mɯ/ (Paha), /pʰru/ (Lachi)
nine*s-ɣwa B/sŋwa/ (Gelao), /kwa/ (Pubiao)
ten*pwlot D/pʷlɔt/ (Buyang), /plɔt/ (Lachi)
These numerals exhibit potential irregularities, such as the uvular initial in *x-nəm A 'six', which aligns with Proto-Kra-Dai patterns but suggests pre-Proto-Kra variation. Some vocabulary items indicate possible loans or substratal influences, particularly in and terms. For instance, *m-səm A '' is flagged as a potential borrowing due to irregular correspondences, while *za C 'dry field' (noted in broader Kra-Dai contexts) may reflect early contact with Sino-Tibetan speakers. Overall, the lexicon supports Kra's position as an early-diverging Kra-Dai branch, with limited but notable parallels to Austronesian (e.g., *sa A 'two' resembling *Esa 'one' in some analyses). Subsequent works, such as Ostapirat's 2018 Proto-Kra-Dai efforts, refine select etyma but largely build on the 2000 foundation without overhauling the core vocabulary.

Classification

Ostapirat (2000)

In 2000, Weera Ostapirat published a seminal of Proto-Kra in his dissertation, establishing the Kra languages as a primary branch of the Kra-Dai family distinct from , Kam–Sui, and Hlai. Drawing on comparative data from phonological correspondences and shared vocabulary, Ostapirat identified Kra as a coherent genetic unit supported by approximately 40 lexical innovations unique to the group, such as reflexes of proto-forms not found in other Kra-Dai branches. This work shifted the understanding of Kra from Benedict's earlier "Kadai" outliers to a well-defined , emphasizing innovations like complex initial clusters and tonal developments. Ostapirat proposed a classification into four main subgroups based on systematic sound changes and lexical retentions, treating Gelao, Lachi, and Laha as having internal dialectal divisions while others remain more uniform. The Western Kra subgroup includes Gelao (with northern, southern, and southwestern varieties) and Lachi (with northern, southern, and southwestern varieties), sharing innovations such as merged initial stops and specific vowel shifts. Southern Kra is represented by Laha (northern and southern dialects), characterized by retained aspirated stops and distinct tonal contours. Central Kra consists solely of Paha, a conservative preserving proto-initial fricatives. Eastern Kra encompasses Buyang (northern and southern dialects) and Lakkia, unified by shared retroflex initials and lexical items like *kraw for 'person'. Additionally, Ostapirat incorporated Laqua (also known as Pubiao or Qabiao) as a monotypic branch, linking it closely to Eastern Kra through phonological parallels, such as simplified syllable codas. This structure highlights Kra's internal diversity while demonstrating its unity via proto-forms like *ʔŋaːᴬ 'I' and *mruːᴮ 'dog', reconstructed across the subgroups. Ostapirat's classification excluded languages like Sui and Kam, reassigning them to Kam–Sui, thereby refining the family's internal phylogeny and influencing subsequent research.
SubgroupLanguages and VarietiesKey Innovations
Western KraGelao (northern, southern, southwestern); Lachi (northern, southern, southwestern)Merged voiceless stops; patterns
Southern KraLaha (northern, southern)Retained ; mid-tone developments
Central KraPahaPreserved initials
Eastern KraBuyang (northern, southern); LakkiaRetroflex series; shared ethnonyms
(Monotypic)Laqua/PubiaoSimplified codas; lexical ties to Eastern

Hsiu (2014) and later updates

Andrew Hsiu advanced the classification of the Kra languages through extensive fieldwork and phylogenetic methods, building on prior work by Edmondson (2011). His 2014 analysis incorporated computational phylogenetics to refine subgroupings, emphasizing the internal diversity of key languages like Gelao and the position of Biao. Hsiu proposed that Biao, spoken in northwestern Guangdong, consists of three mutually unintelligible varieties (Shidong, Yonggu, and Dagang) that share phonological and lexical features with Lakkja, potentially forming a distinct subgroup within Kra-Dai or an independent primary branch coordinate with Kra. This placement highlights Biao's peripheral status relative to core Kra languages, with shared innovations in initial consonants and vocabulary suggesting early divergence. Central to Hsiu's framework is a detailed subdivision of the Gelao languages, the most diverse subgroup, based on comparative wordlists and dialect surveys. He positioned Lachi as a close sister to Gelao within Northern Kra, diverging early but retaining shared Proto-Kra retentions like lateral codas. Gelao itself divides into five main color-based subgroups, each encompassing multiple endangered varieties: Red Gelao (e.g., Vandu, A'ou, Bigong, Hongfeng, Houzitian), White Gelao (e.g., Judu, Moji, Wantao, Yueliangwan, Laozhai), Central Gelao (Qau and Hakei clusters), Black Gelao (Ayo, Aqao, Mulao), and Green Gelao (Dongkou, Xinzhai, Wanzi, Dagouchang). These subgroups exhibit mutual unintelligibility and varying degrees of , with Red Gelao varieties particularly vulnerable, some spoken by fewer than 50 individuals. Hsiu's broader Kra classification aligns with and extends Edmondson's (2011) model, dividing the branch into Northern Kra (Gelao–Lachi) and Southern Kra (Laha, Buyang complex including Paha and Ecun, and Qabiao/Pubiao). Northern Kra languages preserve archaic features like complex consonant clusters, while Southern Kra shows innovations in and vowel systems. This structure underscores Kra's basal position in Kra-Dai, with evidence of substratal influences from Hmong-Mien and Austroasiatic. Subsequent updates to Hsiu's framework include his 2017 documentation of mixed languages like Hezhang Buyi, which reveal Kra substrata in Northern varieties, supporting deeper Kra-Tai interactions. More recently, a Bayesian phylogenetic using 100 Kra-Dai languages confirmed Kra's as one of five primary branches (alongside Hlai, Ong-Be, , and Kam-Sui), with divergence estimated around 4,000–5,000 years ago in southern , linked to environmental and migratory shifts. This analysis reinforces Hsiu's subgroupings through high posterior probabilities for internal nodes, while suggesting ongoing refinement via expanded lexical datasets. Hsiu's MSEA Languages project continues to provide tentative updates, incorporating new field data on varieties like Red Gelao dialects.

Substrata

The Kra languages, spoken primarily in southern , show evidence of substrate influences from adjacent language families due to historical contact in multilingual regions of , , and provinces. These influences are most prominently attested through lexical borrowings and structural features borrowed from Northern Austroasiatic and , reflecting the complex ethnolinguistic landscape of the area where Kra speakers interacted with pre-existing populations. Northern Austroasiatic substrates are evident in basic vocabulary items across several Kra languages, such as words for 'water' and 'meat', which align with forms from branches like Khasi–Palaungic. Qabiao and Buyang (excluding the Paha dialect) exhibit particularly heavy Austroasiatic borrowing, likely from local Northern Austroasiatic varieties, including terms related to daily life and environment that integrated early into the lexicon. This suggests that Kra expansion involved assimilation of Austroasiatic-speaking groups, contributing to phonological and lexical layering in these languages. Tibeto-Burman influences are similarly widespread, with loanwords for body parts and natural phenomena, including 'flower', 'hair', and 'mouth', appearing in core Kra varieties like Buyang and Gelao. Structural parallels include pre-verbal negators, such as *ma- in Pudi and Judu Gelao or *pi- in Paha Buyang, which mirror Tibeto-Burman patterns (e.g., *ma- in Proto-Tibeto-Burman) and are rare elsewhere in Kra-Dai, indicating early contact-mediated adoption. These features likely stem from interactions with Lolo-Burmese or Qiangic groups in northwestern and . Limited Hmong-Mien substrate effects are noted in peripheral Kra languages like Biao, with borrowings for internal body parts such as 'liver', pointing to localized contact in mixed communities. Overall, these substrata highlight the Kra languages' role as a northern in Kra-Dai, shaped by prolonged areal rather than .

Demographics

Speaker populations

The Kra languages, a small branch of the Kra-Dai family, are spoken by a relatively modest number of people, with estimates for the total speaker population ranging from approximately 10,000 to 22,000 individuals across and . These languages are primarily associated with ethnic minority groups facing significant pressures from dominant languages like and , leading to high degrees of . Many Kra varieties are spoken only by older generations, with intergenerational transmission declining rapidly due to , policies, and economic . Speaker numbers vary widely by language, reflecting fragmented ethnic classifications and limited documentation. For instance, the Gelao languages (encompassing several dialects like A'ou, Cao Lan, and Qalao) are spoken by fewer than 6,000 people, primarily in , , where they constitute just 1.2% of the ethnic Gelao population of around 500,000. Recent assessments confirm this low figure, emphasizing the languages' status. The following table summarizes approximate speaker populations for major Kra languages, based on key linguistic surveys (figures are estimates and may include ethnic populations where direct speaker counts are unavailable; data from the early onward show stability or slight decline):
LanguageApproximate SpeakersPrimary LocationsNotes/Source
Gelao (various dialects)5,000–6,000, Critically endangered; ethnic population much larger.
Buyang (including Paha)~2,000/, ; northern VietnamSmall ethnic group; spoken in border villages.
Lachi~2,000, ; Hà Giang/, VietnamEthnic La Chí population ~10,000, but speakers limited to adults.
Laha~1,400/Sơn La, VietnamEthnic population ~5,700; used by older adults only.
Qabiao (Pubiao)700–1,300, ; Hà Giang, VietnamIncreasing slightly from 1989 census; endangered.
En (Nùng Vên)~250Cao Bằng, VietnamNear-extinct; minimal documentation.
Mulao0 (extinct), Last fluent speakers deceased; ethnic classification persists.
These populations highlight the Kra branch's vulnerability, with most languages classified as endangered or moribund by international standards. Efforts to document and revitalize them remain limited, though fieldwork by linguists like Weera Ostapirat has aided preservation.

Geographic distribution

The languages, a branch of the Kra-Dai family, are primarily distributed across and , with speakers concentrated in remote, mountainous regions that reflect their historical dispersal from ancestral homelands in the River basin during the late . Phylogeographic evidence indicates an early divergence and southward migration of Kra-Dai speakers, including Kra, originating from the Guangxi-Guangdong coastal area of toward around 4,000–3,000 years ago, driven by agricultural expansions and environmental changes. This distribution underscores the Kra languages' role as a northwestern periphery of the Kra-Dai family, with small, scattered communities often living alongside other ethnic groups like the Zhuang and Hmong-Mien. In China, Kra languages are spoken mainly in the provinces of Guizhou, Guangxi, and Yunnan, where they form pockets in karst highlands and river valleys. Guizhou hosts the largest concentrations, particularly of Gelao varieties in counties like Longli, Duyun, and Rongjiang, with historical records tracing Gelao presence to the Tang Dynasty (7th–10th centuries CE). Guangxi features Buyang and related dialects in western areas such as Longlin and Napo counties, while Yunnan has Lachi in Jinchang, Paha in Yangliu, and Buyang in Xishuangbanna, often in border villages near Vietnam. These locations highlight the Kra's autochthonous status in pre-Han indigenous territories, with populations estimated at under 100,000 speakers total across China, many shifting to Mandarin or local dominant languages. In , Kra languages extend into the border provinces of , , and Sơn La, comprising a smaller but diverse set of communities amid ethnic minorities like the and . Gelao is spoken in 's Yên Minh district (e.g., Bản Ma Ché village), Lachi in nearby Đồng Văn and Quản Bạ districts (e.g., Bản Phùng), and Laha (or Pubiao variants) in 's Bắc Hà and Sơn La's Mường La, with some Buyang influence in . This transborder distribution, totaling fewer than 10,000 speakers, stems from migrations during the Qin-Han eras (221 BCE–220 CE), when Kra groups were displaced southward by expansions, preserving linguistic diversity in isolated highland enclaves despite pressures from and assimilation.

Linguistic features

Phonological characteristics

The Kra languages are characterized by a rich tonal system inherited from Proto-Kra, which featured a four-way tonal contrast labeled as tones A, B, C, and D. Tone A is associated with or open endings and voiced onsets in the ; tone B is linked to lax voicing features; tone C involves tense ; and tone D is restricted to checked syllables ending in stops. This system has undergone mergers and splits in daughter languages, resulting in 4 to 9 s in modern varieties, with some languages like certain Gelao dialects showing tonal mergers due to contact influences. Consonant inventories in Kra languages are complex, featuring voiceless, voiced, and aspirated stops, as well as affricates and fricatives, with Proto-Kra reconstructing 32 across labial, alveolar, postalveolar, retroflex, palatal, velar, and glottal places of articulation. Initial clusters are common, including prenasalized stops (e.g., *mb-, *nd-) and lateral clusters (e.g., *kl-, *kr-), which reflect an earlier stage of complexity before reduction in some branches. A is the presence of breathy-voiced stops in languages like Lachi and Buyang, derived from Proto-Kra voiced stops, and a proposed retroflex series (*ʈ, *ɖ, *tʂ, dʐ, etc.) in the proto-reconstruction, though this has been debated as potentially arising from disyllabic forms or rather than a dedicated series, given the lack of direct preservation in modern Kra languages. Final consonants are limited to eight in Proto-Kra: nasals (-m, *-n, -ŋ), liquids (-l, -r), and stops (-p, *-t, *-k), with *-l often developing into tones or glottal stops in contemporary varieties. The vowel system of Proto-Kra includes six monophthongs—high (*i, *u), mid (*e, *o, ), and low (a)—with length distinctions playing a role in tonal conditioning, particularly in open syllables. Diphthongs are restricted to four open-syllable rimes (-ai, *-aɯ, *-ui, *-au), which often merge or shift in daughter languages; for instance, *-aɯ may become a or trigger backing in Gelao. or fronting/backing patterns appear in some modern Kra languages, influenced by areal contact with Sino-Tibetan groups, but these are not systematic in the proto-level reconstruction. Syllable structure in Kra languages follows a (C)(C)V(C) template, with sesquisyllabic or disyllabic forms emerging from historical or borrowing, though monosyllabicity dominates due to tone-bearing requirements. Unlike other Kra-Dai branches, Kra languages retain more conservative final consonants and clusters, contributing to their phonological diversity, but they lack widespread , distinguishing them from neighboring tonal families like Hmong-Mien. These features underscore the Kra branch's early divergence within Kra-Dai, with phonological innovations often linked to substratal influences from pre-Austroasiatic or Sino-Tibetan substrates in southern .

Numeral systems

The numeral systems in Kra languages are characterized by their retention of the ancestral Proto-Kra-Dai , a feature not shared with the and Kam-Sui branches, where native forms have been extensively replaced by borrowings from or other . This preservation allows for reliable reconstruction of early Kra-Dai , primarily drawing from Kra and Hlai evidence, and highlights potential historical connections to Austronesian numeral forms, as initially noted in studies. In Kra languages, typically function with classifiers for nouns, following the analytic structure common to the family, and higher numbers beyond ten are often formed by multiplication or addition, such as combining units with terms for ten or hundred. The Proto-Kra , reconstructed by Ostapirat (2000), provides a foundational inventory for the branch, reflecting a or base with distinct roots for 1–10. These forms are attested across daughter languages like Gelao, Lachi, Buyang, and Qabiao, with variations due to phonological shifts, changes, and occasional loss (e.g., the *r- in five). For instance, the form for "five" (*r-ma^A) appears as mpu in some Gelao varieties and ma in Buyang, while "six" (*x-nəm^A) is realized as nəm or naŋ in Gelao and Qabiao. This system underscores the conservative nature of Kra phonology and lexicon compared to more innovative branches.
NumeralProto-Kra ReconstructionTone CategoryExample Reflex (Language)
one*tʂəmCtʃəm (Proto-Western Kra, e.g., Lachi)
two*saAsu (Gelao)
three*tuAta (Gelao)
four*pəApu (Gelao)
five*r-maAma (Buyang)
six*x-nəmAnəm (Qabiao)
seven*t-ruAʈu (Proto-Southern Kra, e.g., Laha)
eight*m-ruAmu (Buyang)
nine*s-ɣwaBswa (Gelao)
ten*pwlotDblɔt (Buyang)
hundred*kjənAkən (Proto-Eastern Kra, e.g., Qabiao)
Reconstructions are from Ostapirat (2000), with reflexes drawn from comparative data in the same source. The tones (A–D) correspond to the Proto-Kra system, abstract categories associated with structure and laryngeal features, where modern reflexes include level, rising, falling, and checked tones. This inventory demonstrates regular correspondences, such as the development of *x- to h- or loss in some reflexes, and supports the broader Kra-Dai family's isolating in numeral usage.

References

  1. [1]
    Kra-Dai Languages
    ### Overview of Kra-Dai Languages, Focusing on the Kra Branch
  2. [2]
    (PDF) Kra or Kadai languages - ResearchGate
    Nov 20, 2014 · In most of its features, however, the Kra or Kadai languages resemble sister groups in the ; Kam-Tai branch. Like Kam and Zhuang the word order ...
  3. [3]
    Kra-Dai Languages - Center of Excellence in Southeast Asian ...
    Apr 10, 2018 · Kra-Dai (also called Tai-Kadai and Kam-Tai) is a family of approximately 100 languages spoken in Southeast Asia, extending from the island of Hainan, China, in ...
  4. [4]
  5. [5]
    Reanalyzing the genetic history of Kra-Dai speakers from Thailand ...
    May 24, 2023 · Introduction. Kra-Dai is a language family uniting about 90 languages spoken mainly in Southern China, Laos, Thailand, Vietnam, and Myanmar.<|control11|><|separator|>
  6. [6]
    [PDF] Kra-Dai and the Proto-History of South China and Vietnam1
    Weera Ostapirat (2000) classifies the Kra languages into six groups, of which Gelao,. Lachi, Laha, and Buyang have subgroups, while Paha and Pubiao (Laqua) are ...
  7. [7]
    [PDF] Kra : The Tai Least-Known Sister Languages
    themselves *kra C, whose original meaning is `human being'. ... Forms followed by (v) are gleaned from. 256. Page 24. Ostapirat. Kra: The Tai Least-Known. Sister ...<|control11|><|separator|>
  8. [8]
    Kra-Dai Languages | Oxford Research Encyclopedia of Linguistics
    Jan 25, 2019 · The Kra-Dai languages, also referred to as Tai-Kadai, Daic, or Kadai, constitute one of the world's major language families, spoken by ...
  9. [9]
    Phylogenetic evidence reveals early Kra-Dai divergence ... - Nature
    Oct 30, 2023 · The inferred language relationships among these five branches were consistent with Ostapirat's classification. The estimated divergence ...
  10. [10]
    Proto-Kra - eScholarship
    Download PDF. Main. PDF. Share. EmailFacebook. Proto-Kra. 1999. Ostapirat, Weera ... Main Content Metrics Author & Book Info. Main Content. Download PDF to View
  11. [11]
    Linguistics of the Tibeto-Burman Area
    Articles by WEERA Ostapirat (Click to see all in SALA). DJVU PDF Weera, O. 2000, "Proto-Kra", in Linguistics of the Tibeto-Burman Area, vol. 23, no. 1, pp. 1 ...
  12. [12]
    and Pre‑Proto‑Austronesian numerals with some help from Kra‑Dai
    Aug 5, 2025 · Proto-Austronesian numeral reconstruction typically includes the reconstructions *əsa 'one' and *ənəm 'six'. These lexemes are noteworthy ...
  13. [13]
    Weera Ostapirat : Proto-Kra - Persée
    As Ostapirat suggests himself, these may be early loans from Tay-yay languages into Kra. Other loanwords have irregular correspondences: "chicken": Chinese kej ...<|control11|><|separator|>
  14. [14]
    The Biao languages of northwestern Guangdong, China - Zenodo
    Biao consists of three mutually unintelligible Kra-Dai (Tai-Kadai) languages spoken primarily in Huaiji County, Guangdong Province, China.
  15. [15]
    KRA-DAI - MSEA Languages
    It now appears that Kra-Dai (also known as Tai-Kadai) consists of perhaps 7 or 8 branches. It is still unclear how these branches fit together.
  16. [16]
    [PDF] The Gelao languages: Preliminary classification and state of the art
    Gelao's position in Kra-Dai. Kra-Dai (Tai-Kadai): primary branches. ○ Tai. ○ Hlai. ○ Ong Be. ○ Kam-Sui. ○ Kra. Source: Ostapirat (2000). Page 3. Goals of this ...<|control11|><|separator|>
  17. [17]
    (PDF) Notes on the Subdivisions in Kra - ResearchGate
    PDF | Kra is a language group related to Tai and Kam-Sui, which has been ... Ostapirat, Weera. 2000. Proto Kra. LTBA. 23.1-251. Sapir, Edward. 1968 (1916) ...
  18. [18]
    [PDF] Hezhang Buyi: a highly endangered Northern Tai language with a ...
    Unlike Maza, which has various lexical items of Kra origin (Hsiu 2014),. Kra lexical items have not yet been detected in Yang Zhuang, but circumfixal ...
  19. [19]
    MSEA Languages - Potential loanwords in Kra - Google Sites
    Tibeto-Burman loanwords in individual Kra languages. dog. Buyang (Langjia) ... Tibeto-Burman loanwords appear to have been borrowed very early into Kra.
  20. [20]
  21. [21]
    KRA OR KADAI LANGUAGES | 35 | Jerold A. Edmo
    ... Laha, ᢝજ La Ha, 1,400 speakers Pubiao, ᱂ᷛ, Qabiao, Pu Peo, 700 speakers En, ր᭛, Nùng Vên, 250 speakers. The total number of speakers amounts perhaps to 22,000.
  22. [22]
    Gelao Language | Encyclopedia MDPI
    Nov 30, 2022 · Zhou (2004) reports that there are no more than 6,000 Gelao speakers, making up only 1.2% of the total number of ethnic Gelao people. The ...
  23. [23]
    Gelao: A highly marginalized language of China
    Mar 3, 2025 · Gelao is one of the most endangered languages in China, with only approximately 5,000 people estimated to be able to speak the language.
  24. [24]
    Laha Language (LHA) - Ethnologue
    Laha is an endangered indigenous language of Vietnam. It belongs to the Kra-Dai language family. The language is used as a first language by older adults only.
  25. [25]
    Mulao Language (GIU) - Ethnologue
    Mulao is an extinct language of China. It belongs to the Kra-Dai language family.
  26. [26]
    [PDF] Proto-Kra - CHAPTER 1
    This study presents a phonological comparison and reconstruction of the. Kra language group, which includes the following six languages and their varieties: ...
  27. [27]
    [PDF] Southeast Asian tone in areal perspective
    Mar 28, 2015 · Tai-Kadai (also Kra-Dai) languages make up another 23.1% of our sample (43 languages). Tai-Kadai languages are mostly monosyllabic, although ...<|control11|><|separator|>