Malayo-Polynesian languages
The Malayo-Polynesian languages constitute the largest branch of the Austronesian language family, encompassing over 1,200 distinct languages spoken by approximately 385 million people (as of 2025).[1][2] This branch excludes the Formosan languages of Taiwan, which represent the family's probable homeland, and instead covers the expansive dispersal of Austronesian speakers beyond Taiwan.[3] Geographically, these languages are distributed across a vast maritime region, from Madagascar in the Indian Ocean to [Easter Island](/page/Easter Island) in the Pacific, including the Philippines, Indonesia, Malaysia, much of Melanesia, Micronesia, and Polynesia.[4][1] Originating from Proto-Austronesian in Taiwan approximately 5,000 years ago, the Malayo-Polynesian languages spread through successive migrations involving seafaring and agricultural expansion, reaching their current range by around 1,000 BCE in many areas.[1] They are divided into three primary subgroups: Western Malayo-Polynesian, which includes about 500–600 languages in the Philippines, western Indonesia, mainland Southeast Asia, and Madagascar (such as Malay, Javanese, and Tagalog); Central Malayo-Polynesian, comprising around 170 languages primarily in eastern Indonesia (e.g., Tetum and Buru); and Eastern Malayo-Polynesian, with about 500 languages across eastern Indonesia, Melanesia, Micronesia, and Polynesia (including Fijian, Hawaiian, and Māori).[3][5] This diversification reflects intense language contact with non-Austronesian populations, particularly in areas like New Guinea, leading to substrate influences and hybrid features in some languages.[1] Linguistically, Malayo-Polynesian languages share inherited traits from Proto-Malayo-Polynesian, such as a phonological inventory with typically four vowels and a set of 19 consonants, along with morphological patterns involving reduplication and affixation for verb formation.[6] Notable for their typological diversity, they range from isolating structures in Malay to highly agglutinative systems in Philippine languages, and include ergative alignment in some Oceanic varieties.[7] Major languages like Indonesian (a standardized form of Malay with approximately 200 million total speakers) and Javanese (around 82 million native speakers) serve as lingua francas in multilingual societies, underscoring the branch's cultural and economic significance in Southeast Asia and the Pacific.[8][9]Overview
Scope and Membership
The Malayo-Polynesian languages form the largest branch of the Austronesian language family, encompassing all Austronesian languages outside of Taiwan except for a few early-diverging ones, and are spoken by approximately 385 million people across vast regions from Madagascar to Easter Island.[10] This branch is defined by descent from Proto-Malayo-Polynesian (PMP), the reconstructed proto-language that emerged after the Austronesian expansion from Taiwan around 4,000–5,000 years ago.[11] Membership criteria hinge on shared innovations distinguishing PMP from Proto-Austronesian (PAN), particularly phonological mergers such as the collapse of PAN *N (uvular nasal) into *n (alveolar nasal) and PAN *C (preploded *k) into *t, alongside other sound changes like the treatment of PAN *S as *h or zero in many daughter languages.[6] These innovations, first systematically outlined by Robert Blust, provide robust evidence for the genetic unity of the group, though some are conditioned or irregular, requiring a cumulative "stones in the wall" approach to subgrouping.[12] As of the 27th edition of Ethnologue (2024), the branch includes 1,235 languages.[13] Notable early-diverging languages within Malayo-Polynesian include Chamorro (spoken in the Mariana Islands) and Palauan (spoken in Palau), which are classified as a primary subgroup within the Western Malayo-Polynesian clade due to their divergence near the PMP stem, evidenced by unique verb morphosyntax and limited shared innovations with other MP languages.[14] Similarly, the Moklenic languages (Moken and Moklen, spoken along the Andaman Sea coasts) have a debated position within Malayo-Polynesian but are generally included in the branch, often as part of the Malayic subgroup, based on lexical and phonological evidence.[15] The name "Malayo-Polynesian" was coined in the 19th century to highlight representative languages from the western (Malay) and eastern (Polynesian) extremes of the branch's distribution, first appearing in print in 1841 through the work of German linguist Franz Bopp, though the term's precise origin traces to earlier comparative efforts by scholars like Wilhelm von Humboldt.[16]Geographic Distribution and Speakers
The Malayo-Polynesian languages, as the primary extralimital branch of the Austronesian family, are distributed across a vast maritime region spanning the Indian and Pacific Oceans, from Madagascar in the west to Easter Island in the east.[3] Their core areas include island Southeast Asia—encompassing Indonesia, the Philippines, Malaysia, Brunei, Singapore, and Timor-Leste—as well as parts of mainland Southeast Asia in Vietnam, Cambodia, and Thailand; the island of Madagascar off Africa's east coast; and Oceania, divided into Near Oceania (New Guinea and nearby islands), Remote Oceania (Melanesia beyond New Guinea, Micronesia, and Polynesia).[17] Outliers include the Rapa Nui language on Easter Island, the easternmost extension of the family.[1] Collectively, these languages are spoken by approximately 385 million people worldwide, with around 350 million native (L1) speakers and the remainder as second-language (L2) users, representing about 4.8% of the global population.[10] Of these, around 350 million are L1 speakers, with the remainder primarily L2 users of lingua francas like Indonesian. The majority of speakers—over 250 million—are concentrated in Southeast Asia, particularly Indonesia (home to roughly 200 million L1 speakers of languages like Javanese, Sundanese, and Indonesian) and the Philippines (about 110 million, primarily Tagalog, Cebuano, and Ilocano).[18] In Malaysia and Brunei, Malay serves as the dominant language for some 20 million L1 speakers.[19] Oceania accounts for around 3 million speakers, with higher densities in Remote Oceania (e.g., Polynesia, where languages like Hawaiian and Maori are spoken by over 1 million combined) and sparser populations in Near Oceania due to extensive contact and substrate influence from non-Austronesian Papuan languages.[20] Madagascar has approximately 29 million Malagasy speakers, nearly the entire population.[21] Diaspora communities have grown through 20th- and 21st-century labor migration, trade, and education, forming pockets in Australia (e.g., over 100,000 speakers of Filipino and Indonesian languages per recent census data), North America (e.g., 1.7 million Tagalog speakers in the US), and Europe (e.g., smaller Malay and Polynesian communities in the UK and Netherlands).[22] These groups, estimated at several million globally in 2025, often maintain heritage languages alongside dominant local tongues.[23] Colonialism and historical trade networks significantly shaped distribution patterns, with European powers promoting lingua francas like Malay (basis of Indonesian and Malaysian) across trade routes in Southeast Asia and the Indian Ocean, while suppressing but not eradicating local varieties in places like the Philippines under Spanish rule.[20] In Oceania, missionary activities and colonial administration further disseminated Oceanic languages as contact varieties.[1]Terminology
Historical Terms
The concept of a "Malayan" language group emerged in the early 19th century through the work of German linguist Wilhelm von Humboldt, who in his 1836 treatise on the Kawi language of Java described a family of languages sharing structural and lexical similarities with Malay, extending across Southeast Asia and the Pacific. Humboldt's classification emphasized typological resemblances, such as agglutinative morphology and phonetic patterns, grouping languages from Madagascar to Polynesia under this broad "Malayan" umbrella, though he did not yet incorporate Formosan varieties.[24][25] By the mid-19th century, the term evolved with contributions from German and Dutch scholars applying comparative methods. Franz Bopp formalized "Malayo-Polynesian" (as malayisch-polynesisch) in 1841 to denote the entire family, highlighting connections between western Malayic languages and eastern Polynesian ones based on shared vocabulary and syntax. Dutch orientalist Hendrik Kern advanced this in the 1880s by using "Indonesian" to refer to the expansive family spanning Indonesia, the Philippines, and beyond, through systematic comparisons that integrated Sanskrit influences and Oceanic outliers. These efforts by German comparativists like Bopp and Dutch philologists like Kern established initial groupings, though debates persisted over excluding Papuan-influenced varieties.[1][26][6] In the 20th century, American linguist Paul K. Benedict reshaped the terminology in 1942 by employing "Malayo-Polynesian" (often interchangeably with "Indonesian") to explicitly encompass Oceanic languages within a proposed broader Austro-Thai alignment, drawing on phonological correspondences like initial consonant shifts. Alternative designations appeared in older schemes, such as "Extra-Formosan" for non-Taiwanese Austronesian branches, reflecting post-1940s recognition of Formosan distinctiveness, or "Oceanic-Austronesian" to prioritize Pacific expansions. Debates over "Malayo-Oceanic" as a potential primary node questioned the unity of western and eastern subgroups, but these remained marginal amid growing evidence for a unified Malayo-Polynesian clade.[27][28][29] As of 2025, "Malayo-Polynesian" remains the standard term in major linguistic databases, denoting the primary non-Formosan branch of Austronesian with over 1,200 languages. Ethnologue classifies it as encompassing subgroups like Western, Central, and Eastern Malayo-Polynesian, while Glottolog maintains it as a core node without proposing renamings, underscoring its entrenched role in comparative Austronesian studies.[30]Modern Nomenclature
In contemporary linguistics, the term "Malayo-Polynesian" (MP) serves as the standard designation for the major subgroup of Austronesian languages spoken outside Taiwan, encompassing the descendants of Proto-Malayo-Polynesian (PMP). This nomenclature is codified in authoritative databases, including Glottolog version 5.2, which assigns the glottocode "mala1545" to the family and recognizes it as distinct from Formosan languages, preferring this specific label over broader or subtractive descriptions like "Austronesian minus Formosan" for its precision in phylogenetic classification.[30] Similarly, Ethnologue categorizes MP as a primary branch under Austronesian, listing 1,235 living languages within this subgroup as of its 2025 edition. Ongoing debates in the field highlight concerns over the term's potential bias, as "Malayo-Polynesian" emphasizes languages at the western (Malayic) and eastern (Polynesian) extremes of the family's geographic range, potentially marginalizing the vast diversity of intervening languages, such as those in the Philippines and eastern Indonesia. Critics argue this naming convention perpetuates a Eurocentric or colonial-era focus derived from early 19th-century explorations, prompting proposals to reframe the group explicitly as "descendants of Proto-Malayo-Polynesian" to underscore their shared ancestry and reduce geographic bias in terminology.[29] Such discussions appear in recent comparative works, including Blust's comprehensive overview, which advocates for nomenclature that better reflects the subgroup's internal unity beyond endpoint languages.[31] In standardized references, Glottolog's 2025 count identifies 1,254 MP languages, providing a benchmark for inventorying that informs global linguistic atlases and supports cross-disciplinary research. Ethnologue's parallel categorization reinforces this by nesting MP under Austronesian while detailing subgroup hierarchies, ensuring consistency in ISO 639-3 compliant codes for individual languages.[30] Subgroup naming conventions within MP also evolve to address regional inclusivity; for instance, the traditional "Western Malayo-Polynesian" (WMP) label is increasingly supplemented or contrasted with geographic descriptors like "Insular Southeast Asian" in studies emphasizing areal linguistics over strict phylogeny. This shift aims to highlight the continuum of languages across island Southeast Asia without overemphasizing distant outliers.[32] Efforts to enhance representativeness extend to computational approaches, such as 2024 Bayesian phylogenetic analyses that integrate underrepresented Philippine and Indonesian MP varieties to refine classification models and challenge name-induced biases toward better-balanced portrayals of the family's core diversity.[1][33]Typological Characteristics
Phonological Features
The phonological systems of Malayo-Polynesian languages derive from the reconstructed Proto-Malayo-Polynesian (PMP) inventory, which features a relatively simple structure typical of early Austronesian languages outside Formosa. PMP is reconstructed with approximately 20 consonants, including voiceless stops *p, *t, *k, *q; voiced stops *b, *d, *z; nasals *m, *n, *ñ (palatal nasal), *ŋ; liquids *l, *r, *R (a uvular trill or flap); fricatives *s, *h; and approximants *w, *y, along with prenasalized forms like *mb, *nd, *ŋg. The vowel system comprises four phonemes: *a, *i, *u, and *ə (schwa), with no length distinctions. The canonical syllable structure is (C)V(C), allowing optional onsets and codas but prohibiting complex clusters, which promotes disyllabic roots as the norm.[31] Key innovations distinguishing PMP from Proto-Austronesian (PAN) include several mergers and simplifications that reduced the overall inventory. Notably, PAN *C (a voiceless alveolar or dental stop) merged with *h, yielding a glottal fricative in PMP environments, while PAN *Z (voiced alveolar affricate or fricative) shifted to *z (voiced alveolar fricative); PAN *D merged into *R (uvular flap or trill). Additionally, uvular distinctions present in PAN, such as potential contrasts involving *q, were lost or regularized, with *q retained but no separate uvular fricatives or additional stops. These changes reflect a trend toward phonetic simplification during the westward expansion of Malayo-Polynesian speakers.[31] Across Malayo-Polynesian languages, common phonological traits emphasize simplicity and predictability. Open syllables predominate, as many daughter languages have undergone loss of final consonants from the PMP (C)V(C) template, resulting in CV structures in forms like Tagalog or Malay. Vowel harmony appears in certain subgroups, such as height or backness assimilation in Kimaragang (western Borneo) or gradient co-occurrence restrictions in Oceanic languages, where non-high vowels in one syllable influence adjacent ones. Suprasegmental features are generally limited to stress, but outliers like Cham exhibit tone systems, with six contrastive tones developed through contact-induced register splits from Austroasiatic influences.[34][35][36] Phonological variation manifests in morphological processes that interact with the sound system. Reduplication, a hallmark for deriving plurality or intensification, often copies initial CV segments, as in PMP *lima "five" yielding *lima-lima "fingers" in reflexes across subgroups. Nasal assimilation is prevalent in verbal prefixes, where actor-focus *Grammatical and Syntactic Traits
Malayo-Polynesian languages display a spectrum of morphological complexity, ranging from predominantly isolating structures in western varieties to more agglutinative patterns in eastern subgroups, particularly those in the Philippines and Taiwan-adjacent regions.[31] A defining feature is the elaborate voice or focus system, inherited from Proto-Austronesian, which uses verbal affixes to highlight different semantic roles such as actor, goal, patient, locative, or beneficiary, rather than relying on fixed subject-object alignments.[38] For instance, in many Philippine languages like Tagalog, the infixHistorical Development and Classification
Relation to Formosan Languages
The Malayo-Polynesian (MP) languages constitute the sole primary branch of the Austronesian family extending beyond Taiwan, forming a sister group to the diverse Formosan languages indigenous to the island. This phylogenetic position positions MP as the extralimital offshoot of Proto-Austronesian (PAN), diverging from the Formosan clades approximately 5,500 years ago, around 3500 BCE, based on glottochronological estimates calibrated with archaeological data.[45] Linguistic evidence supporting this separation includes shared retentions from PAN, such as the reflex of the uvular stop *q as a glottal stop (ʔ) in many MP and Formosan languages, reflecting a common ancestral phonology. However, MP exhibits exclusive innovations absent in Formosan, notably the systematic shift of the fricative *S to /h/ in initial position (e.g., PAN *Saya 'sail' > Proto-MP *haya), which demarcates the MP boundary and supports its status as a unified subgroup.[31] Reconstructions of PAN, drawing from comparative lexicon across Austronesian languages, indicate that MP shares much of the core PAN vocabulary, including basic terms for body parts, numerals, and environment, while Formosan languages display greater internal diversity and archaic retentions, underscoring Taiwan as the likely homeland.[31] Debates persist regarding the internal structure of Formosan relative to MP; Blust (1999) argued for nine primary Formosan branches coordinate with MP as the tenth, based on shared innovations within each. In contrast, recent proposals suggest potential links between East Formosan languages and MP, indicating a more nested phylogeny through shared morphological traits; for example, a 2025 model posits "Late Malayo-Polynesian" as a revised subgrouping within Austronesian relations.[31][46] Recent interdisciplinary studies reinforce this linguistic phylogeny through genetics-linguistics correlations; a 2024 analysis of Y-chromosome haplogroup O2a2b-P164 (including O2a2b1a1) dispersal aligns the MP expansion timeline with the out-of-Taiwan model, showing genetic admixture patterns from Taiwanese indigenous populations into MP-speaking groups across Island Southeast Asia and Oceania.[47]Migration and Expansion History
The prehistoric dispersal of Malayo-Polynesian (MP) speakers began with their expansion from Taiwan, the homeland of the broader Austronesian family, around 4,000 to 3,500 years before present (BP). This initial movement marked the divergence of Proto-MP from Formosan languages and initiated the rapid spread of Austronesian-speaking populations across Southeast Asia and beyond. Linguistic and archaeological evidence indicates that MP speakers first reached the northern Philippines by approximately 4,000 BP, establishing a key staging point for further migrations.[48][49] Migration routes diverged into multiple directions from the Philippines. A southern pathway led through the central and southern Philippines to Borneo, Java, Sumatra, and eastern Indonesia by around 3,500 BP, facilitating the development of Western MP subgroups. To the north and west, groups moved into southern Vietnam and the Chamic region, influencing languages there through contact and settlement around 3,000 BP. An eastern route extended to the Bismarck Archipelago in Near Oceania by 3,500 BP, setting the stage for the Oceanic branch's expansion into Remote Oceania. These seafaring migrations relied on outrigger canoes and advanced navigation, enabling Austronesian speakers to traverse island chains over several centuries.[20][49][50] Archaeological evidence corroborates this timeline and routes. In Taiwan, the Dapenkeng culture (ca. 4,500–3,000 BP) provides early indicators of Austronesian maritime adaptations, including cord-marked pottery and shell tools linked to subsequent sites in the Philippines. Philippine sites, such as those in the Cagayan Valley, yield similar red-slipped pottery dated to 4,000–3,500 BP, bridging Taiwan and Island Southeast Asia. In Remote Oceania, the Lapita culture (ca. 3,500–2,500 BP) is marked by distinctive dentate-stamped pottery in the Bismarck Archipelago and beyond, directly associated with the arrival of Proto-Oceanic speakers.[51][50] During these expansions, MP speakers interacted with non-Austronesian populations, leading to significant linguistic and cultural exchanges. In Near Oceania, particularly the Bismarck Archipelago and Solomon Islands, Oceanic languages incorporated substrate influences from Papuan languages through prolonged contact, trade, and intermarriage starting around 3,500 BP. Farther afield, Austronesian voyagers reached Madagascar around 1,500 BP, introducing Malagasy (a Western MP language) and initiating the island's Austronesianization, where speakers from Borneo mixed with local African populations.[52] Recent advancements as of 2025 have refined these timelines through integrated Bayesian phylogenetic models and genomic data. A 2024 study applying Bayesian methods to Philippine language vocabularies supports a rapid MP expansion from Taiwan, estimating divergence times around 4,000 BP with high posterior probability for a single pulse migration through the archipelago. When combined with ancient DNA analyses, these models corroborate archaeological dates and highlight gene flow patterns, such as minimal Papuan admixture in early Oceanic settlers, providing more precise estimates for the Lapita dispersal at approximately 3,300–3,500 BP.[33][20]Internal Classification
Western Malayo-Polynesian Subgroups
The Western Malayo-Polynesian (WMP) languages constitute the largest and most diverse branch within the Malayo-Polynesian family, encompassing approximately 500–600 languages spoken by over 200 million people across Southeast Asia, including the Philippines, mainland Southeast Asia, the Indonesian archipelago, and as far west as Madagascar.[53] This branch is characterized by its geographic concentration in continental and island Southeast Asia, distinguishing it from the more oceanic extensions of the Central-Eastern Malayo-Polynesian groups. Key subdivisions include the Philippine languages, which form a major cluster with over 100 varieties such as those in the Greater Central Philippine group (e.g., Tagalog and Cebuano); the Malayic languages, spoken in the Malay Peninsula, Sumatra, Borneo, and beyond, including Malay and Iban; and the Chamic languages of mainland Southeast Asia and Hainan, such as Cham and Jarai.[30][54][55] These subgroups exhibit evidence of historical contact, particularly in the Chamic languages, which show substrate influences from Hmong-Mien languages due to prolonged interaction in southern China and Vietnam, including borrowed lexical items related to local flora and topography.[56] Major classificatory proposals have sought to refine the internal structure of WMP. Adelaar's 2005 Malayo-Sumbawan hypothesis posits a primary subgroup uniting the Malayic and Chamic languages with the Balinese-Sasak-Sumbawa (BSS) cluster, supported by shared phonological innovations such as the merger of Proto-Austronesian *ñ and *ŋ, and lexical parallels in basic vocabulary like terms for body parts and numerals.[57] Similarly, Blust's 2010 Greater North Borneo hypothesis identifies a robust subgroup comprising many Bornean languages, North Sarawak varieties, and Southwest Sabah dialects, justified by exclusive sound changes including the split of Proto-Malayo-Polynesian *R into distinct reflexes and shared lexical innovations for coastal environments.[58] WMP languages share several innovations that reflect their historical development, including an expanded lexicon associated with wet-rice agriculture, such as reflexes of Proto-Malayo-Polynesian *pajay ('rice in the field; rice plant') and *beRas ('husked rice'), which are retained and elaborated in Philippine and Malayic varieties to denote cultivation practices.[59] Phonologically, many WMP languages exhibit expansions in the syllable canon beyond the disyllabic roots typical of Proto-Austronesian, allowing complex onsets (e.g., prenasalized stops in Malayic) and occasional codas through lenition or borrowing, as documented in Blust's analyses of sound change patterns.[60] Recent scholarship has increasingly scrutinized the coherence of WMP as a monolithic subgroup. Smith's 2017 comprehensive classification of Bornean languages emphasizes local linkages and rejects broader Western Indonesian groupings like Blust's, instead proposing three primary Malayo-Polynesian branches—Moken-Moklen, Champa-Malayo, and a diverse Borneo core—based on irregular sound correspondences and lexical distributions. Building on this, Smith's 2025 "Late Malayo-Polynesian" model refines the relations of western dialects by arguing against large higher-order subgroups, attributing shared traits to late innovations and contact rather than deep common ancestry, supported by phylogenetic analysis of over 200 lexical items across 150 languages.[61]Central and Eastern Malayo-Polynesian Subgroups
The Central Malayo-Polynesian (CMP) subgroup consists of approximately 170 languages distributed across eastern Indonesia, encompassing the Lesser Sunda Islands from Lombok eastward, the Maluku Islands, and parts of western New Guinea, with prominent examples in regions like Flores and Timor. These languages form a diverse set spoken by communities in volcanic and island environments, reflecting adaptations to maritime and agrarian lifestyles. Key subgroups include the Sumba–Flores group, which covers languages on Sumba, Flores, and nearby islands such as those spoken by the Ngada and Ende peoples, and the Timor subgroup, featuring languages like Tetun and Mamba on Timor and adjacent areas.[62] The Eastern Malayo-Polynesian (EMP) branch bifurcates into two primary divisions: the expansive Oceanic subgroup, comprising around 450 languages spread across Melanesia, Micronesia, and Polynesia—including well-known varieties like Hawaiian, Samoan, and Fijian—and the smaller South Halmahera–West New Guinea (SHWNG) subgroup, with about 41 languages concentrated along the northern Moluccan coasts and the Bird's Head Peninsula of New Guinea, such as Biak and Ternate.[63][64] EMP languages are distinguished from their western counterparts by shared phonological and morphological innovations, including the preposing of demonstratives before nouns (e.g., shifting from post-nominal Proto-Malayo-Polynesian patterns to initial positioning in Oceanic constructions) and a sound merger of Proto-Malayo-Polynesian *R (uvular trill) and *D (voiced dental/alveolar stop) into a single lateral or flap reflex in many EMP varieties.[65][66] Classification proposals for CMP and EMP have evolved through linkage models emphasizing dialect continua over strict tree structures. Blust (2013) argues that CMP functions as a linkage—a network of overlapping innovations without a single proto-language—bridging western and eastern expansions, with shared retentions like nasal substitutions supporting its coherence as an intermediary stage.[31] Recent advances include Smith's (2025) "Late Malayo-Polynesian" model, which posits a dialect continuum around 3,000 years before present (BP) for non-Formosan Austronesian languages, replacing discrete CMP-EMP boundaries with a networked evolution driven by serial founder effects across Island Southeast Asia.[61] Complementing this, Bayesian phylogenetic analyses of Oceanic data in 2024 have employed cognate-based trees to date divergences, revealing rapid eastward spreads post-3,500 BP with reticulation signals in Melanesian contact zones.[67]Major Languages
Most Widely Spoken Varieties
The most widely spoken Malayo-Polynesian languages, measured by total number of speakers (including first and second language users), are concentrated in Southeast Asia, particularly Indonesia and the Philippines, where they serve as national or regional lingua francas. These languages exhibit significant L2 usage due to their official or educational roles, contributing to their broad reach across diverse ethnic groups. Indonesian stands out as the dominant variety, functioning as a unifying medium in one of the world's most linguistically diverse nations. Indonesian (Bahasa Indonesia), a standardized register derived from Bazaar Malay, boasts approximately 75 million first-language speakers and 177 million additional second-language users, for a total of 252 million. It is the official language of Indonesia, as well as Malaysian (a closely related standard) in Malaysia and Brunei, promoting inter-ethnic communication in these countries.[68] Javanese, the largest by native speakers, has over 80 million first-language users primarily in Central and East Java, Indonesia, where it remains a vital community language despite the prevalence of Indonesian in formal contexts.[9] Sundanese, spoken by around 45 million people mainly in West Java, Indonesia, is another major Western Malayo-Polynesian variety with strong regional vitality.[69] In the Philippines, Tagalog (the core of standardized Filipino) has about 30 million native speakers and reaches 90 million total users as the national language, essential for education and government.[70] Cebuano, with roughly 20 million speakers across the Visayas and northern Mindanao, ranks as the second-most spoken language in the Philippines after Filipino.[71] Madurese, numbering approximately 12 million speakers on Madura Island and eastern Java, Indonesia, maintains its status as a key ethnic language in those areas.[72] The following table summarizes the top varieties by speaker estimates (2025 data where available):| Language | First-Language Speakers (approx.) | Total Speakers (L1 + L2, approx.) | Primary Regions and Status |
|---|---|---|---|
| Indonesian | 75 million | 252 million | Indonesia (official); Malaysia, Brunei |
| Javanese | 80 million | 82 million | Central/East Java, Indonesia (regional) |
| Tagalog/Filipino | 30 million | 90 million | Philippines (national/official) |
| Sundanese | 45 million | 45 million | West Java, Indonesia (regional) |
| Cebuano | 20 million | 28 million | Visayas/Mindanao, Philippines (regional) |
| Madurese | 12 million | 12 million | Madura/East Java, Indonesia (regional) |