Fact-checked by Grok 2 weeks ago

Phonation

Phonation is the physiological process by which the vocal folds in the vibrate to produce sound for speech and voice, driven by from the lungs that forces the folds apart and together in rapid cycles. This generates a typically ranging from 85 to 180 Hz in adult males and 165 to 255 Hz in adult females, determining , while the of influences . The , positioned in the neck between the trachea and , houses the vocal folds—multilayered structures of muscle and mucosa that (close) via muscles like the lateral cricoarytenoid and abduct (open) via the posterior cricoarytenoid to regulate . The mechanics of phonation follow the myoelastic-aerodynamic theory, where subglottal pressure from exhalation builds until it overcomes vocal fold tension, initiating self-sustained through , which creates suction to draw the folds back together after they part. Each vibration cycle consists of closed, opening, open, and closing phases, with the —the space between the folds—modulating airflow to produce quasi-periodic sound waves, alongside minor turbulent noise components. Control of phonation involves intrinsic laryngeal muscles, such as the cricothyroid for lengthening and tensing the folds to raise pitch, and the thyroarytenoid for shortening and relaxing them to lower pitch, coordinated with respiratory muscles for sustained output. Disruptions, like incomplete adduction, can result in dysphonia, characterized by hoarse or weak voice quality due to inefficient vibration. Phonation exhibits variations in quality across individuals and languages, including (default smooth ), breathy (loose with added ), creaky (irregular, low-frequency ), and tense (pressed, high-tension) types, which convey linguistic contrasts like or in over 50 languages worldwide. For instance, languages in the Otomanguean distinguish creaky from phonation on vowels, while others like !Xóõ employ up to five phonation types phonemically. These contrasts arise from differences in vocal fold tension, glottal patterns, and , analyzed acoustically via measures like open quotient and spectral tilt. In , phonation serves as the primary sound source, filtered and shaped by the vocal tract (pharynx, , and nasal cavities) through to form consonants and vowels, enabling communication. Vocal intensity is modulated by increasing subglottal (typically 200–800 for conversational speech) and glottal , while disorders affecting phonation, such as vocal fold from damage, impair quality and require medical intervention. Overall, phonation's precise neuromuscular and aerodynamic integration underscores its essential role in human and linguistic diversity.

Fundamentals

Definition and Overview

Phonation is the process by which the vocal folds, located in the , produce voiced sounds through quasi-periodic during the of air from the lungs. This generates a and its harmonics, forming the of sound in human speech and singing. In the basic process, subglottal air pressure from the lungs forces the vocal folds apart, allowing air to flow through the ; the of the folds then causes them to close, repeating in a self-sustaining cycle that produces sound waves. These sound waves are subsequently shaped by the vocal tract above the to create distinct . Phonation is one of three key components in voice production, distinct from —which supplies the power from the lungs—and , which involves the precise shaping of by the articulators in the vocal tract. The plays a central role in this process by housing the vocal folds that vibrate to initiate voiced phonation. The term phonation emerged in the field of to specifically describe vocal fold vibration, with systematic study beginning in the through the physiological work of scientists like Johannes Müller, who formalized early theories of voice production in 1848.

Anatomical Structures Involved

The , a cartilaginous structure located in the anterior neck between the third and sixth , serves as the primary organ for phonation by housing the vocal folds and facilitating their vibration through airflow. It consists of nine cartilages: three unpaired (, , and ) and six paired (two arytenoids, two corniculates, and two cuneiforms). The , the largest, forms the laryngeal prominence () and provides attachment for the vocal folds anteriorly. The , shaped like a signet ring, forms the base of the larynx and encircles the upper trachea, supporting the arytenoid cartilages posteriorly. The paired arytenoid cartilages, pyramid-shaped, sit atop the cricoid and feature vocal processes to which the vocal ligaments attach and muscular processes for muscle insertions, enabling rotation and movement essential for vocal fold positioning. The vocal folds, also known as true vocal cords, are bilateral shelf-like structures extending from the anteriorly to the arytenoid cartilages posteriorly, forming the —the narrowest portion of the airway—when approximated. Composed of five layers from deep to superficial, they include the (vocalis portion), vocal ligament (deep ), intermediate , superficial (Reinke's space, a gelatinous layer), and covering the mucosa. This multilayered design allows for efficient during phonation, with average lengths of approximately 16 mm in males and 10 mm in females, contributing to pitch differences due to shorter folds in females producing higher fundamental frequencies. The , the space between the vocal folds, varies in size from closed (for ) to open (for breathing), directly influencing airflow resistance. Intrinsic laryngeal muscles control vocal fold adduction, abduction, tension, and approximation, while extrinsic muscles position the within the neck. Key intrinsic muscles include the thyroarytenoid (relaxes and shortens folds for lower ), cricothyroid (tilts to elongate and tense folds for higher ), lateral cricoarytenoid and interarytenoid ( folds), and posterior cricoarytenoid (abducts folds). Extrinsic muscles, such as the suprahyoid and infrahyoid groups, elevate or depress the to adjust vocal tract configuration. Subglottal pressure, generated by airflow through the trachea, drives vocal fold , while supraglottal structures like the and shape the resonating airway and protect it during phonation. Neural control of phonation is mediated by branches of the (cranial nerve X). The provides motor innervation to all intrinsic muscles except the cricothyroid (for adduction and ) and sensory innervation below the vocal folds, while the superior laryngeal nerve's external branch innervates the cricothyroid for tension regulation, and its internal branch supplies sensation above the folds. These nerves ensure precise coordination of muscle activity for vocal fold movement.

Mechanisms of Phonation

Myoelastic Aerodynamic Theory

The posits that phonation arises from the interaction between the elastic properties of the vocal folds, modulated by muscular tension, and the aerodynamic forces generated by subglottal . The "myoelastic" component refers to the active adjustment of vocal fold tension and length primarily through contraction of the , which tilts the forward relative to the cricoid, elongating and stiffening the folds to enable . This is complemented by the "aerodynamic" aspect, where from the lungs creates a differential across the ; as air passes through the approximated folds, the effect—resulting from increased velocity and decreased above the folds—draws the superior edges together, facilitating closure. The vibration begins with adduction of the vocal folds, achieved by the arytenoid muscles (lateral cricoarytenoid and interarytenoid), which rotate the arytenoid cartilages medially to close the posterior and bring the folds into approximation. Subglottal pressure then builds below the closed until it overcomes the folds' resistance, causing them to abduct inferiorly and open the in a puff-like release. of the tensed folds, aided by the effect during the closing phase, rapidly approximates the superior surfaces, completing the . This self-sustained repeats at frequencies typically ranging from 100 to 200 Hz in adult males and 200 to 250 Hz in adult females, producing the periodic pulsations that generate voiced . Neural input modulates tension via the cricothyroid but is not required for the oscillatory timing itself. A key mathematical approximation for the fundamental frequency F_0 of vocal fold vibration derives from modeling the folds as a taut string under transverse wave propagation. The derivation starts with the wave speed c = \sqrt{T / \mu}, where T is the longitudinal tension (in newtons) and \mu is the linear mass density (in kg/m), obtained from the one-dimensional wave equation \frac{\partial^2 y}{\partial t^2} = c^2 \frac{\partial^2 y}{\partial x^2} for small-amplitude displacements y(x,t). For a string fixed at both ends over length L (approximating vocal fold length, typically 1.5-2 cm in adults), the fundamental mode has wavelength \lambda = 2L, so F_0 = c / \lambda = \frac{1}{2L} \sqrt{\frac{T}{\mu}}. This predicts that F_0 increases with tension (via cricothyroid activation) and decreases with longer or denser folds. Limitations include the model's assumption of uniform, one-dimensional motion, ignoring the mucosal wave (vertical phase differences in fold layers), three-dimensional airflow effects, and nonlinear tissue properties that cause asymmetries in real vibrations; more advanced models incorporate finite element analysis for better accuracy. Experimental validation of self-sustained oscillation comes from high-speed imaging studies of excised and larynges, where vibrations persist at physiological frequencies without neural innervation, driven solely by controlled and intrinsic elasticity. These recordings, captured at rates exceeding 2000 frames per second, reveal symmetric opening-closing cycles and mucosal wave propagation consistent with MEAD predictions, confirming the mechanism's autonomy from precise neural timing.

Neurochronaxic Theory

The neurochronaxic theory of phonation, proposed by French physiologist Raoul Husson in 1950, posits that vocal fold is actively controlled by discrete neural impulses originating in the and transmitted via the to the intrinsic laryngeal muscles, particularly the thyroarytenoid muscles. According to Husson, each cycle of vocal fold opening and closing is triggered by a specific neural signal, with the adductor muscles contracting to approximate the folds against subglottic air pressure, followed by their relaxation to allow and air escape. This active neuromuscular mechanism contrasts with passive models by emphasizing neural timing as the primary driver of . The term "neurochronaxic" incorporates "," a concept from denoting the minimal duration of a stimulus (twice the rheobase intensity) required to excite or muscle , highlighting the theory's focus on precise neural timing to achieve phonation rates. To account for human phonation frequencies of 100–300 Hz, which exceed the firing capacity of individual fibers (limited to about Hz maximum), Husson invoked the "volley principle" proposed by Wever and Bray, whereby groups of fibers fire in coordinated bursts to generate higher effective impulse rates matching the desired . These volleys, traveling along the —a branch of the (cranial nerve X)—would synchronize muscle contractions to produce rhythmic vibrations independent of aerodynamic forces alone. Despite its innovative emphasis on neural control, the neurochronaxic theory has been largely discredited by subsequent research, particularly (EMG) studies of laryngeal muscles during phonation, which reveal (sustained) electrical activity rather than phasic bursts synchronized to each cycle. For instance, investigations in the late recorded steady EMG patterns in the thyroarytenoid and other intrinsic muscles, showing no evidence of high- neural firing or muscle contractions at 100–300 Hz, as required by Husson's model; instead, muscle tension sets preconditions for , with determined by biomechanical factors. This lack of per-cycle neural undermines the theory's core claim, leading to its rejection in favor of the myoelastic-aerodynamic framework, though it spurred valuable debates on neural roles in voice production. Indirect evidence from recurrent laryngeal nerve damage supports a modulatory neural influence, as unilateral lesions can cause irregular phonation rhythms or adductor weakness, disrupting overall vocal fold coordination without implying cycle-by-cycle control. In modern views, neural impulses via the vagus nerve's recurrent branch initiate phonation by activating adductor muscles to close the and build subglottic pressure, but sustained relies on aerodynamic and elastic forces rather than ongoing neural triggering. This integrated perspective acknowledges the theory's historical contribution to highlighting involvement in voice onset and adjustment, while affirming its limitations for explaining mechanics.

Glottal States

Voiced and Voiceless Phonation

In voiced phonation, the is partially closed, allowing the vocal folds to vibrate periodically under the influence of subglottal , which generates a regular pulsatile and produces a periodic acoustic fundamental to sounds and voiced . This vibration typically features an open quotient (OQ)—the ratio of the glottal open phase to the total vibratory cycle—of approximately 0.5 to 0.6 in , reflecting balanced opening and closing phases that contribute to efficient sound production. Subglottal for sustaining this mode of phonation generally ranges from 5 to 10 cmH₂O during conversational speech, driving the myoelastic oscillations while rates average 100 to 200 ml/s. Voiceless phonation, in contrast, occurs when the is held wide open by of the vocal folds, preventing vibration and permitting uninterrupted airflow through the without pulsatile interruption. This configuration results in no glottal contribution to voicing, often producing aspirate or qualities such as in or as a transitional state in obstruents, where noise arises primarily from supraglottal rather than laryngeal vibration. Aerodynamically, voiceless states require minimal resistance at the , allowing higher airflow continuity compared to voiced modes, though subglottal pressure remains similar unless modulated for . Intermediate glottal states bridge these extremes, including , characterized by loose vocal fold that permits turbulent leakage through incomplete adduction, adding a noisy, airy quality to the periodic . In pressed voice, conversely, the achieves tighter with increased muscular tension, elevating subglottal pressure buildup and yielding a more compact, intense sound with reduced . These variations are quantified using electroglottography (), which measures the contact quotient (CQ)—the proportion of the during which vocal folds are in contact—as a proxy for ; breathy phonation shows lower CQ (higher OQ >0.6), while pressed exhibits higher CQ (OQ <0.5). Such parameters highlight how subtle adjustments in glottal adduction influence phonatory and quality across linguistic and clinical contexts.

Glottal Consonants

Glottal consonants are sounds produced at the , the space between the vocal folds, through specific adjustments that create temporary obstructions or turbulence in . The , represented in the as [ʔ], involves a complete closure of the glottis, fully blocking for a brief moment and resulting in an abrupt interruption of voicing. This sound appears in English as an like "uh-oh," where it separates the two syllables, and in as the (ء), a phonemic that can occur word-initially, medially, or finally, as in أَب (ʾab, ""). The glottal fricative, IPA , is a voiceless sound characterized by a narrow opening at the glottis, allowing airflow to pass through and generate turbulent noise without full vibration of the vocal folds. Its voiced counterpart, [ɦ], involves similar glottal narrowing but with partial vocal fold vibration, producing a breathy quality; this occurs phonemically in Hindi, where [ɦ] contrasts with in words like हवा (havā, "air") versus aspirated stops. Production of these consonants relies on the adduction of the arytenoid cartilages, which rotate to bring the vocal folds together, achieving glottal closure for the stop or a constricted for the . The duration of the closure typically ranges from 50 to 100 ms, sufficient to perceptually distinguish it as a consonantal event without excessive pause in speech flow. In phonemic roles, glottal stops serve as contrasts in tone languages, such as Vietnamese, where pre-stopping with [ʔ] before certain implosive consonants like [ʔɓ] and [ʔɗ] helps delineate syllable boundaries and tonal features. In English, the glottal stop functions as an allophone of /t/ in t-glottalization, particularly in syllable-final positions, as in "button" pronounced [ˈbʌʔn], a common variant in many dialects that does not alter word meaning but reflects casual speech patterns.

Specialized Phonation Types

Supraglottal Phonation

Supraglottal phonation refers to the production of sound through the vibration of laryngeal structures superior to the , primarily the ventricular folds (also called false vocal folds), which are paired mucosal folds located above the true vocal folds in the . These folds, separated from the true vocal folds by the , normally assist in lubrication and air humidification but can vibrate under specific conditions to generate voice, resulting in qualities such as a growl or . This mode contrasts with typical glottal phonation by involving supraglottal tissues, often in conjunction with true vocal fold activity, and is observed in both pathological and intentional vocalizations. The mechanism of supraglottal phonation involves adduction of the ventricular folds due to elevated supraglottal pressure or laryngeal muscle tension, leading to their approximation and subsequent driven by aerodynamic forces from . In this , the folds co-oscillate irregularly with the true vocal folds, often exhibiting aperiodic or periodic motions that are aerodynamically coupled, with vibration amplitudes sufficient to influence glottal . Such vibration typically requires higher phonation threshold pressures, around 16-20 cmH₂O, and is common in high-intensity vocalizations like or , where increased intraoral pressure facilitates fold closure. Acoustically, supraglottal phonation produces a harsh characterized by irregular vibrations, resulting in elevated and shimmer, reduced harmonic-to-noise ratios, and the presence of subharmonics in spectrograms due to period-doubling or asynchronous motions. frequencies often range lower, around 50-100 Hz in growl-like productions, though they can align with or differ from true vocal frequencies by integer ratios, adding roughness and high-frequency components (2-2.5 kHz). Culturally, supraglottal phonation appears in non-Western traditions such as , where the kargyraa style employs ventricular fold vibration to create a low, rumbling undertone at approximately half the frequency of the true vocal folds, enhancing the biphonic effect alongside . In Western and metal vocals, it is intentionally used for growl or effects through controlled supraglottic narrowing and ventricular engagement, allowing sustained harsh timbres without excessive strain. Pathologically, it can manifest as diplophonia in disorders like , where asynchronous vibrations of true and false folds produce dual pitches.

Vocal Registers

Vocal registers refer to distinct modes of vocal fold that produce different perceptual qualities and pitch ranges in . These modes arise from variations in the effective mass, length, and tension of the vocal folds, primarily controlled by the coordinated action of laryngeal muscles such as the cricothyroid () and thyroarytenoid (). The modal , also known as the chest register, involves thicker vocal fold with substantial medial surface contact, where the TA muscle dominates to increase fold mass and ensure robust closure. This typically spans the lowest comfortable pitch range, with frequencies up to approximately 300-400 Hz, and is the primary used in everyday speech due to its efficient energy transfer and strong glottal airflow pulse. In contrast, the register, or head register, features thinned vocal folds with reduced mass and lighter closure, achieved through greater muscle activation that elongates and stiffens the folds while minimizing involvement. This results in higher , generally ranging from 400-800 Hz, with incomplete glottal closure leading to a breathier . The physiological adjustments in tension and reduced effective vibrating mass allow for extension beyond the range, though with less intensity. Transitions between the and registers, known as passaggi or register breaks, occur at specific points where abrupt changes in fold configuration cause perceptual shifts, often around 300-500 Hz depending on individual and . The represents the highest vibrational mode, characterized by edge-only vibration of the vocal fold with minimal mass involvement, facilitated by extreme tension and possible posterior cricoarytenoid assistance. This mode enables fundamental frequencies above 1000 Hz, particularly in trained sopranos reaching up to 2000 Hz, producing a flute-like, piercing sound. Physiologically, it involves maximal fold elongation and reduced contact area, often with raised laryngeal positioning. Acoustically, vocal registers differ in spectral properties: the modal register shows moderate spectral tilt and clustering near the first for a fuller sound; exhibits steeper spectral tilt with fewer harmonics; and displays tight clustering at high frequencies alongside pronounced tilt due to its thin, high-frequency vibration. These acoustic variations stem from differences in glottal flow and fold collision patterns across registers.

Linguistic Applications

Phonological Roles

Phonation serves as a fundamental phonological feature in many languages, most notably through voicing, which establishes a binary contrast between voiced and voiceless obstruents such as stops and fricatives. This contrast, often denoted as [±voice] in feature geometry, distinguishes minimal pairs like the voiceless bilabial stop /p/ and its voiced counterpart /b/ in English, enabling speakers to signal lexical differences through laryngeal state alone. The realization of this feature relies on precise timing of vocal fold vibration relative to oral articulation, with voice onset time (VOT) providing a key acoustic measure: voiceless stops typically exhibit long-lag positive VOT (e.g., 50-100 ms for /p/), while voiced stops show prevoicing (negative VOT) or short-lag positive VOT (0-20 ms). Such contrasts are governed by phonological rules, including assimilation processes where adjacent segments influence voicing, as seen in obstruent clusters that neutralize distinctions to maintain perceptual clarity. In tonal languages, phonation registers extend beyond binary voicing to create multidimensional contrasts, where breathy or creaky voice functions as phonemic categories that interact with pitch contours. Breathy phonation, characterized by incomplete glottal closure and turbulent airflow, often pairs with lower or falling tones, while creaky voice, involving irregular vocal fold vibration and glottal constriction, aligns with high or checked tones to form distinct phonemes. In the Yi language, for example, breathy voice marks mid-falling tones and creaky voice distinguishes high-falling tones, allowing these phonation types to bear lexical load independently of segmental features. These registers enhance tonal inventories by adding laryngeal dimensions, with perceptual cues like spectral tilt and harmonic-to-noise ratio differentiating them reliably across speakers. Glottalization exemplifies phonation's role in airstream mechanisms, particularly in ejective produced via glottalic egressive airflow. During ejective articulation, as in [pʼ], the closes tightly after oral , trapping subglottal air and enabling arytenoid elevation to build supraglottal pressure for explosive release without voicing. This phonemic use of glottal closure contrasts ejectives with pulmonic stops in phonological systems, serving as a place-neutral feature that expands consonant inventories in languages like those of the and Native . Over historical time, phonation s like voicing can erode through , a weakening process that simplifies phonological systems by reducing articulatory effort. often targets voicing in intervocalic or post-vocalic positions, leading to devoicing, spirantization, or complete loss of the , as consonants to adjacent sonorants or lose strength. Such developments, driven by perceptual and aerodynamic factors, result in inventory mergers—e.g., voiced stops becoming fricatives or voiceless—altering historical phonologies without external borrowing, as evidenced in Indo-European branches where initial voicing persisted but medial s neutralized. These shifts highlight phonation's vulnerability to gradual systemic change, prioritizing ease of over maintenance in evolving languages.

Cross-Linguistic Examples

In , glottal reinforcement manifests as a [ʔ] inserted before word-initial vowels to demarcate syllable boundaries, as in ʔAbend ('evening'), enhancing clarity in . This feature is a standard prosodic marker in , distinguishing it from languages without such reinforcement. In Danish, the represents a laryngealized or creaky phonation type, characterized by irregular vocal fold vibration and low pitch on stressed syllables in certain monosyllabic or bisyllabic words, such as hus ('') with stød versus husene ('the houses') without. This non-modal phonation serves as a prosodic contrast, often resulting in a glottal or creaky that differentiates lexical items. Among non-European languages, White Hmong employs creaky voice as part of its register tone system, where the low-falling tone (-m) features creaky phonation with slow, irregular vocal fold vibrations, contrasting with breathy or modal tones, for example in pom (low-falling, creaky) versus pos (low, modal). In Tuvan throat singing, the kargyraa style utilizes ventricular phonation, where the ventricular folds (false vocal folds) vibrate to produce a subharmonic undertone approximately one octave below the modal voice, creating a deep, rumbling quality distinct from standard glottal phonation. Quechua languages, such as Cuzco Quechua, incorporate ejective consonants like [pʼ], [tʼ], and [kʼ], produced with glottalic egressive airflow involving simultaneous closure of the glottis and oral articulation, as in pʼaqcha ('split'), which contrasts with pulmonic stops. In Asian and African languages, features implosive consonants such as [ɓ], [ɗ], and [ɠ], which involve glottal closure followed by ingressive airflow, creating a effect during release, as in ɓakhU (implosive) versus bakhU (voiced), distinguishing them from voiced stops. uses murmured or consonants, denoted as [bʱ], [dʱ], where breathy phonation spreads from the to adjacent vowels, producing and turbulence, as in bʱaːɾ ('outside') contrasting with baːɾ ('load'). Acoustic analyses reveal phonation contrasts through measures like voice onset time (VOT). In Spanish, voiced stops exhibit prevoicing with negative VOT values around -100 ms to -40 ms, while voiceless stops show short-lag VOT of approximately 40-60 ms, as in bala (voiced ) versus pala (voiceless ). In Jalapa Mazatec, vocal registers combine with tones, where modal, breathy, and creaky phonations produce distinct spectrographic patterns: creaky voice shows irregular pulses and low fundamental frequency (F0) with sparse harmonics, breathy voice features turbulent noise and steeper spectral tilt, and modal voice maintains steady periodicity, as visualized in spectrograms of tones like high modal versus low creaky. These acoustic signatures underscore how phonation contributes to phonological contrasts across languages.

Clinical and Educational Contexts

Pedagogical Approaches

Vocal pedagogy employs a variety of exercises to enhance phonation control, particularly in managing vocal registers for seamless transitions. The siren exercise, involving a smooth across the on a single such as , promotes register blending by encouraging gradual shifts between chest and head without abrupt breaks. Similarly, straw phonation, where singers or vocalize through a narrow straw, reduces laryngeal tension by creating backpressure that balances vocal fold adduction and , facilitating easier phonation and register coordination. Scales targeting the passaggi—the transitional zones between —are commonly used to build evenness; for instance, ascending and descending arpeggios on neutral vowels like [ŋ] help singers navigate these areas while maintaining consistent and avoiding flips. In speech training, pedagogical approaches focus on refining voicing contrasts to aid , especially for English as a (ESL) learners. Voice onset time (VOT) drills, which involve timed repetitions of minimal pairs like "pat" versus "bat" to shorten or lengthen the interval between release and voicing onset, improve the distinction between voiced and voiceless stops, reducing perceived foreign . These exercises often incorporate visual or auditory feedback to monitor progress, helping learners achieve more native-like phonation patterns in . Historical methods in vocal pedagogy, such as those from the Bel canto tradition of the 18th and 19th centuries, emphasize achieving even registration across the full range through balanced breath support and resonance adjustment. Bel canto techniques, as described in treatises by pedagogues like Manuel Garcia, involve gradual scale work and portamento to unify chest, middle, and head registers, preventing discontinuities in tone quality. In modern adaptations, biofeedback tools like the VoceVista software provide real-time spectrographic visualization of phonation, allowing singers to observe formant tuning and register shifts during exercises, thereby enhancing self-correction in training sessions. Recent developments since 2020 have integrated into vocal coaching, offering precise feedback on pitch accuracy and register management. AI-driven apps, such as those employing for real-time analysis, evaluate phonation quality by detecting deviations in and harmonic structure, providing personalized exercises to refine transitions between vocal registers. As of 2025, studies indicate these tools enhance student engagement and vocal skills, including and performance outcomes in .

Speech Pathology Considerations

Speech pathology considerations in phonation focus on identifying, diagnosing, and managing disorders that impair vocal fold vibration and glottal closure, leading to dysphonia or voice quality alterations. Common disorders include vocal nodules, which arise from vocal misuse or overuse and result in breathy phonation due to incomplete glottal closure and tissue swelling on the vocal folds. , a neurological condition, causes irregular glottal closure through involuntary spasms of the laryngeal muscles, producing strained or breaks. Presbylaryngis, associated with aging, involves thinning and bowing of the vocal folds, reducing their mass and elasticity, which leads to a weak, breathy, or tremulous voice. Diagnostic approaches emphasize multimodal assessment to evaluate phonatory function. , including flexible or rigid , provides direct visualization of vocal fold structure and movement during phonation, identifying lesions or irregular vibration patterns. Acoustic analysis measures perturbations in voice signals, where (cycle-to-cycle frequency variation) exceeding 1.04% and shimmer (cycle-to-cycle amplitude variation) above 3.81% often indicate hoarseness or dysphonia severity. Electroglottography () captures the glottal waveform by detecting electrical contact between vocal folds, revealing abnormalities in closure phases for disorders like . Treatment strategies are tailored to the underlying , combining behavioral, medical, and surgical interventions. Voice therapy, such as the resonant voice technique, promotes optimal vocal fold vibration through forward focus and reduced laryngeal tension, yielding significant improvements in voice handicap scores and perceptual quality. For unilateral vocal fold or , medialization laryngoplasty surgically repositions the fold toward the midline using implants, enhancing glottal closure and phonatory efficiency. (Botox) injections into hyperactive laryngeal muscles effectively reduce spasms in , with patients reporting substantial voice improvement lasting 3-4 months per treatment. Clinical studies demonstrate 70-80% of patients achieve meaningful voice quality gains post-voice therapy, though outcomes vary by adherence and disorder chronicity. Recent research from the highlights emerging challenges in phonation disorders. Long COVID-associated dysphonia affects 19-28% of infected individuals in recent cohorts, persisting in up to 70% of cases with symptoms like hoarseness due to laryngeal or neuropathy. differences show higher prevalence among females (14.4%) compared to males (10.0%), potentially linked to hormonal influences on vocal fold tissue and occupational voice demands.

References

  1. [1]
    The Larynx, Voice & Swallowing - Anatomy - OHSU
    Voicing, or phonation, is a complicated process in which sound is produced for speech. During phonation, the vocal folds are brought together by muscles ...
  2. [2]
    Mechanics of human voice production and control - PMC
    A. Sound sources of voice production. The phonation process starts from the adduction of the vocal folds, which approximates the vocal folds to reduce or close ...
  3. [3]
    Anatomy | Medical School
    Phonation - Phonation is the word we use for making noise with the larynx, or producing vocal sound; phon- is a root word meaning sound. · Dysphonia - Dysphonia ...
  4. [4]
    Phonation - Speech Production - The University of New Mexico
    Phonation - Speech Production. Three types of sound generated by larynx. Transient; Turbulent; Quasi-periodic phonation. Vibration of vocal folds.
  5. [5]
    [PDF] Phonation Contrasts Across Languages* - UCLA Linguistics
    1. INTRODUCTION. PHONATION is the production of sound in the larynx. Often this term is used in a narrow sense to refer only to the production of voicing, i.e. ...
  6. [6]
    How Does the Human Body Produce Voice and Speech? - NIDCD
    Mar 13, 2023 · To produce speech, the vocal folds must vibrate normally as air travels through them from the lungs and reaches the mouth and nose.Missing: phonation | Show results with:phonation
  7. [7]
    MeSH - Phonation - NCBI - NIH
    Phonation is the process of producing vocal sounds by means of vocal cords vibrating in an expiratory blast of air.
  8. [8]
    Understanding Voice Production - THE VOICE FOUNDATION
    The vocal fold vibratory cycle has phases that include an orderly sequence of opening and closing the top and bottom of the vocal folds, letting short puffs of ...
  9. [9]
    2.1 How Humans Produce Speech – Essentials of Linguistics
    So to sum up, the three mechanisms that we use to produce speech are: respiration at the lungs,; phonation at the larynx, and; articulation in the mouth.
  10. [10]
    [PDF] Chapter 95: Physiology - Famona Site
    The almost universally accepted myoelastic-aerodynamic theory of phonation was put into its modern form by Johannes Müller in 1848 an was presented again by ...
  11. [11]
    Anatomy, Head and Neck: Larynx - StatPearls - NCBI Bookshelf
    Intrinsic Muscles​​ These muscles produce elongation of the vocal folds, resulting in higher-pitch phonation. The posterior cricoarytenoid muscle attaches to the ...
  12. [12]
    Anatomy, Head and Neck, Larynx Vocal Cords - StatPearls - NCBI
    The larynx has three regions: supraglottis, glottis, and subglottis. It has unpaired (thyroid, cricoid) and paired cartilages. Vocal folds have five layers.Introduction · Structure and Function · Nerves · Muscles
  13. [13]
    Myoelastic-Aerodynamic Theory of Voice Production - ASHA Journals
    Research Article. September 1958. Myoelastic-Aerodynamic Theory of Voice Production. Author: Janwillem van den BergAuthors Info & Affiliations. Publication ...Missing: original | Show results with:original
  14. [14]
    Integrative Insights into the Myoelastic-Aerodynamic Theory and ...
    In this tribute article to DG Miller, we review some historical and recent contributions to understanding the myoelastic-aerodynamic (MEAD) theory of phonation.
  15. [15]
    High-speed digital imaging of the medial surface of the vocal folds
    Nov 1, 2001 · High-speed digital imaging of the medial surface of the vocal folds was performed in excised canine larynx experiments.
  16. [16]
    VOCAL FOLD MASS IS NOT A USEFUL QUANTITY FOR ... - NIH
    The mode frequency is lowered, which is expressed in Equation 2 by the 1/(2L) division. This decrease in F0 will occur whether the string is thick or thin ...
  17. [17]
    The Physics of the Human Vocal Folds as a Biological Oscillator
    Evidence for this oscillator concept is provided by the observation that in high-speed video recordings, at the end of a vocal utterance, when the airflow is ...
  18. [18]
    Husson's Theory: An Experimental Analysis of His Research Data ...
    THE CLASSIC myeloelastic-aerodynamic theory of phonation is based on the belief that the vocal folds, positioned and tensed by the laryngeal muscles, are.
  19. [19]
    [On the myoelastic and neuro-chronaxic theories of phonation]
    [On the myoelastic and neuro-chronaxic theories of phonation]. Rev Laryngol Otol Rhinol (Bord). 1954 May-Jun;75(5-6):494-512. [Article in French] ...Missing: Neurochronaxic chronaxie Husson
  20. [20]
    Influence of Raoul Husson's Neurochronaxic Theory on the ...
    In the 1950s–1970s, Soviet physiologists undertook to test the new concept proposed by Husson. They came to the conclusion that both mechanisms, myoelastic and ...
  21. [21]
    Influence of Raoul Husson's Neurochronaxic Theory on the ...
    Influence of Raoul Husson's Neurochronaxic Theory on the Development of Research on the Physiological Mechanisms of Voice Production in the Soviet Union (1950s– ...
  22. [22]
    Electromyography of the H um an Vocal Cords and the Theory of ...
    in 1950 Husson proposed a new theory on phonation, the so- called neuro-chronaxic theory, in which an active periodic stimu lation of the vocal cords by the ...
  23. [23]
    Anatomy and development and physiology of the larynx - Nature
    May 16, 2006 · The aerodynamic theory of sound production therefore replaces the neurochronaxic theory proposed by Husson, who incorrectly advanced the notion ...Missing: modern | Show results with:modern
  24. [24]
    The Neurochronaxic Theory of Voice Production—A Refutation
    The classic theory postulates that, as part of the complicated act of phonation, motor impulses are transmitted via the recurrent laryngeal nerves to the ...
  25. [25]
    Neuroanatomy, Cranial Nerve 10 (Vagus Nerve) - StatPearls - NCBI
    The vagus nerve (cranial nerve [CN] X) is the longest in the body, containing both motor and sensory functions in afferent and efferent regards.Missing: modern neurochronaxic theory
  26. [26]
    Types of phonation
    During the open phase of vibration the glottis has a triangular form with wider opening at the arytenoids. As the vocal folds close, they do not do so in all ...
  27. [27]
    Variability in the relationships among voice quality, harmonic ... - NIH
    This study examined the empirical relationship among H1*–H2*, the glottal open quotient (OQ), and glottal area waveform skewness.
  28. [28]
    [PDF] Holmberg - Aerodynamic measurements of normal voice - DiVA portal
    Vocal fold vibration results from an alternating balance between subglottal air pressure that drives the vocal folds apart and muscular, elastic, ...
  29. [29]
    Regulation of glottal closure and airflow in a three-dimensional ... - NIH
    The mean flow rate in normal human phonation is in the range between 60 and 340 ml/s, with the average ranging from 120 to 200 ml/s (Hirano, 1981; Holmberg et ...
  30. [30]
    Air and Phonation
    If the vocal cords are held apart, air can flow between them without being obstructed, so that no noise is produced by the larynx. In voiceless fricatives such ...Missing: uninterrupted | Show results with:uninterrupted
  31. [31]
    Electroglottographic Analysis of the Voice in Young Male Role of ...
    For EGG waveforms with a single peak, a high CQ is typically related to a pressed voice, while a low one is commonly observed for breathy voice. However, this ...
  32. [32]
    EGG / ELG - UNED Voice Lab
    The contact quotient (CQ) is intended to correspond to the fraction of the period time that the vocal folds are in contact; conversely, the open quotient (OQ) ...Missing: configuration | Show results with:configuration
  33. [33]
    Electroglottographic contact quotient in different phonation types ...
    Mar 20, 2012 · Contact quotient (CQ), measured by electroglottogram (EGG), is a ratio which illustrates the duration of vocal fold contact during one vocal fold period.
  34. [34]
    Electroglottography in Medical Diagnostics of Vocal Tract Pathologies
    Dec 23, 2023 · Electroglottography (EGG) is a technology developed for measuring the vocal fold contact area during human voice production.
  35. [35]
    Glottal stops do not constrain lexical access as do oral stops - PMC
    To generate versions with a full glottal stop, we replaced the initial glottalization (with a mean duration of 58 ms and a range from 31 to 80 ms) with a full ...
  36. [36]
    The Glottal Stop in Arabic - Lingualism.com
    Nov 27, 2022 · The glottal stop (ʔ) in Arabic is represented by hamza (ء). It's a recognized sound, and words with hamza retain it, even if it sounds like a ...
  37. [37]
    [PDF] On the sound source locations of 'glottal fricative' [h]
    Fricative consonants are known to be realized by producing turbulent jet flows in a vocal tract. The jet flow is produced at a constriction in the vocal tract,.
  38. [38]
    [PDF] The Glottal Fricative and Schwa Deletion in Hindi
    Dec 9, 2017 · Although /h/ in Hindi has been treated as a „voiceless glottal fricative‟ [1], in the present investigation, it has been found to have two ...
  39. [39]
    [PDF] Laryngeal Mechanisms and - Haskins Laboratories
    ... 50-100 ms. For the glottal adduction, the pattern of activity in the two muscles is reversed with a decrease in PCA and an increase in INT. Li silar Li ...
  40. [40]
    [PDF] Vietnamese accent
    Vietnamese accent features glottal stops, glottalized stops, and voiced implosives. It is syllable-timed, and has a staccato rhythm.
  41. [41]
    [PDF] Glottalisation of voiceless stops in Multicultural London English
    In almost all cases, /t/ was glottalised more than 70% of the time, suggesting that glottal stops are likely a near-categorical allophone of. /t/ in coda ...
  42. [42]
    Acoustic, aerodynamic, and vibrational effects of ventricular folds ...
    This study aimed to examine the impact of different degrees of ventricular adduction on acoustics, aerodynamics, and vocal fold vibration in an ex vivo ...
  43. [43]
    [PDF] Ventricular-Fold Dynamics in Human Phonation - HAL
    Oct 8, 2014 · In addition, several studies showed that these supraglottal structures can move closer to each other, come into contact, and even vibrate for ...
  44. [44]
    Vocal fold and ventricular fold vibration in period-doubling phonation
    May 12, 2010 · The ventricular folds, also called false vocal folds or ventricular bands, are two laryngeal structures located above the vocal folds, superior ...
  45. [45]
    Overtone focusing in biphonic tuvan throat singing - eLife
    Feb 12, 2020 · Tuvan singers show remarkable control in shaping their vocal tract to narrowly focus the harmonics (or overtones) emanating from their vocal cords.
  46. [46]
    Laryngoscopic, Acoustic, Perceptual, and Functional Assessment of ...
    Mar 18, 2014 · Objective: The present study aimed to vocally assess a group of rock singers who use growl voice and reinforced falsetto.<|separator|>
  47. [47]
    Spectral Characteristics of the Modal and Falsetto Registers
    in different vocal registers. Generally, it is claimed that modal phonations. (which occupy a frequency range from 80 to 300 Hz) exhibit energy in a greater ...
  48. [48]
    Phonational Range in the Modal and Falsetto Registers
    The phonational range of the modal and falsetto registers was determined for 35 nonsingers and eight singers. Each subject phonated the vowel /a/ at ...
  49. [49]
    [PDF] Relations between vocal registers in voice breaks - ISCA Archive
    They found a frequency change between modal and falsetto register of about 7 semitones (ST) for a range of about one octave (e-e') for the frequency in modal ...
  50. [50]
    Biomechanics of sound production in high-pitched classical singing
    Jun 7, 2024 · Voice production of humans and most mammals is governed by the MyoElastic-AeroDynamic (MEAD) principle, where an air stream is modulated by ...Missing: core seminal<|control11|><|separator|>
  51. [51]
    On Whistle Register | NCVS - National Center for Voice and Speech
    Oct 31, 2025 · M1 refers to chest or modal register, while M2 refers to falsetto in males and head voice in females. The M3 mechanism has yet to be described ...
  52. [52]
    [PDF] Phonetic and Phonological Representation of Stop Consonant Voicing
    Jun 5, 2006 · Here I will explore the consequences of a proposal by. Lieberman 1970, 1977 that [±voice] be used as a binary phonological feature which can ...
  53. [53]
    [PDF] Voice onset time and beyond: Exploring laryngeal contrast in 19 ...
    The distri- bution of VOT especially in those languages with a two-way and a three-way voicing contrast may indeed be largely pre- dicted by how each ...
  54. [54]
    Phonological 'voicing', phonetic voicing, and assimilation in English
    This article investigates certain aspects of regressive voicing assimilation by means of a quantitative acoustic study of British English obstruent clusters.
  55. [55]
    [PDF] UNIVERSITY OF CALIFORNIA Los Angeles Phonation in Tonal ...
    The name “Yi” refers to both the whole Yi (Loloish) branch of languages and the Yi language, because it has the most population in this language family branch.
  56. [56]
    The production of ejectives in German and Georgian - ScienceDirect
    Supraglottal air pressure increases because of a continued pulmonic airstream through the open glottis. The glottis is then closed maintaining the prior ...
  57. [57]
    [PDF] The Phonetics and Phonology of the Timing of Oral and Glottal Events
    articulation of the stop: 50 ms for bilabial stops, 55 ms for alveolar stops, and 70 ms for velars. This. Reproduced with permission of the copyright owner ...
  58. [58]
    [PDF] Lenition, weakening and consonantal strength: tracing concepts ...
    Lenition, also called weakening, is a segmental interaction where consonants become weaker by resembling vowels, such as voicing voiceless stops.
  59. [59]
    [PDF] Lenition, Perception, and Neutralisation
    This is a case of neutralising continuity lenition: aspiration/voicing contrasts in word-initial position are neutralised by voicing or deaspiration in all ...
  60. [60]
    Glottalization versus linking of TCU-initial vowels in German
    Introduction. In spoken German, word-initial vowels are often preceded by a glottal stop, as in 'guten ʔAbend', 'cool ʔirgendwie', 'ja ʔaber'.
  61. [61]
    Phonetic analysis of the stød in standard Danish - PubMed
    The stød is a prosodic feature related to creaky voice, with a higher pitch and intensity at the start, and a decrease in the second part.
  62. [62]
    Voice quality and tone identification in White Hmong - PMC - NIH
    In White Hmong, nonmodal phonation (breathy or creaky voice) accompanies certain lexical tones, but its importance in tonal contrasts is unclear. In this study, ...
  63. [63]
    An acoustic study of trans-vocalic ejective pairs in Cochabamba ...
    Jul 25, 2014 · Cochabamba Quechua has three phonemic vowels /i ɑ u/ and two allophones [e o], which result from lowering of /i u/ adjacent to uvulars. [ʔ] is ...
  64. [64]
    The Sindhi implosives: Archaism or innovation? - ResearchGate
    Aug 6, 2025 · The belief that the Sindhi implosives represent direct inheritance of the voiced preglottalized mediae, which are reliably reconstructed for ...
  65. [65]
    The phonetics of contrastive phonation in Gujarati - ScienceDirect.com
    One language whose phonation contrast has been of much interest is Gujarati, which distinguishes modal (“clear”) and breathy (“murmured”) vowels (e.g. [baɾ] ' ...
  66. [66]
    [PDF] Perception of VOT and First Formant Onset by Spanish and English ...
    Voice onset time (VOT) is the timing between release and laryngeal vibration. Spanish voiced stops have negative VOT, while voiceless stops have long lag VOT.
  67. [67]
    The acoustic consequences of phonation and tone interactions in ...
    Jul 12, 2011 · Jalapa Mazatec is unusual in possessing a three-way phonation contrast and a three-way level tone contrast independent of phonation. For this ...Missing: spectrograms | Show results with:spectrograms
  68. [68]
    How to Hit High Notes Without Straining Your Voice
    Oct 3, 2022 · Perform a vocal siren by singing the “ooh” vowel on your highest note, your lowest note, and then your highest tone again. Finally, try to sing ...
  69. [69]
    [PDF] The Versatile Singer: A Guide to Vibrato & Straight Tone - CUNY ...
    Vocal exercises that focus on slides, sirens, and staccato passages can be beneficial to the singer who wants to learn how to sing with straight tone ...
  70. [70]
    Immediate effects of the phonation into a straw exercise - PMC - NIH
    The straw phonation exercise caused positive effects, seen upon voice self-assessment, indicating an easier and better voice upon phonation.
  71. [71]
    Can Straw Phonation Be Considered As Vocal Warm Up Among ...
    Straw phonation exercises have shown improvements in vocal sounds by reducing the effort and stress induced fatigue on the vocal folds during phonation.
  72. [72]
    [PDF] The Hybrid Teacher: Expanding the Vocal Pedagogy Regime
    Jan 12, 2022 · Teaching singers to belt and mix involves working toward speaking and singing on higher and higher pitches while maintaining some TA activity.<|separator|>
  73. [73]
    [PDF] FUNDAMENTALS OF VOCAL PEDAGOGY AND APPLICATIONS ...
    Sing easier scales or melodies on a nasal [hai] (hi) knowing that a ... When connected to the head-voice through the passaggio, my ears, vocal mask ...
  74. [74]
    [PDF] TRAINING AMERICAN-ENGLISH SPEAKERS IN THE ...
    The goal of the intervention was to reduce the aspiration of the Spanish consonants, or the Voice Onset Time (VOT), in the speech of the experimental.
  75. [75]
    [PDF] A historical approach to training the vocal registers
    Historical vocal pedagogy was based primarily on a few fundamental facts underlying the physical laws of sound. Thus, instruction became a matter of training ...
  76. [76]
    Bel Canto: A History of Vocal Pedagogy - James Stark - Google Books
    Jan 1, 1999 · Using a nineteenth-century treatise by Manuel Garcia as his point of reference, Stark analyses the many sources that discuss singing techniques ...
  77. [77]
    [PDF] Mobile Apps and Biofeedback in Voice Pedagogy
    Mar 3, 2021 · The sports medicine field has implemented a visual biofeed- back protocol called Real-time Optimized Biofeedback. Utilizing Sport Techniques ( ...Missing: modern | Show results with:modern
  78. [78]
    VoceVista – Software for Voice Analysis and Sound Visualization
    VoceVista Video is a software application for the interactive recording and exploration of sounds. The visual display of a sound enables the quick recognition ...Resonance in Singing · History of VoceVista · VoceVista Pro (legacy) · BuyMissing: biofeedback modern
  79. [79]
    AI-assisted feedback and reflection in vocal music training
    This study explores the impact of integrating artificial intelligence (AI) and e-learning tools into vocal music training. It focuses on feedback and reflection ...
  80. [80]
  81. [81]
    Using mobile applications in the study of vocal skills - PMC
    Aug 12, 2022 · The work analyzed the impact of specialized mobile applications Vox Tools: Learn to Sing and Swiftscales Vocal Trainer on the vocal learning process.Literature Review · Vocal Training · Results<|control11|><|separator|>
  82. [82]
    Functional Voice Disorders - StatPearls - NCBI Bookshelf - NIH
    Apr 28, 2023 · Structural organic voice disorders result from physical changes in the laryngeal anatomy, such as edema, vocal nodules, and presbylarynx.Missing: presbylaryngis | Show results with:presbylaryngis
  83. [83]
    Laryngeal dystonia (spasmodic dysphonia) - MedLink Neurology
    The primary treatment involves electromyography-guided botulinum toxin injection into the affected intrinsic laryngeal muscles. Patients report significant ...
  84. [84]
    The Elderly Voice: Mechanisms, Disorders and Treatment Methods
    Jul 7, 2023 · Although presbylarynx itself causes voice changes, it is generally accompanied by one or more organic and/or functional voice disorders. Gregory ...
  85. [85]
    Recommended Protocols for Instrumental Assessment of Voice
    The aim of this study was to recommend protocols for instrumental assessment of voice production in the areas of laryngeal endoscopic imaging, acoustic ...
  86. [86]
    (PDF) Vocal Acoustic Analysis – Jitter, Shimmer and HNR Parameters
    Aug 10, 2025 · It is known as jitt and has 1.04% as the threshold limit for detecting pathologies. 1. 100. 1. N. i. i. jitta. jitt. T.
  87. [87]
    Assessing the Effectiveness of Voice Therapy Techniques in ...
    Jun 10, 2024 · The physiologic voice therapy group showed significant improvements in VHI, VoiSs, VTDS (decrease), and self-perception of resonant voice ...
  88. [88]
    Medialization Laryngoplasty: A Review for Speech-Language ...
    This surgical procedure adds bulk to vocal fold tissue using an implant or injectable filler that indirectly presses the impaired fold toward the midline.Missing: botox spasmodic
  89. [89]
    Does Voice Therapy Improve Vocal Outcomes in Vocal Fold Atrophy ...
    Only 29% of patients with vocal atrophy completed voice therapy when recommended. Within this patient cohort, voice therapy results in significant improvement.Missing: technique | Show results with:technique
  90. [90]
    The Effects of COVID‐19 on Voice - Syamal - Wiley Online Library
    Apr 30, 2025 · Most studies report the prevalence of dysphonia in COVID-19 infection to be 25%–28% with varying degrees of severity [5, 6]. However, it is ...ABSTRACT · Prevalence of Dysphonia in... · Etiology and Characterization...Missing: 2020s | Show results with:2020s
  91. [91]
    Increasing Prevalence of Voice Disorders in the USA
    Objective: To estimate the current prevalence of voice disorders among adults in the United States; to determine the association of individual factors with ...