Fact-checked by Grok 2 weeks ago

Auditory illusion

Auditory illusions are perceptual distortions in which the brain interprets acoustic stimuli in a manner that deviates from their physical properties, often reorganizing or filling in sensory information to create a coherent but inaccurate auditory experience. These phenomena arise from the interplay of bottom-up sensory processing and top-down cognitive influences, demonstrating how the auditory system groups sounds based on principles like similarity, proximity, continuity, and common fate. Auditory illusions provide critical insights into human sound perception, revealing the brain's active role in constructing auditory scenes rather than passively receiving input. They occur in everyday listening, such as the phonemic restoration effect where noise masks speech yet the mind reconstructs missing phonemes, and have applications in , , and even music composition to explore perceptual limits. Unlike hallucinations, which lack external stimuli, auditory illusions stem from real sounds but highlight vulnerabilities in perceptual organization, influenced by factors like , , and linguistic background. First systematically studied in the mid-20th century, particularly through work by researchers like and Diana Deutsch in the 1960s–1970s, these illusions continue to inform research on auditory processing disorders, cognitive models, and emerging applications in AI and as of 2025.

Overview

Definition

An auditory illusion is a misinterpretation of auditory stimuli by the , resulting in the of sound characteristics—such as , , or continuity—that differ from the actual physical properties of the sound waves presented. Unlike accurate auditory , which faithfully represents the incoming acoustic signals, illusions arise when the reorganizes or alters the sound information to form a coherent , often prioritizing perceptual over literal to the stimulus. Key characteristics of auditory illusions include their subjectivity, context-dependence, and under controlled conditions. Subjectivity manifests in individual variations in , influenced by factors such as linguistic background or cognitive biases, leading different listeners to experience the same stimulus differently. Context-dependence means that the illusion's strength or occurrence relies on surrounding auditory cues or environmental factors, which shape how the interprets ambiguous signals. Reproducibility ensures that, despite variability, the illusions can be reliably elicited in most individuals when specific stimulus parameters are met, making them valuable tools for studying . Auditory illusions exploit the auditory system's predictive processing, a mechanism where the brain generates expectations based on prior knowledge and sensory context to anticipate and fill in incomplete or ambiguous auditory input. This process, akin to frameworks, allows the system to maintain a stable auditory scene by inferring plausible continuations or resolutions, but it can lead to perceptual mismatches when predictions override the actual stimulus. Similar to visual illusions, these phenomena highlight how sensory processing is an active construction rather than passive reception.

Historical Context

The study of auditory illusions traces its roots to early 19th-century investigations into and binaural hearing, which laid foundational observations for understanding perceptual discrepancies in audition. , known primarily for his work in , contributed to auditory research through experiments demonstrating that sounds are perceived more intensely in the occluded ear, highlighting basic binaural effects that foreshadowed later illusion studies. These efforts were part of a broader physiological inquiry into sensory integration, influenced by the era's advancements in acoustics and , though systematic exploration of auditory deceptions remained limited until the mid-20th century. A pivotal advancement occurred in 1964 when psychologist introduced the concept of endless rising tones, now known as Shepard tones, through computer-generated stimuli that created an illusion of continuous pitch ascent without resolution. Published in the Journal of the Acoustical Society of America, Shepard's work demonstrated circularity in judgments, marking a shift toward experimental psychology's use of synthesized sounds to probe perceptual ambiguities and influencing subsequent research on auditory . Post-2000 developments integrated techniques to elucidate the neural underpinnings of auditory illusions, confirming brain-level involvement beyond peripheral mechanisms. Functional magnetic resonance imaging (fMRI) studies in the 2010s, for instance, revealed heightened activity in the during illusions of sound location, such as those induced by interaural level differences, underscoring the role of cortical processing in spatial misperceptions. Similarly, fMRI investigations of the ventriloquism effect demonstrated visual dominance over auditory localization in , with activations in regions. Historical perspectives on auditory illusions also extend to ancient non-Western contexts, where acoustic phenomena in were interpreted through cultural lenses, often evoking explanations. In prehistoric Native American sites, such as Utah's canyon locations, sound reflections and ricochets created illusory whispers or echoes that aligned with pictorial motifs of spirits, suggesting early recognition of auditory deceptions in built environments. These observations highlight a gap in Western-centric narratives, as similar effects in non-European architectural designs influenced ritualistic and artistic expressions long before formal scientific study.

Mechanisms

Physiological Basis

Auditory illusions originate at the peripheral level through the cochlea's frequency-selective processing, where inner hair cells transduce mechanical vibrations into neural signals via stereocilia deflection, establishing a tonotopic map along the basilar membrane. This organization allows precise detection of frequencies, but ambiguities arise when stimuli produce overlapping or nonlinear interactions, such as distortion products from concurrent tones that s cannot fully resolve, leading to perceptual misrepresentations of or . These peripheral ambiguities propagate centrally, where incomplete frequency separation contributes to illusory qualities by exploiting the limits of tuning sharpness. Central processing involves the auditory pathway, beginning with the auditory nerve (cranial nerve VIII) relaying signals from hair cells to the in the , followed by projections to the for binaural integration, the in the , the in the , and ultimately the primary in the . Within the , particularly the , top-down processing modulates these signals through corticofugal feedback loops, incorporating prior expectations to resolve ambiguities in complex or noisy environments via mechanisms. This hierarchical integration enhances perceptual accuracy but can generate illusions when top-down influences override or bias bottom-up inputs, as seen in delayed reconciliation of predictions in frontal-auditory interactions. Many auditory illusions stem from failures in cross-modal integration, where auditory signals are inappropriately biased by non-auditory cues, leading to mismatched multisensory representations. Recent studies in animal models during the 2020s, including 2024 research decoding contextual influences on auditory perception from primary activity, have causally demonstrated these effects by selectively activating or silencing circuits, revealing how specific neuronal ensembles generate illusion-like perceptual alterations, such as modified sound predictions in cross-modal contexts. As of 2025, studies using auditory illusory models as proxies continue to investigate bottom-up and top-down neural networks underlying phantom perceptions like . These findings highlight the 's role in bridging peripheral inputs and higher-order interpretation, with implications for understanding pathological illusions like .

Perceptual Processes

Auditory illusions often arise from the perceptual system's reliance on organizational principles akin to those in visual , adapted to the temporal and spectral dimensions of sound. In auditory streams, grouping by proximity organizes sounds based on their temporal closeness, where successive tones or events separated by short intervals are perceived as belonging to a single coherent stream rather than separate entities. Similarly, grouping by similarity merges sounds sharing spectral characteristics, such as or range, leading to illusory fusions or segregations; for instance, harmonic sounds with matching frequencies may be heard as a unified despite physical discontinuities. These principles, extended from visual to auditory domains, explain why ambiguous acoustic inputs can yield stable yet illusory percepts, as the imposes structure to resolve . The role of and further modulates these processes through a framework, where prior knowledge from experience biases the interpretation of sensory input. In this model, the generates hypotheses about likely sound sources based on learned probabilities and updates them with incoming data, often favoring interpretations that align with contextual expectations over raw acoustic evidence. This can produce illusions when priors override veridical cues, such as expecting a continuous in noisy environments, resulting in filled-in gaps or misattributed sources. selectively amplifies relevant streams, enhancing grouping by proximity or similarity while suppressing alternatives, thereby shaping the illusory outcome. A foundational concept in understanding these perceptual mechanisms is auditory scene analysis, as articulated by Bregman, which describes how the segregates complex sound mixtures into perceptual streams, with illusions emerging from failures in this segregation. Stream segregation relies on cues like common fate (synchronized changes) or harmonicity, but when these conflict—such as in rapid alternations between tones—percepts may erroneously integrate or split, creating phantom continuities or fragmented illusions. Bregman's framework highlights primitive grouping (automatic, cue-based) versus schema-based grouping (top-down, expectation-driven), where lapses in either lead to misperceptions. Cultural influences introduce variability in these perceptual biases, particularly in pitch processing, where speakers of tone languages exhibit distinct sensitivities compared to non-tonal language users. For example, tone-language speakers, accustomed to using pitch for lexical distinction, show reduced susceptibility to certain pitch-related illusions, such as the speech-to-song effect, where repetitive speech fragments transform into song-like melodies; this illusion weakens due to perceiving prosodic structures as linguistic rather than musical, resisting the perceptual shift. Similarly, the tritone paradox—an ambiguity in perceiving ascending or descending tritones—varies by linguistic background, with tone-language speakers demonstrating altered directional judgments influenced by native pitch contours in spoken language. These differences underscore how cultural-linguistic experience tunes Bayesian priors, affecting illusion proneness in auditory pitch perception.

Types

Pitch and Frequency Illusions

Pitch illusions represent a category of auditory illusions where the perceived height of a sound, known as , deviates from its actual physical . is formally defined as the auditory sensation allowing sounds to be ordered on a from low to high, independent of other attributes like or . These illusions arise because human perception is not a direct linear mapping of but involves complex psychoacoustic processing influenced by contextual cues. Key types of pitch illusions include glissando illusions and those based on ambiguous tones. In glissando illusions, a tone of constant pitch is presented simultaneously with a gliding (glissando) tone separated spatially via stereo speakers; listeners often perceive the stationary tone as rising or falling in pitch following the glissando's trajectory, due to the brain's integration of temporal and spatial auditory cues. This effect, first demonstrated by Diana Deutsch in 1995, highlights how proximity in auditory space can override frequency constancy to create illusory pitch motion. Ambiguous tones exploit the perceptual principle of octave equivalence, where tones separated by an octave (a ratio of 2:1) are treated as equivalent in despite differing in absolute , enabling circular representations of pitch height. For instance, in the , pairs of synthesized tones related by a half-octave () interval are presented; the perceived direction of pitch change—ascending or descending—varies systematically across listeners based on linguistic and cultural factors, such as the spoken language's tonal structure. This ambiguity stems from the brain's reliance on learned pitch hierarchies rather than raw differences. Psychoacoustic scaling reveals why such illusions occur: perceived pitch does not scale linearly with physical , as lower frequencies require larger changes for equivalent perceptual steps compared to higher ones. The , developed to quantify this nonlinearity, approximates perceived pitch by transforming f (in Hz) into mels via the formula \text{mel}(f) = 2595 \log_{10} (1 + f / 700), aligning better with subjective experience than linear measures. This scale underscores the logarithmic compression in auditory processing, where equal mel intervals correspond to roughly equal perceived pitch differences. In the , sound synthesis research has increasingly employed generative models to create and study novel illusions, simulating complex interactions that push perceptual boundaries beyond traditional stimuli. These approaches, including neural networks trained on psychoacoustic data, generate ambiguous tone sequences that elicit variable pitch perceptions, aiding investigations into auditory .

Spatial and Localization Illusions

Spatial and localization illusions occur when the misperceives the position, direction, or motion of a source in , often due to ambiguities or overrides in the primary cues used for localization. Humans rely on interaural time differences (ITD), the slight delay in arrival between the ears (typically 10–700 μs depending on ), and interaural level differences (ILD), the disparity caused by the head's shadowing effect (up to 20 dB at high angles), to estimate azimuth. ITD is most effective for low-frequency below 1,500 Hz, where differences allow precise timing extraction, while ILD dominates for high frequencies above 4,000 Hz due to acoustic shadowing. These cues fail in illusions when frequency content falls in the transitional 1,500–4,000 Hz range, where neither provides reliable information, or in reverberant environments that introduce conflicting reflections, leading to ambiguous localization on the "cone of confusion" (a hyperbolic surface where identical ITD/ILD values correspond to multiple positions). The exemplifies how ITD and ILD cues can be overridden by temporal arrival order, causing echoes to be perceptually fused with and localized to the first-arriving direct rather than their true position. In this illusion, when a lead is followed by a () within 1–10 ms, the brain suppresses the lag's spatial cues, attributing the entire auditory event to the lead's direction; this enhances localization accuracy in everyday reverberant spaces by preventing "ghost" sources from reflections. Seminal experiments with click pairs showed localization dominance persisting beyond fusion thresholds (4–7 ms), with neural correlates in the inhibiting lag responses up to 10 ms in animal models. The ventriloquism effect demonstrates cross-modal failure of auditory localization cues, where visual stimuli bias perceived position toward the visual source, overriding ITD/ILD by up to 90% in spatial misalignment. This occurs through near-optimal Bayesian of sensory inputs, weighting higher due to its superior spatial acuity (error <1° vs. auditory ~10°), such that when a and incongruent visual coincide temporally, the computes a fused estimate closer to the visual location. Functional MRI studies confirm this capture in regions, with the effect diminishing when visual reliability decreases (e.g., via blurring). Recent applications in virtual and augmented reality (VR/AR) leverage these illusions for immersive spatial audio, exploiting manipulated ITD/ILD and precedence to create synthetic soundscapes that enhance presence and navigation. Post-2020 studies show that adding co-localized auditory cues in VR homing tasks improves spatial updating accuracy by 20–30% over visual-only conditions, while reverberation simulations in AR induce precedence-like illusions to mimic real-room acoustics, aiding distance perception despite cue conflicts. Technologies like higher-order ambisonics further enable dynamic illusions, such as virtual sound motion overriding physical echoes, with evaluations in head-mounted displays confirming heightened immersion without disorientation.

Temporal and Continuity Illusions

Temporal and continuity illusions disrupt the auditory system's processing of sound timing, , and , often compelling listeners to perceive unbroken sequences amid interruptions or ambiguities in the input signal. These illusions arise within broader perceptual processes that prioritize scene coherence, such as auditory stream and , where the brain infers to resolve incomplete auditory scenes. Disruptions in temporal processing can lead to misperceptions of , organization, or seamless sound progression, highlighting the interplay between sensory input and cognitive expectations. The continuity illusion, also termed auditory induction, exemplifies how sounds are perceived as uninterrupted despite containing silent gaps masked by concurrent . In this effect, a interrupted by a brief burst—matching the tone's and exceeding a critical threshold—triggers the to "fill in" the gap, restoring as continuous; this occurs because the suppresses neural offset responses to the while providing excitatory drive to sustain activity in relevant neural populations. Seminal demonstrations showed that such restoration applies even to , where missing phonemes obscured by are perceptually reinstated based on contextual cues. Neural models attribute this to bistable states in , where recurrent excitation maintains activity during masking, as evidenced by sustained responses in primary during illusory continuity. Recent computational simulations confirm that in neural populations, combined with masking of transients, underlies the illusion's dynamics. Rhythmic grouping errors occur when listeners erroneously organize beats in isochronous sequences, such as metronome-like auditory pulses, particularly under perturbations that subtly alter timing without altering the overall . In sensorimotor synchronization tasks, small subliminal shifts (e.g., 0.8–2% of the inter-onset ) in a sequence prompt rapid corrections in responses, but can induce perceptual illusions where the appears to shift or regroup, as the integrates the into the ongoing temporal pattern via attentional monitoring. These errors persist across various modes, including antiphase or interrupted responses, suggesting that perceptual oscillators detect deviations below awareness, leading to illusory alignments or drifts. Such grouping misperceptions reveal the auditory system's reliance on local timing adjustments over global recalibration, with full correction requiring multiple successive perturbations. The in audition manifests as a where longer durations between successive tones lead to overestimation of the perceived separation, or spatial extent, between them. This temporal-spatial , known more precisely as the auditory in this direction, demonstrates how extended intervals distort judgments of distance, with listeners reporting greater separation for longer inter-tone durations despite fixed differences. Experimental evidence using three-tone sequences shows systematic distortions in tasks, supporting models where imputed velocity influences spatiotemporal binding. This underscores the auditory modality's susceptibility to cross-dimensional influences, analogous to visual tau illusions. Research from the 2020s has increasingly explored temporal illusions in neurodiverse populations, such as those with , revealing atypical processing that limits susceptibility to certain auditory timing distortions. Autistic adults exhibit reduced sensitivity to audiovisual asynchronies in judgments, particularly for complex social stimuli like speech or rhythmic actions, resulting in wider temporal binding windows and higher error rates for auditory-leading trials. studies indicate altered neural synchrony in superior temporal regions during audiovisual temporal , contributing to diminished illusory effects in . These findings suggest that enhanced local processing in may impair global temporal coherence, with implications for sensory deficits beyond typical populations.

Examples

Shepard Illusion

The Shepard illusion, commonly referred to as the , is an auditory phenomenon consisting of a superposition of sine waves separated by octaves, which generates the perception of a tone that rises indefinitely in without ever reaching a higher register. This illusion exemplifies and frequency illusions by exploiting the logarithmic nature of pitch perception, where the overlapping harmonics create a seamless auditory loop. Developed by psychologist in 1964, the illusion is constructed by layering multiple sine waves at frequencies that are octave multiples of a base tone, with each wave's modulated via bell-shaped envelopes that gradually increase for higher s and decrease for lower ones as the sequence progresses. This fading in and out of components ensures that the perceptual focus shifts continuously upward, mimicking an ascending scale while the overall spectral centroid remains ambiguously cyclic. The perceptual effect arises from the brain's failure to disambiguate the circular structure, as the interprets the rising components as a unidirectional , leading listeners to experience an ascent that defies acoustic . This ambiguity stems from the equivalence of octaves in musical perception, where the highest audible component dominates the judgment, perpetuating the across repetitions. A notable variation, the Shepard-Risset glissando, was introduced by composer Jean-Claude Risset in 1969, adapting the discrete tone steps into a continuous gliding scale that can produce both endless ascents and descents through similar amplitude and techniques.

Octave Illusion

The octave illusion is an auditory phenomenon discovered by psychologist Diana Deutsch in 1973 and first reported in 1974. It arises from the dichotic presentation of two pure tones separated by an octave, typically 400 Hz and 800 Hz, alternated at a rate of four cycles per second through stereo . In the standard sequence, the right receives the high while the left receives the low , followed by a switch where the right gets the low and the left gets the high , creating a repeating pattern without gaps between tones. Listeners typically perceive the high tone as emanating continuously from the right ear and the low tone from the left ear, regardless of the actual input switches, resulting in an illusory sensation of the pitch ascending and descending by a full octave with each alternation. This swapped perception of pitch and location persists even when the sequence is reversed or presented monaurally, highlighting the illusion's robustness. Subjective reports from experimental participants consistently describe this octave-jumping effect, though some individuals experience variations such as the tones appearing to trade places or fuse into a single gliding pitch. The experimental setup relies on controlled dichotic listening via headphones to isolate ear-specific inputs, with participants asked to describe their perceptions after multiple repetitions of the 20-second sequence. The mechanism underlying the octave illusion involves a right-ear bias in the brain's -processing pathways, where information from the right ear dominates , overriding location cues from and leading to the anomalous assignment of pitches to ears. This reflects a between "what" ( identification) and "where" () streams in auditory processing. The illusion demonstrates robustness across diverse cultural groups, as evidenced by consistent replications in and non-Western populations, but its specific form varies systematically with : right-handers predominantly report the standard high-right/low-left pattern, while left-handers exhibit more diverse or reversed perceptions, potentially linked to hemispheric lateralization differences.

McGurk Effect

The is a perceptual in which conflicting auditory and visual speech cues lead to the of a fused or altered speech sound that is not present in either modality. First demonstrated by Harry McGurk and John MacDonald in 1976, the effect occurs when an auditory , such as /ba/, is paired with a visually articulated like /ga/, resulting in the perceiver often reporting an intermediate sound, such as /da/. This integration error highlights how the brain prioritizes multimodal coherence in , even when the inputs are incongruent. At the neural level, the involves integration primarily in the (), a region critical for . () studies show heightened activity in the left during the illusion, supporting its role in fusing auditory and visual inputs. targeted at the disrupts the effect, confirming its causal involvement in multimodal . The strength of the varies based on visual salience and the perceiver's native . Higher visual clarity, such as sharp lip movements without blurring, enhances the by increasing the weight of visual cues in . Similarly, native experience modulates susceptibility; for instance, non-native speakers exhibit weaker effects for unfamiliar phonetic contrasts due to reduced audiovisual mappings in their linguistic system. Recent extensions in the 2020s have explored McGurk-like illusions beyond traditional speech, applying them to non-speech sounds and AI-generated content. Studies demonstrate that integration for non-linguistic auditory stimuli, such as environmental noises paired with gestures, yields weaker but analogous illusions compared to speech, underscoring modality-specific mechanisms. In AI dubbing contexts, models that synchronize dubbed audio with mismatched lip movements in videos can induce McGurk illusions, raising implications for realistic .

Applications

In Music and Sound Design

Auditory illusions play a significant role in music composition by enabling to manipulate perception for emotional impact. The , which creates the illusion of an endlessly rising or falling pitch, has been integrated into film scores to evoke escalating tension. In Christopher Nolan's (2017), employed Shepard tones in the soundtrack's ticking clock motif and orchestral builds, layering octave-separated sine waves to simulate perpetual ascent without resolution. Similarly, in (2020), used Shepard tones with dissonant elements to drive rhythmic tension in sequences. These applications leverage the illusion's ambiguity to heighten suspense in narrative contexts. Risset rhythms extend similar principles to , producing the perception of continuous or deceleration through overlapping cyclic patterns at varying speeds. In electronic music, this illusion crafts dynamic builds that maintain momentum indefinitely, often in or experimental genres. For instance, martsman incorporated Risset rhythms in a 2024 remix of the track "Ting" from the album Black Plastics Pt. 5, using software to generate eternal accelerando effects that enhance rhythmic drive without fatigue. Such techniques allow composers to create hypnotic grooves that align with the genre's emphasis on perceptual motion. In for video games and , spatial auditory illusions foster immersion by exploiting cues to simulate 3D positioning. Binaural audio recording and rendering techniques mimic interaural time and level differences, generating the illusion of sounds originating from precise locations in virtual space. This is particularly effective in VR titles, where it aids player navigation and environmental awareness; for example, spatial audio in games like Half-Life: Alyx (2020) uses head-related transfer functions to place auditory events dynamically around the user, enhancing realism and tension. By leveraging these localization illusions, designers create believable worlds that respond to head movements via . Recent advancements in have introduced tools for procedural audio generation that incorporate auditory illusions, streamlining creation for post-2020. Generative models now produce dynamic soundscapes with embedded perceptual effects, tailored to gameplay. These developments enable scalable applications in games and , where automates illusion-based audio to heighten engagement without predefined loops.

In Psychological Research

Auditory illusions have been instrumental in for elucidating the underlying mechanisms of perception, cognitive processing, and neural organization. By manipulating acoustic stimuli to elicit predictable misperceptions, researchers can dissect how the constructs auditory from ambiguous inputs, revealing principles of sensory , , and expectation. These illusions provide controlled paradigms to test theories of without relying on reports alone, allowing for objective measurement through behavioral responses, , and electrophysiological recordings. Seminal work in this domain highlights how illusions bridge low-level sensory encoding with higher-order cognitive influences, informing models of auditory scene analysis and multisensory fusion. In the realm of pitch and frequency processing, illusions such as the octave illusion and , pioneered by Diana Deutsch, demonstrate marked individual and cultural variations in height perception. The octave illusion, where alternating high and low tones presented dichotically yield a fused ascending or descending scale, varies systematically between right- and left-handers, suggesting lateralized hemispheric specialization in auditory processing. Similarly, the shows that listeners from different linguistic backgrounds assign pitch classes differently, indicating that early exposure to tonal languages shapes categorical representations. These illusions have been used to probe abilities and the interplay between music and , with studies confirming shared neural substrates in the . Temporal and continuity illusions further illuminate auditory grouping principles, as explored in Albert Bregman's foundational framework of auditory scene analysis. The illusory continuity illusion, where a high-frequency tone seems uninterrupted despite being masked by noise, reveals how the perceptual system restores missing information based on temporal proximity and spectral similarity, facilitating sound source segregation in noisy environments. This effect, observed in psychophysical experiments with reaction times and perceptual ratings, underscores Gestalt-like organizational rules in audition and has influenced computational models of streaming versus integration. Bregman's 1990 monograph, cited over 10,000 times, established illusions as proxies for studying real-world listening challenges, such as cocktail party effects. Multisensory applications leverage illusions like the to investigate audiovisual speech integration, a process central to communication. First described by Harry McGurk and John MacDonald, this illusion occurs when incongruent auditory and visual speech cues—such as dubbing a video of /ga/ with /ba/ audio—lead perceivers to report a fused /da/, demonstrating obligatory multisensory binding in the posterior . Developmental studies using the effect show that integration strengthens from infancy to adulthood, while clinical research links reduced susceptibility to autism spectrum disorders and , highlighting its role in . With applications in over 2,000 studies, the McGurk paradigm has quantified integration strength via fusion rates, aiding diagnostics for deficits. Beyond typical perception, auditory illusions serve as biomarkers in psychopathology research, particularly for psychosis risk. Speech illusions, where degraded or ambiguous auditory stimuli are interpreted as meaningful words (e.g., via continuum noise), correlate with hallucination proneness in non-clinical populations. EEG studies of these illusions reveal aberrant in the temporal lobes, linking top-down expectations to symptom emergence—a high-impact finding from over 300 citations. Similarly, conditioned auditory hallucinations, elicited by learned associations, model formation, informing cognitive therapies. Recent advancements integrate illusions with advanced to map bottom-up versus top-down influences. For instance, the Zwicker tone illusion, a virtual pitch perceived in notched noise, activates primary similarly to real tones, as shown in fMRI, while top-down modulation via attention alters its salience. These paradigms, combining with analyses, continue to refine models of perceptual inference, with implications for AI-driven hearing aids and soundscapes.

References

  1. [1]
    An auditory illusion reveals the role of streaming in the temporal ...
    We find that the illusion of alternating tones arises from the synchronous tone pairs across ears rather than sequential tones in one ear, suggesting that the ...
  2. [2]
    [PDF] AUDITORY CONFLICTS AND ILLUSIONS - USAARL
    For example, seeing lip movement in a noisy environment where no speech is present may result in the illusion of hearing speech. Another example of an auditory ...
  3. [3]
    [PDF] Understanding the science behind auditory processing using illusions
    Examples of auditory spectral-based illusions. (a) The Zwicker Tone illusion, where a broadband spectrum with a gap around a central frequency produces an.<|control11|><|separator|>
  4. [4]
    Auditory-based illusions for sound installations
    Jun 30, 2025 · Analogously, auditory illusions are misinterpreted percep- tions of an external sound stimulus: listeners may hear sounds which are not present ...
  5. [5]
    Ear and pitch segregation in Deutsch's octave illusion persist ...
    Oct 3, 2011 · Deutsch's octave illusion occurs when two tones that are spaced an octave apart are repeatedly presented in alternation; the sequence is ...
  6. [6]
    Circularity in Judgments of Relative Pitch - AIP Publishing
    A special set of computer‐generated complex tones is shown to lead to a complete breakdown of transitivity in judgments of relative pitch.
  7. [7]
    Hearing lips and seeing voices - Nature
    Dec 23, 1976 · The study reported here demonstrates a previously unrecognised influence of vision upon speech perception.
  8. [8]
  9. [9]
    [PDF] Auditory Illusions - Diana Deutsch
    The illusions described in this entry show that what we hear is by no means a direct reflection of the sounds that are presented to our ears; instead, high- ...
  10. [10]
    Multistability in auditory stream segregation: a predictive coding view
    Apr 5, 2012 · Predictive processing helps to maintain perceptual stability by ... auditory illusion, The Journal of the Acoustical Society of America ...
  11. [11]
    [PDF] Binaural Hearing—Before and After the Stethophone
    Left–“Stereoscopist” by Nicholas Wade. Charles Wheatstone is shown with his eyes in the mirrors of his stereoscope. Wheatstone's portrait is derived from an ...Missing: 1839 | Show results with:1839
  12. [12]
    Sensitivity to an Illusion of Sound Location in Human Auditory Cortex
    May 22, 2017 · We used functional magnetic resonance imaging (fMRI) to measure AC responses to sounds that varied in perceived location due to interaural level ...
  13. [13]
    fMRI Study of the Ventriloquism Effect | Cerebral Cortex
    Jan 9, 2015 · Abstract. In spatial perception, visual information has higher acuity than auditory information and we often misperceive sound-source ...Fmri Data Analysis · Discussion · Plasticity In Human Sound...<|separator|>
  14. [14]
    Auditory Illusions of Supernatural Spirits: Archaeological Evidence ...
    Oct 21, 2014 · Sound reflection, whisper galleries, reverberation, ricochets, and interference patterns were perceived in the past as eerie sounds attributed to invisible ...Missing: Western | Show results with:Western
  15. [15]
    Neuroanatomy, Auditory Pathway - StatPearls - NCBI Bookshelf
    Oct 24, 2023 · Hair cell depolarization sends an impulse toward the auditory nerve. Sound energy is thus converted to electrical energy and nerve signals ...
  16. [16]
  17. [17]
    How We Hear: The Perception and Neural Coding of Sound - PMC
    This organization is maintained from the cochlea via the inner hair cells and the auditory nerve, through the brainstem and midbrain, to the primary auditory ...
  18. [18]
    The Auditory Pathway - Structures of the Ear - TeachMeAnatomy
    The auditory pathway conveys the special sense of hearing. Information travels from the receptors in the organ of Corti of the inner ear (cochlear hair cells) ...<|separator|>
  19. [19]
    Top-Down Inference in the Auditory System: Potential Roles for ...
    We argue that corticofugal pathways contain the requisite circuitry to implement predictive coding mechanisms to facilitate perception of complex sounds.Abstract · Introduction · Evidence for Top-Down... · Methodological Issues in Top...
  20. [20]
    Evidence for causal top-down frontal contributions to predictive ...
    Dec 18, 2017 · Here we show that selective neurodegeneration of human frontal speech regions results in delayed reconciliation of predictions in temporal cortex.Results · Induced Oscillatory Dynamics · Methods
  21. [21]
    Adaptation in auditory processing - PMC - PubMed Central - NIH
    Adaptation is a fundamental process in the auditory system that dynamically adjusts the responses of neurons to unchanging and recurring sounds.
  22. [22]
    A cortical circuit for audio-visual predictions | Nature Neuroscience
    Dec 2, 2021 · Auditory cortex axons carry a mixture of auditory and retinotopically matched visual input to V1, and optogenetic stimulation of these axons ...
  23. [23]
    Auditory illusory models as proxies to investigate bottom-up and top ...
    Auditory phantom perception, exemplified by tinnitus, is characterized by a perceptual experience without external stimuli. This study utilized two auditory ...
  24. [24]
    Auditory and visual objects - ScienceDirect.com
    In this paper we re-examine the concept of an object in a way that overcomes the limitations of the traditional perspective.
  25. [25]
    Auditory Scene Analysis: The Perceptual Organization of Sound
    Auditory Scene Analysis addresses the problem of hearing complex auditory environments, using a series of creative analogies to describe the process requir.
  26. [26]
    The Speech-to-Song Illusion Is Reduced in Speakers of Tonal ... - NIH
    May 9, 2016 · Examining the effect of the listener's native language, tonal language native speakers experienced significantly weaker speech-to-song effects ...
  27. [27]
    The Tritone Paradox: An Influence of Language on Music Perception
    Jul 1, 1991 · The tritone paradox is produced when two tones that are related by a half- octave (or tritone) are presented in succession. Each tone is ...
  28. [28]
    The glissando illusion and handedness - ScienceDirect.com
    The present paper reports the first study of the glissando illusion, which was created and published as a sound demonstration by Deutsch (1995).1 To experience ...Missing: original | Show results with:original
  29. [29]
    The glissando illusion: A spatial illusory contour in hearing
    Apr 1, 2005 · In the glissando illusion (originally demonstrated by Deutsch, 1995) a synthesized oboe tone of constant pitch is played together with a ...
  30. [30]
    Tritone Paradox - Diana Deutsch
    The basic pattern that produces this illusion consists of two computer-produced tones that are related by a half-octave. (This interval is called a tritone).
  31. [31]
    A paradox of musical pitch - American Psychological Association
    Shepard, PhD, created an auditory illusion by using tones that consist of many octaves of the same pitch class note played at once--from very low to very high.
  32. [32]
  33. [33]
    Listening with generative models - ScienceDirect.com
    The results show how generative models can account for the perception of both classic illusions and everyday sensory signals.Missing: 2020s | Show results with:2020s
  34. [34]
    Auditory localization: a comprehensive practical review - Frontiers
    The difference in reception times between the two ears is called the Interaural Time Difference (ITD). It constitutes the dominant cue in estimating the azimuth ...
  35. [35]
    The Precedence Effect in Sound Localization - PMC - PubMed Central
    The precedence effect, characterizing the perceptual dominance of spatial information carried by the first-arriving signal.
  36. [36]
    The ventriloquist effect results from near-optimal bimodal integration
    In this study we investigate spatial localization of audio-visual stimuli. When visual localization is good, vision does indeed dominate and capture sound.
  37. [37]
    An fMRI Study of the Ventriloquism Effect - PMC - PubMed Central
    Jan 9, 2015 · In spatial perception, visual information has higher acuity than auditory information and we often misperceive sound-source locations when ...
  38. [38]
    The addition of a spatial auditory cue improves spatial updating in a ...
    May 9, 2024 · Here, we tested whether an auditory cue co-localized with a visual target could improve spatial updating in a virtual reality homing task.
  39. [39]
    Effects of auditory distance cues and reverberation on spatial ...
    Oct 13, 2025 · This study was fundamental to better comprehending the interaction between head movements and reverberation in spatial auditory perception, even ...
  40. [40]
    Creating Auditory Illusions with Spatial-Audio Technologies
    The spatial auditory illusion (SAI) occurs when acoustic sources create a sound scene that produces a desired auditory scene over a region. ... ... In addition ...
  41. [41]
    Circularity in Judgments of Relative Pitch - AIP Publishing
    Circularity in Judgments of Relative Pitch Available. Roger N. Shepard. Roger ... This content is only available via PDF. Open the PDF for in another ...
  42. [42]
    Listening to the Shepard-Risset Glissando - PubMed Central - NIH
    Mar 4, 2016 · Thus, the Shepard scale creates an illusion because it contains ambiguous tones that lure the brain into making perceptual errors. Although the ...
  43. [43]
    Octave Illusion - Diana Deutsch
    The Octave Illusion involves alternating tones an octave apart, often perceived as a single tone switching between high and low pitch in different ears.Missing: paper | Show results with:paper
  44. [44]
  45. [45]
    What is the McGurk effect? - PMC - NIH
    The McGurk effect is a multisensory illusion where a voice is heard as a different consonant when paired with incongruent visual speech, causing a change in ...
  46. [46]
    fMRI-Guided Transcranial Magnetic Stimulation Reveals That the ...
    Feb 17, 2010 · These results demonstrate that the STS plays a critical role in the McGurk effect and auditory–visual integration of speech.
  47. [47]
    Influence of auditory and visual stimulus degradation on eye ...
    Jun 12, 2020 · Adding noise to the auditory signal increased McGurk responses, while blurring the visual signal decreased McGurk responses.Missing: salience | Show results with:salience
  48. [48]
    Language Experience Changes Audiovisual Perception - PMC - NIH
    May 11, 2018 · The McGurk effect refers to a curious phenomenon whereby what we see changes what we hear. Participants are presented with auditory stimuli such ...Missing: salience | Show results with:salience
  49. [49]
    The sound illusion that makes Dunkirk so intense - Vox
    Jul 26, 2017 · Named after cognitive scientist Roger Shepard, the sound consists of several tones separated by an octave layered on top of each other. As the ...<|separator|>
  50. [50]
    Marco Beltrami Used Shepard Tones to Create Tension on A Quiet ...
    May 28, 2021 · One of the sounds that was new to this movie was the dissonance between the two pianos and using that as a pulsing rhythmic element that could drive some of ...<|control11|><|separator|>
  51. [51]
    What is The Shepard Tone? The Audio Illusion Explained with ...
    Aug 10, 2020 · What is the Shepard Tone? The Shepard Tone is an audio illusion that creates the feeling of consistent, never-ending rising or falling.
  52. [52]
    Risset rhythms: Pure Data implementation of eternal accelerando
    Sep 5, 2024 · The component is useful to set up full compositions based on Risset rhythms. As an example, I've remixed my track “Ting” from Black Plastics Pt.
  53. [53]
    Infinite Acceleration: Risset Rhythms - Point at Infinity - WordPress.com
    Nov 20, 2017 · Such a rhythm is known as a Risset rhythm. I coded up some very basic examples on Supercollider. Here's an accelerating Risset rhythm: And a ...
  54. [54]
    Spatial Audio For VR - Meegle
    Dec 27, 2024 · When played back through headphones, binaural audio can create a realistic 3D sound environment, making it ideal for VR applications where users ...
  55. [55]
    Spatial Sound in VR - Showtime VR
    Mar 5, 2025 · Spatial sound in VR recreates a sound field, giving cues about sound position, distance, and movement, replicating natural psychoacoustic ...
  56. [56]
  57. [57]
    [PDF] Artificial intelligence in creating, representing or expressing an ...
    Apr 10, 2025 · Our approach involved proposing a systematic process, starting with AI-driven sound generation, leading to the development of an immersive ...
  58. [58]
    AI in Sound Design and Automated Post-Production
    Sep 25, 2025 · Moreover, AI assists in creative manipulations—transforming sounds, layering textures, or generating entirely new auditory elements—offering ...
  59. [59]
    Illusions and Research - Diana Deutsch
    Auditory Illusion 4: Mystery Melody Youtube Video. Auditory Illusion 3: Phantom Words Youtube Video. Octave Illusion: Auditory Illusion 1 (headphones needed!)<|control11|><|separator|>
  60. [60]
  61. [61]
    Auditory Scene Analysis: The Perceptual Organization of Sound
    Aug 6, 2025 · Auditory scene analysis (ASA) is the process by which the auditory system separates the individual sounds in natural-world situations.
  62. [62]
    From Speech Illusions to Onset of Psychotic Disorder: Applying ...
    May 27, 2020 · Earlier studies have found experimentally assessed speech illusions to be associated with positive symptoms in patients with psychotic disorders ...<|control11|><|separator|>
  63. [63]
    Cortical processes of speech illusions in the general population
    Oct 18, 2016 · There is evidence that experimentally elicited auditory illusions in the general population index risk for psychotic symptoms.Eeg Measurement · Eeg During Si · Discussion
  64. [64]
    Auditory illusory models as proxies to investigate bottom-up and top ...
    May 9, 2025 · This induces a temporary and auditory illusion which is an afterimage of the missing central frequency called the Zwicker Tone (ZT) (Zwicker, ...