Fact-checked by Grok 2 weeks ago

Multi-Band Excitation

Multi-Band Excitation () is a for speech analysis and synthesis used in vocoders, which represents the speech signal as the product of a slowly varying spectral envelope and a rapidly varying source, with the determined independently as voiced or unvoiced in each of multiple frequency bands around the harmonics of the to better handle mixed voicing regions in speech. Developed by Daniel W. Griffin and Jae S. Lim at , the model was introduced in their seminal paper and addresses limitations of earlier single-band approaches, such as those in LPC-10 vocoders, by allowing fine-grained control over voicing decisions to reduce synthesis artifacts like buzziness during transitions between voiced and unvoiced segments. In the MBE framework, key parameters extracted from the speech signal include the (), binary voiced/unvoiced (V/UV) decisions for each band (typically 20–30 bands up to 4 kHz), and the envelope magnitudes sampled at frequencies. For , voiced is generated as a periodic impulse train with phases derived from the original signal, filtered by the spectral envelope in the , while unvoiced uses band-limited mixed selectively per band; the combined signal is then to produce natural-sounding speech at low , such as 8 kbps for high-quality output. This dual-domain approach—time-domain for periodicity and for —enables efficient coding and modification of speech parameters. MBE vocoders demonstrate superior performance over traditional methods, achieving intelligibility improvements of up to 12 points on Diagnostic Rhyme Test (DRT) scores compared to single-band excitation systems, particularly in noisy environments at signal-to-noise ratios as low as 5 dB. The model has been foundational for subsequent developments, including commercial implementations like Improved Multi-Band Excitation (IMBE) and Advanced Multi-Band Excitation (AMBE) by Digital Voice Systems, Inc., which are standardized in applications such as (e.g., ), satellite communications, and secure voice systems for their robustness and low-latency synthesis. These extensions maintain the core principles while incorporating enhancements like for further bit-rate reduction down to 2.4 kbps without significant quality loss.

Introduction

Definition and Principles

Multi-Band Excitation (MBE) is a model for that represents the speech signal as a sum of (periodic/voiced) and (aperiodic/unvoiced) components distributed across multiple bands. This approach enables a more nuanced modeling of speech by allowing independent treatment of voicing in different regions, improving quality over simpler models that assume uniform across the entire . The basic principles of MBE involve dividing the speech spectrum into multiple bands centered around the harmonics of the , with each band having a width approximately equal to the , where each band is independently classified as voiced or unvoiced based on its spectral characteristics, such as the fit between the observed spectrum and a harmonic model. This band-wise decision captures the mixed voicing nature of natural speech, where lower frequencies may be predominantly voiced while higher frequencies exhibit more noise-like properties due to frication or . The model was developed in the at as an advancement in speech analysis/synthesis techniques. A key concept in is the estimation of the () and per-band voicing decisions, which together provide a more accurate approximation of human speech production than single-band models by accounting for frequency-dependent periodicity. estimation is performed globally by minimizing the overall spectral reconstruction error, while voicing is determined locally within each band by comparing the error of a voiced model against a model . The mathematical representation of the speech signal s(n) in MBE approximates voiced components as a sum of sinusoids corresponding to the harmonics: s(n) \approx \sum_k A_k \cos\left( \frac{2\pi k f_0 n}{F_s} + \phi_k \right) for voiced bands, plus additive excitation for unvoiced bands, where f_0 is the , A_k and \phi_k are the and of the k-th , and F_s is the sampling rate. This time-domain allows efficient reconstruction of the signal from extracted parameters.

Historical Development

The multi-band excitation (MBE) originated from research at the (MIT) in the mid-1980s, where Daniel W. Griffin and Jae S. Lim developed a novel speech model to improve low-bitrate coding performance over traditional (LPC) , which often suffered from buzziness and muffled artifacts in voiced speech. Their Ph.D. thesis introduced the core MBE framework, treating the short-time speech spectrum as the product of a harmonic series modulated by band-specific voicing decisions, enabling more natural synthesis at rates around 8 kbps. This work addressed limitations in single-band excitation models by allowing independent voiced/unvoiced classifications for frequency bands around each harmonic, a principle that became foundational to subsequent advancements. A pivotal in by John C. Hardwick and Jae S. Lim detailed a practical implementation of the model as a 4.8 kbps speech coder, presented at the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). This coder quantized spectral magnitudes and voicing parameters efficiently, achieving superior subjective quality compared to contemporaneous LPC-based systems at similar bitrates, with informal listening tests highlighting reduced distortion in transitional speech segments. The paper established as a viable alternative for bandwidth-constrained applications, influencing government and commercial interest in parametric vocoding. Digital Voice Systems, Inc. (DVSI) was founded in 1988 by a team including Jae S. Lim to commercialize the MBE technology emerging from MIT research. Building on the academic foundations, DVSI refined the model into the Improved Multi-Band Excitation (IMBE) vocoder in the early 1990s, incorporating enhanced parameter estimation and error protection for robust performance in noisy environments. IMBE was selected as the vocoder standard for the APCO Project 25 (P25) digital land mobile radio system in 1993, marking its adoption in public safety communications during the mid-1990s. By 1997, DVSI evolved IMBE further into the Advanced Multi-Band Excitation (AMBE) vocoder, optimizing for even lower bitrates (down to 2.4 kbps) while maintaining high intelligibility through advanced vector quantization of voicing and pitch. In the , military evaluations underscored 's advantages, with tests demonstrating superior Diagnostic Rhyme Test (DRT) intelligibility scores over single-band excitation vocoders like LPC-10e; for instance, early MBE implementations at 4.8 kbps achieved DRT intelligibility improvements of up to 12 points over LPC-10e, validating its efficacy for secure under error-prone channels. These assessments, conducted by U.S. Department of Defense entities, contributed to MBE's integration into standards like Inmarsat-M for satellite-based military voice transmission.

Technical Model

Speech Production Framework

The Multi-Band Excitation (MBE) model draws from the source-filter theory of , mimicking the physiological processes of human voice generation by separating the glottal source excitation from the vocal tract's filtering effects. In voiced speech, the produces quasi-periodic pulses during vocal fold vibration, resulting in a harmonic , while unvoiced sounds arise from turbulent airflow generating -like excitation; MBE replicates this by modeling voiced excitation as a periodic impulse train at the and unvoiced excitation as , both filtered by the vocal tract to produce resonances that shape the overall . This approach aligns with the acoustic behavior of the and supralaryngeal structures, enabling accurate representation of natural speech transitions without assuming global periodicity across the entire . Acoustically, the MBE framework employs frequency-domain analysis, typically via , to decompose the speech signal into its spectral envelope and fine structure components. The spectral envelope, representing the vocal tract's resonant characteristics, is modeled using linear prediction coefficients (LPC) or direct sampling of the smoothed speech , capturing the slowly varying amplitude profile across frequencies. is then separated into harmonic components for periodic (voiced) regions—modeled as a series of sinusoids at integer multiples of the —and for turbulent (unvoiced) regions, allowing independent parameterization of periodicity and randomness to reflect the diverse acoustic properties of . This separation ensures that the model preserves both the smooth structure and the detailed temporal variations inherent in human utterance. The rationale for dividing the spectrum into multiple bands—typically over 20 narrow bands centered on harmonics—stems from the mixed excitation observed in human speech, particularly in transition regions such as 2-4 kHz where fricatives and aspirations exhibit partial voicing. In these areas, not all frequencies are uniformly periodic or noisy; instead, lower bands may show strong harmonics from glottal pulses, while higher bands display noise from airflow turbulence, leading to a blended spectrum that traditional single-decision models distort into overly periodic or noisy outputs. By enabling per-band voiced/unvoiced decisions in MBE, the model accurately captures this hybrid nature, reducing artifacts like buzziness in synthesized fricatives and improving fidelity to physiological speech production. Central to MBE's efficiency is the concept of gain-shape separation, where the envelope's (the normalized profile) is parameterized independently from the 's (overall scaling) and (harmonic versus contributions). This decoupling allows the envelope to be efficiently coded as a low-dimensional representation—such as LPC coefficients or interpolated samples—while the handles pitch-synchronous details, facilitating compact and without redundant . Such parameterization mirrors the physiological of vocal tract shaping and glottal , optimizing the model for low-bitrate applications while maintaining perceptual quality.

Parameter Estimation Process

The parameter estimation process in Multi-Band Excitation (MBE) vocoders extracts key parameters—pitch, voicing decisions, and spectral envelope—from input speech signals through frame-by-frame analysis, typically every 10-20 ms using a to minimize . This process relies on the model of , where the goal is to minimize the least-squares error between the magnitude spectrum of the original signal and that of a synthesized signal assuming periodic at the estimated . The computation occurs in the via FFT, with parameters refined iteratively to achieve the best fit. Pitch estimation employs a closed-loop that evaluates candidate pitch periods by computing the in the spectral match, starting with a coarse of periods and refining to sub-sample accuracy through local optimization. This is often supplemented by time-domain techniques such as normalized of the windowed speech signal or the average magnitude difference function (AMDF), which identifies periodicity by minimizing differences between delayed versions of the signal. The search range for pitch periods is approximately 2.5 to 20 ms to cover typical human frequencies from 50 to 400 Hz, ensuring robust detection even in noisy conditions. Voicing decisions are determined independently for each frequency band (typically 10-20 bands across the up to 4 kHz) through , classifying bands as voiced or unvoiced based on how well the structure fits the observed using decisions in the original model. Common measures include the harmonic-to-noise ratio (HNR), which quantifies the strength of periodic components relative to , or maximum likelihood voicing (MLV) approaches that probabilistically assign voicing states by modeling the as a of and processes. Voicing decisions are made by comparing the normalized between the of the original signal and that synthesized assuming voiced to a (typically around 0.2); lower indicates voiced. This per-band resolution captures transitions in natural speech, such as in vowels with aspirated onsets. Spectral envelope extraction involves estimating the smooth amplitude contour that modulates the harmonics, using either 10th-order (LPC) analysis to derive all-pole model coefficients or direct FFT-based magnitude computation at harmonic locations followed by . The resulting envelope parameters are quantized for , often as line spectral pairs (LSPs) derived from LPC roots for efficient and stability, or as band-specific gains for unvoiced regions. This step integrates with the overall error minimization, adjusting envelope estimates to reduce discrepancies with the input spectrum while preserving structure.

Synthesis Mechanism

In the synthesis mechanism of Multi-Band Excitation (MBE), the reconstructed speech signal is generated from the decoded parameters, including the f_0, voiced/unvoiced (V/UV) decisions per frequency band, spectral magnitudes, and phases for voiced components. The process begins with excitation generation, where voiced bands are modeled as a periodic signal composed of sinusoids at the harmonics of f_0. Specifically, the voiced excitation is synthesized in the as a sum of cosine waves: \hat{s}_v(n) = \sum_{k=1}^{K} A_k \cos\left(2\pi \frac{f_0 n}{F_s} + \phi_k\right), where A_k represents the spectral amplitude at the k-th harmonic frequency \omega_k = k f_0, \phi_k is the corresponding phase, K is the number of harmonics within the bandwidth, n is the time index, and F_s is the sampling frequency (typically 8 kHz for narrowband speech). These amplitudes A_k are obtained by sampling the spectral envelope at the harmonic locations. For unvoiced bands, excitation is produced by generating white noise, bandpass-filtering it to isolate the unvoiced frequency regions, and scaling its spectrum to match the interpolated envelope magnitudes in those bands, ensuring appropriate energy levels without periodicity. The overall excitation signal is then formed by adding the voiced and unvoiced components, which inherently provides a mixed excitation model to capture the quasi-periodic nature of speech and mitigate artifacts like buzziness from purely sinusoidal or noise-based excitations. The spectral envelope, which shapes the excitation to produce the final speech timbre, is interpolated from the provided parameters—either direct gains (amplitudes) at harmonic frequencies in the original formulation or line spectral pairs (LSPs) in derivative implementations for smoother and quantization efficiency. This envelope is applied by multiplying the spectrum (in the for unvoiced parts) or incorporating it directly into the A_k values for voiced . To ensure continuity across analysis frames (typically 20-40 ms in duration), the signals are windowed using a Hamming window and overlap-added in the , preventing discontinuities at frame boundaries. modeling plays a key role in this overlap-add process; for voiced harmonics, phases are cumulatively interpolated between frames using a linear w_k(t), defined as \theta_k(t) = \phi_{k0} + \int w_k(t) \, dt, where \phi_{k0} is the initial and w_k(t) bridges the instantaneous frequencies, promoting smooth waveform transitions and natural prosody. The resulting time-domain signal from the overlap-add operation forms the synthesized speech output at the target sampling rate, such as 8 kHz for standard bandwidth. In practice, direct time-domain is preferred for , though frequency-domain filtering can be used in some variants. Post-processing may include bandwidth extension techniques to reconstruct higher frequencies beyond 4 kHz, enhancing perceived quality in applications, though this is not part of the core mechanism. This approach leverages the multi-band V/UV decisions to achieve high-fidelity at low bit rates, distinguishing from single-band models.

Implementations

Improved Multi-Band Excitation (IMBE)

Improved Multi-Band Excitation (IMBE) represents the first commercial instantiation of the Multi-Band Excitation (MBE) speech coding model, developed by Digital Voice Systems, Inc. (DVSI) and introduced in the early 1990s to support low-bitrate digital voice transmission in public safety communications. Specifically designed for rates ranging from 2.4 to 4.8 kbps, IMBE was selected as the vocoder for the APCO Project 25 (P25) standard in 1992 following competitive evaluations, enabling interoperable digital radio systems for emergency responders. It also formed the basis for the EIA-567 standard in the 1990s, integrating into systems like Enhanced Digital Access Communications System (EDACS) for land mobile radio applications in the United States. Key enhancements in IMBE over the foundational model include the application of to spectral parameters for more efficient bit allocation and reduced quantization error, alongside dedicated error protection coding that allocates approximately 2.8 kbps for using techniques like Golay and Hamming codes to mitigate channel impairments in noisy environments. The algorithm employs a multi-band framework, which refines the handling of mixed voicing by determining voiced/unvoiced (V/UV) decisions on a per-band basis, improving naturalness in transitional speech segments compared to coarser uniform voicing models. Parameter estimation occurs over 20 ms frames, with encoded using 8 bits, V/UV decisions requiring 3 to 12 bits depending on the number of active bands, and the remaining bits dedicated to quantized spectral magnitudes and gain via hybrid scalar-vector methods. In performance evaluations, IMBE at 4.4 kbps achieves Mean Opinion Scores (MOS) around 3.4, indicating acceptable communication quality suitable for mobile and tactical radio use, with robust resilience to bit error rates up to 1% without significant degradation. This variant's synthesis leverages the core mechanism of harmonic reconstruction with band-specific excitation but incorporates these optimizations for real-world deployment in U.S. public safety radios, where it remains a foundational for Phase 1 P25 systems.

Advanced Multi-Band Excitation (AMBE)

Advanced Multi-Band Excitation (AMBE) is a algorithm developed by Digital Voice Systems, Inc. (DVSI) as an enhancement to the Improved Multi-Band Excitation (IMBE) , with the AMBE-1000 released in January 1997. Operating at bit rates from 2.0 to 9.6 kbps, AMBE employs enhanced for efficient parameter encoding and adaptive spectral enhancement to improve perceptual quality, enabling natural-sounding at reduced data rates while preserving and intelligibility. A core advancement in AMBE is its enhanced , which refines the frequency-domain analysis beyond IMBE's capabilities, allowing for more precise modeling of and components in speech. Dynamic bit allocation adapts the distribution of bits across parameters based on signal characteristics, optimizing efficiency for varying speech content. Additionally, noise interpolation in unvoiced regions generates smooth transitions between voiced and unvoiced segments, reducing artifacts and bitrate demands without compromising naturalness. These features collectively lower compared to CELP-based coders by eliminating the need for residual . AMBE supports variable rates adjustable in 50 bps increments, processes audio in 20 ms frames at an 8 kHz sampling rate, and includes integrated to maintain performance under channel errors up to 5% (BER). It has been integrated into standards such as (DMR), where its evolved form AMBE+2 serves as the preferred , as well as satellite systems like . In performance evaluations, AMBE exhibits superior intelligibility in noisy environments due to its multi-band excitation model, which robustly handles and channel impairments better than IMBE, contributing to its selection in high-reliability communication systems. This resistance to noise ensures consistent speech quality, making AMBE suitable for applications requiring clear voice transmission under adverse conditions. Mixed-Excitation Linear Prediction (MELP) is a speech coding standard developed by the U.S. Department of Defense (DoD) and standardized in 1997 as the Federal Standard at 2.4 kbps, primarily for military applications. It builds upon the multi-band excitation (MBE) model by incorporating multi-band voicing decisions across five frequency bands (0-500 Hz, 500-1000 Hz, 1000-2000 Hz, 2000-3000 Hz, and 3000-4000 Hz) to model mixed excitation, where portions of the spectrum can be voiced or unvoiced independently. MELP integrates 10th-order linear predictive coding (LPC) to represent the spectral envelope and quantizes ten Fourier magnitudes from the LPC residual using an 8-bit vector quantizer to generate the mixed excitation signal, enhancing robustness in noisy environments compared to pure MBE vocoders. The INMARSAT-M vocoder, standardized in the for satellite voice communications, operates at a source coding rate of 4.15 kbps (with a total rate of 6.4 kbps including 2.25 kbps channel coding) and employs Improved Multi-Band Excitation (IMBE) principles to determine voicing states across multiple frequency bands without full reliance on LPC modeling. This approach allows efficient transmission of speech parameters over links, prioritizing low-latency and error resilience in scenarios. Other -derived vocoders include a 960 bps variant presented at the 1994 International Conference on Spoken Language Processing (ICSLP), which uses the model with optimized parameter quantization for high-quality at ultra-low rates, suitable for bandwidth-constrained secure communications. Additionally, quad-band (QBE) extensions to , introduced in 1996, limit the voicing decisions to four variable-length, non-overlapping bands across the telephone bandwidth to further reduce while improving modeling of mixed voicing and noisy speech in vocoders. These derivatives highlight 's adaptability for rates below 1 kbps, though they trade off some naturalness for efficiency in specialized applications.

Applications and Usage

Secure and Military Communications

Multi-Band Excitation (MBE) vocoders, including the Improved Multi-Band Excitation (IMBE) and Advanced Multi-Band Excitation (AMBE) variants, play a key role in secure and due to their ability to deliver intelligible speech at low bitrates while maintaining robustness in noisy or error-prone channels. These vocoders enable efficient transmission over limited bandwidth links, facilitating integration with protocols to support tactical voice networks. In the United States, IMBE serves as the baseline for (P25) Phase I systems, which are deployed in secure communications for military first responders and public safety agencies requiring encrypted . P25 Phase II systems upgrade to the AMBE+2 , offering improved noise suppression and with IMBE for enhanced tactical reliability in handheld and vehicle-mounted radios. These implementations provide low latency, with frame processing under 20 ms contributing to end-to-end delays below 100 ms, essential for real-time . MBE-based vocoders excel in military environments through their resilience to bit errors and acoustic noise, outperforming traditional coders in tandem configurations common to secure pipelines. For instance, AMBE achieves 92.6% intelligibility for clear speech in isolation and remains effective when paired with legacy waveform coders like CVSD, supporting seamless integration in encrypted systems. This robustness extends to satellite links, where AMBE operates at 2.4 kbps in the Iridium network, enabling global secure voice for drones, handheld devices, and remote operations. Modern tactical systems, such as Motorola's SRX 2200 combat radio, incorporate AMBE for dual-microphone noise cancellation and clear transmission in high-noise scenarios, representing the evolution from early prototypes to current battlefield applications.

Commercial and Broadcasting Systems

Multi-Band Excitation (MBE) technology, particularly its advanced variants like AMBE, has found widespread adoption in civilian communication systems, enabling efficient low-bitrate for bandwidth-constrained environments. Digital Voice Systems, Inc. (DVSI) provides hardware solutions such as the AMBE-2020 and AMBE-4020 chips, which support data rates from 2.0 to 9.6 kbps and are integrated into commercial products for voice transmission. These chips facilitate half-duplex operations and include features like and comfort noise insertion, making them suitable for applications requiring robust speech quality. In , MBE-based s power digital voice modes such as , supported by transceivers like the Icom IC-705, which uses DVSI's AMBE+2 technology for interoperability in bandwidth-limited setups. Similarly, in (DMR) systems, the ETSI standard for Tier II (conventional) and Tier III (trunked) operations incorporates DVSI's AMBE+2 at 3.6 kbps, ensuring compatibility across networks. Manufacturers like Tait implement this to convert analog voice into digital data, supporting two-slot in 12.5 kHz channels. For broadcasting, MBE technology enhances speech transmission in systems, notably , where the AMBE 4.0 kbps is employed for live traffic and weather announcements to optimize usage in broadcasts. DVSI's Net-2000 Voice Compression Unit further extends MBE applications to VoIP and teleconferencing platforms, providing toll-quality speech compression for internet-based communications in consumer devices. Since the 2000s, MBE adoption has expanded significantly into consumer and commercial sectors, with DVSI reporting over 350 million implementations of its AMBE vocoder technology worldwide by the 2020s, reflecting its integration into diverse non-military voice products.

Performance and Comparisons

Advantages Over Traditional Methods

Multi-Band Excitation () vocoders offer superior naturalness compared to traditional methods like (LPC) and (CELP), particularly in handling mixed voicing scenarios such as vowels with frication. By determining voiced/unvoiced decisions independently for each frequency band, MBE avoids the "buzzy" artifacts common in single-band excitation models, where unvoiced energy is inaccurately replaced by periodic harmonics across the entire spectrum. This results in more accurate reproduction of aperiodic components, leading to higher subjective quality scores; for instance, the Improved Multi-Band Excitation (IMBE) variant achieves a (MOS) of approximately 3.4 at 4.15 kbps, outperforming 4.8 kbps CELP implementations with MOS ratings around 3.17 by providing clearer, less synthetic-sounding speech in both clean and transitional phonemes. The original MBE model operates efficiently at 8 kbps, providing intelligible and natural that outperforms traditional LPC-10 vocoders at 2.4 kbps, which achieve lower intelligibility due to their single-band limitations. Later MBE-based implementations like IMBE enable even lower rates down to 2.4 kbps while maintaining superior . This is facilitated by the model's harmonic-plus-noise structure, which efficiently quantizes spectral envelopes and parameters per band without relying on computationally intensive searches as in CELP. The approach is particularly suited for bandwidth-constrained channels, such as communications, where IMBE has been standardized for rates between 2.4 and 4.15 kbps while maintaining superior to LPC-10e at equivalent rates. In terms of robustness, excels in noisy environments and error-prone channels due to its independent band processing, which isolates and preserves unvoiced components better than global excitation models in LPC or CELP. Evaluations from the show MBE achieving a Diagnostic Rhyme Test (DRT) score of 58.0 in 5 dB SNR conditions, a 12-point improvement over single-band excitation's 46.0, indicating enhanced intelligibility under adverse noise. Additionally, MBE's flexibility supports variable operation and seamless integration with error correction schemes, allowing adaptive allocation of bits to critical bands for optimized performance across diverse transmission scenarios.

Limitations and Alternatives

Despite its advantages in low-bitrate speech synthesis, Multi-Band Excitation (MBE) suffers from high , particularly for implementations, as simultaneous estimation of all model parameters is prohibitive and requires a multi-step process that demands significantly more resources than simpler (LPC) methods. Additionally, MBE is sensitive to pitch estimation errors, which become more frequent and cause noticeable degradations in synthesized speech quality under noisy conditions due to the difficulty in accurate parameter extraction. While the original MBE model developed at is not , the nature of its commercial extensions like IMBE and AMBE, developed by Digital Voice Systems, Inc. (DVSI), further restricts widespread adoption, as licensing requirements limit integration into open-source projects and foster reliance on licensed hardware or software. At bitrates below 2 kbps, quality in -based vocoders like IMBE degrades noticeably, with increased artifacts from limited parameter resolution, making them less suitable for ultra-low-rate applications compared to their performance at 2.4–4.8 kbps. Alternatives to MBE include (CELP), which operates effectively at 4–8 kbps to deliver more natural-sounding speech, though it requires higher and is less efficient for very low rates. Neural vocoders, such as introduced in 2016, provide superior perceptual quality at comparable bitrates by generating waveforms directly from acoustic features, but they incur substantially higher , often orders of magnitude slower than traditional parametric methods like MBE during . In comparisons, MBE-based vocoders like IMBE excel in secure, low-rate communications for clean speech (e.g., 2.4 kbps) due to their parametric efficiency, but they lag behind CELP in handling mixed speech-music signals, where CELP's analysis-by-synthesis approach preserves more details despite higher bitrates.

Licensing

Proprietary Framework

Digital Voice Systems, Inc. (DVSI), founded in 1987, owns the for all variants of Multi-Band Excitation () technologies, including Improved Multi-Band Excitation (IMBE) and Advanced Multi-Band Excitation (AMBE). These technologies stem from DVSI's enhancements to the original speech model, with patents covering the core encoding and decoding algorithms since the late . The patent portfolio includes key U.S. patents such as (issued 1998) for spectral magnitude representation in MBE speech coders and (issued 2001) for joint quantization of voicing parameters, among others filed in the early . These patents have expired variably between the and due to the standard 20-year term from filing, but DVSI retains control over proprietary implementations through ongoing licensing agreements that ensure compliance with protected methods. Exclusivity is maintained by DVSI through the absence of any open-source releases of MBE variants, positioning the company as the sole authority for standards compliance in systems like and , where certified vocoders are required. This framework prevents unauthorized replication while enabling integration in secure communications and commercial applications. Licensing forms the economic backbone of DVSI's proprietary model, generating revenue via agreements that often include royalties per device or system deployed, supporting widespread adoption in bandwidth-constrained environments without compromising quality.

Implementation Access

Digital Voice Systems, Inc. (DVSI) facilitates access to (MBE) technology through a combination of products, software libraries, and licensing agreements tailored for developers and integrators. solutions, such as the AMBE-3000 chip, are available off-the-shelf without upfront licensing fees or royalties, making them suitable for low- and high- deployments in systems. for the AMBE-3000 series typically ranges from $25 to $32 per for orders of 250 or more pieces, while evaluation boards like the AMBE-3000-HDK are priced at $765 for initial quantities of 1-9 s to support prototyping and testing. Software access involves licensing DVSI's proprietary libraries, which implement variants such as AMBE and IMBE, through formal agreements that allow customization for specific use cases. These libraries are optimized for integration into embedded platforms, including digital signal processors (DSPs) from , such as the TMS320C6000 series, via provided and implementations. For standards compliance, such as APCO (P25), DVSI offers reference designs and P25-specific hardware configurations, like the USB-3000 P25 interface, to streamline development for public safety communications. Certain aspects of MBE technology have become more accessible due to expirations, particularly for the original Improved Multi-Band Excitation (IMBE) , whose core s lapsed around 2017, enabling partial open-source reimplementations for research and non-commercial purposes. Community-driven projects, such as MBELib and Open-AMBE efforts for compatibility, provide approximations of IMBE and early AMBE decoding but lack the full fidelity and noise robustness of DVSI's licensed implementations, often including disclaimers regarding remaining elements. Full high-quality MBE , especially for advanced variants like AMBE+2, still requires DVSI approval and licensing to avoid infringement on active s. As of 2025, DVSI maintains ongoing support for in bandwidth-constrained environments, including devices and legacy radio systems, with hardware and software updates focused on low-power integration rather than direct New Radio (VoNR) adoption. While hybrid -neural vocoding concepts are emerging in academic research, DVSI's commercial offerings emphasize proven parametric methods without documented neural hybrids.

References

  1. [1]
    Multiband excitation vocoder | IEEE Journals & Magazine
    A speech model, referred to as the multiband excitation model, is presented. In this model the band around each harmonic of the fundamental frequency is ...
  2. [2]
    [PDF] Multi-Band Excitation Vocoder - DSpace@MIT
    In order to synthesize speech, the excitation parameters are used to synthesize an excitation signal consisting of a periodic impulse train in voiced regions or.
  3. [3]
    An enhanced multiband excitation speech coder at 2400 b/s
    The coder uses a variation of the multiband excitation (MBE) model ... coder is well suited for applications requiring low rate communications quality speech.
  4. [4]
  5. [5]
  6. [6]
    DVSI The Company
    Founded by our Management team in 1988, Digital Voice Systems, Inc. (DVSI) of Westford, Massachusetts, is a world leader in the development of low data-rate ...
  7. [7]
    [PDF] The AMBE+2™ version 1.6 Vocoder - EF Johnson Technologies
    The vocoder that was initially chosen for Project 25 was the Improved Multi-. Band Excitation (IMBETM) vocoder in 1993, a product of Digital Voice Systems,.
  8. [8]
    [PDF] Speech Coding: A Tutorial Review - NET
    The Multiband Excitation (MBE) coder, proposed by. Griffin and Lim [ 117], relies on a model that treats the short-time speech spectrum as the product of an ...
  9. [9]
    [PDF] Multiband excitation vocoder - Semantic Scholar
    Multiband excitation vocoder · D. Griffin, Jae S. Lim · Published in IEEE Transactions on… 1 August 1988 · Computer Science, Engineering.
  10. [10]
    [PDF] Multi-Band Excitation Vocoder. - DTIC
    This Technical Report presents the Multi-Band Excitation. Vocoder which contains a speech model allowing the band around each harmonic of the fundamental ...
  11. [11]
    Adaptive Refinements of Pitch Tracking and HNR Estimation within ...
    ... Harmonic-to-Noise Ratio (HNR) to the voiced and unvoiced components to ... Griffin, D.W. Multi-Band Excitation Vocoder. Ph.D. Thesis, Massachusetts ...
  12. [12]
    US6963833B1 - Modifications in the multi-band excitation (MBE ...
    2 , April 1984, pp 236-243; Hardwick, “A 4.8 kbps multi-band excitation speech coder”, S.M. Thesis, M.I.T., May 1988; P. Bhattacharya, M. Singhal and ...
  13. [13]
    [PDF] A 1.5 Kbps Multi-Band Excitation Speech Coder - DSpace@MIT
    Aug 10, 1990 · A new coding method based upon Linear Predictive Coding (LPC) of the harmonic magnitudes and a Line Spectrum Pair (LSP) representation of the ...
  14. [14]
    DVSI's Legacy Vocoder Technology
    DVSI's AMBE™ voice compression algorithm easily integrates with the most popular fixed-point and floating-point Digital Signal Processor (DSP) families.Missing: 1997 | Show results with:1997
  15. [15]
    [PDF] P25 Radio Systems - Zetron
    P25 uses a specific method of digitized voice (speech coding) called Improved Multi-Band Excitation (IMBE™). The IMBE™ voice encoder-decoder. (vocoder) ...
  16. [16]
    [PDF] Nonsquare Transform Vector Quantization - Signal Compression Lab
    In an attempt to solve this problem, the IMBE codec [1] uses a complicated encoding scheme with variable-bit assignments and hybrid scalar/vector quantization.
  17. [17]
    None
    ### Summary of IMBE in Project 25
  18. [18]
    [PDF] Voice Quality Assessment of Vocoders in Tandem Configuration
    The IMBE encoder extracts a set of parameters from the incoming speech signal including: pitch, a set of Voiced/Unvoiced (V/UV) parameters, and a set of ...Missing: DVSI | Show results with:DVSI
  19. [19]
    The application of the IMBE speech coder to mobile communications ...
    The IMBE system yielded the best performance of the systems tested. It received an MOS score of 3.4 at both 0% and 1% bit error rate. The test results show that ...
  20. [20]
    [PDF] P25 Training Guide (Draft).indd - DVSI
    The vocoder uses a frame size of 20 ms. P25 selected the IMBE™ vocoder in 1992 after a competition with several other proposed vocoders. All the vocoders ...
  21. [21]
    None
    Summary of each segment:
  22. [22]
    [PDF] DMR versus TETRA systems comparison - Radio Activity Srl
    Sep 7, 2009 · ... AMBE II+TM (Advanced Multi-Band Excitation) that is a proprietary speech coding standard developed by Digital Voice Systems. The use of the ...
  23. [23]
    [PDF] MELP: The New Federal Standard at 2400 bps
    The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal ...Missing: kbps MBE
  24. [24]
    [PDF] Speech Coding Methods, Standards, and Applications - ViVoNets
    Care has been taken only to present MOS values that are consistent with widely known performance results for each codec. For more details on subjective and.
  25. [25]
    [PDF] eot156 - speech coding: fundamentals and applications
    Speech coding is the process of obtaining a compact representation of voice signals for efficient transmission over band-limited wired and wireless channels ...
  26. [26]
    Multi-band excitation coding of speech at 960 bps ... - ISCA Archive
    This paper describes a method to achieve high-quality coding of speech signals at 960 bps. The method employs the multiband excitation (MBE) model together ...Missing: seminal | Show results with:seminal
  27. [27]
  28. [28]
    [PDF] Military Speech Communications over Vocoders in Tandem - DTIC
    Apr 1, 2005 · The AMBE parametric vocoder performed at the 88% level in isolation and at 84% when tandemed with the CVSD waveform vocoder. Alternative methods ...
  29. [29]
    IMBE™ and AMBE™ Speech Compression - DVSI
    The result of this work was the Multi-Band Excitation (MBE) speech model. This speech model provides a unique speech coding framework which results in a ...
  30. [30]
    [PDF] Project 25 Vocoder Standard
    Nov 24, 2014 · It features the. AMBE vocoder that has been enhanced in that time to improve audio quality for the users. The vocoder now has a dual rate ...Missing: military SINCGARS MUOS ANDVT
  31. [31]
    [PDF] Enhanced Full-Rate (AMBE+2) P25 Vocoder
    Critical for Firefighters: It reduces background noise in transmissions where high noise levels are present, ensuring your first responder's voice is heard ...Missing: military SINCGARS MUOS ANDVT STANAG
  32. [32]
    Iridium - Signal Identification Wiki
    Sep 6, 2025 · For the voice communications, Iridium uses a 2.4 kbps Advanced Multi-Band Excitation (AMBE) vocoder developed by Digital Voice System Inc. (DVSI) ...
  33. [33]
    DVSI Vocoder Software used in Satellite Communication Systems
    Inmarsat - BGAN, Areo-I, and Mini-M. AMBE™ Vocoder @ 3.6 kbps. DVSI's AMBE™ vocoder technology is used in the Inmarsat™ mobile satellite communication systems.Missing: 4.15 | Show results with:4.15
  34. [34]
    [PDF] SRX 2200 COMBAT RADIO - Motorola Solutions
    It's equipped with the latest Advanced Multi-Band Excitation. (AMBE) digital vocoder and dual microphones to locate the talker while it cancels out ...
  35. [35]
    AMBE-2020™ Vocoder Chips - DVSI
    The AMBE-2020™ Vocoder Chip is a low cost, DSP-based voice codec for half-duplex real and non-real time voice compression applications. The AMBE-2020™ Vocoder ...Missing: SINCGARS | Show results with:SINCGARS
  36. [36]
    AMBE-4020™ Vocoder Chip - DVSI
    The AMBE-4020™ Vocoder Chip implements DVSI's patented AMBE+2™ Voice Compression Algorithm that can operate at virtually any data rate from 2.0 to 9.6 kbps. The ...
  37. [37]
    IC-705 | Products - Icom America
    A new hybrid radio. Base station radio performance and functions packaged in a compact and lightweight, portable size.
  38. [38]
    Various Applications of DVSI's Vocoder Technology
    AMBE+™ Vocoder Technology​​ SiriusXM's live traffic information service, uses DVSI's AMBE® 4.0 kbps Vocoder in the broadcast of local traffic and weather reports ...
  39. [39]
    DVSI Vocoder Software used in Digital Mobile Radio Applications
    DVSI's AMBE+2™ 3.6 kbps vocoder was selected by ETSI as part of the DMR digital radio protocol standard. DMR is the world's leading standard for professional ...
  40. [40]
    Benefits of DMR | Tait Radio Academy
    Feb 1, 2015 · DMR systems use a device called the AMBE+2™ vocoder to convert voice information into digital data. During the digitization process, the ...
  41. [41]
    SIRIUS Technology - Rohde & Schwarz
    Advanced multiband excitation (AMBE) is used as the audio coding format for weather and traffic announcements; otherwise, aac+ is used.
  42. [42]
  43. [43]
    Welcome to DVSI Web Site
    Digital Voice Systems, Inc.is a world leader in the development of low data rate, high quality speech compression products used in digital communications ...AMBE+2™ Vocoder · AMBE® Hardware Solutions · Contact DVSI · Contact UsMissing: founding 1987 Barnwell
  44. [44]
    None
    ### Summary of Multiband Excitation (MBE) Vocoder from the Paper
  45. [45]
    [PDF] SPECTRAL EXCITATION CODING OF SPEECH AT 2.4 KB/S
    At the transmitter, the speech spectral envelope is estimated using linear prediction techniques. The LPC parameters are transformed into LSP's and quantized by ...
  46. [46]
    [PDF] QUALITY EVALUATION OF LPC BASED LOW BIT RATE SPEECH ...
    MOS score obtained for CELP-4.8 is 3.17 at a bitrate of 4.8Kbps. Even if the MOS score is less compared to higher bitrate vocoders, the reconstructed speech by.
  47. [47]
    [PDF] Coding Strategies and Standards
    CELP coders buffer the speech signal and perform block based analysis and transmit the prediction filter coefficients along with an index for the excitation ...
  48. [48]
    [PDF] A mixed excitation LPC vocoder operating at very low bit rate
    This paper presents a 1.2 kb/s mixed LPC vocoder based on MultiBand Excitation (MBE) model. The vocoder extracts the pitch by a robust and efficient tracking ...Missing: load | Show results with:load
  49. [49]
    Intelligibility evaluation of 4-5 kbps CELP and MBE vocoders
    During the last years new promising techniques have emerged for voice coding at bit rates around 5 Kbps and below. These techniques include the Code Excited ...
  50. [50]
    [PDF] NeuralDPS: Neural Deterministic Plus Stochastic Model with ... - arXiv
    Mar 5, 2022 · The experimental results demonstrate that the proposed NDPS-MBE vocoder generates waveforms at least 280 times faster than the WaveNet vocoder ...
  51. [51]
    DVSI Technology
    DVSI introduced the AMBE+™ Vocoder in the late 1990's. It included numerous advancements and design features such as, automatic Voice/Silence Detection, ...Missing: development | Show results with:development
  52. [52]
    US5754974A - Spectral magnitude representation ... - Google Patents
    ... Multi-Band Excitation (MBE) speech model developed by Griffin and Lim. This model uses a flexible voicing structure which allows it to produce more natural ...
  53. [53]
    US6199037B1 - Joint quantization of speech subframe voicing ...
    MBE-based speech coders include the IMBE® speech coder and the AMBE® speech coder. The AMBE® speech coder was developed as an improvement on earlier MBE-based ...
  54. [54]
    DVSI's Vocoder Software Libraries Licensing Custom and Standard ...
    Digital Voice Systems, Inc. (DVSI), is the sole owner and developer of the AMBE®, AMBE+™ and AMBE+2™ vocoder technology and associated hardware and software ...Missing: royalties | Show results with:royalties
  55. [55]
    DVSI Pricing Information
    DVSI's hardware products are truly affordable, requiring no licensing fees or royalties which makes them ideal for low volume digital communication systems.Missing: MBE | Show results with:MBE
  56. [56]
    DVSI Hardware
    DVSI hardware includes AMBE+2 vocoder chips, USB products, and Net-2000 voice codec units, all using voice compression technology.Missing: 1997 | Show results with:1997
  57. [57]
    DVSI USB-3000™ Family
    The USB-3000™ contains Digital Voice Systems' proprietary and patented Advanced Multi-Band Excitation AMBE® voice compression algorithm. The standard ...Missing: 4995396 | Show results with:4995396
  58. [58]
    Improving Open-AMBE for D-Star - QSL.net
    Also, DVSI made use of the AMBE codec in commerce before some of their patent applications, potentially invalidating their own patents. There's one ham and ...<|separator|>
  59. [59]
    mbe - Go Packages
    Feb 12, 2019 · Package mbe implements Multi-Band Excitation codecs (AMBE and IMBE). Patent notice This source code is provided for educational purposes only.Missing: proprietary | Show results with:proprietary
  60. [60]