Fact-checked by Grok 2 weeks ago

MPEG-4

MPEG-4, formally known as ISO/IEC 14496, is a suite of international standards developed by the (MPEG) for the , delivery, and management of content across fixed and mobile networks. It enables the representation of audio-visual scenes as compositions of objects, supporting interactive and scalable applications. The standard emphasizes object-based coding, allowing individual elements like video objects, audio streams, and graphics to be manipulated independently for enhanced interactivity and efficiency. The development of MPEG-4 built upon the successes of earlier MPEG standards, such as for digital storage media and for broadcast television, with work beginning in the mid-1990s to address emerging needs for and . The first core parts of the standard were published in early 1999, marking its formal ratification as ISO/IEC 14496 by the (ISO) and the (IEC). Subsequent amendments and additional parts have been released over the years, with ongoing updates to incorporate advancements like improved compression algorithms. MPEG-4 comprises over 30 parts, each addressing specific aspects of multimedia handling; key components include Part 1 (Systems) for scene description and delivery, Part 2 (Visual) for video object compression, Part 3 (Audio) for speech and general audio coding, Part 10 (, or AVC/H.264) for high-efficiency video, and Part 14 () for storing timed media streams. These parts support a range of profiles and levels tailored to applications, from low-bitrate mobile streaming to high-definition broadcasting. Notable features of MPEG-4 include its support for and graphics, synthetic content generation, text rendering, and binary format for scenes (BIFS) to enable dynamic scene composition and user interaction. It provides superior compression efficiency compared to prior standards, facilitating savings in diverse environments. Applications encompass , video-on-demand services, mobile devices, , surveillance systems, and web-based delivery. The MP4 , in particular, has become ubiquitous for storing and streaming video files across platforms.

History and Development

Origins and Goals

The (MPEG) was established in 1988 as a (WG11) under the (ISO) and the (IEC) Joint Technical Committee 1, Subcommittee 29 (ISO/IEC JTC1/SC29), initially focused on developing standards for the compression of moving pictures and associated audio. Building on the success of , which targeted storage on digital storage media like , and , designed for broadcasting and DVD, the committee expanded its scope with MPEG-4 to encompass broader multimedia applications beyond traditional video and audio coding. In July 1993, MPEG issued a call for requirements to define the objectives of a new standard aimed at addressing emerging needs in communication and delivery, particularly over heterogeneous networks. The primary goals included achieving improved compression efficiency, targeting approximately 50% better performance than in terms of bitrate reduction for equivalent subjective quality, to enable efficient transmission and storage. Additional objectives encompassed support for , allowing users to manipulate individual elements within scenes, and content-based manipulation, facilitating access and editing of specific objects rather than entire frames; these features were essential for applications over low-bitrate channels, such as mobile networks operating at bitrates as low as 10 kbit/s. A core emphasis of the MPEG-4 goals was object-based coding, which treats data as composable objects—either natural (e.g., captured video) or synthetic (e.g., computer-generated )—to enable scalable and across diverse devices and conditions. This approach supported robustness in error-prone environments and adaptability to varying bandwidths, promoting content delivery to a wide range of users from desktop computers to portable devices. Key milestones included the issuance of the for proposals in July 1995, soliciting technologies aligned with the defined requirements, followed by the and selection of core technologies at the MPEG meeting in in January 1996, marking the integration of proposals into the initial verification model.

Standardization Timeline

The standardization process for MPEG-4, designated as ISO/IEC 14496, began in July 1993 under the (MPEG), part of ISO/IEC JTC 1/SC 29, with the issued in July 1995, a working draft released in November 1996, and the committee draft published in late 1997. The effort culminated in the of the initial in late 1998, with formal publication of the first editions in 1999. Phase 1 of MPEG-4 development, covering the period from 1996 to 1999, concentrated on establishing core tools for and natural/synthetic models, facial animation parameters, and basic video and audio functionalities. This phase resulted in the first editions of Parts 1 through 3 of ISO/IEC 14496 published as International Standards in December 1999: Part 1 (Systems) for multiplexing and , Part 2 (Visual) for object-based video , Part 3 (Audio) for advanced , with Part 4 () for verification procedures following in 2000. Phase 2, extending from 2000 to 2002, expanded the standard with advanced features such as fine granularity scalability, error resilience, and support for higher-resolution content, incorporating amendments to existing parts and introducing new ones. This included Parts 5 (Reference Software), 6 (Delivery Multimedia Integration Framework), 7 (Optimized Reference Software), 8 (Carriage of Enhanced Audio), and 9 (Reference Hardware Description), with their initial editions published between 2002 and 2004. A major milestone was Part 10 (, or AVC), developed in collaboration with as H.264, which was finalized and published in May 2003 as the first edition of ISO/IEC 14496-10. Following Phase 2, subsequent development phases from 2004 onward addressed emerging needs in delivery and content, leading to Parts 11 through 30. Notable additions included Part 12 () in its first edition published in 2004, providing a foundational structure for media storage shared with other standards, and Part 16 (Animation Framework eXtension, or AFX) in February 2004, enabling advanced graphics and animation tools. The MPEG group has continued maintenance, with amendments and new editions focusing on , , and with modern applications; for instance, updates to Part 10 for enhanced compression were incorporated through editions up to 2023. As of 2025, MPEG continues maintenance of MPEG-4 parts, with recent updates to file formats and with modern standards. This ongoing collaboration between ISO, IEC, and ensures MPEG-4's adaptability across fixed and mobile environments.

Technical Overview

Core Principles

MPEG-4 adopts an object-based representation for scenes, treating as discrete, reusable objects such as video objects, audio streams, or graphics elements, each described by for properties like , texture, and temporal behavior. These objects are composed hierarchically within a , allowing independent manipulation, synchronization, and delivery, which enables applications like content editing, selective transmission, and user interaction without decoding the entire stream. This approach contrasts with frame-based methods in prior standards by emphasizing semantic structure over pixel-level processing, facilitating scalability across diverse devices and networks. Central to this architecture is the Binary Format for Scenes (BIFS), a compact syntax derived from that describes the spatiotemporal organization of objects in or scenes using nodes for grouping, positioning, transformations, and event handling. BIFS supports dynamic scene updates through command streams, enabling real-time interactivity such as object selection or animation, while its binary encoding ensures efficient storage and transmission. This scene description mechanism promotes content reusability and adaptability, allowing scenes to scale from low-bitrate mobile viewing to high-resolution immersive experiences. The Delivery Multimedia Integration Framework (DMIF) serves as a unified for accessing and delivering MPEG-4 content across heterogeneous environments, including networks, storage media, and broadcasts, via a standardized and application . DMIF handles resource negotiation, quality-of-service management, and of object streams, insulating applications from underlying transport specifics like or MPEG-2 transport streams. This framework ensures seamless integration of delivery, supporting both pull (interactive) and push (broadcast) modes. MPEG-4 inherently supports hybrid natural and synthetic content by integrating compressed streams of real-world media—such as video and audio—with generated elements like / graphics, facial animation parameters, and within the same framework. This unification allows for blended audiovisual experiences, where synthetic objects can overlay or interact with natural ones, enhancing applications in , gaming, and augmented communication. The design accommodates bitrates from low (e.g., 2-5 kbit/s for speech) to high (up to 10 Mbit/s for video), prioritizing efficient coding for mixed content types. To ensure robustness in channels, particularly error-prone ones like networks, MPEG-4 incorporates error features at the systems level, including resynchronization markers in and scalable object hierarchies that permit graceful . These mechanisms, combined with BIFS updates for error recovery, maintain scene integrity without full retransmission, supporting reliable operation at bitrates below 64 kbit/s.

Key Innovations

MPEG-4 introduced content-based interactivity as a core advancement, enabling the manipulation of individual audiovisual objects within a scene, such as selecting, editing, or scaling specific video objects without requiring re-encoding of the entire multimedia stream. This object-based approach allows for applications like user-driven content customization, where elements can be interacted with independently, distinguishing it from frame-based standards like . The standard's scalability features represent another major innovation, supporting temporal scalability through layered to adjust frame rates, spatial scalability for varying resolutions via enhancement layers, and quality (SNR) scalability using fine granularity scalability (FGS) techniques that enable progressive refinement of video . These mechanisms facilitate adaptive streaming over heterogeneous networks, allowing bitstreams to be tailored to fluctuating conditions without full re-transmission. Universal multimedia access was achieved through tools optimized for low-bitrate delivery, supporting rates as low as 5 kbit/s for basic video while enabling high-quality rendering on diverse devices ranging from phones to high-definition displays. This versatility ensures accessibility across varying computational resources and network capabilities, promoting widespread adoption in early and applications. MPEG-4 integrated synthetic media by defining Facial Animation Parameters (FAP) and Body Animation Parameters (BAP) in Parts 1 and 2, providing a parametric framework for animating 3D face and body models with minimal data overhead—68 FAPs control deformations of 84 facial feature points to produce expressions, while 196 BAPs define joint rotations and movements for the body model. These parameters enable realistic synthesis of virtual characters, blending seamlessly with natural video for hybrid content creation. Hybrid coding in MPEG-4 combines traditional with wavelet-based methods, particularly in the Synthetic and Natural Hybrid Coding (SNHC) tools, to enhance efficiency for textured and composite scenes. This approach leverages discrete wavelet transforms for scalable texture representation, achieving better compression ratios than the purely block-based in , especially for irregular or synthetic-natural hybrid imagery.

Video Compression

MPEG-4

MPEG-4 encompasses multiple video compression standards defined in ISO/IEC 14496, with Parts 2 and 10 being central to its video coding capabilities. Part 2, known as MPEG-4 Visual, provides a flexible for compressing rectangular frame-based video, supporting ranging from 5 kbit/s to over 1 Gbit/s. It accommodates progressive and interlaced formats, resolutions from sub-QCIF (128×96) to 4096×4096 pixels, and color options such as , 4:2:2, and 4:4:4. This part builds on earlier MPEG technologies by introducing object-based coding, allowing video objects to be independently encoded and manipulated, which enhances in applications. The core of MPEG-4 Visual employs a hybrid coding framework that integrates motion-compensated prediction with (DCT) quantization. Intra-coded video object planes (I-VOPs) use spatial prediction within the frame, while predictive (P-VOPs) and bidirectional (B-VOPs) planes leverage temporal prediction from reference frames. Innovations include quarter-pixel motion accuracy for smoother motion representation, global for camera panning effects, and variable sizes (e.g., 16×16 or 8×8 macroblocks) to adapt to content complexity. Additional tools support scalability—spatial (resolution), temporal (), and (SNR)—enabling adaptive streaming, as well as error resilience features like resynchronization markers, data partitioning, and reversible variable-length coding (RVLC) for transmission over unreliable networks. Profiles such as , Advanced Simple, and provide tailored constraints for applications from video to broadcast. MPEG-4 Part 10, or (AVC), also standardized as H.264, represents a major advancement in video compression efficiency within the MPEG-4 family. Developed jointly by MPEG and 's (VCEG) from 2001 onward, it achieves approximately 50% better compression than at equivalent quality levels, supporting at bit rates as low as 1 Mbit/s for 720p content. AVC uses an enhanced hybrid approach with more sophisticated intra and inter prediction modes: intra prediction employs directional modes (e.g., 9 for 4×4 blocks), while inter prediction supports multiple reference frames, weighted prediction, and sub-pixel accuracy up to 1/4 pixel. The transform stage applies a 4×4 DCT approximation for reduced complexity, followed by context-adaptive binary (CABAC) or context-adaptive variable-length coding (CAVLC) for encoding, which significantly improves rate-distortion performance. Key innovations in AVC include deblocking filters to reduce blocking artifacts, in-loop processing for better prediction accuracy, and flexible partitioning (from 16×16 to 4×4) for efficient handling of diverse video content. It supports scalability through extensions like (Scalable Video Coding in Part 10 amendments) and multiview coding for stereoscopic video. Profiles such as (for low-latency applications like video conferencing), Main (for broadcast), and High (for / with 8×8 transforms) ensure broad applicability, from mobile devices to professional cinema. AVC's widespread adoption stems from its balance of compression efficiency—up to twice that of MPEG-4 Visual—and computational feasibility, powering formats like Blu-ray discs and streaming services. Beyond core 2D video, MPEG-4 includes specialized compression for synthetic content, such as Face and Body Animation (Parts 1 and 2), which encodes facial expressions and body poses at low bit rates (2–3 kbit/s for faces, up to 40 kbit/s for bodies) using parameter-based models. 3D Mesh Coding (Part 16) achieves 30:1 to 40:1 compression ratios for triangular mesh models by wavelet decomposition and of and . These extensions enable immersive applications like and animation, integrating seamlessly with the binary format for scenes (BIFS) in MPEG-4 systems. Overall, MPEG-4 video compression prioritizes versatility, enabling content creation, delivery, and interaction across diverse platforms.

References

  1. [1]
    MPEG-4 - Standards – MPEG
    A suite of standards for multimedia for the fixed and mobile web. Parts. Systems Part: 1. This standard specifies the Systems layer of MPEG-4. Visual Part: 2
  2. [2]
    [PDF] MPEG-4 - The Media Standard
    Nov 19, 2002 · MPEG-4 is developed by the Moving Picture Experts Group (MPEG), a workgroup of the. International Organization for Standardization (ISO) (www.
  3. [3]
    MPEG-4 File Format, Version 2 - The Library of Congress
    Apr 18, 2025 · ... MPEG-4 is an ISO/IES standard developed by MPEG for communicating interactive audiovisual scenes. The standard defines a set of tools that ...
  4. [4]
    MPEG-4, Advanced Video Coding (Part 10) (H.264)
    Apr 26, 2024 · For all profiles, MPEG-4_AVC (MPEG-4 Advanced Video Coding, part 10) is more efficient than the MPEG-4_V (MPEG-4 Visual Coding, part 2), i.e., ...
  5. [5]
    [PDF] Overview of the MPEG-4 Standard
    MPEG-4, whose formal ISO/IEC designation will be ISO/IEC. 14496, was ... (officially designated as ISO/IEC 13818, in 9 parts). MPEG is has recently ...
  6. [6]
    [PDF] MPEG-4 - EBU tech
    In 1993, MPEG [4] launched the MPEG-4 work item – officially called “Coding of audio-visual objects” – to address, among others, the requirements men-.
  7. [7]
    [PDF] MPEG-4: An Object-based Multimedia Coding Standard supporting ...
    Improved Compression is needed to allow increase in efficiency in transmission or decrease in amount of storage required. For low bit-rate applications ...
  8. [8]
    MPEG-4 – Development - Riding the Media Bits
    The call sought technologies supporting eight detailed MPEG-4 functionalities. ... At the Munich meeting in January 1996 the first pieces of the puzzle began to ...
  9. [9]
    History of MPEG - Courses
    MPEG was founded in January 1988 by Leonardo Chiariglione, with the first meeting in May 1988 with 25 experts. MPEG-1 was established in 1992.
  10. [10]
    [PDF] NE: TRENDS IN IMAGE AND VIDEO COMPRESSION - EURASIP
    MPEG-4 phase 1 became an international standard in 1999 [3]. MPEG-4 is ... [3] ISO/IEC ISO/IEC 14496-2: 1999: Information technology – Coding of audio ...
  11. [11]
    [PDF] Video Compression Technique
    MPEG-4 phase 1 became an international standard in 1999 [8]. MPEG-4 is ... [8] ISO/IEC ISO/IEC 14496-2: 1999: Information technology –. Coding of audio ...
  12. [12]
    ISO/IEC 14496-4:2004 - Conformance testing
    2–5 day deliveryStatus. : Published ; Publication date. : 2004-12 ; Stage. : International Standard under systematic review [90.20] ; Edition. : 2 ; Number of pages. : 298.<|control11|><|separator|>
  13. [13]
    Advanced Video Coding - Wikipedia
    With the use of H. 264, bit rate savings of 50% or more compared to MPEG-2 Part 2 are reported. For example, H. 264 has been reported to give the same Digital ...
  14. [14]
    ISO/IEC 14496-10:2004 - Coding of audio-visual objects
    General information ; Publication date. : 2004-10 ; Stage. : Withdrawal of International Standard [95.99] ; Edition. : 2 ; Number of pages. : 267 ; Technical ...
  15. [15]
    ISO Base Media File Format - Library of Congress
    ISO/IEC 14496-12 was first published in 2004 under the auspices of the Motion Pictures Expert Group (MPEG) and also published as ISO/IEC 15444-12 under the ...
  16. [16]
    Animation Framework eXtension (AFX) | MPEG - Chiariglione.org
    AFX is specified in MPEG‑4's Part 16 REF _Ref111527792 \r \h [1], whose first Edition was published by ISO in February 2004. ... WD of ISO/IEC 14496-16:2011/Amd.3 ...
  17. [17]
    ISO/IEC 14496-10:2025 - Coding of audio-visual objects
    2–5 day deliveryThis document specifies advanced video coding for coding of audio-visual objects. General information. Status. : Published. Publication date. : 2025-07.
  18. [18]
  19. [19]
    What are exactly the functionalities that are supported by MPEG-4 ...
    MPEG-4 supports eight key functionalities, that can be gathered around three classes:Content-based interactivity:Object-based Multimedia Access Tools.
  20. [20]
    MPEG-4 Beyond Conventional Video Coding - SpringerLink
    The three topics highlighted in this book are object-based coding and scalability, Fine Granularity Scalability, and error resiliencetools. This book is aimed ...
  21. [21]
    [PDF] Overview of fine granularity scalability in MPEG-4 video standard
    Before presenting the FGS technique in MPEG-4, other scalable video coding techniques (SNR, temporal, spatial) are briefly reviewed in this section. More ...
  22. [22]
    MPEG4 - an overview | ScienceDirect Topics
    MPEG-4 Part 2 supports scalability tools that organize bitstreams into base and enhancement layers, allowing adaptation to bit rate, display resolution, network ...
  23. [23]
    [PDF] MPEG-4 Face and Body Animation (MPEG-4 FBA)
    For the body, there are 196 Body Animation Parameters (BAPs). BAP parameters are the angles of rotation of body joints connecting different body parts. These ...
  24. [24]
  25. [25]
    [PDF] Scalable Wavelet Coding For Synthetic/Natural Hybrid Images
    Abstract— This paper describes the texture representation scheme adopted for MPEG-4 synthetic/natural hybrid coding. (SNHC) of texture maps and images.
  26. [26]
    Video | MPEG
    ISO/IEC 14496-2 specifies a video codec which allows efficient compression of rectangular (frame-based) video. Support is given for manifold applications, ...
  27. [27]
    ISO/IEC 14496-10:2009 - Advanced Video Coding
    ISO/IEC 14496-10:2009 was developed in response to a growing need for higher compression of moving pictures for various applications such as digital storage ...
  28. [28]
    MPEG-4: Advanced Video Coding
    MPEG-4: Advanced Video Coding Standard: MPEG-4 Part: 10 This standard specifies a video compression format more efficient than MPEG-4.