Fact-checked by Grok 2 weeks ago

Multiview Video Coding

Multiview Video Coding (MVC) is a video compression standard designed to efficiently encode multiple synchronized video streams captured from different viewpoints of the same scene, by exploiting both temporal redundancies within each view and statistical dependencies between views to achieve significant bitrate savings compared to simulcast encoding. Developed as an extension to the H.264/MPEG-4 Advanced Video Coding (AVC) standard, MVC enables backward compatibility with single-view decoders through a base layer that conforms to the existing AVC syntax, while additional views are encoded using inter-view prediction mechanisms.^[1] The development of MVC was a collaborative effort between the ISO/IEC Moving Picture Experts Group (MPEG) and the ITU-T Video Coding Experts Group (VCEG), culminating in its approval as Amendment 7 to H.264/AVC in July 2008 and integration into ISO/IEC 14496-10 (Annex H) as part of the fifth edition of the standard.^[1] This extension introduced flexible reference picture management for inter-view prediction without altering the core AVC syntax at lower levels, allowing for an average bitrate reduction of approximately 25% over simulcast for typical multiview content.^[1] Following its standardization, MVC saw applications in Blu-ray 3D disc formats and broadcast stereoscopic video, including the subsequent development of conformance testing and transport specifications to facilitate broader adoption.^[1]^[2] Subsequent advancements extended multiview coding principles to newer baselines, notably with Multiview High Efficiency Video Coding (MV-HEVC), which applies similar inter-view prediction techniques to the High Efficiency Video Coding (HEVC) standard finalized in 2013.^[3] MV-HEVC, finalized in 2014 (ITU-T) and 2015 (ISO/IEC), supports efficient coding of multiple camera views with or without depth information, offering substantial compression gains for emerging 3D and free-viewpoint applications while maintaining compatibility with single-view HEVC decoders.^[4] These extensions have been pivotal for immersive media, including autostereoscopic displays and virtual reality systems, where capturing and rendering scenes from numerous angles is essential.^[4] More recent standards like Versatile Video Coding (VVC) incorporate multilayer profiles that further enhance multiview capabilities, building on the foundational efficiency of MVC for higher resolutions and complex geometries.^[5] Further extensions include MPEG Immersive Video (MIV), standardized in 2023, which supports compression of multiview-plus-depth data for six-degrees-of-freedom immersive experiences.^[6]

Introduction

Definition and Scope

Multiview Video Coding (MVC) is a video compression extension to the H.264/Advanced Video Coding (AVC) standard, specifically Annex H, designed to encode multiple synchronized video sequences captured simultaneously from different camera angles into a single efficient bitstream.^[7] This approach allows for the joint compression of multiview content, enabling immersive viewing experiences while building upon the block-based hybrid coding framework of H.264/AVC.^[7] MVC adopts a "2D plus delta" coding paradigm, in which a base view is encoded using conventional H.264/AVC methods to ensure full compatibility with existing legacy 2D decoders and players.^[7] Dependent views are then coded as enhancements relative to the base view, utilizing differential techniques that minimize additional bitrate overhead by referencing shared content across perspectives.^[7] The scope of MVC includes stereoscopic scenarios with two views for basic 3D rendering as well as broader multiview configurations involving multiple cameras, supporting applications that require spatial freedom in viewpoint selection.^[7] Approved in July 2008 by the Joint Video Team (JVT), a collaboration between the MPEG and ITU-T Video Coding Experts Group (VCEG), with the Stereo High Profile finalized in 2009, MVC fundamentally exploits spatial redundancies between views through disparity compensation mechanisms, contrasting with the temporal redundancies targeted in single-view coding.^[7]^[1]

Applications

Multiview Video Coding (MVC) has found primary applications in stereoscopic 3D television broadcasting and Blu-ray 3D disc formats, where it enables efficient compression of paired video views to deliver immersive depth perception on compatible displays. The Stereo High Profile of MVC, standardized in 2009, was adopted by the Blu-ray Disc Association in July 2009 as the mandatory codec for high-definition 3D content, ensuring backward compatibility with 2D playback while supporting full 1080p resolution per eye.^[8] Advanced applications extend MVC to free-viewpoint television (FTV), allowing users to interactively select and navigate viewpoints within a 3D scene captured by multiple synchronized cameras, which facilitates realistic exploration beyond fixed stereo pairs. In surveillance systems, MVC supports multi-camera setups for enhanced monitoring, enabling efficient encoding of overlapping views to provide comprehensive spatial coverage and viewpoint navigation for security analysis. Additionally, MVC contributes to immersive displays in virtual reality (VR) and augmented reality (AR) environments, such as teleconferencing with motion parallax effects that simulate natural head movements for deeper spatial immersion.^[8]^[9]^[10] A key benefit of MVC is its compression efficiency, achieving up to 50% bitrate reduction for stereo pairs relative to independent (simulcast) encoding of each view, which preserves video quality while enabling transmission over bandwidth-constrained channels like broadcast or mobile networks; average savings range from 20-30% for the dependent view in typical scenarios.^[8]^[11] In emerging contexts as of 2025, MVC and its extensions like MV-HEVC are integrating into streaming services for 3D content delivery, supporting platforms such as Apple Vision Pro with over 150 native 3D titles from providers like Disney+ and Amazon Prime Video as of its 2024 launch, using advanced stereoscopic formats for high-quality playback.^[12]^[13]

History and Standardization

Development of MVC for H.264/AVC

The development of Multiview Video Coding (MVC) as an extension to H.264/AVC originated from a joint initiative by the Moving Picture Experts Group (MPEG) and the ITU-T Video Coding Experts Group (VCEG) under the Joint Video Team (JVT), beginning in 2005 to address the increasing demand for efficient compression of multiview video content driven by emerging 3D television and free-viewpoint applications.^[7] In October 2005, MPEG issued a Call for Proposals (CfP) for multiview video coding technologies based on H.264/AVC, aiming to exploit both temporal and inter-view redundancies while maintaining compatibility with existing single-view decoders.^[8] This effort was motivated by the limitations of simulcast approaches, which encoded multiple views independently and resulted in inefficient bandwidth usage for 3D content distribution.^[7] Key milestones in the standardization process included the evaluation of CfP responses in 2006, where subjective tests demonstrated significant quality improvements over simulcast, achieving up to 3 mean opinion score (MOS) points better at low to medium bit rates.^[8] Following this, the JVT developed the Joint Multiview Video Model (JMVM) starting with version 1.0 in 2006, based on a selected proposal, and iterated through versions up to JMVM 8.0 by 2008, incorporating refinements such as time-first coding order for better efficiency.^[7] The process culminated in the approval of Joint Draft 8 in July 2008, with the final standardization of MVC as Annex H of H.264/AVC and Amendment 1 to ISO/IEC 14496-10:2008, published in 2009, enabling the Multiview High Profile. A subsequent amendment in July 2009 added the Stereo High Profile for simplified stereoscopic applications.^[7] Leading contributions came from researchers affiliated with Mitsubishi Electric Research Laboratories, Fraunhofer Heinrich Hertz Institute (HHI), and Microsoft Research, including key figures such as Anthony Vetro, Thomas Wiegand, and Gary J. Sullivan, who coordinated the algorithmic design and testing.^[7] Additional input was provided by industry players like Panasonic, LG Electronics, and Sony, particularly in proposal submissions and emphasis on practical deployment for 3D consumer electronics.^[14] A primary focus throughout was ensuring backward compatibility, where the base view could be decoded using standard H.264/AVC profiles, while dependent views utilized extension Network Abstraction Layer (NAL) units.^[7] This design allowed MVC bitstreams to support legacy single-view playback without requiring full multiview decoder upgrades. Initial performance goals targeted 20-50% bit rate reduction compared to simulcast for multiview sequences, verified through objective tests on standard sequences such as "Exit," "Rena," "Ballroom," and "Race1."^[8] For instance, MVC achieved an average 20% bit rate savings across up to eight views under common test conditions, with peak gains of up to 50% (equivalent to 3 dB PSNR improvement) in inter-view prediction scenarios.^[7] These results validated the efficacy of inter-view prediction, which briefly leverages spatial correlations between views to enhance compression without altering core H.264/AVC intra- and inter-frame tools.^[8]

Evolution to MV-HEVC and Beyond

The transition from the earlier MVC extension of H.264/AVC to HEVC-based multiview coding was driven by the need for improved compression efficiency in higher-resolution multiview content. In 2012, the Joint Collaborative Team on 3D Video Coding Extension Development (JCT-3V), formed by ISO/IEC MPEG and ITU-T VCEG, began developing extensions to the High Efficiency Video Coding (HEVC) standard, building on the foundational work of the JCT-VC initiated in 2010. This effort culminated in the standardization of Multiview HEVC (MV-HEVC) as ITU-T Recommendation H.265 Annex G in October 2014 and as part of the second edition of ISO/IEC 23008-2 in May 2015.^[15]^[16]^[17] MV-HEVC represents a key advancement through its high-level syntax (HLS) extension to the base HEVC standard, enabling efficient coding of multiple camera views via inter-view prediction without altering the core decoding engine. This design supports up to 16 views in HD (1920×1080) resolution, facilitating applications like stereoscopic 3D and free-viewpoint video. Furthermore, MV-HEVC integrates with the 3D-HEVC extension, which adds specialized tools for depth map coding, allowing depth-aided rendering for enhanced view synthesis and immersive experiences.^[18]^[19]^[20] Following its standardization, MV-HEVC saw adoption in consumer formats, including UHD Blu-ray discs launched in 2016, where it enables efficient stereoscopic 3D playback alongside HDR support. In streaming and broadcast, it has been integrated into platforms for VR/AR content delivery, such as Apple's Vision Pro ecosystem for spatial video. By 2025, hardware advancements included NVIDIA's Video Codec SDK 13.0, released in February, providing GPU-accelerated MV-HEVC encoding for stereo and multiview workflows in applications like automotive displays and immersive headsets. Ongoing explorations within MPEG and ITU-T are extending Versatile Video Coding (VVC, H.266) to multiview scenarios, with demonstrations at IBC 2025 showcasing solutions like Tencent's MultiView266 for 4K binocular compression in next-generation immersive video.^[12]^[21] These evolutions address critical challenges in multiview coding, such as supporting resolutions up to 8K per view—enabled by HEVC's flexible profile and level structures—and improving efficiency for asymmetric configurations where views differ in resolution, frame rate, or quality. For instance, MV-HEVC's view scalability allows independent decoding of base views while predicting dependent ones, reducing complexity in heterogeneous networks.^[11]^[22]

Technical Fundamentals

Multiview Video Representation

Multiview video is represented as a sequence of access units, where each access unit comprises pictures from multiple synchronized views captured at the same time instant by an array of cameras.^[8] These views provide different perspectives of the same scene, enabling depth perception or free-viewpoint navigation upon decoding and rendering. For stereo applications, this typically involves two views (left and right), while more advanced setups can include dozens of views for immersive experiences.^[23] Camera arrangements for capturing multiview video are designed to minimize distortions and maximize scene overlap, commonly configured in linear (side-by-side) formations for basic stereo setups, arc-shaped arrays for intermediate viewpoints, or spherical distributions for full free-viewpoint television (FTV).^[23] The baseline distance between adjacent cameras is typically set to 6.5 cm in stereo configurations to approximate the human inter-pupillary distance, ensuring natural depth cues.^[24] Such arrangements capture highly correlated imagery across views due to the shared scene content. The raw data from these captures is formatted in standard color spaces such as RGB or YUV (typically 4:2:0 subsampling for efficiency), with support for either progressive or interlaced scanning depending on the display requirements.^[8] In 3D extensions like multiview video plus depth (MVD), each view is augmented with corresponding depth maps, which represent per-pixel distances from the camera plane and facilitate view synthesis for novel viewpoints.^[23] A key characteristic of multiview video is the high spatial redundancy between adjacent views, arising from overlapping scene projections.^[23] This redundancy is primarily quantified and exploited through disparity vectors, which describe horizontal shifts in pixel positions across views due to parallax effects from camera separation.^[8]

Core Prediction Mechanisms

Multiview Video Coding (MVC) extends the prediction frameworks of single-view video coding standards like H.264/AVC by incorporating both temporal and inter-view dimensions to exploit redundancies in multi-camera captures. These core mechanisms enable efficient compression of correlated video sequences from multiple viewpoints, typically arranged in a parallel camera setup, while maintaining backward compatibility with legacy decoders for the base view. The prediction process operates on a block basis, such as macroblocks in H.264/AVC, to minimize residual errors after motion or disparity compensation. Temporal prediction in MVC relies on motion-compensated prediction within individual views, mirroring the intra-frame and inter-frame techniques of H.264/AVC. This involves dividing frames into blocks and estimating motion vectors to reference previously decoded pictures in the same view's temporal sequence, thereby reducing redundancies due to object movement over time. The process uses variable block sizes and multiple reference frames for flexibility, achieving high efficiency for dynamic scenes captured from a single viewpoint.^[8] Inter-view prediction complements temporal methods by addressing spatial redundancies across adjacent views, primarily through disparity-compensated prediction. Disparity vectors, analogous to motion vectors but capturing horizontal shifts induced by the fixed baseline between cameras, are estimated by matching blocks to corresponding regions in decoded pictures from neighboring views within the same time instant (access unit). This mechanism is particularly effective for static or slowly moving content, where inter-view correlations dominate, and is applied selectively to avoid artifacts in areas with occlusions or depth discontinuities.^[8]^[25] A hybrid approach integrates these predictions at the macroblock level, allowing rate-distortion optimized mode selection among intra-prediction, temporal inter-prediction, and inter-view prediction. Encoders evaluate candidate modes based on reconstructed reference pictures, with inter-view references explicitly marked and managed via slice headers to ensure accessibility without altering the base view's decoding process. This adaptive selection balances computational complexity and compression gains, often favoring inter-view modes in non-anchor frames for enhanced efficiency.^[8] The effectiveness of these mechanisms is quantified through rate-distortion (RD) optimization, where the cost for a prediction mode is computed as RD = D + λR, with D representing distortion (e.g., mean squared error) and R the bitrate, weighted by Lagrange multiplier λ. Inter-view prediction typically yields 20-50% bitrate savings over independent view coding, depending on view count and scene correlation, establishing MVC's impact on multiview compression benchmarks.^[8]

MVC for H.264/AVC

Coding Structure and Layers

The coding structure of Multiview Video Coding (MVC) for H.264/AVC organizes the bitstream into a base layer and one or more dependent layers to enable efficient representation of multiple views while maintaining compatibility with existing single-view decoders. The base layer encodes a single view using the unmodified H.264/AVC syntax, allowing it to be decoded independently by legacy H.264/AVC decoders without any MVC-specific extensions. Dependent layers encode additional views that rely on both temporal prediction within the same view and inter-view prediction from other views, including the base layer, to achieve compression gains. This hierarchical approach supports multiple views, with the standard allowing up to 1024 views via 10-bit view IDs, though typical applications and tests use up to 8 views, with the base layer serving as the foundational independent stream and dependent layers providing delta-coded enhancements for the remaining views.^[26] The bitstream syntax incorporates MVC-specific descriptors within sequence parameter sets (SPS) and subset sequence parameter sets (subset SPS, NAL unit type 15) to define the multiview configuration. The SPS MVC extension includes critical parameters such as view identification, which assigns unique view IDs to each view in a predefined order essential for decoding dependencies using 10-bit view IDs ranging from 0 to 1023; view dependency information, outlining prediction relationships between views; and level indices for specifying operation points that determine decodable subsets of the bitstream. Subset SPS further refines this by associating specific parameter sets with individual views or groups, ensuring flexible extraction of operation points for targeted decoding, such as a single enhanced view. MVC extends the group of pictures (GOP) structure from single-view H.264/AVC to accommodate multiview prediction, employing hierarchical B-frames across both temporal and inter-view dimensions for improved efficiency. Within a GOP, anchor pictures—typically I- or P-frames that do not rely on temporal prediction—enable direct inter-view prediction, facilitating random access and symmetric coding where all views maintain equivalent quality and frame rates. For asymmetric configurations, such as stereo video, non-anchor pictures use hierarchical temporal prediction combined with inter-view references, while interview GOPs allow tailored dependencies that can prioritize higher quality for the base view over dependent views to optimize bitrate allocation. Backward compatibility is ensured by confining MVC extensions to new network abstraction layer (NAL) unit types (e.g., type 20 for coded slice extension), which standard H.264/AVC decoders ignore, allowing seamless extraction and playback of the base layer without modification. This design permits MVC bitstreams to be transported in systems originally supporting only single-view H.264/AVC, such as Blu-ray Disc stereoscopic playback.

Inter-View Prediction Techniques

In H.264 MVC, inter-view prediction exploits spatial correlations between views by estimating disparity vectors through block matching algorithms applied across different camera perspectives. Disparity estimation operates at integer-pel accuracy, mirroring the integer-pel motion estimation process in single-view H.264/AVC, where candidate blocks from a reference view are compared to the current block using metrics like sum of absolute differences (SAD). The resulting disparity vectors are encoded and stored similarly to motion vectors, enabling disparity-compensated prediction that reduces redundancy in non-base views. This approach avoids sub-pel refinement for inter-view searches to maintain computational efficiency, as sub-pel accuracy is reserved primarily for temporal motion compensation.^[8] The reference picture list in H.264 MVC is extended to include both temporal pictures from the same view and inter-view pictures from adjacent views within the same access unit, allowing flexible reordering to prioritize relevant references. To account for global camera translations, a global disparity adjustment mechanism shifts the reference frame position by a disparity vector derived from camera parameters or estimated from the sequence, ensuring accurate alignment across views without per-block computation. This adjustment is applied at the slice or picture level, enhancing prediction accuracy for scenes with parallel camera setups.^[8] Specific prediction modes in H.264 MVC facilitate efficient inter-view coding, including inter-view skip and direct modes that enable zero-residual prediction by inheriting motion or disparity information from co-located blocks in reference views. In inter-view skip mode, applicable to P-slices in dependent views, the motion vectors and residuals are inferred directly from the corresponding block in the adjacent view, bypassing explicit signaling for stationary regions across perspectives. The inter-view direct mode, used in B-slices, derives bidirectional prediction from inter-view references, further reducing bitrate for static multiview content. Additionally, weighted prediction is employed to compensate for illumination differences between views, applying explicit weights and offsets to the prediction signal based on view-specific parameters, which is particularly beneficial in non-parallel camera configurations.^[8] These inter-view techniques demonstrate significant compression gains in H.264 MVC, achieving approximately 30% bitrate reduction for stereo video compared to simulcast encoding of independent views, as validated in Joint Video Team (JVT) core experiments using test sequences like "Flamenco." For multiview setups with more cameras, average savings reach around 20-30% depending on view count and content, highlighting the efficacy of disparity-based prediction in reducing inter-view redundancy.^[8]

MV-HEVC Extension

Architectural Differences

MV-HEVC was standardized in October 2014 as part of ITU-T H.265 with subsequent amendments in 2015, introducing multiview capabilities to the base High Efficiency Video Coding (HEVC) standard primarily through modifications at the high-level syntax, leaving the core decoding processes below the slice level unchanged. This design reuses existing single-layer HEVC decoders without alterations to coding tree units (CTUs) or prediction units (PUs), enabling efficient inter-view prediction while preserving compatibility with monoscopic HEVC bitstreams. The Video Parameter Set (VPS) is extended to signal layer dependencies, view identifiers, and scalability information, while Sequence Parameter Sets (SPS) and Picture Parameter Sets (PPS) include additional syntax elements—such as multiview-specific extensions—to support sharing across layers or per-layer customization, including constraints on disparity vectors.^[19] In terms of view scalability, MV-HEVC supports coding of multiple texture views and can accommodate depth views as additional texture-like layers, facilitating applications like multiview plus depth formats for 3D rendering. Temporal sub-layers, inherited from base HEVC, are managed through VPS signaling for dependency info, allowing temporal scalability within each view. The bitstream structure merges all views into a single scalable stream, where layers are identified by network abstraction layer (NAL) unit header layer IDs and view IDs in the VPS, enabling flexible extraction of subsets for decoding. This contrasts with base HEVC's single-view focus by introducing layer sets and output layer sets for multiview operation, theoretically supporting dozens of views depending on layer allocation.^[19]^[27] Compared to the original Multiview Video Coding (MVC) extension of H.264/AVC, MV-HEVC maintains a similar hierarchical prediction structure but leverages HEVC's advanced tools, including larger block sizes up to 64×64 for CTUs versus AVC's 16×16 macroblocks. The base view in MV-HEVC remains fully decodable by a standard HEVC decoder, ensuring backward compatibility akin to MVC's base view support, but with enhanced compression efficiency for multiview scenarios due to HEVC's underlying improvements.^[19]

Enhanced Coding Tools

MV-HEVC introduces several advanced coding tools that leverage the foundational capabilities of HEVC while addressing multiview-specific challenges, such as inter-view redundancies and geometric distortions. These enhancements build upon inter-view prediction mechanisms by incorporating higher precision and adaptive techniques, enabling more efficient compression for multi-camera setups. Key among them are improvements in disparity handling, depth integration, and compensation for environmental variations. Improved disparity compensation in MV-HEVC achieves sub-pixel accuracy for inter-view motion compensation, utilizing quarter-sample precision with advanced interpolation filters, such as 7-tap or 8-tap for luma components, to refine disparity vector predictions. This is complemented by an adaptation of the Advanced Motion Vector Prediction (AMVP) scheme, which incorporates inter-view candidates into the candidate list for merge and AMVP modes, allowing efficient derivation of disparity vectors without explicit scaling due to the use of long-term reference pictures for dependent views. These modifications reduce artifacts in synthesized views and enhance prediction accuracy across camera geometries.^[15]^[28] Depth-based processing represents a significant advancement through integration with 3D-HEVC, where depth maps are employed to facilitate view synthesis prediction (VSP). In this approach, depth information enables backward warping of texture samples from reference views to generate predictive blocks for target views, with VSP merge candidates signaled at the block level to select synthesized textures. This tool exploits geometric relationships derived from depth data, allowing for more precise inter-view predictions, particularly in scenarios with occlusions or sparse camera arrangements, and is particularly effective when combined with sub-block partitioning for finer granularity.^[15]^[28] To mitigate inconsistencies arising from differing lighting conditions across cameras, MV-HEVC incorporates illumination compensation using local and global linear models. These models apply a scaling factor and offset, estimated from neighboring reconstructed samples, to adjust predicted samples from reference views, thereby aligning intensity levels and reducing residual errors in inter-view prediction. The local model operates on a per-block basis for fine-tuned adaptation, while global variants provide broader corrections, improving coding efficiency especially for sequences with non-uniform illumination.^[15] These enhanced tools collectively yield substantial efficiency gains, with MV-HEVC demonstrating 25-40% bitrate reductions compared to simulcast HEVC encoding for high-definition multiview content, as verified in JCT-3V common test conditions using sequences like "BookArrival." For instance, in two-view configurations, average savings reach approximately 28%, escalating to 38% for three views, highlighting the impact on practical deployment for immersive video applications.^[15]^[28]

Implementation and Ecosystem

Software and Hardware Support

Commercial encoders for Multiview Video Coding (MVC) and its extensions, such as Multiview High Efficiency Video Coding (MV-HEVC), are provided by specialized vendors to support professional video production and analysis. Elecard's StreamEye suite includes tools for analyzing MV-HEVC video streams, enabling quality assessment and debugging of multiview content through metrics like PSNR and SSIM.^[29] MainConcept offers a dedicated MV-HEVC Encoder SDK add-on, which facilitates the creation of stereoscopic 3D content compatible with platforms like Apple Vision Pro, supporting Main 10 profiles, HDR signaling, and integration with formats such as MP4 and HLS.^[30] Hardware acceleration for MV-HEVC encoding has been integrated into NVIDIA's NVENC engine via the Video Codec SDK, with support introduced in version 13.0 in early 2025 to enable efficient stereo encoding for applications in broadcasting and AR/VR.^[12] This hardware-based approach leverages GPU resources to handle the demands of multiview compression, providing enhanced quality and efficiency over software-only methods. Decoding support for MVC has been embedded in consumer hardware since the finalization of the Blu-ray 3D specification in December 2009, which mandates MVC as the codec for stereoscopic video on optical discs, allowing backward compatibility with standard H.264/AVC players.^[31] For MV-HEVC, Apple devices such as Vision Pro gained native support for stereoscopic playback starting in 2023 through the HEVC Stereo Video profile, enabling seamless rendering of multiview streams on compatible displays.^[32] In broadcasting, MVC has been used for 3D content transmission. For streaming, Dolby Vision Profile 20, introduced in 2023, incorporates MV-HEVC to deliver immersive stereoscopic video over IP networks, supporting 3D experiences on HDR-enabled devices.^[33] A key challenge in deploying MVC and MV-HEVC is the elevated computational complexity, which can be 2-3 times higher than single-view HEVC due to inter-view prediction and multi-loop processing, necessitating optimized hardware or software to maintain real-time performance.^[34] Open source alternatives, such as those based on reference software, provide complementary implementation options but are detailed separately.

Open Source Developments

Early open-source implementations of Multiview Video Coding (MVC) faced significant gaps, particularly in comprehensive encoding and decoding support within widely used libraries like FFmpeg and x264. Until around 2016, FFmpeg and x264 lacked full MVC capabilities, with x264 primarily focused on single-view H.264/AVC encoding without native multiview extensions. Partial MVC decoding emerged through external filters such as LAV Filters, which introduced H.264 MVC 3D demuxing and basic decoding support in version 0.68.0 released on March 8, 2016, enabling playback of MVC streams from formats like MKV 3D and Blu-ray but requiring integration with compatible renderers like madVR for full functionality.^[35] By 2025, open-source support for Multiview High Efficiency Video Coding (MV-HEVC) has advanced considerably, primarily through integrations in FFmpeg and the libx265 encoder. FFmpeg now provides full encode and decode capabilities for MV-HEVC via libx265, with the hevc decoder supporting multiview streams limited to at most two views, ensuring compatibility for stereoscopic content. Multiview flags were incorporated into libx265 starting with version 4.0 in September 2024, following community patches and contributions that enabled compile-time configuration for MV-HEVC. The x265 command-line interface includes options like --multiview-config for specifying multiview encoding parameters, such as view counts and inter-view prediction settings, allowing users to generate efficient stereoscopic HEVC bitstreams.^[36]^[37]^[38]^[39] Several open-source tools facilitate playback and processing of MVC and MV-HEVC content. The Bino 3D player, a free stereoscopic video player, supports H.264 MVC decoding for 3D playback, handling formats like side-by-side and frame-packed MVC streams on Linux, Windows, and macOS platforms. VLC Media Player integrates MV-HEVC stereo support through its FFmpeg backend, enabling seamless playback of two-view MV-HEVC files without additional plugins, provided the latest versions are used for optimal HEVC handling.^[40]^[41] Despite these advancements, open-source MVC implementations remain limited, particularly for scenarios exceeding two views, where support is incomplete and often requires custom builds or experimental patches in tools like FFmpeg. Development is largely community-driven, resulting in slower integration of hardware acceleration compared to single-view codecs, with reliance on software decoding that can impact performance on resource-constrained systems.^[42]^[43]

Intellectual Property

Patent Pools and Holders

The primary patent pool for Multiview Video Coding (MVC), an extension of the H.264/AVC standard, is administered by MPEG LA, which announced the MVC Patent Portfolio License on February 23, 2012, to provide one-stop licensing for essential patents.^[44] Following the 2023 acquisition of MPEG LA by Via Licensing Corp., the program continues under Via Licensing Alliance, which expanded the AVC Patent Portfolio License in 2022 to encompass complete MVC coverage without altering royalty structures.^[45] This pool now aggregates essential patents from over 40 organizations, enabling streamlined access for implementers.^[46] Key initial contributors to the MVC pool included Panasonic Corporation, LG Electronics Inc., Dolby Laboratories Licensing Corporation, Sony Corporation, and Thomson Licensing, among 15 total licensors whose patents covered core multiview functionalities.^[44] Subsequent expansions incorporated additional holders from the broader AVC ecosystem, such as Fujitsu Limited, Mitsubishi Electric Corporation, and Sharp Corporation, reflecting the integrated nature of MVC within H.264.^[45] As of 2025, a majority of patents in the H.264/MVC pool have lapsed due to natural expiration (over 50% by 2023, with further expirations ongoing), particularly those filed in the early 2000s, thereby lowering effective licensing burdens for legacy MVC deployments while active patents continue to enforce coverage.^[47]^[48] Royalties under the AVC/MVC license are set at $0.20 per end-product unit (e.g., decoders or codecs), applicable after a 100,000-unit annual threshold, with enterprise-wide caps at $3.5–$25 million depending on volume. For MV-HEVC, the multiview extension to HEVC (H.265), essential patents are licensed through multiple pools including Via LA's HEVC Patent Portfolio License, established in 2013, and HEVC Advance, covering both baseband and extension profiles.^[49]^[50] This program includes contributions from major holders like Panasonic, LG Electronics, and Dolby, overlapping with the core HEVC declarations to facilitate multiview implementations in advanced video systems.^[51]

Licensing and Adoption Barriers

The licensing of Multiview Video Coding (MVC) is administered through Via Licensing Alliance's AVC/H.264 Patent Portfolio License, which was expanded in 2022 to fully cover MVC as an extension without altering the existing royalty structure or imposing additional fees. Per-unit charges for encoders and decoders are $0.20 after initial volume thresholds, with annual caps such as $25 million per legal entity for high-volume deployments.^[45]^[52] These licensing costs and the complexity of patent pools have posed significant barriers to MVC adoption, particularly in open-source developments prior to 2020, where commercial distribution of encoders or decoders required navigating royalty obligations that discouraged broad experimentation and integration. Hardware fragmentation further exacerbated this, with limited native support for MVC decoding in mobile ecosystems like Android, leading to inconsistent playback across devices and hindering consumer applications.^[45] MVC adoption peaked during the 2010s with its integration into Blu-ray 3D discs, where it enabled efficient stereoscopic encoding for home theater systems, but declined sharply with the rise of streaming services that prioritized 2D content due to bandwidth constraints and reduced demand for 3D viewing. By 2025, a revival is underway in virtual reality (VR) applications through the MV-HEVC extension, supported by devices like the Apple Vision Pro for immersive stereoscopic experiences.^[31]^[53]^[30]^[54] To circumvent these patent-related hurdles, some regions and developers have shifted to alternative standards like AVS3 multiview coding, which incorporates depth-based formats and offers a more transparent, low-cost licensing model designed to avoid reliance on international pools.^[55]

References

[1]
[PDF] MPEG Developments in Multi-view Video Coding and 3D Video - ITU
Multiview & 3D Video Coding – EBU Workshop April 2009. 8. Page 9. MVC Standard – Status and Overview. Standard was approved in July 2008. ▫ Specified as an ...
[2]
Overview of the Stereo and Multiview Video Coding Extensions of ...
Jan 31, 2011 · This paper provides an overview of the algorithmic design used for extending H.264/MPEG-4 AVC towards MVC. The basic approach of MVC for ...
[3]
Overview of the multiview high efficiency video coding (MV-HEVC ...
This paper reviews the multiview extension (MV-HEVC) of the High Efficiency Video Coding (HEVC) standard. MV-HEVC is capable of multiview video coding with ...
[4]
Overview of the Multiview and 3D Extensions of High Efficiency ...
Sep 11, 2015 · The High Efficiency Video Coding (HEVC) standard has recently been extended to support efficient representation of multiview video and depth-based 3D video ...
[5]
https://ieeexplore.ieee.org/document/9328514/
[6]
https://www.iso.org/standard/79113.html
[7]
[PDF] Overview of the Stereo and Multiview Video Coding Extensions of ...
Significant improvements in video compression capability have been demonstrated with the in- troduction of the H.264/MPEG-4 Advanced Video Coding (AVC) standard ...
[8]
Realization of Free Viewpoint TV Based on Improved MVC
Jun 23, 2009 · We present a Free Viewpoint Television (FTV) application with performance improved multi-view video coding (MVC) strategy, which is intended ...
[9]
https://dl.acm.org/doi/10.1007/978-3-642-02472-6_13
[10]
Multiview Video Coding - an overview | ScienceDirect Topics
Multiview video coding (MVC) is defined as a standard that efficiently encodes two or more spatially correlated images, commonly applied to stereo pairs.
[11]
Enabling Stereoscopic and 3D Views Using MV-HEVC in NVIDIA ...
Feb 24, 2025 · MV-HEVC is an extension of the High Efficiency Video Coding (HEVC) standard, designed to efficiently compress multiple video views of the same ...
[12]
Apple Vision Pro to launch with 150 3D movies, immersive films and ...
Jan 16, 2024 · At launch, the headset will feature 150 3D movies, immersive films and series, a Travel Mode feature, streaming services like Disney+ and Amazon Prime Video, ...
[13]
https://techcrunch.com/2024/01/16/apple-vision-pro-to-launch-with-150-3d-movies-immersive-films-and-series-disney-max-and-more/
[14]
(PDF) Overview of Multi-view Video Coding - ResearchGate
The main feature of the new MPEG-4 multimedia standard with respect to video ... Multiview Video Coding, " JVT-U211, Oct. 2006. Show more ...
[15]
Overview of the Multiview and 3D Extensions of High Efficiency ...
Aug 5, 2025 · In 2012, the Joint Collaborative Team on 3D Video Coding Extension Development (JCT-3V) was established by ISO/IEC MPEG and ITU-T VCEG to ...
[16]
H.265 : High efficiency video coding
### Summary of Publication Dates for H.265 Versions Including Annex G MV-HEVC
[17]
ISO/IEC 23008-2:2015 - High efficiency video coding
ISO/IEC 23008-2:2015 specifies high efficiency video coding. General information Status : Withdrawn Publication date : 2015-05 Stage : Withdrawal of ...Missing: HEVC | Show results with:HEVC
[18]
Multiview High Efficiency Video Coding (MV-HEVC)
MV-HEVC is included in the second version of HEVC, which was finalized in October 2014.Missing: date | Show results with:date<|control11|><|separator|>
[19]
[PDF] Overview of the Multiview and 3D Extensions of High Efficiency ...
standardization activities in video coding, namely, the MPEG Video Subgroup since 2002, the Joint Video Team of MPEG and ITU-T SG 16 VCEG from. 2005 to 2009 ...
[20]
https://ieeexplore.ieee.org/document/7258339
[21]
IBC 2025: Media Coding Industry Forum Demonstrates How VVC ...
Sep 10, 2025 · Demonstrations will highlight Tencent's MultiView266 solution with binocular 4K video compression and decoding as well as Its 6DoF266 toolkit ...Missing: explorations | Show results with:explorations
[22]
NVIDIA Video Codec SDK
Video Codec SDK 13.0 also introduces MV-HEVC for hardware-accelerated stereo encoding to address use cases in broadcast, auto and AR/VR headsets. Using rich ...
[23]
[PDF] International Standardization of FTV - OAPEN Library
The framework of MVC is shown in Fig. 2. MVC targets efficient coding of multiview video. In MVC, the number of input views is the same as output views. The ...
[24]
[PDF] Method for measuring stereo camera depth accuracy based on ...
I is interpupillary distance, typically 6.5 cm. Human depth ... The 7 cm baseline represents the interpupillary distance and 14 cm the width of head.Missing: multiview | Show results with:multiview
[25]
View synthesis prediction for multiview video coding - ScienceDirect
Then, disparity vectors are easily computed from inter-view reference pictures in the same way that motion vectors are computed from temporal reference pictures ...
[26]
https://datatracker.ietf.org/doc/html/draft-ietf-payload-rtp-mvc-01
[27]
[PDF] Standardized Extensions of High Efficiency Video Coding (HEVC)
Abstract. This paper describes extensions to the High Efficiency Video Coding (HEVC) standard that are active areas of current development in the relevant ...
[28]
Video analysis StreamEye - Elecard
It covers most deployed video codecs (MPEG-1/2, H.264, HEVC, VP9) and lets you color and see all specific coding elements that they use. If you are tuning your ...
[29]
MV-HEVC - Immersive 3D for Apple Vision Pro - MainConcept
Create immersive 3D content for Apple Vision Pro with MV-HEVC. Ensure seamless playback & compatibility with HEVC for an enhanced viewing experience.
[30]
Final 3-D Blu-ray Specification Announced
Dec 17, 2009 · The Blu-ray 3D specification calls for encoding 3-D video using the Multiview Video Coding (MVC) codec, an extension to the ITU-T H.264 Advanced ...
[31]
[PDF] Apple HEVC Stereo Video - Interoperability Profile (Beta)
Jun 21, 2023 · Annex G of the HEVC Specification (Recommendation ITU-T H.265 or ISO/IEC 23008-2), build- ing on Annex F, defines how multiple layers ...
[32]
The Official 2010 FIFA World Cup Film in 3D Blu-ray review - Engadget
Nov 18, 2010 · ... 3D sports, they weren't here. While the video is in 720p MVC-encoded 3D, the picture is sharp with little motion artifacting that we could ...Missing: trials | Show results with:trials
[33]
Dolby Vision Profile 20 FAQ
Nov 7, 2023 · Note that MV-HEVC codec is designed to carry multiple views and its use-cases are not limited to stereoscopic video. Multiview extensions to ...Missing: maximum | Show results with:maximum
[34]
[PDF] High Performance Multiview Video Coding
For an average QP of around 25, the reduction in the bitrate is approximately 80%. it should be, however, noted that the source of bitrate reduction is the ...
[35]
LAV Filters Version History - VideoHelp
LAV Video - NEW: H.264 MVC 3D decoding (requires madVR 0.90 or newer) - NEW: HEVC HDR streams export the HDR metadata to the video renderer (requires madVR ...
[36]
FFmpeg Codecs Documentation
See x265 --help for a list of options. For example to specify libx265 encoding options with -x265-params : ffmpeg -i input -c:v libx265 -x265-params crf=26 ...
[37]
Blog | How To Encode MV-HEVC Video With FFmpeg - SpatialGen
Oct 2, 2024 · Download or clone the repository and make sure you have version 4.0. The September 13th, 2024 release has MV-HEVC support. Make sure to unzip ...Missing: libx265 | Show results with:libx265<|separator|>
[38]
[x265] [PATCH MV-HEVC 01/10] Add compile time configuration and ...
Aug 7, 2024 · [x265] [PATCH MV-HEVC 01/10] Add compile time configuration and support for parsing multiview config file.
[39]
Command Line Options — x265 documentation
If there are more than two extra arguments, the CLI will consider this an error and abort. For Multiview encodes i.e encodes with --multiview-config , only one ...
[40]
bino - about
Bino is a video player with a focus on 3D and Virtual Reality: Support for 3D videos in various formats. Support for 360° and 180° videos, with and without 3D.Download · News · Examples · ContactMissing: MVC | Show results with:MVC
[41]
How to Play HEVC Files in VLC Media Player (2025 Guide)
Oct 23, 2025 · This guide walks you through the complete process of setting up VLC to play HEVC content smoothly, including troubleshooting common playback ...
[42]
xde: Support standard MVC decoding [FFmpeg limitation] #68 - GitHub
Jan 18, 2020 · So that MVC reduces bandwidth necessary for second stream, and potentially improve decoding time. MVC also considers more than 2 views videos ( ...
[43]
The State of the Video Codec Market 2025 - Streaming Media
Mar 28, 2025 · I'm here to help you decide whether it's time to go all in on AV1, VVC, LCEVC, or EVC or whether it's better to stick with H.264, VP9, and HEVC.<|control11|><|separator|>
[44]
MPEG LA Offers MVC Patent Portfolio License - Design And Reuse
A summary of the license terms is attached. In addition, MVC products may benefit from coverage under the AVC/H.264 Patent Portfoli License offered by MPEG LA.Missing: details | Show results with:details
[45]
MPEG LA Expands AVC License to Include Complete MVC Coverage
Although coverage has been expanded, the AVC License royalty rates and annual caps remain unchanged. Relevant Links. English PDF · AVC Overview · License Fees ...Missing: pool details holders 2025
[46]
MPEG LA Expands AVC License to Include Complete MVC Coverage
Mar 16, 2022 · More than 2,000 licensees enjoy the worldwide one-stop coverage of MPEG LA's AVC Patent Portfolio License under essential patents owned by 41 ...
[47]
HEVC - ViaLa - Via Licensing
Via LA's HEVC Patent Portfolio License provides access to essential patent rights for the HEVC digital video coding standard.
[48]
HEVC/VVC - ViaLa - Via Licensing Alliance
Please note that as of October 1, 2025, Via LA is changing the royalty rate structure for its HEVC/VVC Pool. Learn more on our License Fees page.Patent List · Submit a Patent · FAQMissing: Multiview holders
[49]
H.264 patents: how much do they really cost? - ZDNET
May 4, 2010 · H.264 patents: how much do they really cost? · 0 - 100,000 units per year = no royalty · US $0.20 per unit after first 100,000 units each year ...
[50]
Supported media formats - Android Developers
May 22, 2025 · This document describes the media codec, container, and network protocol support provided by the Android platform.Missing: multiview 2023
[51]
MPEG LA Expands AVC License | TV Tech - TVTechnology
Mar 17, 2022 · Although coverage has been expanded, the AVC License royalty rates and annual caps remain unchanged. The MPEG LA's AVC/H.264 Patent ...
[52]
AVS3 Video Codec Technology | HiSilicon
The AVS licensing mode is set prior to standard formulation, helping avoid potentially complex and stringent patent licensing challenges in advance, that might ...Missing: multiview | Show results with:multiview