USB video device class
The USB Video Device Class (UVC) is a standardized USB device class specification developed by the USB Implementers Forum (USB-IF) that defines protocols for video streaming, control, and still image capture from USB-connected devices, ensuring interoperability between hosts and devices like webcams, digital camcorders, and video converters.[1] Introduced in version 1.0 in 2003, the specification has evolved through revisions including 1.1 and the current 1.5 (released August 9, 2012), which adds features such as enhanced encoding controls, latency optimizations, and support for modern video formats while maintaining backward compatibility.[1] UVC devices are identified by base class code 0x0E in USB interface descriptors, encompassing sub-classes and protocols detailed in the specification for video control (VC) and video streaming (VS) interfaces.[2] Key aspects of UVC include its format-agnostic approach, supporting frame-based formats (e.g., uncompressed YUV or MJPEG), stream-based formats (e.g., MPEG-2 transport streams), and temporally compressed formats (e.g., H.264), alongside mechanisms for stream negotiation via probe and commit controls to manage bandwidth and device capabilities.[1] The class enables precise device controls such as pan, tilt, zoom, privacy settings, and region-of-interest selection, using standard USB control transfers and status interrupts for real-time feedback.[1] Supported device types range from traditional video cameras and TV tuners to analog-to-digital converters, displays with video input, and media transport devices, all compliant with USB 2.0 and later specifications.[1] By standardizing payload headers with framing information (e.g., frame ID, end-of-frame markers), UVC simplifies driver implementation, often allowing plug-and-play operation without custom software on major operating systems.[1]Overview
Definition and purpose
The USB Video Device Class (UVC) is a standardized protocol within the Universal Serial Bus (USB) framework, designated with the base class code 0x0E, for devices capable of streaming video and capturing still images over USB connections.[2] It encompasses two primary subclasses: Video Control (0x01), which manages device settings such as camera adjustments, and Video Streaming (0x02), which handles the transmission of video payloads.[3] This class defines the necessary descriptors, requests, and controls to enable consistent video functionality without requiring device-specific implementations.[4] The primary purpose of UVC is to standardize communication protocols between host systems and video peripherals, thereby minimizing the development of custom drivers and promoting broad interoperability.[5] By establishing a common interface for video data exchange, UVC allows devices like cameras and capture cards to operate seamlessly across various platforms and operating systems, leveraging built-in host drivers for plug-and-play compatibility.[5] UVC specifically addresses input video devices that capture and stream content or still images, excluding output displays or rendering functions.[4] Developed by the USB Implementers Forum (USB-IF), it aims to simplify the integration of video capabilities into consumer electronics ecosystems.[4] The specification's initial release in 2003 was motivated by the need to resolve driver fragmentation that arose after the widespread adoption of USB 2.0, providing a unified approach to video device support.[3]Key features and benefits
The USB Video Device Class (UVC) emphasizes plug-and-play functionality through automatic device enumeration and configuration, eliminating the requirement for proprietary drivers and enabling seamless integration with host systems.[1] This approach leverages the standard USB class structure to ensure devices are recognized and operational immediately upon connection, reducing setup complexity for end users.[1] At its core, UVC defines two primary interfaces: the Video Control (VC) interface, which provides standardized controls for camera settings such as brightness, contrast, and zoom, and the Video Streaming (VS) interface, responsible for transmitting video data payloads efficiently.[1] Bandwidth efficiency is a hallmark of UVC, achieved via support for isochronous transfers over USB 2.0 and USB 3.0, which deliver real-time video streaming with minimal latency suitable for applications like video conferencing and surveillance.[1] These transfers prioritize guaranteed bandwidth allocation, ensuring consistent performance even in shared bus environments without the overhead of retransmissions common in other USB transfer modes.[1] This capability scales effectively from low-resolution webcams delivering basic 640x480 streams to high-definition devices supporting 1080p or higher resolutions at frame rates up to 60 fps, adapting to diverse hardware constraints.[1] Key benefits of UVC include significantly reduced development costs for manufacturers, as adherence to the class specification minimizes the need for custom software and testing.[1] It fosters broad interoperability across operating systems and devices, promoting widespread adoption in consumer and professional markets.[1] Additionally, UVC supports vendor-specific extensions, allowing proprietary enhancements while preserving baseline compliance to maintain compatibility.[1]Technical architecture
Device class structure
The USB Video Device Class (UVC) is defined within the USB framework using specific class codes to identify its functionality. The base class code is 0x0E, designated as CC_VIDEO for video devices. This class employs two primary subclasses: 0x01 for SC_VIDEOCONTROL, which handles device controls, and 0x02 for SC_VIDEOSTREAMING, which manages video data transfer. The protocol code is 0x00, indicating the standard undefined protocol, with vendor-specific extensions allowed under 0xFF where applicable.[1] UVC devices organize their interfaces hierarchically to separate control and data functions. The VideoControl (VC) interface is mandatory and operates with a single alternate setting (0), incorporating a required control endpoint for management and an optional interrupt endpoint for status updates, particularly if hardware triggers or automatic control adjustments are implemented. The VideoStreaming (VS) interface is optional and supports multiple instances for handling concurrent streams, utilizing isochronous endpoints for real-time video data or bulk endpoints for non-real-time transfers, with an additional optional bulk endpoint for still image capture in certain configurations. This structure ensures efficient resource allocation within the USB bus, allowing devices to expose controls independently from streaming operations.[1] At the core of UVC's organization is a unit hierarchy that models the video pipeline through interconnected descriptors. Input Terminals (ITs) serve as data sources with a single output pin, such as Camera Terminals (CTs) that interface with sensors and support features like zoom or focus. Output Terminals (OTs) act as data sinks with a single input pin. Processing Units (PUs) connect between terminals or other units, applying image adjustments like brightness or contrast via a single input and output. Selector Units (SUs) route from multiple inputs to one output, while Extension Units (XUs) enable vendor-specific processing with flexible input/output configurations. Encoding Units (EUs) manage compression attributes for outputs. These elements link via unique identifiers—bTerminalID for terminals and bUnitID for units—with connections specified by bSourceID fields, forming a directed acyclic graph that prohibits loops or fan-in to maintain predictable data flow.[1]| Component Type | Identifier Field | Connection Field | Key Characteristics |
|---|---|---|---|
| Input Terminal (IT) | bTerminalID | N/A (source) | 0 inputs, 1 output; e.g., Camera Terminal for sensor interface |
| Output Terminal (OT) | bTerminalID | bSourceID | 1 input, 0 outputs; data sink |
| Processing Unit (PU) | bUnitID | bSourceID | 1 input, 1 output; controls like contrast |
| Extension Unit (XU) | bUnitID | bSourceID | ≥1 inputs, 1 output; custom vendor features |
| Selector Unit (SU) | bUnitID | bSourceID | ≥1 inputs, 1 output; input routing |
Standard requests and controls
The USB Video Device Class (UVC) employs standard USB requests to enable hosts to configure and query device parameters, such as exposure, focus, and frame rate, through the VideoControl (VC) and VideoStreaming (VS) interfaces.[1] These requests build on the class's interface structure to facilitate runtime control without requiring custom drivers.[1] Core standard requests include GET_CUR and SET_CUR, which retrieve or set the current value of a control, involving a data stage for parameter exchange.[1] Complementary requests such as GET_MIN, GET_MAX, GET_RES, and GET_DEF provide the range, resolution, and default values for these controls, supporting parameters like exposure time (via CT_EXPOSURE_TIME_ABSOLUTE_CONTROL, selector 0x0004) and absolute focus (via CT_FOCUS_ABSOLUTE_CONTROL, selector 0x0006).[1] For instance, focus control supports auto-mode toggling with GET_CUR/SET_CUR on the PU_FOCUS_AUTO selector (0x0022), where the device stalls the request if manual adjustment is attempted while auto-focus is enabled.[1] VC interface requests handle configuration and event notification, with status interrupts via an optional endpoint to report device events like control changes or stream errors.[1] VS interface requests focus on stream initialization, primarily through Video Probe and Video Commit controls (VS_PROBE_CONTROL and VS_COMMIT_CONTROL).[1] These 48-byte structures negotiate parameters like resolution, frame interval, and bitrate, with fields including bmHint for optimization hints, bFormatIndex for format selection, and dwFrameInterval for timing.[1] The probe request proposes settings, while commit finalizes them, ensuring compatibility before streaming begins. The VS interface also includes the Stream Error Code Control (selector 0x04), accessible via GET_CUR to retrieve stream-specific error status, such as format changes or transmission issues.[1] Control selectors specify the parameters targeted by these requests, categorized under processing units (PU), camera terminals (CT), and others.[1] Key examples include:| Selector | Value (Hex) | Description | Supported Requests |
|---|---|---|---|
| PU_BRIGHTNESS | 0x0001 | Adjusts image brightness | GET/SET_CUR, GET_MIN/MAX/RES/DEF |
| PU_FOCUS_AUTO | 0x0022 | Enables/disables automatic focus adjustment | GET/SET_CUR, GET_INFO |
| CT_EXPOSURE_TIME_ABSOLUTE_CONTROL | 0x0004 | Sets precise exposure time in absolute units | GET/SET_CUR, GET_MIN/MAX/RES/DEF |
| CT_FOCUS_ABSOLUTE_CONTROL | 0x0006 | Manually sets focus distance | GET/SET_CUR, GET_MIN/MAX/RES/DEF |