Azure Kinect
The Azure Kinect Developer Kit (DK) is a compact spatial computing device developed by Microsoft, integrating multiple AI-enabled sensors including a 1-megapixel time-of-flight depth camera with wide and narrow field-of-view options, a 12-megapixel RGB color camera, a seven-microphone circular array for spatial audio, and an inertial measurement unit (IMU) comprising an accelerometer and gyroscope.[1] It enables real-time capture of depth data, high-resolution video, directional audio, and motion information, supporting applications in computer vision, body tracking, speech recognition, and mixed reality through integration with Azure AI services and open-source software development kits (SDKs).[1] Announced on February 24, 2019, at the Mobile World Congress in Barcelona, the Azure Kinect DK served as the successor to the Kinect for Windows v2 sensor, whose production was discontinued in 2015 and sales ended in 2018, and was positioned as a professional-grade tool for developers and enterprises rather than consumer gaming.[2] Priced at $399, it became available for preorder immediately following the announcement, with general availability in the United States and China starting in July 2019, and subsequent rollout to markets including the United Kingdom, Germany, and Japan.[3] The device measures less than half the size of its predecessor while offering enhanced accuracy in depth perception and temporal synchronization, making it suitable for advanced research in human-computer interaction and robotics.[1] The Azure Kinect DK supports cross-platform development via the Azure Kinect Sensor SDK, which is compatible with Windows 10/11 and Ubuntu 18.04 LTS, alongside specialized SDKs for body tracking (detecting up to 32 joints per person for multiple individuals) and integration with Azure Cognitive Services for AI processing.[4] It requires connection to a host PC for computation, as it lacks onboard processing power, and includes versatile mounting options for deployment in diverse environments such as offices or labs.[1] In August 2023, Microsoft announced the end of production for the Azure Kinect DK hardware, citing a shift in focus toward broader AI ecosystem partnerships, though the device remains available through third-party suppliers until stocks deplete, and support for the Azure Kinect SDK ended on August 16, 2024, though GitHub repositories and Azure cloud services remain accessible as of 2025.[5][6]Overview
Description
The Azure Kinect Developer Kit (DK) is a spatial computing developer kit designed to combine depth sensing, artificial intelligence capabilities, and integration with Azure cloud services, enabling the development of applications in computer vision and human-computer interaction.[1] It provides developers with tools to create AI-driven solutions that leverage spatial data for enhanced perception and interaction in real-world environments.[1] Priced at $399 USD at launch, the kit targets developers and commercial businesses building sophisticated AI models, particularly in fields such as robotics and healthcare, where it supports applications like patient monitoring and automated navigation systems.[2][1][7] Evolving from the original Kinect sensor, the Azure Kinect shifts focus to professional, non-gaming use cases, offering a compact form factor for enterprise-level AI development while integrating seamlessly with Azure services for scalable cloud-based processing.[8][1]Key Features
The Azure Kinect Developer Kit (DK) features multi-modal sensing capabilities, combining depth, color, infrared, and audio capture in a single device to enable precise 3D spatial mapping and environmental understanding. This integration allows developers to acquire synchronized data from time-of-flight depth sensing for spatial reconstruction, RGB video for visual details, infrared for low-light operations, and a microphone array for audio localization, supporting applications in robotics, human-computer interaction, and augmented reality.[1] AI acceleration is provided through specialized software development kits (SDKs) that leverage the device's sensor data for advanced processing, including real-time body tracking of multiple individuals with 3D joint estimation, object detection via custom models, and speech recognition using integrated audio streams. These capabilities utilize machine learning models run on the host system, drawing from the high-quality, low-latency inputs captured by the device to facilitate on-the-edge AI applications without relying solely on cloud resources. High-fidelity data streams are a core strength, offering synchronized capture across sensors at up to 30 frames per second (FPS) for depth imaging and 4K resolution for RGB video, ensuring temporal alignment essential for dynamic scene analysis and multi-view fusion. This performance enables robust handling of complex environments, such as crowded spaces or fast-moving subjects, while minimizing latency in data acquisition.[9] The modular design enhances extensibility for custom AI pipelines, with configurable sensor modes, open-source SDKs, and support for external synchronization, allowing developers to tailor workflows for specialized tasks like volumetric capture or hybrid edge-cloud processing. It also integrates seamlessly with Azure services for scalable AI deployment.Development and Release
Announcement and Development
The development of the Azure Kinect originated as an evolution of the Kinect v2 sensor technology, transitioning Microsoft's focus from consumer gaming peripherals to enterprise-grade AI and computer vision tools.[10] This shift emphasized integration with cloud-based AI services, achieved through close collaboration between Microsoft's hardware engineering teams and the Azure cloud platform group to enable seamless data processing and analytics at the edge.[11] A significant early milestone came in February 2018, when Microsoft researchers presented a prototype time-of-flight (ToF) depth sensor at the IEEE International Solid-State Circuits Conference (ISSCC), highlighting its high-resolution, low-noise capabilities for real-time 3D mapping.[12] This technology, developed in partnership with semiconductor experts, laid the groundwork for the device's advanced spatial sensing. Building on this, Microsoft announced Project Kinect for Azure in May 2018 during its Build developer conference, unveiling a developer-oriented sensor package that combined depth imaging with onboard compute and connectivity for AI prototyping.[11] The Azure Kinect was formally announced on February 24, 2019, at the Mobile World Congress (MWC) in Barcelona, Spain, where Microsoft showcased it alongside the HoloLens 2 as part of a broader push into mixed reality and intelligent devices.[13] Microsoft's strategic vision framed the device as a foundational tool for spatial computing, designed to empower developers in creating AI-driven applications for industries like healthcare, retail, and robotics, extending beyond the original Kinect's entertainment roots.[1]Launch and Availability
The Azure Kinect Developer Kit (DK) reached general availability in the United States and China on July 15, 2019, following preorders that began in February 2019 to enable early developer access.[2][14] Availability expanded to the United Kingdom, Germany, and Japan in April 2020, driven by strong initial market interest from developers building AI-driven computer vision and speech models.[15][1] The kit was positioned exclusively for developers, with no consumer version produced, and included the Azure Kinect SDK for seamless integration and rapid prototyping of applications.[1] It was offered for direct purchase at $399 through the Microsoft Store and Azure portal, with global distribution later supported by authorized partners to reach additional regions.[1][16] Early reception highlighted its appeal for enterprise use cases in areas like healthcare and manufacturing, contributing to quick adoption among AI researchers and integrators.[15]Discontinuation
In August 2023, Microsoft announced the end of production for the Azure Kinect Developer Kit through an official statement on its Mixed Reality blog, marking the conclusion of direct hardware manufacturing by the company.[17] The hardware discontinuation took effect in October 2023, after which the device was available for purchase only until existing stocks were depleted.[17] This decision reflected Microsoft's strategic shift toward a partner ecosystem for hardware production, enabling third-party manufacturers to license and build upon the Kinect's time-of-flight depth-sensing technology for broader customization and availability.[17] By focusing on software intellectual property licensing rather than hardware sales, Microsoft aimed to sustain the technology's ecosystem without maintaining low-volume direct production.[18] For ongoing support, the Azure Kinect SDK adhered to Microsoft's Modern Lifecycle Policy, providing security and reliability updates until its retirement on August 16, 2024, with no new features developed thereafter.[19] Microsoft recommended sourcing spare parts from third-party suppliers and ensured continued access to the SDK and related tools via GitHub for existing users to maintain their deployments.[17]Hardware
Sensors and Cameras
The Azure Kinect DK incorporates several advanced sensors designed to capture multimodal data for spatial computing and AI applications. These include a depth camera, an RGB camera, an infrared camera, a microphone array, and an inertial measurement unit (IMU), each contributing to comprehensive environmental perception without relying on external lighting for core functions.[9] The depth camera employs time-of-flight (ToF) technology to generate 3D depth maps, illuminating the scene with modulated near-infrared light from an integrated emitter and measuring the phase shift of the reflected light to determine distances up to 5.46 meters. This active sensing approach enables robust 3D mapping in low-light conditions by calculating depth through indirect ToF principles, such as amplitude-modulated continuous-wave detection, which supports applications like object reconstruction and gesture recognition.[20][9] Complementing the depth data, the RGB camera provides high-resolution color imaging that can be aligned and overlaid onto the depth maps, adding visual texture and detail for enhanced scene understanding in computer vision tasks. This integration allows developers to fuse color information with 3D geometry, facilitating applications such as augmented reality overlays and facial analysis.[9] The infrared (IR) camera, integrated within the depth sensing system, operates in narrow-angle and wide-angle modes to capture IR imagery, enabling enhanced depth perception in environments with varying lighting by providing raw IR data that can be used passively without the ToF emitter or actively for depth computation. These modes allow flexibility in field-of-view selection to balance detail and coverage for tasks like motion tracking in cluttered spaces.[9] The microphone array consists of a seven-membrane circular configuration that supports 360-degree audio capture, incorporating beamforming techniques to isolate and enhance voice signals from specific directions while suppressing noise. This setup enables spatial audio processing for applications like voice command recognition and acoustic source localization.[9] The IMU includes an accelerometer and a gyroscope to track device orientation and motion, providing real-time data on linear acceleration and angular velocity for stabilizing sensor outputs and compensating for device movement in dynamic scenarios.[9]Technical Specifications
The Azure Kinect DK features a time-of-flight depth camera with a 1-megapixel resolution, supporting frame rates up to 30 FPS across various modes.[21] It operates in narrow field-of-view (NFOV) and wide field-of-view (WFOV) configurations, with depth ranges from 0.25 m to 5.46 m depending on the mode; for instance, the NFOV 2x2 binned mode covers 0.5–5.46 m, while the WFOV 2x2 binned mode spans 0.25–2.88 m.[21] Depth accuracy includes a systematic error of less than 11 mm + 0.1% of distance and a random error of ≤17 mm.[21] The RGB camera provides up to 12 MP resolution at 4096×3072 pixels, with support for 3840×2160 at 30 FPS, and a field of view of 90° horizontal by 59° vertical in 16:9 aspect ratio.[21] It supports HDR and multiple formats including MJPEG, YUY2, and NV12, enabling high-quality color imaging aligned with depth data.[21] The infrared (IR) camera, integrated with the depth sensor, delivers 1 MP resolution at 1024×1024 pixels and up to 30 FPS in passive IR mode, with FOV options matching the depth camera: narrow at 75°×65° and wide at 120°×120°.[21]| Component | Specification |
|---|---|
| Microphones | 7-channel circular array, USB Audio Class 2.0 compliant |
| Sensitivity | -22 dBFS at 94 dB SPL, 1 kHz |
| Signal-to-Noise Ratio (SNR) | >65 dB |
| Acoustic Overload Point | 116 dB |
| Sampling Rate | 48 kHz (16-bit) |