Fact-checked by Grok 2 weeks ago

Open Neural Network Exchange

The Open Neural Network Exchange (ONNX) is an open-source format and designed to represent models in a standardized way, enabling seamless interoperability across diverse frameworks, tools, , and hardware platforms. Developed initially in 2017 as a collaborative effort between and (now ) to address the challenges of model portability in development, ONNX defines an extensible model with built-in operators, standard types, and a protobuf-based serialization format that supports both and traditional workflows. In December 2017, version 1.0 was released with additional support from AWS and other partners, marking its transition to a production-ready standard. Now governed as a graduate project under the LF & Data Foundation, ONNX fosters a community-driven that includes special interest groups for areas like quantization and runtime optimization, allowing developers to train models in one framework (such as or ) and deploy them in another without proprietary lock-in. Key benefits include enhanced through compatible like ONNX Runtime, support for over 200 operators in its latest versions, and broad adoption by industry leaders including , , , and for efficient inference on edge devices and environments. This standardization reduces development friction, promotes innovation by decoupling model representation from specific tools, and continues to evolve through ongoing contributions, with recent focuses on advanced features like generative model support.

Overview

Definition and Purpose

The Open Neural Network Exchange (ONNX) is an open-source format designed to represent models in a framework-agnostic manner, capturing both their structural and values to ensure portability across diverse ecosystems. Developed as a standardized , ONNX supports a wide range of models, including those from and traditional paradigms, by defining an extensible computation graph that encapsulates the model's logic independently of the originating framework. The primary purpose of ONNX is to enable seamless model exchange between training frameworks, such as and , and inference engines, allowing developers to transfer models without loss of or . This interoperability addresses a key challenge in AI development by standardizing how models are serialized and shared, facilitating deployment on various hardware accelerators and runtimes while preserving the model's intended behavior. ONNX achieves this through (protobuf)-based serialization, which provides a compact and extensible binary format for encoding models. At its core, an ONNX model is represented as a (DAG), consisting of nodes that denote operations, along with inputs, outputs, and attributes that define tensor shapes, data types, and other essential for execution. This graph structure ensures that the model's computational flow and parameters are fully described in a vendor-neutral way, supporting efficient and optimization by downstream tools. By decoupling model creation from deployment, ONNX promotes innovation in the , empowering researchers to experiment with preferred environments while enabling production teams to optimize for specific platforms without constraints. This separation fosters broader adoption of models across industries, as evidenced by its role in streamlining workflows from research to real-world applications.

Key Benefits

The Open Neural Network Exchange (ONNX) provides significant interoperability benefits by allowing models trained in one framework, such as or , to be seamlessly exported and deployed in another without requiring retraining or reimplementation, thereby reducing and facilitating hybrid workflows across development and production environments. This standardization enables data scientists and engineers to collaborate more effectively, as models can be shared and utilized regardless of the originating toolset. ONNX also offers optimization advantages through its support for shared runtime libraries and hardware accelerators from multiple vendors, which can lead to faster inference times and reduced resource consumption compared to framework-specific implementations. For instance, , a key execution provider, applies cross-platform optimizations that enhance performance on diverse hardware like GPUs and CPUs. Additionally, its portability ensures models can transition smoothly between cloud-based services, edge devices, and on-device applications, maintaining consistency in behavior and efficiency across deployment scenarios. The adoption of ONNX has fostered growth by promoting among AI developers, tool providers, and hardware manufacturers under an open governance model, resulting in a rich collection of tools, pre-trained models, and extensions that accelerate . This collaborative has contributed to quantifiable impacts, such as reduced costs and time-to-market in cross-framework projects by eliminating the need for redundant model adaptations. Overall, these benefits have made ONNX a cornerstone for scalable AI deployment, enhancing flexibility and efficiency in the broader landscape.

History

Origins

The Open Neural Network Exchange (ONNX) was founded in 2017 as a joint initiative by (now ) and to establish an for representing models. Announced on September 7, 2017, the project aimed to create a shared format that would facilitate seamless model interchange across different AI frameworks, addressing the challenges posed by the rapid proliferation of diverse tools in the ecosystem. The primary motivations for ONNX stemmed from the growing fragmentation in frameworks, such as Caffe2, , and Microsoft's Cognitive Toolkit (CNTK), which made it difficult for developers to move models between training environments and production runtimes. By providing a unified export format, ONNX sought to streamline AI pipelines, enabling researchers and engineers to select the best for each without barriers, ultimately accelerating and deployment in applications. ONNX 1.0 was released in December 2017 as the initial production-ready version, concentrating on a core set of operators primarily for vision-based models while laying the groundwork for broader applicability. Early involvement from hardware and software partners bolstered the specification's development; for instance, announced support for ONNX in October 2017, joining other contributors to promote across ecosystems.

Milestones and Governance

The Open Neural Network Exchange (ONNX) project marked a significant in November 2019 when it was accepted as a graduate project under the AI & Data Foundation, transitioning to a vendor-neutral model that fosters broader participation. This move built on its initial launch in 2017 and emphasized among industry leaders. Earlier, the of ONNX version 1.2 in September 2018 introduced essential operators, such as and If, enabling support for more complex model structures beyond static computations. ONNX employs semantic versioning for its releases, with the Interchange Representation (IR) version and operator sets (opsets) evolving independently to maintain backward compatibility; for instance, opset 18, released in December 2022 as part of ONNX 1.13, enhanced quantization capabilities through improved support for low-precision data types like INT8. In March 2024, ONNX 1.16.0 was released, introducing enhanced machine learning-specific operators, including support for UINT4 and INT4 data types to facilitate efficient quantization, alongside refinements to function and node prototypes for better overload handling. This version also bolstered compatibility with diverse hardware, including improved execution on architectures via associated runtimes. More recently, as of October 2025, ONNX 1.19.0 was released, adding support for advanced features such as new operators for generative models and further quantization improvements. Governance of ONNX is overseen by the Technical Steering Committee (TSC) within the LF AI & Data Foundation since 2019, comprising elected representatives from contributing organizations to guide technical direction, release planning, and community standards. The TSC includes members from over 20 companies, such as , , , (AWS), and , ensuring diverse input on specifications and extensions. Elections for TSC seats occur annually, with the current term (September 2025–May 2026) featuring experts like Alexandre Eichenberger from and Mayank Kaushik from . ONNX has long included native support for traditional machine learning models via ONNX-ML extensions, allowing representation of non-deep learning algorithms like decision trees alongside neural networks. ONNX Runtime provides extensions for on-device training, including support for scenarios that enable privacy-preserving model training across distributed edge devices without central . These developments underscore ONNX's evolution toward comprehensive .

Technical Specifications

Model Representation

The Open Neural Network Exchange (ONNX) represents models as a (DAG), which captures the computational topology of the model through a sequence of operations without cycles. This graph-based structure allows for a portable and framework-agnostic description of model inference, where each node in the DAG corresponds to a specific operation, and edges represent data flow between them. The entire model is serialized into a compact with the .onnx extension using , enabling efficient storage, transmission, and parsing across diverse environments. At the core of an ONNX model is the ModelProto structure, which encapsulates the along with essential . The itself consists of nodes that define operations, input and output tensors that specify shapes and types, initializers for constant tensors (such as weights and biases), and attributes that provide fixed parameters to operations. elements include the name and version (indicating the tool or that generated the model), the (e.g., 'ai.onnx' for the standard ), model version for tracking iterations, and a doc_string for human-readable descriptions. Tensors are multidimensional arrays supporting various types, such as and INT32, though detailed type specifications are covered elsewhere. ONNX supports modular extensions through subgraphs and functions, enhancing reusability and complexity handling. Subgraphs appear within certain control-flow nodes, allowing nested computations for conditional or iterative logic, while functions define reusable compositions of operations that can be invoked like built-in nodes, promoting efficiency in large-scale models. To maintain , every ONNX model specifies an (IR) version—IR version 11 as of ONNX 1.17.0—which dictates the supported features and ensures across tools and runtimes. Graph execution in ONNX can be conceptually expressed as: \text{Output tensors} = f(\text{Input tensors}, \text{Weights}) where f represents the of operations applied in , transforming inputs and constants into outputs without deriving intermediate steps here.

Operators and Data Types

ONNX defines a comprehensive set of standardized that form the building blocks for representing models as computational . These operators are organized into versioned operator sets, known as opsets, which ensure across frameworks and runtimes. Each opset represents a collection of immutable operator specifications within a specific , with the default domain being "ai.onnx" for core operators. As of 2025, the latest opset version for the ai.onnx domain is 25, encompassing over 170 operators, including foundational ones introduced in earlier versions and new additions like DeformConv in opset 19. Operators are categorized based on their functionality to support diverse machine learning tasks. Core machine learning operators handle fundamental computations, such as MatMul for matrix multiplication, Conv for convolution, and Relu for rectified linear unit activation. Vision-specific operators include Resize for image scaling and MaxPool for pooling operations. Sequence processing operators support recurrent structures, exemplified by GRU for gated recurrent units and LSTM for long short-term memory cells. These categories enable the expression of complex models, from convolutional neural networks to recurrent architectures, while maintaining a unified vocabulary. Data types in ONNX are primarily tensor-based, allowing models to specify the and types of , outputs, and intermediate values within the graph. Supported tensor types include float (32-bit), float16, int8, int16, int32, int64, uint8, uint16, uint32, uint64, bool, and string. Recent versions have added support for bfloat16 to optimize for in , complex types for advanced applications, and 8-bit floating-point formats such as FLOAT8E4M3FN and FLOAT8E5M2 for quantization. Tensors can also incorporate sparse representations and optional types that may hold values, enhancing flexibility for optional or dynamic graphs. To accommodate specialized models beyond core deep learning, ONNX includes domain-specific operators, such as those in the "ai.onnx.ml" domain for traditional algorithms from libraries like . Examples include LinearRegressor for and TreeEnsemble for tree-based ensembles. is maintained through opset versioning: models specify the required opset version in their , allowing runtimes to select appropriate implementations without breaking existing models, as non-breaking changes (e.g., documentation updates) are handled within the same version, while breaking changes increment the version number. A representative example is the Add , which performs element-wise addition of two input tensors, supporting to align shapes. For inputs A and B (tensors of compatible shapes), the output C is computed as: C_{i,j} = A_{i,j} + B_{i,j} where rules apply if dimensions differ (e.g., a scalar added to a tensor expands the scalar across all elements). This has evolved across versions, with enhancements in opset 14 for improved string support, ensuring robust arithmetic operations in models.

Interoperability

Framework Support

ONNX provides native export capabilities in several major machine learning frameworks, enabling seamless conversion of trained models to the ONNX format for interoperability. PyTorch includes built-in support through the torch.onnx.export function, which has been available since version 1.2, allowing users to export computational graphs directly from torch.nn.Module instances. TensorFlow relies on the tf2onnx tool for exporting models, including those built with Keras or TensorFlow Lite, supporting opsets up to 18 for compatibility with various inference engines. Similarly, scikit-learn models can be exported using the skl2onnx library, which converts pipelines and estimators into ONNX graphs while preserving sklearn's feature engineering components. Import support for ONNX models is also widespread, facilitating the loading and execution of ONNX files within diverse frameworks. integration is handled through tf2onnx, as keras-onnx is deprecated; primary loading often occurs through ONNX Runtime, which handles -derived models efficiently. , now a retired project since , previously provided native APIs in its contrib.onnx module for converting ONNX models to MXNet symbols and parameters. PaddlePaddle supports ONNX through its high-performance plugins, allowing deployment of ONNX models alongside native Paddle formats in production pipelines. As of 2025, over 15 frameworks offer official ONNX export functionality, reflecting the format's broad adoption for cross-ecosystem workflows; notable examples include Transformers via the Optimum library, which streamlines export of models like for optimized . This extensive compatibility supports bidirectional operations, such as initial training in followed by fine-tuning in through ONNX round-trip conversion, minimizing data loss and enabling hybrid development pipelines. Conversion tools, detailed separately, underpin these integrations by handling graph translations between frameworks.

Conversion Mechanisms

Conversion to the Open Neural Network Exchange (ONNX) format involves framework-specific exporters that capture a model's computation graph and serialize it into a standardized (protobuf) representation. This process ensures the model can be imported into various runtimes while maintaining compatibility with specified operator sets (opsets). For instance, in , the torch.onnx.export converts a torch.nn.Module to ONNX by providing example inputs and specifying parameters like names and the target opset version. Exporters handle variable input shapes through mechanisms such as dynamic axes in , where a dictionary maps input names to dynamic dimensions (e.g., dynamic_axes={"input": {0: "batch_size"}}), allowing flexibility for batch sizes or sequence lengths without fixed dimensions. Similar exporters exist for other frameworks like via tensorflow-onnx, which rewrites model components using ONNX operators during conversion. Once exported, the model is serialized to a .onnx file in protobuf format, with opset compatibility verified to match the target runtime's supported versions. Post-conversion, importers load the ONNX protobuf into memory for further processing or execution, often accompanied by validation to ensure integrity. The ONNX checker tool, via onnx.checker.check_model, verifies the model's legality, checking for issues like duplicate or opset imports, and can optionally perform full checks including . This validation is crucial after import to detect inconsistencies arising from framework differences, such as type mismatches (e.g., float32 vs. float64). Discrepancies in dynamic dimensions are addressed using shape inference, implemented through onnx.shape_inference.infer_shapes(), which propagates known shapes across the and adds inferred dimensions to the model's value_info field. This resolves partial or symbolic shapes post-export, ensuring the model is executable without runtime errors from undefined tensors. For custom operators not in the ONNX domain, extensions map them via , where a custom domain (e.g., "com.example") is defined in the model's opset_import to isolate proprietary ops from core ones. In , custom ops during can be handled by providing a custom_translation_table to decompose them into supported ONNX . Updates in 2025, including ONNX v1.17.0 (released October 2025), enhance this with support for bfloat16 data types in multiple operators and other improvements for handling complex graphs. The overall follows: start with the source model in a supported , via the appropriate to generate the ONNX graph, serialize to protobuf while specifying the opset, and validate using the checker with shape inference to confirm fidelity. This structured approach minimizes fidelity loss, though challenges like unsupported custom ops may require manual decomposition or domain extensions.

Optimization and Deployment

Runtimes and Execution

ONNX Runtime (ORT), developed by , serves as the primary cross-platform for executing ONNX models, supporting a wide range of hardware including CPUs, GPUs, and mobile devices such as those on and . ORT enables portable deployment by abstracting hardware-specific optimizations through its Execution Provider (EP) interface, allowing seamless integration with backends like DirectML for Windows, for GPUs, and CoreML for Apple devices. This architecture ensures that ONNX models can run efficiently across diverse environments without requiring framework-specific modifications. The execution model in ORT relies on just-in-time (JIT) compilation, where the ONNX computation is parsed, optimized, and transformed into platform-specific code paths for efficient . optimizations, including , node , and layout transformations, are applied to reduce overhead and improve throughput; for instance, operator combines multiple nodes (e.g., Conv + ReLU) into a single to minimize accesses and launch costs. latency can be modeled as \text{[Latency](/page/Latency)} = \sum (\text{Node execution time}), where optimizations like partitioning and reduce the number of nodes and inter-node transfers, leading to measurable gains. In version 1.17, released in 2024, ORT introduced support, enabling accelerated browser-based for web applications while maintaining compatibility with (Wasm), SIMD, and multi-threading extensions for better performance. On ARM-based devices, such as mobile processors, operator fusion contributes to significant speedups, with benchmarks showing up to 2x improvement in time for fused convolutional layers compared to unfused execution. Beyond ORT, other runtimes provide ONNX compatibility for specialized use cases. Apache TVM, an open-source , supports direct import and compilation of ONNX models into optimized for various targets, emphasizing auto-tuning for high-performance kernels. NVIDIA's TensorRT, while hardware-specific to GPUs, offers robust ONNX and building, partitioning the to leverage layer fusion and precision calibration for up to 5x faster on compatible . These runtimes collectively enhance ONNX's portability by allowing model execution on diverse software stacks.

Hardware Acceleration

ONNX leverages primarily through ONNX Runtime (ORT), which integrates with vendor-specific execution providers () to optimize model on diverse accelerators without requiring modifications to the ONNX model itself. These map ONNX operators to hardware-optimized kernels, enabling efficient execution on GPUs, NPUs, and other specialized hardware. Key backends include the EP for GPUs, which accelerates computations using CUDA libraries and supports features like CUDA graphs for reduced latency in repeated . The DirectML EP targets Windows-based GPUs from various vendors, utilizing Microsoft's DirectML for cross-vendor compatibility and hardware-agnostic acceleration. For Apple ecosystems, the CoreML EP delegates sub-graphs to CoreML's runtime, exploiting CPU, GPU, and Neural Engine for low-power, high-performance on and macOS devices. 's EP optimizes for Intel CPUs, integrated GPUs, and VPUs, applying techniques like low-precision to enhance throughput on edge and server hardware. Optimization techniques in ONNX for emphasize quantization and . Quantization converts floating-point models to lower-precision formats, such as INT8, using operators like QuantizeLinear, which maps high-precision tensors to quantized representations via scale and zero-point parameters, reducing and boosting speed on supported hardware. merges compatible ONNX operators into single hardware-specific kernels, minimizing data transfers and overhead; for instance, fusing and operations in or backends can yield significant performance gains by leveraging vendor-optimized implementations. In 2025, ORT introduced enhanced support via a dedicated GPU EP, enabling acceleration of generative AI workloads on and GPUs through ROCm libraries, with seamless integration into the existing EP framework. Benchmarks on edge devices, including those using GPUs and NPUs via compatible EPs like NNAPI, demonstrate speedups of 2-3x in compared to unoptimized CPU execution, establishing key context for deployment efficiency. Vendor contributions via in ORT facilitate plug-and-play hardware integration, allowing developers to switch accelerators—such as from to —while maintaining model portability and performance portability across ecosystems. This modular design, as detailed in the runtimes section, underscores ONNX's role in hardware-agnostic deployment.

Adoption and Ecosystem

Community Contributions

The ONNX project has attracted contributions from over 500 individuals on its primary repository, reflecting broad community involvement in enhancing model interoperability and ecosystem tools. Leading contributions have come from major organizations such as , , and , which have driven core development through code submissions, specification updates, and integration efforts since the project's inception. To foster collaboration, ONNX maintains Community Working Groups focused on key domains, such as Generative- for advancing in generative models, Preprocessing for pre/post and featurization, Multi-device for multi-device , and Safety-Related-Profile for considerations. These groups facilitate cross-organizational discussions on operator extensions and domain-specific optimizations, ensuring ONNX evolves with diverse workloads. Additionally, annual ONNX Community Meetups have been held since 2020, providing forums for technical presentations, networking, and planning, with the 2025 event emphasizing in generative . Growth metrics underscore the project's momentum: the main repository has surpassed 25,000 stars on by 2025, indicating widespread adoption among developers and researchers. As a graduate project under the LF AI & Data Foundation, ONNX's ecosystem has expanded to include over 30 member organizations, ranging from tech giants to AI startups, supporting governance and resource allocation.

Real-World Applications

ONNX has found significant application in the automotive sector, particularly in autonomous vehicles. NVIDIA's platform supports ONNX models for in perception tasks through integration with TensorRT, enabling efficient deployment of models on for real-time processing. In cloud-based machine learning services, Azure Machine Learning facilitates the export of models from various frameworks to ONNX format, supporting scalable across cloud and environments. This allows developers to train models in tools like or and deploy them seamlessly using ONNX Runtime for optimized performance. Within research communities, ONNX integration in Hugging Face's Transformers library enables the export and deployment of large language models and other transformer s. Researchers can convert pretrained models to ONNX for faster with ONNX Runtime, reducing latency in tasks without altering the underlying . For AI in IoT devices, ONNX Runtime provides deployment capabilities on platforms like , supporting applications such as for real-time image classification from device cameras. This enables lightweight, privacy-focused on resource-constrained , suitable for smart home and industrial monitoring systems. Meta employs ONNX in its large-scale recommendation systems through compatibility with Deep Learning Recommendation Models (DLRM), allowing model portability and optimization for production-scale serving. Emerging applications of ONNX include scenarios for privacy-preserving , where tools like enable secure inference on ONNX models using across distributed devices. This approach maintains data confidentiality while allowing collaborative model training in sensitive domains like healthcare.

References

  1. [1]
    ONNX | Home
    ONNX is an open format built to represent machine learning models. ONNX defines a common set of operators - the building blocks of machine learning and deep ...Docs · Get Started · About · News
  2. [2]
    onnx/onnx: Open standard for machine learning interoperability
    Open Neural Network Exchange (ONNX) is an open ecosystem that empowers AI developers to choose the right tools as their project evolves.Open Neural Network Exchange · ONNX tutorials · Pre-trained ONNX models · Wiki
  3. [3]
    Microsoft and Facebook create open ecosystem for AI model ...
    Sep 7, 2017 · Today we are excited to announce the Open Neural Network Exchange (ONNX) format in conjunction with Facebook. ONNX provides a shared model ...Missing: original | Show results with:original
  4. [4]
    Facebook and Microsoft introduce new open ecosystem for ...
    Facebook and Microsoft are today introducing Open Neural Network Exchange (ONNX) format, a standard for representing deep learning models that enables models ...Missing: announcement | Show results with:announcement
  5. [5]
    ONNX v1 - Meta for Developers
    Today Facebook, AWS, and Microsoft are excited to announce that with the support of the community and new partners the first version of ONNX is now production- ...Missing: original | Show results with:original
  6. [6]
    ONNX | About
    ### Summary of ONNX About Page
  7. [7]
    Open Neural Network Exchange Intermediate Representation ...
    ONNX IR is an open specification defining a computation graph model, data types, and built-in operators. It includes a portable, serialized format of a ...
  8. [8]
    Open Neural Network Exchange (ONNX) Explained - Splunk
    Nov 14, 2024 · ONNX is an open-source format that bridges AI frameworks, enabling interoperability and model portability using a computation graph model.Advantages Of Using Onnx · What Is Onnx Runtime? · Examples Of Onnx Runtime...
  9. [9]
    ONNX Explained: A New Paradigm in AI Interoperability - Viso Suite
    Dec 18, 2023 · ONNX models can benefit from optimizations available in different frameworks and efficiently run on various hardware platforms. Community ...
  10. [10]
    Unlocking the Power of ONNX: Model Interoperability and Boosting ...
    Oct 24, 2023 · One of ONNX's key benefits is that it makes it simple to export models from one framework, like PyTorch, and import them into another ...
  11. [11]
  12. [12]
    ONNX: Enhancing AI Model Portability and Performance - Ikomia
    May 23, 2024 · ONNX revolutionizes AI development by ensuring interoperability and portability of machine learning models across different frameworks and ...
  13. [13]
    Onnx in Stream Products - GetStream.io
    Sep 17, 2024 · Reduced development time: The ability to export and import models in the Onnx format significantly reduces the time required to move from model ...Missing: cost savings
  14. [14]
    LF AI Welcomes ONNX, Ecosystem for Interoperable AI Models, as ...
    Nov 14, 2019 · ONNX allows the transfer of models between deep learning frameworks and simplifies the deployment of trained models to inference servers in ...<|control11|><|separator|>
  15. [15]
    ONNX V1 released - Engineering at Meta - Facebook
    Dec 8, 2017 · The first version of ONNX is now production-ready. With ONNX, we are working to create an AI ecosystem that gives developers the freedom to innovate.
  16. [16]
    News - ONNX
    The ONNX initiative envisions the flexibility to move deep learning models seamlessly between open-source frameworks to accelerate development for data ...Missing: statistics | Show results with:statistics
  17. [17]
    ONNX joins Linux Foundation - Microsoft Open Source Blog
    Nov 14, 2019 · Open Neural Network eXchange (ONNX) is joining the LF AI Foundation, an umbrella foundation of the Linux Foundation.
  18. [18]
    ONNX – LFAI & Data
    ONNX is an open format to represent deep learning models. With ONNX, AI developers can more easily move models between state-of-the-art tools.
  19. [19]
    Releases · onnx/onnx - GitHub
    Oct 7, 2025 · ONNX v1.16.0 is now available with exciting new features! We would like to thank everyone who contributed to this release!Missing: milestones | Show results with:milestones
  20. [20]
    ONNX Versioning - ONNX 1.21.0 documentation
    The ONNX versioning system allows for simple monotonically increasing numbers or semantic versioning (SemVer). For IR and operator sets, versioning is based on ...Missing: milestones | Show results with:milestones
  21. [21]
    Release v1.16.0 · onnx/onnx
    ### Key Features and Changes in ONNX 1.16.0
  22. [22]
    Notes and artifacts from the ONNX steering committee - GitHub
    Members. Current September 1st, 2025 - May 31, 2026, Alexandre Eichenberger (IBM) Mayank Kaushik (Nvidia) Andreas Fehlner (TRUMPF Laser SE) Ganesan Ramalingam ...
  23. [23]
    On-Device Training with ONNX Runtime
    Federated learning tasks, where the model is locally trained on data that is distributed across multiple devices in an effort to build a more robust aggregated ...
  24. [24]
    ONNX Concepts - ONNX 1.21.0 documentation
    ONNX can be compared to a programming language specialized in mathematical functions. It defines all the necessary operations a machine learning model needs.
  25. [25]
    ONNX Operators - ONNX 1.21.0 documentation
    Lists out all the ONNX operators. For each operator, lists out the usage guide, parameters, examples, and line-by-line version history.Ai.onnx.ml - TreeEnsemble · Ai.onnx.ml - LinearRegressor · Ai.onnx.ml - CastMap · If
  26. [26]
    ONNX Types - ONNX 1.20.0 documentation
    An optional type represents a reference to either an element (could be Tensor, Sequence, Map, or Sparse Tensor) or a null value.
  27. [27]
    Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX
    tf2onnx will use the ONNX version installed on your system and installs the latest ONNX version if none is found. We support and test ONNX opset-14 to opset-18.Missing: official | Show results with:official
  28. [28]
    onnx/sklearn-onnx: Convert scikit-learn models and pipelines to ONNX
    sklearn-onnx converts scikit-learn models to ONNX. Once in the ONNX format, you can use tools like ONNX Runtime for high performance scoring.Missing: official | Show results with:official
  29. [29]
    onnx/keras-onnx: Convert tf.keras/Keras models to ONNX - GitHub
    Oct 13, 2021 · The keras2onnx model converter enables users to convert Keras models into the ONNX model format. Initially, the Keras converter was developed in the project ...
  30. [30]
    Importing an ONNX model into MXNet
    To completely describe a pre-trained model in MXNet, we need two elements: a symbolic graph, containing the model's network definition, and a binary file ...Prerequisites · Loading The Model Into Mxnet · Input Pre-ProcessingMissing: support | Show results with:support
  31. [31]
    High Performance Inference - PaddleX Documentation
    The high-performance inference plugin supports handling multiple model formats, including PaddlePaddle static graph ( .pdmodel , .json ), ONNX ( .onnx ) and ...
  32. [32]
    Supported Tools - ONNX
    Frameworks & Converters. Use the frameworks you already know and love. Yandex CatBoost · CoreML · Optimum · Keras · LibSVM · Matlab · MindSpore.Missing: exporters importers<|control11|><|separator|>
  33. [33]
    Getting Started Converting TensorFlow to ONNX
    TensorFlow models (including keras and TFLite models) can be converted to ONNX using the tf2onnx tool. Full code for this tutorial is available here.Missing: official | Show results with:official
  34. [34]
    Converters - ONNX 1.21.0 documentation
    It enables code reuse across libraries like NumPy, JAX, PyTorch, CuPy and more. ndonnx enables execution with an ONNX backend and instant ONNX export for Array ...Other Api · A Class Graph With A Method... · Tricks Learned From...
  35. [35]
    onnx.checker - ONNX 1.21.0 documentation
    The onnx.checker module provides graph utilities for checking if an ONNX proto message is legal and checks the consistency of a model.Missing: importers | Show results with:importers
  36. [36]
    onnx.shape_inference - ONNX 1.21.0 documentation
    Apply shape inference to the provided ModelProto. Inferred shapes are added to the value_info field of the graph.Missing: importers validation
  37. [37]
    How to make custom operator in onnx and run it in onnx-runtime?
    Aug 18, 2021 · For ONNX, you need to set the domain name for opset. ... Then ONNX checker will know it's an op from your custom domain instead of official ONNX ...
  38. [38]
    onnxruntime - ONNX Runtime
    ONNX Runtime is a cross-platform machine-learning model accelerator, with a flexible interface to integrate hardware-specific libraries.Missing: representation | Show results with:representation
  39. [39]
    microsoft/onnxruntime: ONNX Runtime: cross-platform, high ... - GitHub
    ONNX Runtime is a cross-platform inference and training machine-learning accelerator. ONNX Runtime inference can enable faster customer experiences and lower ...
  40. [40]
    Graph Optimizations in ONNX Runtime
    Graph optimizations are essentially graph-level transformations, ranging from small graph simplifications and node eliminations to more complex node fusions ...Missing: 2025 | Show results with:2025
  41. [41]
    ONNX Runtime Performance
    ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator.Performance Diagnosis · Quantize ONNX models · ORT model format runtime...Missing: fusion | Show results with:fusion
  42. [42]
    Build for web | onnxruntime
    ONNX Runtime WebAssembly can be built with or without multi-thread and Single Instruction Multiple Data (SIMD) support. This support is added/removed by ...
  43. [43]
    ONNX Runtime Execution Providers
    ONNX Runtime works with different hardware acceleration libraries through its extensible Execution Providers (EP) framework to optimally execute the ONNX models ...Nvidia - cuda · Intel - OpenVINO · Windows - DirectML · Apple - CoreML
  44. [44]
    NVIDIA - CUDA | onnxruntime
    The CUDA Execution Provider enables hardware accelerated computation on Nvidia CUDA-enabled GPUs.Configuration Options · Performance Tuning · Using Cuda Graphs (preview)
  45. [45]
    Windows - DirectML | onnxruntime
    The DirectML Execution Provider is a component of ONNX Runtime that uses DirectML to accelerate inference of ONNX models. The DirectML execution provider is ...
  46. [46]
    CoreML Execution Provider - Apple - ONNX Runtime
    CoreML is Apple's machine learning framework designed to efficiently use hardware like CPU, GPU, and Neural Engine for performance and minimal power ...Available Options (new Api) · Supported Operators · NeuralnetworkMissing: Microsoft | Show results with:Microsoft
  47. [47]
  48. [48]
    QuantizeLinear - ONNX 1.21.0 documentation
    The linear quantization operator consumes a high-precision tensor, a scale, and a zero point to compute the low-precision/quantized tensor.Missing: fusion hardware kernels
  49. [49]
    Quantize ONNX models | onnxruntime
    ONNX Runtime provides python APIs for converting 32-bit floating point model to an 8-bit integer model, aka quantization.
  50. [50]
    Introducing AMD GPU EP for Generative AI with Amuse
    Oct 6, 2025 · AMD GPU EP is built on ONNX Runtime and powered by ROCm technologies. Diagram of AMD GPU EP showing Hugging Face to ONNX Runtime pipeline with ...
  51. [51]
    ONNX expansion speeds AI development - Engineering at Meta
    May 2, 2018 · ONNX is adding support for additional AI tools, including Baidu's PaddlePaddle platform, and Qualcomm SNPE. ONNX is also adding a production-ready converter ...
  52. [52]
    Repository for ONNX working group artifacts - GitHub
    Completed working groups ; Control Flow and Loops, Enable dynamic control structures to enable advanced models for NLP, speech, and video/image processing ...Missing: vision | Show results with:vision
  53. [53]
    LF AI & Data Day - ONNX Community Virtual Meetup – Fall | LF Events
    *LF AI & Data Days are regional, one-day events hosted and organized by local members with support from the LF AI & Data Foundation and its projects.Missing: organizations | Show results with:organizations
  54. [54]
    Press Releases – LFAI & Data
    LF AI Welcomes ONNX, Ecosystem for Interoperable AI Models, as Graduate Project. Active contributors to ONNX code base include over 30 blue chip companies in AI ...
  55. [55]
    daquexian/onnx-simplifier: Simplify your onnx model - GitHub
    ONNX Simplifier is presented to simplify the ONNX model. It infers the whole computation graph and then replaces the redundant operators with their constant ...Issues 166 · Pull requests 9 · Security · ActivityMissing: pruning | Show results with:pruning
  56. [56]
    OODTE: A Differential Testing Engine for the ONNX Optimizer - arXiv
    May 3, 2025 · In this work, we present OODTE, a utility designed to automatically and comprehensively evaluate the correctness of the ONNX Optimizer.<|separator|>
  57. [57]
    TensorRT, Drive AGX, Jetson and the .onnx format
    Jul 27, 2021 · The latest DRIVE release(DRIVE OS 5.2.0) has TensorRT 6.3. If you want to get TensorRT Optimized model, you need to use it. No. Both have ...
  58. [58]
    ONNX Runtime and Models - Azure Machine Learning
    Oct 13, 2025 · Supported frameworks include TensorFlow, PyTorch, scikit-learn, Keras, Chainer, MXNet, and MATLAB. You can run models in the ONNX format on ...ONNX Runtime · Ways to get ONNX modelsMissing: importers | Show results with:importers
  59. [59]
    ONNX - Hugging Face
    ONNX is an open standard that defines a common set of operators and a file format to represent deep learning models in different frameworks, including PyTorch ...Missing: structure | Show results with:structure
  60. [60]
    ONNX Runtime IoT Deployment on Raspberry Pi
    Learn how to perform image classification on the edge using ONNX Runtime and a Raspberry Pi, taking input from the device's camera and sending the ...
  61. [61]
    Engineering Engagement: A Practitioner's Guide to DLRM in Large ...
    May 3, 2025 · DLRM is compatible with production tools like TorchScript, ONNX, and distributed PyTorch environments. Facebook's open-source TorchRec further ...
  62. [62]
    Homomorphically Encrypted MAchine learning with oNnx models
    HE-MAN is introduced, an open-source two-party machine learning toolset for privacy preserving inference with ONNX models and homomorphically encrypted data