Fact-checked by Grok 2 weeks ago

High-level synthesis

High-level synthesis (HLS) is an electronic design automation (EDA) process that automates the transformation of untimed behavioral descriptions of digital hardware—typically specified in high-level programming languages such as C, C++, SystemC, or MATLAB—into optimized register-transfer level (RTL) implementations in hardware description languages like Verilog, VHDL, or SystemVerilog.^[1]^[2]^[3] This approach raises the abstraction level in digital circuit design, enabling engineers to focus on algorithmic functionality rather than low-level gate or register details, thereby accelerating the development of complex systems such as embedded processors, video decoders, and encryption engines.^[1]^[3] The core steps of HLS involve parsing the high-level input to generate a data flow graph (DFG) representing operations and dependencies, followed by scheduling (assigning operations to clock cycles), allocation (mapping operations to hardware resources like adders or multipliers), and binding (connecting resources while minimizing wiring).^[2]^[1] Optimizations during this process exploit parallelism through techniques such as loop unrolling, pipelining, and array partitioning, while inferring interfaces, memories, and registers to balance performance, area, power consumption, and latency.^[2]^[1] These capabilities make HLS particularly valuable for field-programmable gate array (FPGA) prototyping and application-specific integrated circuit (ASIC) design in domains requiring rapid iteration, such as signal processing and machine learning accelerators.^[3] Introduced in the 1980s but gaining widespread adoption in the 2000s with advancements in tool maturity, HLS addresses the escalating complexity of hardware designs by reducing manual RTL coding efforts, shortening verification cycles through high-level testbenches, and facilitating design space exploration across multiple architectures.^[1]^[3] Leading commercial tools from vendors like Siemens (Catapult), AMD (Vitis HLS), and Synopsys automate these flows while supporting industry standards from Accellera for synthesis subsets of C/C++/SystemC, ensuring compatibility with downstream physical design tools.^[2]^[1] By integrating into electronic system-level (ESL) methodologies, HLS improves productivity, quality of results, and time-to-market for hardware engineers tackling ever-larger systems-on-chip (SoCs).^[3]^[1]

Overview and Fundamentals

Definition and Purpose

High-level synthesis (HLS) is an automated process within electronic design automation (EDA) that translates high-level behavioral descriptions, typically written in programming languages such as C, C++, or SystemC, into register-transfer level (RTL) hardware descriptions in formats like Verilog, VHDL, or SystemVerilog.^[2]^[1] This transformation enables the generation of synthesizable hardware implementations from abstract algorithmic specifications, handling aspects such as micro-architecture, timing, and resource allocation automatically.^[4] The primary purpose of HLS is to bridge the abstraction gap between software-oriented algorithmic modeling and hardware realization, allowing designers to focus on functionality rather than low-level details like gate structures or signal timings.^[2] By automating much of the RTL development, HLS reduces design complexity, enabling software engineers and domain experts without deep hardware knowledge to create custom digital circuits.^[5] Key benefits include enhanced designer productivity through higher abstraction levels, faster time-to-market by shortening the design cycle, and improved support for complex systems such as application-specific integrated circuits (ASICs) and field-programmable gate arrays (FPGAs).^[6]^[7] In a basic workflow, HLS begins with an input specification of the desired behavior and progresses through automated optimization and mapping to produce hardware-ready RTL code that can be further processed by downstream EDA tools for physical implementation.^[4] This approach emerged in the 1980s as a response to the growing complexity of very-large-scale integration (VLSI) designs, aiming to streamline the transition from conceptual algorithms to efficient hardware.^[8]

Comparison to Low-Level Synthesis

Low-level synthesis, often referred to as register-transfer level (RTL) synthesis, involves manually describing hardware behavior using hardware description languages (HDLs) such as Verilog or VHDL, which are then transformed into gate-level netlists for implementation on FPGAs or ASICs. This approach requires designers to explicitly manage hardware-specific details like clocking, pipelining, and resource allocation, making it highly hardware-oriented and labor-intensive.^[9] In contrast, high-level synthesis (HLS) operates at a higher abstraction level, starting from algorithmic descriptions in languages like C, C++, or SystemC, and automatically generates RTL code, thereby bridging software-like programming with hardware implementation.^[6] HLS offers significant advantages in productivity for complex designs, as it allows developers to focus on algorithmic logic rather than hardware minutiae, reducing design entry time and enabling faster iterations through C-level simulations that execute orders of magnitude quicker than RTL simulations.^[9] Additionally, HLS automatically infers parallelism from sequential code, such as loop unrolling or function pipelining, which can accelerate applications like signal processing without manual intervention.^[10] However, these benefits come with disadvantages, including a potential loss of fine-grained control over hardware details, such as precise datapath widths or custom state machines, which may lead to suboptimal resource utilization or timing if not addressed through directives.^[11] The trade-offs between HLS and low-level synthesis are evident in design cycles: HLS facilitates rapid prototyping and exploration of multiple implementations from a single algorithm, shortening development from months to weeks, but often requires iterative refinements to match RTL's performance.^[6] Conversely, RTL methods provide exact control for meeting stringent timing and area constraints at the expense of longer, error-prone manual coding efforts.^[10] For instance, in implementing a Sobel edge detection filter for image processing, HLS reduced programming effort to 90 lines of code and 121 hours compared to RTL's 493 lines and 384 hours, though RTL achieved lower resource usage (0.8% LUTs vs. 4.4%) and faster execution (5.9 ms vs. 39.3 ms on a test image).^[10] Similarly, for signal processing algorithms like FIR filters, HLS enables quick algorithmic validation, while manual RTL is preferred for custom datapaths demanding optimized throughput.^[11] Over time, HLS has evolved to complement rather than replace low-level synthesis in integrated design flows, where HLS-generated RTL modules are imported as IP blocks into larger RTL hierarchies for final optimization and verification.^[9] This hybrid approach leverages HLS for high-level algorithm acceleration and RTL for performance-critical refinements, enhancing overall efficiency in FPGA and ASIC development.^[6]

Historical Development

Early Innovations

The origins of high-level synthesis (HLS) trace back to the 1970s, when researchers began exploring the translation of algorithmic descriptions into hardware structures, drawing inspiration from software compilation techniques. At Carnegie Mellon University, pioneers such as Mario Barbacci, Daniel Siewiorek, Donald Thomas, and Alice Parker developed hardware description languages like ISPS (Instruction Set Processor Specification), which enabled the simulation and synthesis of processor architectures from behavioral specifications. Barbacci's 1974 work specifically proposed compiling ISPS into gate-level implementations, marking an early attempt at "algorithms to gates" synthesis and establishing foundational concepts for behavioral modeling influenced by compiler theory.^[12]^[13] The 1980s saw significant advancements through academic and government-funded initiatives, focusing on practical synthesis tools for specific domains like digital signal processing (DSP). A key milestone was the Cathedral project, launched in 1984 at IMEC in Belgium under Hugo De Man, which developed a silicon compiler for synchronous multiprocessor DSP architectures using the Silage language to automate datapath and control unit generation from signal flow graphs. This effort addressed real-time DSP challenges and influenced subsequent commercial tools. In the U.S., DARPA-funded projects supported behavioral synthesis research, emphasizing design specification, verification, and intermediate representations like control-data flow graphs (CDFGs) at institutions such as UC Berkeley and Stanford. Researchers like Giovanni De Micheli contributed pivotal methods, including the introduction of HardwareC in 1988 for hardware-software co-design and synthesis under timing constraints, while Pierre Paulin and John Knight advanced scheduling algorithms such as force-directed scheduling.^[12]^[13]^[14]^[15] Early HLS tools faced substantial challenges, remaining largely confined to simple datapath designs due to immature optimization algorithms for resource allocation, scheduling, and timing analysis. Issues such as poor quality of results compared to manual register-transfer level (RTL) design, domain-specific limitations (e.g., DSP focus), and the need for obscure input languages hindered broader adoption. By the early 1990s, however, a shift occurred toward commercial viability, exemplified by Synopsys' announcement of the Behavioral Compiler in 1994, which integrated behavioral synthesis with existing RTL flows to enable more efficient IC design specification and reduce development time. This transition bridged academic prototypes to industry tools, laying the groundwork for HLS's evolution beyond experimental stages.^[12]^[13]^[16]

Modern Advancements and Adoption

During the 2000s, HLS tools matured significantly, leading to broader commercial adoption as mentioned in the overview. Tools like NEC's Cynthesizer gained traction among Japanese companies around 2000, leveraging the mature SystemC community for system-level design. Mentor Graphics introduced Catapult C in 2004, a C++-based synthesis tool that supported complex, multi-block subsystems and accelerated adoption in SoC design by improving productivity for control logic and low-power applications. This decade saw HLS transition from niche DSP uses to more general-purpose applications, with vendors like Synopsys and Forte enhancing their offerings for better quality of results and integration with EDA flows, setting the stage for widespread industry use.^[17]^[13] In the 2010s, high-level synthesis (HLS) saw significant milestones that broadened its accessibility and application. Xilinx's Vivado HLS, introduced in 2011, played a pivotal role in popularizing C-to-HDL synthesis by enabling designers to generate register-transfer level (RTL) code directly from C/C++ specifications, thus streamlining FPGA and ASIC development workflows.^[9] Concurrently, open-source initiatives like Bambu, developed at Politecnico di Milano and first released in 2012, provided a flexible research framework for HLS, supporting C constructs and integrating with GCC for parsing, which fostered experimentation in academic and custom tool development.^[18] Key advancements in the 2010s and 2020s integrated machine learning (ML) and artificial intelligence (AI) techniques to enhance HLS optimizations, particularly in scheduling and resource allocation. For instance, reinforcement learning (RL) methods emerged for automated scheduling, where RL agents learn optimal operation orders by exploring the design space, outperforming traditional heuristic approaches in complex dataflow graphs.^[19] Graph neural networks combined with RL have further improved dependency-aware scheduling, achieving better latency and throughput in HLS-generated hardware.^[20] These AI-driven tools, starting from prototypes in the mid-2010s, have enabled predictive modeling of loop unrolling and bit-width optimization, reducing manual tuning efforts.^[21] HLS tools have increasingly supported heterogeneous computing environments, facilitating seamless integration across CPU, GPU, and FPGA platforms. OpenCL-based HLS frameworks allow multi-kernel pipelines on CPU-FPGA systems, optimizing data transfer and task partitioning for improved overall system performance.^[22] This capability extends to FPGA-accelerated systems with GPU offloading, where HLS generates hardware IPs tailored for specific accelerators while maintaining software-like programmability.^[23] Such advancements address the challenges of unified programming models in diverse hardware ecosystems, enabling efficient deployment in data-intensive applications. Adoption of HLS surged in the 2020s, particularly in automotive advanced driver-assistance systems (ADAS) and AI hardware for neural network accelerators, driven by the need for customized, power-efficient designs. In automotive contexts, HLS has been used to migrate AI inference functions to hardware accelerators, optimizing compute for real-time sensor processing in ADAS.^[24] For AI hardware, HLS facilitates the creation of bespoke neural network accelerators, delivering higher performance per watt compared to general-purpose processors.^[25] The slowdown in Moore's Law, with transistor scaling decelerating since the mid-2010s, has further propelled HLS adoption by emphasizing higher abstraction levels to maintain productivity amid rising design complexity and costs.^[26] As of 2025, recent innovations in HLS include quantum-inspired approaches for synthesis in error-prone hardware environments and enhanced verification flows. Quantum-inspired algorithms, adapted via HLS frameworks like QHLS, enable the generation of resilient circuits for noisy intermediate-scale quantum (NISQ) devices by optimizing gate decompositions and error mitigation.^[27] Enhanced verification integrates formal methods, such as equivalence checking for source-to-source transformations, into HLS pipelines, ensuring functional correctness from high-level C code to RTL with reduced simulation overhead.^[28] Studies from 2015 to 2025 demonstrate substantial productivity gains with HLS in FPGA designs, often achieving up to 10x faster development cycles compared to traditional RTL hand-coding, primarily through automated code generation and iterative optimization.^[29] These gains are evidenced in benchmarks across image processing and AI workloads, where HLS reduced design time from months to weeks while maintaining competitive quality-of-results.^[30]

Input Specifications

Supported Languages and Formats

High-level synthesis (HLS) tools predominantly accept inputs in C and C++ as primary programming languages, often extended with tool-specific pragmas to guide hardware mapping. Some tools, such as AMD Vitis HLS, also support OpenCL C for defining kernels.^[31] These extensions include directives such as #pragma HLS pipeline for loop optimization and #pragma HLS array for memory partitioning, enabling designers to influence scheduling and resource allocation without altering core algorithm logic.^[31] For instance, in tools like Vitis HLS, C/C++ specifications must adhere to synthesizable subsets that prohibit unbounded recursion, dynamic memory allocation within loops, and certain pointer manipulations to ensure predictable hardware generation.^[32] SystemC serves as another key input language, particularly for system-level modeling in HLS, allowing description of hardware-software interfaces and transaction-level behaviors.^[33] Tools such as Catapult and Stratus HLS support synthesizable SystemC subsets defined by standards like the Accellera SystemC Synthesis Subset, which restricts features to finite-state, single-threaded constructs for reliable RTL output.^[34] This makes SystemC suitable for modeling complex modules like bus interfaces or co-processors, where C/C++ might require additional abstractions. Beyond general-purpose languages, domain-specific formats like MATLAB and Simulink are supported for digital signal processing (DSP) applications through integrated workflows.^[35] HDL Coder from MathWorks generates HLS-compatible C/C++ code from MATLAB algorithms, supporting fixed- and floating-point operations while handling block-based modeling for filters and transforms.^[36] Python-based domain-specific languages (DSLs), such as MyHDL, are less commonly used for direct HLS synthesis, as they primarily target RTL generation rather than algorithmic transformation to hardware.^[37] Over time, HLS inputs have evolved from pure C specifications toward domain-specific languages like Halide, introduced in the 2010s for image and array processing.^[38] Halide separates algorithm description from optimization schedules, allowing HLS tools to compile its embedded DSL into synthesizable C/C++ for FPGA accelerators, improving portability across hardware targets.^[39] A notable limitation in HLS inputs involves non-deterministic behaviors, such as floating-point arithmetic, which can lead to inconsistent hardware results due to varying precision and rounding modes across tools.^[40] To mitigate this, inputs often require conversion to fixed-point representations, ensuring bit-accurate synthesis while trading off dynamic range for predictability.^[41]

Behavioral Descriptions and Models

Behavioral descriptions in high-level synthesis (HLS) serve as the input specifications that capture the intended functionality of a hardware design at a high level of abstraction, typically expressed through algorithmic or transaction-level models (TLM).^[42] Algorithmic descriptions focus on sequential code incorporating loops, conditionals, and computational operations to define the core behavior, while TLM emphasizes abstract communication protocols between modules without detailing cycle-by-cycle interactions.^[43] These models enable designers to specify complex algorithms without immediate concern for hardware implementation details, facilitating rapid prototyping and verification.^[42] Key representational elements include control-flow graphs (CFG) and data-flow graphs (DFG), which model the behavioral structure for synthesis tools. A CFG represents the program's execution paths as nodes for basic blocks (sequences without branches) connected by edges for control dependencies, such as conditionals or loops.^[42] In contrast, a DFG captures data dependencies, with nodes denoting operations (e.g., additions or multiplications) and directed edges indicating data flow between them, often combined into a control-data flow graph (CDFG) for comprehensive analysis.^[4] Hierarchical behaviors extend these graphs to modular designs, allowing nested structures where sub-modules are represented as higher-level nodes within a parent graph, supporting scalable representation of complex systems.^[42] Abstraction levels in behavioral models range from untimed functional specifications, which describe pure functionality without timing constraints, to partially timed models that incorporate latency or throughput requirements for certain operations.^[43] Untimed specs assume instantaneous execution of functions, consuming all inputs simultaneously and producing outputs without delays, ideal for initial algorithmic exploration.^[42] Partially timed models add constraints like minimum execution times for loops or functions, guiding the synthesis tool toward specific performance targets while retaining behavioral focus.^[43] A representative example is a matrix multiplication algorithm implemented in C, where nested loops perform element-wise multiplications and accumulations; this sequential code implies potential parallelism in hardware, such as unrolling loops to create multiple multipliers operating concurrently.^[42] HLS tools analyze such descriptions to infer parallel opportunities from the DFG, transforming the abstract computation into hardware operations. Behavioral inputs often include non-synthesizable elements, such as I/O simulations or dynamic memory allocations, which are automatically pruned or replaced during the synthesis process to ensure hardware compatibility.^[42] For successful synthesis, behavioral models must eventually support cycle-accurate semantics after refinement, where the untimed or partially timed description is transformed into a timed implementation assigning operations to specific clock cycles.^[42] This prerequisite ensures the generated hardware meets precise timing and resource requirements, bridging the gap between high-level intent and low-level RTL output.^[43]

Synthesis Process

Core Stages of Transformation

High-level synthesis (HLS) employs a structured pipeline to transform behavioral descriptions into hardware implementations, typically progressing through sequential stages that convert abstract code into register-transfer level (RTL) representations.^[44] This process begins with input parsing and culminates in initial RTL generation, enabling automation of design space exploration while preserving functional equivalence.^[45] The pipeline's modularity allows for iterative refinements, with verification mechanisms integrated to detect inconsistencies early.^[4] The first stage, parsing and elaboration, involves analyzing the input source code—such as C/C++ or SystemC—and converting it into a structured intermediate representation (IR).^[44] Parsing breaks down the code into tokens and builds an abstract syntax tree (AST), while elaboration resolves ambiguities, incorporates libraries, and generates a control data flow graph (CDFG) that captures data dependencies, control flow, and sequential operations.^[45] Traditional IRs like custom ASTs or LLVM have been standard, but modern approaches leverage multi-level IRs such as MLIR to support hierarchical optimizations and scalability in complex designs, with frameworks like ScaleHLS emerging around 2021.^[46] Following elaboration, high-level transformations prepare the IR for hardware mapping by applying behavioral optimizations.^[44] Techniques such as loop unrolling, which replicates loop bodies to enable parallelism, and function inlining, which eliminates procedure call overheads, reduce abstraction levels and expose opportunities for resource sharing without altering semantics.^[45] These transformations operate on the CDFG to simplify control structures and balance computation, often guided by user directives for targeted improvements.^[4] The next stage is scheduling, which assigns operations from the CDFG to specific clock cycles while respecting data dependencies and resource constraints (detailed in the following subsection). This step determines the timing of operations, balancing latency, throughput, and hardware utilization, and produces a scheduled graph for subsequent stages.^[45] The allocation and binding stage assigns computational operations and data elements to physical hardware resources based on the scheduled graph.^[44] Allocation determines the number and type of functional units (e.g., mapping multiple additions to shared adders) and storage elements based on timing and area constraints, while binding connects these operations to specific units and variables to registers or memories, minimizing interconnections.^[45] This phase relies on the transformed and scheduled IR to ensure efficient mapping.^[47] Initial RTL generation translates the bound and allocated model into synthesizable hardware descriptions, such as Verilog or VHDL netlists.^[44] This involves generating datapaths from bound functional units, control logic from the CDFG's sequencing, and interconnections, yielding a cycle-accurate RTL that can be further refined by downstream tools.^[45] Throughout the pipeline, interdependencies foster feedback loops for verification, where simulation or formal checks at each stage—such as post-elaboration functional validation or post-binding timing analysis—allow corrections before proceeding.^[47] For small designs, like signal processing kernels with thousands of lines of code, the entire process typically completes in minutes to hours on modern workstations, depending on complexity and tool implementation.^[47] As of 2025, emerging techniques like large language models (LLMs) are being explored to assist in optimizing synthesis directives and process automation.^[48]

Key Algorithms and Scheduling

High-level synthesis (HLS) relies on scheduling algorithms to map operations from a behavioral description, typically represented as a data flow graph (DFG), onto a sequence of clock cycles while respecting timing and resource constraints. Scheduling determines the time steps at which each operation executes, balancing latency, throughput, and hardware utilization. Key concepts include the mobility of operations—the range of time steps an operation can be assigned without violating data dependencies—and critical path analysis, which identifies the longest path in the DFG to establish the minimum achievable latency. Latency minimization involves scheduling operations to minimize the length of the longest path (critical path) in the DFG, typically formulated as \min \max_p \sum_{e \in p} \text{delay}_e over all paths p, where delays account for operation execution and interconnects. List scheduling is a foundational heuristic algorithm in HLS, prioritizing operations based on a priority function such as urgency or data arrival times. It includes as-soon-as-possible (ASAP) scheduling, which assigns each operation to the earliest feasible time step to minimize latency, and as-late-as-possible (ALAP) scheduling, which delays operations to the latest possible step, often used to expose parallelism or guide resource binding. These methods are computationally efficient but may not yield globally optimal results due to their greedy nature. For resource-constrained scenarios, list scheduling extends to assign operations while ensuring the number of concurrent uses does not exceed available units. For optimal solutions, integer linear programming (ILP) formulates scheduling as a mathematical optimization problem, minimizing latency or area subject to precedence and resource limits. The assignment of operations to time steps and units is modeled via binary variables in a matrix A, where A_{op,unit,t} indicates if operation op is bound to unit unit at time t, solved such that \sum_{op,t} A_{op,unit,t} \leq available units for each unit type, alongside constraints for data dependencies. ILP excels in latency-area trade-offs for small to medium DFGs but scales poorly due to exponential complexity, often requiring branch-and-bound techniques. Resource allocation in HLS involves binding operations to functional units and storage elements to minimize hardware cost, such as multiplexers and registers. Binding techniques, like left-edge algorithm for register allocation or clique partitioning for functional unit sharing, model compatibility graphs where nodes represent operations and edges indicate shareability, partitioning cliques to reduce interconnections. For instance, clique partitioning minimizes the number of multiplexers by grouping mutually exclusive operations onto shared units. These steps follow scheduling to ensure feasible mappings. Heuristic methods address ILP's scalability issues; force-directed scheduling, for example, iteratively balances resource usage across time steps by simulating forces proportional to operation densities, aiming for uniform distribution to avoid bottlenecks. This approach trades optimality for speed, achieving near-optimal results in polynomial time for many designs. More advanced techniques integrate satisfiability (SAT) solvers to handle complex constraints like multi-cycle operations or pipelining, encoding the scheduling problem as a Boolean formula and using conflict-driven clause learning for efficient search. Recent advancements in the 2020s incorporate machine learning (ML) for scheduling, particularly reinforcement learning agents trained on DFG ensembles to predict optimal assignments under non-deterministic optimizations like dynamic resource scaling. These ML-based methods, such as graph neural networks for mobility prediction, outperform traditional heuristics in latency reduction for irregular applications, enabling adaptive scheduling in heterogeneous accelerators.^[49]

Optimization Techniques

Architectural Constraints and Exploration

In high-level synthesis (HLS), architectural constraints define the boundaries within which the synthesis tool generates hardware implementations from behavioral descriptions, ensuring compatibility with target platforms such as FPGAs or ASICs. Common constraints include clock period, which specifies the maximum cycle time to meet timing requirements, often derived from performance targets like achieving a certain operating frequency.^[50] Area budget limits the resource utilization, such as the number of logic elements or DSP blocks, to fit within available hardware capacity.^[51] Throughput requirements dictate the minimum data processing rate, influencing scheduling decisions to balance latency and resource sharing.^[51] Additionally, datapath width constraints enforce bit-accurate modeling, where fixed-point or arbitrary-precision data types are specified to optimize precision and resource efficiency without overflow or unnecessary bits.^[52] Design space exploration (DSE) in HLS involves systematically evaluating architectural alternatives to identify optimal trade-offs under these constraints, typically framed as multi-objective optimization problems. Techniques such as genetic algorithms and Bayesian optimization navigate the vast parameter space, considering factors like loop unrolling factors, pipeline stages, and resource bindings.^[53] The resulting Pareto fronts represent non-dominated solutions, plotting metrics like latency against area to guide designers in selecting implementations that best satisfy conflicting goals, such as minimizing latency while adhering to an area budget.^[51] These fronts enable visualization of trade-offs, where increasing parallelism might reduce latency but exceed area limits.^[51] Recent advancements as of 2025 have integrated large language models (LLMs) and deep learning for automated HLS directive optimization, enhancing DSE by predicting optimal configurations from syntax-aware abstract syntax tree (AST) guidance, enabling faster exploration of complex parameter spaces.^[21]^[54] Methods for efficient exploration include rapid prototyping using template architectures, which provide pre-defined structures like systolic arrays to accelerate implementation for compute-intensive kernels, allowing quick assessment of constraint satisfaction through parameterized HLS directives. Iterative refinement further enhances this by incorporating feedback from simulation or early synthesis estimates, adjusting parameters like binding strategies to converge on feasible designs while respecting clock and throughput constraints.^[53] Such approaches reduce the need for exhaustive searches by prioritizing promising configurations based on predictive models. A representative example is exploring pipeline depths in fast Fourier transform (FFT) implementations, where varying the initiation interval and stage parallelism trades off latency for resource usage; deeper pipelines can achieve higher throughput under fixed clock periods but may increase area due to additional registers. Challenges in architectural constraint handling arise from balancing user-specified limits, such as explicit area budgets, against automated decisions by the HLS tool, which may over-allocate resources if not guided properly.^[55] Target-specific variability exacerbates this, as FPGA implementations tolerate reconfiguration for timing closure more readily than fixed ASIC flows, where post-fabrication changes are impossible, requiring conservative constraints to account for process variations.^[55] Emerging in the 2020s, machine learning-driven DSE tools have addressed exploration challenges by using predictive models, such as graph neural networks or transfer learning, to approximate synthesis outcomes and achieve faster convergence to Pareto-optimal designs, often evaluating thousands of configurations in hours rather than days.^[56]

Performance and Resource Optimizations

High-level synthesis (HLS) employs pipelining as a core technique to enhance performance by overlapping the execution of successive loop iterations, thereby increasing throughput while inserting registers to support higher clock frequencies.^[57] This process analyzes data dependencies to determine the initiation interval (II), the minimum number of clock cycles between starting consecutive iterations, enabling steady-state operation where new iterations begin every II cycles after an initial ramp-up.^[58] For instance, polyhedral models facilitate dynamic loop pipelining by extracting parallelism from affine loop nests, achieving up to 4.3× improvement in cycles per iteration on FPGA platforms.^[59] Parallelization through loop tiling further boosts performance by partitioning iteration spaces into smaller blocks that can be processed concurrently, often in conjunction with unrolling to instantiate multiple processing units. Resource optimizations in HLS focus on minimizing hardware usage without severely impacting functionality. Operator sharing binds multiple operations to a single functional unit via multiplexers, reducing the total number of logic gates and interconnects; for example, sharing a multiplier across N operations typically requires an N-to-1 multiplexer, lowering area by up to 50% in datapath designs. Memory optimizations, such as loop fusion, merge adjacent loops to reuse intermediate data in on-chip buffers, minimizing off-chip accesses and bandwidth pressure; this technique has been shown to reduce memory traffic by fusing stencil computations, improving overall efficiency in iterative algorithms. Power optimizations automate techniques like clock gating, which disables clocks to idle registers and reduces dynamic power in non-active cycles, integrated directly into the HLS flow to achieve up to 30% savings in FPGA implementations.^[60] Voltage scaling hints in the input code guide dynamic adjustment of supply voltages for less critical paths, further lowering energy consumption; when combined with approximate designs, this yields 24.5% additional savings over switching activity reduction alone.^[61] Key metrics evaluate these optimizations: throughput is quantified as iterations per cycle, given by \frac{1}{\text{II}}, where II represents the pipelining efficiency, while latency measures the cycles for a single iteration completion. Area is estimated using gate equivalents, approximated as \text{Area} = \sum (\text{unit\_cost} \times \text{binding\_factor}), where binding_factor accounts for sharing multiplicity across operators.^[62] These metrics highlight trade-offs, as increasing parallelism via tiling or unrolling enhances throughput but escalates area and power due to replicated units and higher routing demands. Post-2015 advancements incorporate approximate computing in HLS for energy-efficient AI accelerators, trading minor accuracy loss for substantial resource reductions in deep neural networks; configurable approximate arithmetic units, synthesized from high-level descriptions, cut area and power by 20-40% in DNN inference while maintaining acceptable error rates.^[63]

Outputs and Integration

Generated Hardware Descriptions

High-level synthesis (HLS) tools primarily generate synthesizable hardware description language (HDL) code in formats such as Verilog, VHDL, or SystemVerilog, which can be directly used for further implementation in field-programmable gate arrays (FPGAs) or application-specific integrated circuits (ASICs). These outputs represent a register-transfer level (RTL) description that captures the scheduled and allocated hardware architecture derived from the input behavioral model.^[64]^[45]^[42] The generated RTL typically consists of modular structures with defined input/output ports that mirror the interfaces specified in the high-level input, ensuring compatibility with surrounding system components. Internally, the design separates into a datapath for computational operations—comprising operators like adders, multipliers, and multiplexers—and a control unit implemented as a finite state machine (FSM) to sequence operations based on the synthesis schedule. This separation facilitates targeted optimizations, where the datapath handles dataflow and the FSM manages timing and resource allocation.^[42]^[65]^[66] Beyond core RTL, HLS tools often produce additional artifacts to support validation and analysis, including automatically generated testbenches derived from the original input stimuli to enable co-simulation between the high-level model and RTL output. Performance reports are generated, providing estimates of latency, initiation interval, and achievable clock frequency to assess design performance. Detailed timing analysis, including slack metrics like worst negative slack (WNS) and total negative slack (TNS), is performed in downstream tools.^[9]^[47]^[67] Post-processing of HLS outputs involves integration with backend electronic design automation (EDA) flows, where the RTL undergoes logic synthesis to produce a gate-level netlist optimized for specific technology libraries. This step maps RTL constructs to standard cells, applying technology-specific optimizations for area, power, and timing before physical design stages like placement and routing. The resulting gate-level netlist serves as the foundation for fabrication or FPGA bitstream generation.^[68]^[69]^[70] Quality assurance for generated hardware descriptions relies on metrics such as functional coverage, which measures the extent to which input behaviors are exercised in the RTL, and simulation-based equivalence checking to confirm that the output matches the high-level model's semantics. Formal equivalence verification tools compare the RTL against the behavioral source, detecting discrepancies in state transitions or data transformations. These checks ensure high fidelity, with coverage goals often targeting over 90% to validate completeness.^[71]^[72]^[73] In modern applications during the 2020s, HLS supports generation of parameterized RTL IP blocks that facilitate reuse in system-on-chip (SoC) designs by allowing configurable modules to be integrated across diverse architectures without full RTL regeneration. This approach enhances modularity, as seen in HLS-generated IP cores that support rapid reconfiguration for AI accelerators or signal processing in multi-die SoCs.^[74]^[75]

Interface and Protocol Synthesis

Interface and protocol synthesis in high-level synthesis (HLS) involves the automatic generation of hardware ports, communication channels, and protocol-compliant wrappers from high-level behavioral descriptions, enabling seamless integration of synthesized modules into larger systems such as SoCs or FPGA platforms. This process transforms abstract I/O specifications—such as function arguments in C/C++—into concrete hardware interfaces that adhere to established standards, ensuring compatibility with processors, memories, and peripherals. By inferring interface types based on data access patterns (e.g., scalar arguments as control signals, arrays as memory ports), HLS tools add necessary protocol logic like address decoding, handshaking, and arbitration without manual RTL intervention.^[76] Key interface types supported in HLS include memory-mapped bus protocols like AXI (Advanced eXtensible Interface), which is prevalent in ARM-based and Xilinx/AMD ecosystems for its support of high-bandwidth, low-latency transactions, and Wishbone, an open-standard bus favored in custom or open-source designs for its simplicity and flexibility. For dataflow-oriented architectures, streaming interfaces—often implemented via libraries like hls::stream in Vitis HLS—facilitate point-to-point data movement with inherent FIFO buffering, mapping to AXI-Stream protocols that enable pipelined, burst-capable transfers between accelerator functions. During synthesis, ports are inferred from top-level function I/O: for instance, pointer arguments to large arrays are typically synthesized as AXI master interfaces for memory access, while scalar inputs become AXI-Lite slave ports for configuration; protocol wrapping then encapsulates these with control signals (e.g., valid/ready handshakes) to enforce standard compliance.^[77] Synthesis challenges arise in optimizing for efficiency and correctness, particularly with burst transfers in AXI, where consecutive memory accesses must be grouped to minimize overhead—HLS tools report burst opportunities and misses to guide optimizations, but unaligned accesses or irregular patterns can limit coalescence. Endianness mismatches between software and hardware domains require explicit handling, often via pragmas or post-synthesis verification, while ensuring AMBA specification compliance (e.g., for AXI4 subsets) demands rigorous protocol validation to avoid deadlocks or data corruption in multi-master systems. Techniques for customization include directive-based approaches, such as the #pragma HLS interface directive in Vitis HLS, which specifies modes like s_axilite for lightweight control registers or m_axi for full memory ports with burst support; bundle pragmas further group related arguments into shared interfaces. Co-synthesis with software is facilitated by automated generation of host drivers (e.g., C APIs for register access), allowing joint verification of hardware-software interactions.^[78]^[79] A representative example is converting a C function for image filtering into a hardware accelerator: input/output streams are declared as hls::stream<ap_axiu<8,0,0,0>> to infer AXI-Stream interfaces with side-channel signals for pixel data and control (e.g., TUSER for frame boundaries), while parameters like filter coefficients use AXI-Lite for runtime configuration, resulting in an RTL module ready for IP integrator tools. As of 2025, HLS-generated accelerators integrate with high-speed interfaces like PCIe and Ethernet via standard protocols such as AXI, often using dedicated IP cores for the protocol handling to enable offloading compute-intensive kernels or real-time sensor fusion in distributed edge computing.^[80]

Applications and Tools

Industrial Use Cases

High-level synthesis (HLS) has been widely adopted in telecommunications for accelerating 5G baseband processing on field-programmable gate arrays (FPGAs), where it enables the rapid implementation of high-throughput components such as channel coding and synchronization modules. For instance, HLS tools have been used to design quasi-cyclic low-density parity-check (LDPC) decoders and primary/secondary synchronization signal detectors, achieving real-time performance with fixed-latency parallel processing pipelines that meet 5G new radio (NR) standards. These implementations demonstrate HLS's ability to handle complex signal processing while optimizing resource utilization on FPGA fabrics.^[81]^[82] In the automotive industry, HLS supports the development of advanced driver-assistance systems (ADAS), particularly for vision pipelines that process camera feeds for object detection, lane tracking, and sensor fusion in autonomous vehicles. By synthesizing C/C++ algorithms into hardware accelerators, HLS reduces the design cycle for in-vehicle compute units, allowing integration with neural processing elements for real-time inference. Case studies show HLS enabling optimized hardware for fatigue detection and environmental perception, balancing performance and power in embedded systems.^[83]^[84]^[85] For artificial intelligence and machine learning (AI/ML) applications, HLS is integral to creating specialized accelerators for convolutional neural networks (CNNs), as seen in AMD's Vitis AI platform, which compiles high-level models into FPGA-optimized inference engines. This approach facilitates deployment of CNNs for tasks like image classification and edge analytics, with HLS directives enabling quantization and pipelining to achieve low-latency processing on resource-constrained devices. Vitis AI's HLS flow supports end-to-end development from trained models to hardware, improving throughput for real-world AI workloads.^[86]^[87] In aerospace, SystemC-based HLS aids the creation of fault-tolerant designs for satellite systems, where radiation-hardened processors must withstand single-event upsets and ensure reliable operation in harsh environments. SystemC models allow high-level simulation and synthesis of multi-processor systems-on-chip (MPSoCs) for on-board computing, incorporating fault injection for tolerance analysis. This methodology supports scalable architectures for satellite payloads, verifying redundancy mechanisms before ASIC or FPGA deployment.^[88]^[89] Notable case studies illustrate HLS's practical impact. NASA's efforts in the 2010s and beyond include applying HLS to image processing for space-borne instruments, such as connected-component labeling in high-energy physics detectors, where Vivado HLS accelerates data binning and filtering on Zynq SoCs for real-time analysis of particle collision imagery. More recent work explores HLS for front-end algorithms in space telescopes, optimizing Sobel edge detection and grayscale conversion to handle orbital data streams efficiently. In telecommunications, HLS has been pivotal in 5G chip development, with behavioral synthesis flows enabling baseband processors for NR protocols, as demonstrated in FPGA prototypes for LDPC decoding and OFDM transceivers that scale to production ASICs.^[90]^[91]^[92] In practice, HLS delivers 5-10x productivity gains in prototype development by reducing code complexity and enabling faster simulation-to-hardware iteration, allowing teams to explore design spaces rapidly without manual register-transfer level (RTL) coding. This abstraction also supports scalability to billion-gate designs through hierarchical synthesis and modular verification, facilitating integration into large-scale systems like multi-core SoCs.^[93]^[94]^[95] Despite these advantages, challenges persist in safety-critical applications, where verification overhead can extend project timelines due to the need for formal equivalence checking between high-level models and RTL outputs, as well as qualification under standards like DO-254 for avionics. HLS-generated designs require additional simulation and fault coverage analysis to ensure reliability in domains like aerospace and automotive, often necessitating hybrid verification flows to mitigate timing discrepancies.^[96]^[97]^[55] In the 2020s, HLS has extended to emerging paradigms, including quantum hardware, where synthesis flows convert high-level quantum algorithms into gate-level descriptions for superconducting or ion-trap processors, addressing compilation challenges in noisy intermediate-scale quantum (NISQ) devices. For neuromorphic hardware, HLS optimizes spiking neural network accelerators on FPGAs, with reliability studies showing improved fault tolerance for brain-inspired computing in edge AI applications like pattern recognition.^[98]^[99]^[100]

Commercial Vendors and Open-Source Options

Several major commercial vendors provide high-level synthesis (HLS) tools tailored for both ASIC and FPGA targets, emphasizing integration with broader electronic design automation (EDA) flows to streamline hardware development. Cadence's Stratus HLS enables C/C++ to RTL synthesis with a focus on power, performance, and area (PPA) optimization, supporting multi-clock domains and integration with the Genus synthesis solution for early congestion feedback.^[101] AMD's Vitis HLS, part of the Vitis unified software platform, targets Xilinx FPGAs and allows C, C++, and SystemC descriptions to generate optimized RTL, with features for design space exploration and co-simulation.^[102] Siemens EDA's Catapult HLS platform supports C++ and SystemC for both ASIC and FPGA, offering physically aware synthesis and verification to achieve production-quality RTL with reduced iterations.^[103] Intel's High Level Synthesis Compiler integrates directly with the Quartus Prime design suite, converting C++ code to RTL for Intel FPGAs while providing area and timing optimization directives.^[104] Open-source HLS options provide accessible alternatives for research and prototyping, often focused on FPGA acceleration. Bambu, within the PandA framework developed by Politecnico di Milano, is an academic-grade HLS tool that synthesizes C code to RTL, supporting custom optimizations and integration with both commercial and open-source backends like Yosys.^[105] CIRCT (Circuit IR Compilers and Tools), an LLVM project extension, facilitates high-level to low-level circuit transformations using MLIR dialects, enabling modular HLS flows for hardware accelerators. LegUp, originally from the University of Toronto, was an open-source FPGA-focused HLS tool for C/C++ to Verilog but was acquired by Microchip Technology in 2021 and rebranded as SmartHLS, a commercial tool suite targeting PolarFire FPGAs with C/C++ to Verilog synthesis, co-simulation, and IP core generation, while its core concepts continue to influence open-source research.^[106]^[107] These tools differ in target support and optimization capabilities: commercial offerings like Stratus and Catapult provide deep ASIC integration and advanced PPA tuning for enterprise-scale designs, whereas Vitis HLS and Intel's compiler excel in FPGA-specific directives for rapid prototyping; open-source tools such as Bambu emphasize extensibility for algorithmic research but may require more manual tuning for performance parity.^[3] Pricing for commercial tools typically follows enterprise subscription models, often bundled with full EDA suites starting at tens of thousands of dollars annually, while open-source options are free but demand expertise in underlying frameworks.^[1] Market trends in HLS reflect industry consolidation and cloud adoption, exemplified by AMD's 2022 acquisition of Xilinx, which unified Vitis HLS with AMD's CPU/GPU ecosystem to accelerate AI and data center workloads.^[108] Cloud-based HLS has gained traction through platforms like AWS F1 instances, allowing remote FPGA prototyping without local hardware.^[109] When selecting an HLS tool, key criteria include ecosystem integration—such as Cadence Stratus with Genus for seamless RTL handoff or Intel HLS with Quartus for FPGA place-and-route—and support for specific targets like ASIC versus FPGA, alongside ease of verification and scalability for team workflows.^[110] Emerging options include libraries like hlslib, which extend Vitis and Intel HLS with stream processing primitives to simplify accelerator design.^[111] Post-2023 advancements incorporate AI/ML for enhanced design space exploration, such as machine learning-based predictors in tools like Bambu extensions to reduce synthesis runtime by modeling latency and resource usage, along with large language models for automated HLS directive optimization and the AMD Vitis HLS 2025.1 release featuring MATLAB-to-C++ code generation for easier high-level design entry, as of November 2025.^[112]^[21]^[113]

References

[1]
High-Level Synthesis (HLS) - Semiconductor Engineering
High-level synthesis (HLS) is a technology that assists with the transformation of a behavioral description of hardware into an RTL model.
[2]
What is High-Level Synthesis (HLS) - Siemens HLS Academy
High-Level Synthesis (HLS) is a process that automates the generation of production-quality Register-Transfer Level (RTL) implementations from high-level ...
[3]
(PDF) An overview of today's high-level synthesis tools
Aug 7, 2025 · High-level synthesis (HLS) is an increasingly popular approach in electronic design automation (EDA) that raises the abstraction level for ...
[4]
High-Level Synthesis - Cadence
High-level synthesis is the process of taking an abstract functional-only design description and translating and optimizing it into a logic-synthesizable ...
[5]
What's The Real Benefit Of High-Level Synthesis?
Nov 10, 2016 · Design and verification productivity · Broader IP reuse · Improved Quality of Results · Higher level of abstraction · Knowledge Centers Entities, ...
[6]
Benefits of High-Level Synthesis - 2025.1 English - UG1399
HLS enables quick, high-quality RTL creation, reduces errors, increases productivity, and allows for multiple design solutions and quick verification.Missing: EDA | Show results with:EDA
[7]
High-Level Synthesis & Verification Platform - Siemens EDA
Siemens' High-Level Synthesis (HLS) and Verification (HLV) platform improves your ASIC and FPGA design and verification flow when compared to traditional RTL.
[8]
High-level synthesis - the right side of history - IEEE Xplore
High-level synthesis (HLS) was first proposed in the 1980s. After spending decades on the sidelines of mainstream RTL digital design, there has been ...
[9]
[PDF] Vivado Design Suite User Guide: High-Level Synthesis
May 4, 2021 · ... benefits in performance, cost, and power over traditional processors. This chapter provides an overview of high-level synthesis. Note: For ...
[10]
[PDF] A Comparative Study between HLS and HDL on SoC for Image ...
Dec 15, 2020 · The results are low-level, complex designs, and slow development and debugging processes. The HDL drawbacks lead to the development of new tools.
[11]
A Comparative Study between RTL and HLS for Image Processing ...
Both of methods have advantages and disadvantages: HLS provides ease of development, and is less prone to error. Thus, it reduces the development time ...
[12]
(PDF) High-Level Synthesis: A Retrospective - ResearchGate
The early research in HLS has taken place since the 1970s [Barbacci 1973]. Academic research has led to many advances and tools development, as it can be found ...
[13]
[PDF] High-Level Synthesis: Past, Present, and Future - Columbia CS
During the mid-1980s to early 1990s, design tech- nologies for integrated circuits were undergoing sig- nificant change. Automatic placement and routing.
[14]
DARPA VLSI Project Review - People @EECS
DARPA Projects Combinational Synthesis · State Assignment · Synthesis ... Synthesis-Directed High-Level Simulation · DARPA Projects Behavioral Synthesis.Missing: funded | Show results with:funded
[15]
High Level Synthesis of ASICs under Timing and Synchronization ...
High Level Synthesis of ASICs Under Timing and Synchronization Constraints addresses both theoretical and practical aspects in the design of a high-level ...
[16]
Synopsys, Inc. - Company-Histories.com
Other new developments in 1994 included the announcement of Behavioral Compiler, a synthesis tool that simplified IC design by cutting specification time by ...
[17]
[PDF] Bambu: an Open-Source Research Framework for the High-Level ...
Oct 22, 2023 · Bambu (HLS tool) was first released in March 2012. ... ❑ HLS tool developed at Politecnico di Milano (Italy) within the PandA framework.
[18]
[PDF] Reinforcement Learning Strategies for Compiler Optimization in ...
Reinforcement Learning (RL) can learn optimal compiler pass orderings for HLS, unlike supervised methods, by traversing the optimization space.
[19]
ODGS: Dependency-Aware Scheduling for High-Level Synthesis ...
In this work, we propose ODGS, a dependency-aware scheduling method for high-level synthesis with graph neural network (GNN) and reinforcement learning (RL). ...
[20]
High-level Synthesis Directives Design Optimization via Large ...
Sep 11, 2025 · High-level synthesis is an effective methodology that accelerates early-stage circuit design. The optimization of HLS directives has been a ...
[21]
A CPU+FPGA OpenCL Heterogeneous Computing Platform for Multi ...
Jul 18, 2025 · This article presents a CPU+FPGA heterogeneous platform with a novel execution model to optimize multi-kernel pipeline. Firstly, we extend ...
[22]
High-Level Programming of FPGA-Accelerated Systems with ...
May 27, 2024 · These high-level synthesis (HLS) tools allow developers to program FPGAs faster and without hardware expertise [20]. More recently, both Intel ...
[23]
Case Study: Optimizing In-Vehicle Compute with High-Level Synthesis
Sep 29, 2025 · This presentation introduces the use of High-Level Synthesis (HLS) to migrate AI functions from software into bespoke hardware accelerators. HLS ...Missing: neural 2020s
[24]
High-Level Synthesis Propels Next-Gen AI Accelerators
May 20, 2024 · HLS is a practical and proven way to create bespoke accelerators, optimized for a very specific application, that deliver higher performance and efficiency ...Missing: adoption ADAS 2020s
[25]
The Impact of Moore's Law Ending - Semiconductor Engineering
Oct 29, 2018 · Continuing to follow Moore's Law will result in increased design manufacturing costs. Staying put at existing nodes will result in added design ...
[26]
QHLS: An HLS Framework to Convert High-Level Descriptions to ...
This paper presents a new framework for quantum high-level synthesis, called QHLS, that aims to facilitate programmers using quantum computers. Currently ...
[27]
[PDF] HEC: Equivalence Verification Checking for Code Transformation ...
Source-to-source code transformations, which include control flow and datapath transformations, have been widely used in. High-Level Synthesis (HLS) and ...
[28]
High-level Synthesis for FPGAs - A Hardware Engineer's Perspective
Aug 6, 2025 · High-level synthesis (HLS) promises to increase the productivity of hardware design by allowing system description from abstract, timeless ...
[29]
High-level synthesis: Productivity, performance, and software ...
Based on our study, we provide insights on current limitations of mapping general-purpose software to hardware using HLS and some future directions for HLS tool ...Missing: 2015-2025 10x
[30]
HLS Pragmas - 2025.1 English
The v++ compiler calls the Vitis High-Level Synthesis (HLS) tool to synthesize the RTL code from the kernel source code. The HLS tool is intended to work with ...
[31]
HLS Synthesisable Subset. - University of Cambridge
HLS Synthesisable Subset. · Program must be finite-state and single-threaded, · all recursion bounded, · all dynamic storage allocation outside of infinite loops ( ...Missing: synthesizable | Show results with:synthesizable
[32]
Catapult C++/SystemC Synthesis Tool - Siemens EDA
Catapult is the leading HLS solution for ASIC & FPGA. Supporting C++ & SystemC, designers use their preferred language, moving up in productivity & quality.
[33]
[PDF] SystemC Synthesizable Subset Version 1.4.7 - Accellera
This standard defines a SystemC subset for HLS tools, allowing hardware designers to create portable models. It is a minimum subset for synthesis.
[34]
High-Level Synthesis Code Generation from MATLAB - MathWorks
The MATLAB to HLS workflow is an integration of the high-level synthesis tools with the MATLAB programming environment. HLS supports C/C++ datatypes and fixed- ...
[35]
Get Started with MATLAB to High-Level Synthesis Workflow Using ...
This example shows how to use the HDL Coder™ command-line interface to generate High-Level Synthesis (HLS) code from MATLAB® code, including floating-point to ...
[36]
Resources for HLS? : r/FPGA - Reddit
Jul 7, 2022 · MyHDL is python, but it isn't Python HLS, it's Python-based RTL entry. Generally HLS products use C or C++ as their input language. Both Intel ...<|separator|>
[37]
jingpu/Halide-HLS: HLS branch of Halide - GitHub
The current compiler is based on Halide release 2017/05/03 (https://github.com/halide/Halide/releases). Intructions for building examples can be found at the ...
[38]
HeteroHalide: From Image Processing DSL to Efficient FPGA ...
Feb 24, 2020 · We propose HeteroHalide, an end-to-end system for compiling Halide programs to FPGA accelerators. This system makes use of both algorithm and scheduling ...
[39]
Fixed point vs floating point arithmetic in FPGA - imperix
Aug 13, 2021 · Floating-point-based algorithms are more complex to handle than fixed-point, especially when using HDL languages (VHDL, Verilog). Fortunately, ...
[40]
Compile-Time Generation of Custom-Precision Floating-Point IP ...
Two weaknesses of this approach are that it limits the number of floating-point formats - typically to half, single, and double - and that it requires ...
[41]
[PDF] An Introduction to High-Level Synthesis - Columbia CS
Interface synthesis makes it possible to map the transfer of data that is implied by passing of C++ func- tion arguments to various hardware interfaces such as.<|control11|><|separator|>
[42]
High Level Synthesis - an overview | ScienceDirect Topics
One of the main benefits of the HLST approach is that a set of algorithms and HLS constraints can be directly evaluated. This enables new design possibilities ...
[43]
[PDF] Introduction to high-level synthesis - IEEE Design & Test of Computers
Since these three tasks are re- peated in each state, they can be pipelined into three stages. ... Gajski served as technical chair of the High-Level. Synthesis ...
[44]
[PDF] High Level Synthesis
Architectural Synthesis. • Deals with “computational” behavioral descriptions. – Behavior as sequencing graph. (called dependency graph, or data flow graph DFG).Missing: abstraction levels
[45]
ScaleHLS: A New Scalable High-Level Synthesis Framework ... - arXiv
Jul 24, 2021 · This paper proposes ScaleHLS, a new scalable and customizable HLS framework, on top of a multi-level compiler infrastructure called MLIR.
[46]
[PDF] Vitis High-Level Synthesis User Guide - AMD
Oct 19, 2022 · With HLS, the testbench is also generated or created at a high level, meaning the original design intent can be verified very quickly. The ...Missing: EDA | Show results with:EDA<|control11|><|separator|>
[47]
Timing driven power gating in high-level synthesis - IEEE Xplore
... high-level synthesis. Given a target clock period and design constraints, our goal is to derive the minimum-standby-leakage-current resource binding solution.
[48]
Multi-objective Design Space Exploration for High-Level Synthesis ...
In this paper, we model the design space exploration (DSE) as a multi-objective black-box optimization problem via Bayesian optimization with float encoding ...
[49]
GPU-Accelerated High-Level Synthesis for Bitwidth Optimization of ...
We show how to parallelize the key steps of bitwidth optimization on the GPU by performing a fast brute-force search over a carefully constrained search space.
[50]
A Multi-objective Genetic Algorithm for Design Space Exploration in ...
This paper presents a methodology for design space exploration (DSE) in high-level synthesis (HLS), based on a multi-objective genetic algorithm.
[51]
FPGA HLS Today: Successes, Challenges, and Opportunities
Aug 8, 2022 · In this article, we assess the progress of the deployment of HLS technology and highlight the successes in several application domains.<|control11|><|separator|>
[52]
Fast and Inexpensive High-Level Synthesis Design Space Exploration
Mar 16, 2023 · Fast and Inexpensive High-Level Synthesis Design Space Exploration: Machine Learning to the Rescue. Publisher: IEEE. Cite This.
[53]
Enabling adaptive loop pipelining in high-level synthesis - IEEE Xplore
Loop pipelining is an important optimization in high-level synthesis (HLS) because it allows successive loop iterations to be overlapped during execution.
[54]
Polyhedral-Based Dynamic Loop Pipelining for High-Level Synthesis
Dec 14, 2017 · Loop pipelining is one of the most important optimization methods in high-level synthesis (HLS) for increasing loop parallelism.
[55]
Loop Splitting for Efficient Pipelining in High-Level Synthesis
Our parametric loop splitting improves pipeline performance by 4.3× in terms of clock cycles per iteration.
[56]
Low power methodology for an ASIC design flow based on high ...
Clock gating and power gating are two well-known techniques for dynamic and leakage power reduction respectively. They can even be integrated to get maximum ...
[57]
High-level synthesis of approximate hardware under joint precision ...
Results show that when considering voltage scaling, up to 24.5% higher energy savings can be achieved compared to approaches that only consider switching ...
[58]
Area Optimization of Multi-Cycle Operators in High-Level Synthesis
In this paper a new design technique to overcome the restricted reusability of multi-cycle operators is presented.
[59]
Configurable High-Level Synthesis Approximate Arithmetic Units for ...
The approximate computing paradigm reports promising techniques for the design of Deep Neural Network (DNN) accelerators to reduce resource consumption in both ...
[60]
High Level Synthesis - an overview | ScienceDirect Topics
The backend synthesis phase involves three critical steps: allocation, scheduling, and binding. 2. Allocation determines the hardware resources to be used, ...
[61]
Bluespec Updates ESL Synthesis Toolset; Offers Improved Verilog ...
This latest release of Bluespec ESL Synthesis offers IP vendors a viable delivery vehicle of RTL code generated from high-level models. Remarks Pattanam: " ...
[62]
[PDF] High-Level Synthesis Blue Book
This book presents the recommended coding style for C++ synthesis that results in good quality. RTL. Most of the C++ examples are accompanied with hardware and ...
[63]
[PDF] High-Level Synthesis: from theory to practice - ARCHI
High-Level Synthesis (HLS). ○ Starting from a functional description, automatically generate an RTL architecture. ○ Constraints. ◊ Timing constraints ...
[64]
LLM-Based Timing-Aware and Architecture-Specific FPGA HLS ...
Jul 23, 2025 · Vivado's post-synthesis reports are then used to evaluate timing closure (e.g., Worst Negative Slack (WNS), Total Negative Slack (TNS)), ...
[65]
What is Synthesis? – How it Works - Synopsys
Sep 8, 2025 · Synthesis is the process of transforming a high-level hardware description (such as RTL code) into a gate-level representation suitable for ...Missing: post- | Show results with:post-
[66]
Logic Synthesis in Digital Electronics - GeeksforGeeks
Jul 23, 2025 · RTL block Synthesis: Translate RTL code into gate-level netlist by logical synthesis under the constraints. Partitioning of chip: The chip ...Asic Design · Logic Design · Logic Synthesis Flow
[67]
[PDF] Introduction to High-Level Synthesis ECE 699: Lecture 12
Generation 1 (1980s-early 1990s): research period. Generation 2 (mid 1990s-early 2000s):. • Commercial tools from Synopsys, Cadence, Mentor Graphics, etc.
[68]
Functional Equivalence Verification Tools in High-Level Synthesis ...
Aug 6, 2025 · High-level synthesis facilitates the use of formal verification methodologies that check the equivalence of the generated RTL model against ...
[69]
Functional Equivalence Verification Tools in High-Level Synthesis ...
The article provides an overview of sequential equivalence checking techniques, its challenges, and successes in real-world designs.
[70]
[PDF] closing-functional-and-structural-coverage-on-rtl-generated-by-high ...
The most common goal for using High Level Synthesis (HLS) is to reduce the effort needed to verify ... verification team to separate the testing of functionality ...<|control11|><|separator|>
[71]
How the Productivity Advantages of High-Level Synthesis Can ...
This paper discusses how HLS can be used to improve the design, verification, and reuse of intellectual property (IP).
[72]
How The Productivity Advantages Of High-Level Synthesis Can ...
Dec 6, 2023 · This paper discusses how HLS can be used to improve the design, verification, and reuse of intellectual property (IP) and an HLS tool. Click ...Missing: netlists 2020s
[73]
Introduction to Interface Synthesis - 2025.1 English - UG1399
Introduction to Interface Synthesis - 2025.1 English - UG1399. Vitis High-Level Synthesis User Guide (UG1399) ... The default channels for Vitis kernels are AXI ...
[74]
AXI Adapter Interface Protocols - 2025.1 English - UG1399
Tip: The AXI protocol requires an active-Low reset. If your design uses AXI interfaces the tool will define this reset level with a warning if the syn.rtl.
[75]
AXI Burst Transfers - 2025.1 English - UG1399
The burst optimizations are reported in the Synthesis Summary report, and missed burst opportunities are also reported to help you improve burst optimization.
[76]
pragma HLS interface - 2025.1 English - UG1399
AXI Interface Protocols: s_axilite : Implements the port as an AXI4-Lite interface. The tool produces an associated set of C driver files when exporting the ...
[77]
AXI4-Lite Interface - 2025.1 English - UG1399
An HLS IP or kernel can be controlled by a host application, or embedded processor using the Slave AXI4-Lite interface ( s_axilite ) which acts as a system bus.
[78]
FPGA-Based Channel Coding Architectures for 5G Wireless Using ...
Jun 7, 2017 · High-level synthesis compilation is used to design and develop the architecture on the FPGA hardware platform. To validate this architecture, an ...
[79]
(PDF) FPGA Implementation of 5G NR Primary and Secondary ...
Aug 9, 2025 · FPGA are reconfigurable devices and easy to design complex circuits at high frequencies. The proposed architecture employs Primary ...
[80]
High-Level Synthesis for autonomous drive | Siemens Software
This whitepaper describes how to speed the design flow and tame the verification challenge using the High-Level Synthesis (HLS) methodology.Missing: 2020s | Show results with:2020s
[81]
Case Study: Optimizing In-Vehicle Compute with High-Level Synthesis
This presentation introduces the use of High-Level Synthesis (HLS) to migrate AI functions from software into bespoke hardware accelerators. HLS simplifies and ...Missing: pipelines | Show results with:pipelines
[82]
[PDF] HW/SW Co-design and Prototyping Approach for Embedded Smart ...
The main goal of this paper is to build a prototype of vision based ADAS (Advanced Driver Assistant System) as a smart camera capable to detect a fatigue state ...
[83]
Vitis AI Developer Hub - AMD
Overview. AMD Vitis™ AI software is an AI inference development platform for AMD devices, boards, and Alveo™ data center acceleration cards.
[84]
Vitis HLS : floating point vs fixed point - Adaptive Support - AMD
Feb 19, 2021 · I've designed a CNN accelerator design in Vitis HLS - two projects, each using floating and fixed point data type.
[85]
[PDF] Fault-Tolerant Satellite Computing with Modern Semiconductors
no problem for emulation-based fault injection, where only the high-level behavior of a system is emulation, but challenging for more close-to-hardware SystemC- ...
[86]
[PDF] Fault-Tolerant Satellite Computing with Modern Semiconductors
no problem for emulation-based fault injection, where only the high-level behavior of a system is emulation, but challenging for more close-to-hardware SystemC- ...
[87]
HLS Taking Flight: Toward Using High-Level Synthesis Techniques ...
Jul 2, 2024 · HLS Taking Flight: Toward Using High-Level Synthesis Techniques in a Space-Borne Instrument. Authors: Marion Sudvarg.
[88]
https://liacs.leidenuniv.nl/~plaata1/theses/ChristianFuchs.pdf
[89]
Accelerating FPGA-Based Wi-Fi Transceiver Design and Prototyping ...
May 23, 2023 · This work shows that it is feasible to design modern Orthogonal Frequency Division Multiplex (OFDM) baseband processing modules like channel ...
[90]
The Evolution Of High-Level Synthesis - Semiconductor Engineering
Aug 27, 2020 · The evolution of high-level synthesis. HLS is beginning to solve some problems that were not originally anticipated.
[91]
ZeBu EP: Scalable Emulation & Prototyping Platform | Synopsys
ZeBu EP offers the most scalable unified hardware platform for emulation and prototyping, supporting up to 5.8 billion gates for complex SoC designs.
[92]
10x productivity boost with HLS: Myth, legend or fact? | EDA stuff
Oct 18, 2014 · The verdict: tenfold increase in productivity is a FACT, and you should be able to experience it, if you do it the RIGHT way!
[93]
[PDF] Paper Title (use style: paper title) - DVCon Proceedings
Abstract—The adoption of tools into safety-critical workflows is often challenging as these new technologies must demonstrate sufficient safeness to use ...
[94]
Challenges In Using HLS For FPGA Design
Apr 29, 2019 · The simple answer is that adopting an HLS design methodology in the real world does present unique challenges that must be considered and overcome during the ...
[95]
[PDF] Towards High-Level Synthesis of Quantum Circuits
To this end, we propose an approach based on high-level synthesis concepts for quantum computers. High-Level Synthesis (HLS) is widely applied in CMOS- based ...Missing: 2020s | Show results with:2020s
[96]
[PDF] Impact of High-Level-Synthesis on Reliability of Artificial Neural ...
Mar 21, 2024 · For instance, a neuromorphic computer architecture is analyzed in [12], a commercial-off-the-shelf EdgeAI device in [13], and Google Tensor ...
[97]
A Quarter of a Century of Neuromorphic Architectures on FPGAs
Mar 7, 2025 · This paper presents an overview of digital NMAs implemented on FPGAs, with a goal of providing useful references to various architectural design choices.Missing: 2020s | Show results with:2020s
[98]
Stratus High-Level Synthesis - Cadence
Stratus HLS starts with transaction-level SystemC, C, or C++ descriptions. Because the micro-architecture details are defined during HLS, the source ...
[99]
High-Level Synthesis C-Based Design - 2025.1 English - UG892
The C-based High-Level Synthesis (HLS) tools within the Vivado Design Suite enable you to describe various DSP functions in the design using C, C++, and SystemC ...
[100]
Catapult High-Level Synthesis & Verification - Siemens EDA
Catapult has the broadest portfolio of hardware design solutions for C++ and SystemC-based High-Level Synthesis (HLS). Catapult's physically-aware, multi-VT ...Catapult HLSC++/SystemC Synthesis A ...HLS & HLV Resource LibraryHigh-Level Verification SolutionsSLEC System
[101]
2. High Level Synthesis (HLS) Design Examples and Tutorials - Intel
The Intel® High Level Synthesis (HLS) Compiler Pro Edition includes design examples and tutorials to provide you with example components and demonstrate ways to ...
[102]
Bambu: An Open-Source Research Framework for the High-Level ...
This paper presents the open-source high-level synthesis (HLS) research framework Bambu. Bambu provides a research environment to experiment with new ideas ...
[103]
LegUp: An open-source high-level synthesis tool for FPGA-based ...
In this article, we introduce a new high-level synthesis tool called LegUp that allows software techniques to be used for hardware design. LegUp accepts a ...
[104]
AMD Acquires Xilinx
AMD acquired Xilinx to create a high-performance computing leader, combining products, markets, and technology, and to accelerate emerging workloads.Missing: consolidation AWS F1
[105]
Democratizing Domain-Specific Computing
Jan 1, 2023 · Moreover, FPGAs have become available in the public cloud, such as Amazon AWS F1 and Nimbix. Designers can create their own DSAs on the FPGA and ...
[106]
Stratus High-Level Synthesis Datasheet - Cadence
Integration with Genus physical synthesis allows early visibility and feedback into likely congestion problems, allowing the front-end designer to avoid ...Missing: Quartus Intel
[107]
HLS LIBS - High-Level Synthesis Libraries' Homepage
Welcome to hlslibs! HLSLibs is a free and open set of libraries implemented in standard C++ for bit-accurate hardware and software design.
[108]
Machine learning based fast and accurate High Level Synthesis ...
In this paper, we present a machine learning based High-Level Synthesis (HLS) design space explorer (DSE) that significantly reduces the exploration runtime.