Biren Technology
Shanghai Biren Intelligent Technology Co., Ltd., commonly known as Biren Technology, is a Chinese fabless semiconductor company founded in 2019 that designs graphics processing units (GPUs) and domain-specific architecture processors for artificial intelligence, high-performance computing, and cloud applications.[1][2][3] Headquartered in Shanghai, the firm has developed chips such as the BR100, targeting performance competitive with international benchmarks in AI training and inference.[4][5] Biren has secured over 900 million USD in funding from investors including state-backed entities and venture firms like Qiming Venture Partners, achieving valuations exceeding 2 billion USD, and is preparing for a Hong Kong initial public offering.[6][7][8] In 2022, the United States added Biren to its Entity List, restricting access to advanced manufacturing technologies due to concerns over the company's contributions to military supercomputing capabilities, prompting production halts by foundry partners like TSMC and subsequent layoffs of approximately one-third of its workforce.[9][10][11] Despite these challenges, Biren has continued operations by modifying chip designs to comply with export controls and obtaining domestic funding pledges, including 280 million USD from government-supported investors in 2023.[12][13][11]History
Founding and Early Development
Biren Technology was established in 2019 in Shanghai, China, by Zhang Wen and Xu Lingjie, with the objective of designing high-performance graphics processing units (GPUs) tailored for artificial intelligence workloads and general-purpose computing.[14][15] The founders, drawing on prior experience from firms including Nvidia, AMD, and Alibaba, sought to create domestically developed semiconductors capable of supporting AI training, inference, and scientific simulations, thereby addressing China's strategic need to lessen dependence on imported GPU technologies dominated by companies like Nvidia.[16][17] In its initial phase, Biren prioritized the recruitment of talent from global semiconductor enterprises to build a core team focused on original chip architecture development rather than reliance on foreign intellectual property.[16] Early research and development efforts centered on chiplet-based designs, which enable modular scalability for high-throughput parallel processing essential to AI applications.[15] This approach aligned with broader national initiatives to foster indigenous innovation in high-performance computing amid geopolitical constraints on technology access.[17] By 2022, these foundational investments culminated in the unveiling of Biren's first GPU prototypes, marking progress toward self-reliant computing hardware ecosystems.[15]Product Launches and Milestones
Biren Technology publicly unveiled its BR100 general-purpose GPU in August 2022 at the Hot Chips 34 conference, marking the company's entry into high-performance computing with a dual-die design boasting 77 billion transistors on TSMC's 7nm process node.[15][16] The BR100 was marketed as delivering 2048 TOPS of INT8 performance for AI workloads, positioning it against Nvidia's A100 in compute-intensive applications like machine learning training.[18][19] In September 2024, the company introduced its HGCT heterogeneous GPU collaborative training solution, enabling enhanced scalability for large-scale AI model development through integrated multi-GPU orchestration.[20] This was followed in November 2024 by a partnership with Tencent-backed Infinigence AI, which integrated Biren GPU clusters into cloud infrastructure, yielding approximately 100% improvements in training throughput for large language models compared to prior configurations.[21][22] Biren's GPUs also saw deployments in domestic intelligent computing centers via collaborations with entities like China Mobile, supporting broader cloud-based AI inference and training ecosystems.[23] By mid-2025, Biren secured approximately 1.5 billion yuan ($207 million) in new funding from state-backed investors, bolstering production and R&D amid scaling efforts.[7] The company initiated preparations for a Hong Kong IPO, including hiring investment banks for valuation assessments targeting over $2 billion, signaling maturation toward commercial expansion.[20][24]Expansion Amid Challenges
Following the imposition of restrictions limiting access to advanced foreign manufacturing processes in late 2022, Biren Technology pivoted to domestic fabrication partners, notably Semiconductor Manufacturing International Corporation (SMIC), to produce its BR100 GPUs on 7-nanometer nodes previously handled by Taiwan Semiconductor Manufacturing Company (TSMC).[25] This transition supported continued scaling of production capacity, aligning with broader efforts to localize the semiconductor supply chain amid external constraints.[26] To bolster operational resilience, Biren expanded into cloud computing infrastructure tailored for AI workloads within China, integrating its GPUs into server systems optimized for general-purpose computing tasks such as large language model training and inference.[27] These integrations emphasized chiplet-based architectures to enhance scalability in domestic data centers, fostering compatibility with local software stacks and reducing latency for enterprise applications in sectors like finance and autonomous systems.[28] In 2024, Biren formed a strategic collaboration with Infinigence AI, backed by Tencent, which resulted in a near-100% increase in GPU training capacity for large language models through optimized software-hardware co-design.[22] This partnership demonstrated tangible performance gains from ecosystem adaptations, enabling Biren to deploy enhanced configurations in Chinese cloud environments. Additional alliances with entities such as China Mobile, China Telecom, and Shanghai AI Laboratory further strengthened domestic interoperability, contributing to a self-reliant AI computing framework by mid-2025.[28][29]Technology and Products
Core Architecture and Innovations
Biren Technology's GPUs employ a chiplet-based modular architecture, consisting of multiple smaller dies interconnected to form a cohesive processing unit, which enables scalable performance while mitigating manufacturing challenges associated with large monolithic chips. This design divides the GPU into tiles—such as the dual-die configuration in early models—allowing for the reuse of standardized compute and memory tiles across different product variants, thereby reducing development costs and time-to-market.[30][19] On advanced nodes like 7nm, where defect densities increase yield risks for dies exceeding approximately 600-800 mm², chiplets improve economic viability by confining defects to smaller areas, facilitating higher overall production yields and lower per-unit costs compared to equivalent monolithic designs that suffer from exponential yield drops.[27] To support multi-GPU scaling in datacenter environments, Biren integrates proprietary high-bandwidth interconnects, notably BLink, which enables up to 8-way clustering with low-latency data transfer rates optimized for AI workloads. BLink's architecture prioritizes bandwidth density and protocol efficiency, allowing seamless aggregation of GPU resources into larger computational fabrics without the bottlenecks common in traditional PCIe or NVLink alternatives, thus facilitating elastic scaling for distributed training and inference.[31] The architecture emphasizes optimizations for lower-precision formats such as FP16 and INT8, which dominate AI inference and training efficiency by reducing computational overhead and memory bandwidth demands while preserving model accuracy in quantized neural networks. This focus stems from empirical observations that most deep learning operations achieve sufficient fidelity at half-precision, yielding 2-4x throughput gains over FP32 without proportional power increases, as evidenced by hardware designs that allocate disproportionate resources to tensor operations in these formats. In contrast to monolithic dies, the chiplet approach further enhances power efficiency by localizing high-activity compute tiles near memory stacks, minimizing inter-die signaling losses and enabling fine-grained power gating.[32][15] Custom machine learning accelerators are embedded within the streaming multiprocessors, featuring dedicated tensor cores tailored for matrix multiply-accumulate operations central to convolutional and transformer models, with a transistor allocation prioritizing dense compute arrays over general-purpose logic. Biren's designs achieve transistor densities around 70-80 million per mm² on 7nm processes, reflecting efficient layout for AI-specific primitives rather than bloated feature sets, which supports verifiable scaling laws where performance correlates directly with active transistor utilization in ML kernels.[31][33]BR100 GPU Specifications and Capabilities
The BR100 GPU, Biren Technology's flagship accelerator, incorporates 77 billion transistors across a multi-chip module (MCM) design comprising two dies, fabricated on a TSMC 7 nm process node with an aggregate die area of 1074 mm².[34][15][35] It supports up to 64 GB of HBM2E memory distributed across four stacks, delivering bandwidth exceeding 2 TB/s, alongside 300 MB of on-chip cache to facilitate data-intensive computations.[15][36][37] Host connectivity is provided via a PCIe Gen 5 x16 interface, enabling integration into datacenter servers.[34][38] Key performance attributes target AI and high-performance computing (HPC) workloads, with peak theoretical throughput of 1 PFLOPS in BF16 precision for floating-point operations and 2 PFLOPS in INT8 for integer-based AI inference tasks.[39][37] FP32 performance reaches 256 TFLOPS, supporting general-purpose parallel processing, while the architecture includes dedicated tensor cores optimized for matrix multiplications central to deep learning models.[39][31] The design accommodates up to 550 W thermal design power (TDP), balancing compute density with power efficiency in scaled deployments.[32] For multi-GPU scalability in machine learning training, the BR100 features eight BLink ports, implementing an all-to-all interconnect with 512 GB/s bidirectional bandwidth per port to minimize latency in distributed training topologies.[40][16] This enables cluster configurations for large-scale model training, though hardware trade-offs such as reliance on high-bandwidth memory and custom interconnects prioritize throughput over flexibility in non-AI workloads compared to more mature ecosystems.[35] While capable of general-purpose GPU (GPGPU) tasks like scientific simulations, its core optimizations favor cloud-based AI inference and training, with integrated support for video encoding/decoding up to 64 channels of HEVC/H.264 at FHD@30fps.[41][40]| Specification | Details |
|---|---|
| Transistor Count | 77 billion |
| Process Node | 7 nm (TSMC) |
| Die Area | 1074 mm² |
| Memory | 64 GB HBM2E (4 stacks) |
| On-Chip Cache | 300 MB |
| Peak BF16 Performance | 1 PFLOPS |
| Peak INT8 Performance | 2 PFLOPS |
| Peak FP32 Performance | 256 TFLOPS |
| Interconnect | 8 × BLink ports (512 GB/s each) |
| Host Interface | PCIe Gen 5 x16 |
| TDP | 550 W |
Subsequent Developments and Iterations
In response to US export controls imposed in October 2022, Biren Technology modified the BR100 GPU design by reducing BLink interconnect links from 8 to 7, lowering the combined BLink and CXL 2.0 bandwidth from 640 GB/s to 576 GB/s to comply with the 600 GB/s threshold for restricted advanced computing items.[12] These changes, applied to variants like the BR104 single-die module, preserved core compute performance density while disabling high-bandwidth features through packaging and pin-out adjustments, enabling limited domestic production amid foundry access constraints.[12] Biren developed the Birensupa software platform to optimize GPU workloads, providing tools for porting applications, performance tuning, and integration with domestic AI frameworks such as those for large language models.[42] This ecosystem supports heterogeneous computing environments by abstracting hardware differences, facilitating deployment in sanction-limited settings where uniform high-end accelerators are unavailable.[43] The company iterated on hardware with the 106 series GPUs, launched in March 2025, featuring models like the 106B as dual-width PCIe cards with 300W peak power for tasks including multimodal AIGC generation, image recognition, and recommendation systems.[44][20] These iterations emphasize inference efficiency, supporting models like Alibaba's Tongyi QWQ-32B without relying on restricted foreign memory hierarchies. In September 2024, Biren unveiled the Heterogeneous GPU Collaborative Training (HGCT) solution for scaling across mixed GPU clusters, followed by China's first hybrid training implementation using four or more heterogeneous chips by April 2025.[20][45] This approach boosts inference throughput by dynamically allocating compute and memory resources, compensating for interconnect limitations in constrained supply chains through software-orchestrated load balancing.[45]Funding and Financial Performance
Investment Rounds and Backers
Biren Technology has secured substantial venture capital since its 2019 founding, with total funding reported at approximately $991 million across multiple rounds as of 2025.[46] Investors have included both private firms and state-affiliated entities, reflecting strategic emphasis on bolstering China's domestic GPU capabilities for AI and high-performance computing.[47] Early-stage backing came from semiconductor-focused investors such as Walden International and Semiconductor Manufacturing International Corporation (SMIC), which participated in rounds supporting initial R&D for Biren's cloud-oriented processors.[1] In 2023, additional capital flowed from Guangzhou-based funds, aiding expansion amid national pushes for chip self-sufficiency.[48] A March 2025 round featured investment from a private equity arm of Shanghai State-owned Capital Investment (SSCI), elevating the company's valuation to $2.2 billion and signaling municipal government alignment with AI hardware priorities.[49] This was followed by a June 2025 Series C-II extension raising 1.5 billion yuan ($209 million), primarily led by state-linked participants including a Guangdong provincial fund and Shanghai-backed vehicles like the State-owned Pioneer Private Equity Fund Management.[7][50] These infusions, totaling over $200 million in 2025 alone, have prioritized scaling production and ecosystem integration to reduce reliance on foreign technology suppliers.[5]| Round Date | Type | Amount (USD) | Key Backers |
|---|---|---|---|
| Pre-2025 (multiple) | Various early/series | ~$700M | Walden International, SMIC, Guangzhou funds[48][1] |
| March 2025 | Undisclosed | Undisclosed (valuation: $2.2B) | Shanghai State-owned Capital Investment entity[49] |
| June 27, 2025 | Series C-II | $209M | Guangdong provincial fund, Shanghai state-linked investors[7][50] |
Revenue, Losses, and IPO Preparations
Biren Technology generated 400 million yuan in revenue during 2024, derived mainly from sales of its BR100 GPUs to domestic Chinese cloud providers and data centers seeking alternatives to restricted foreign chips.[51][24] Despite this uptick, the company sustained operating losses, driven by elevated research and development expenditures exceeding 1 billion yuan annually to enhance chip yields, expand manufacturing partnerships with firms like SMIC, and iterate on architectures resilient to supply constraints.[7] These deficits underscore Biren's emphasis on capital-intensive scaling amid geopolitical barriers, rather than short-term cost controls. In September 2024, Biren hired Guotai Junan Securities, China's largest brokerage by market capitalization, to conduct IPO tutoring—a mandatory preparatory phase for listings on mainland or Hong Kong exchanges—and targeted an initial valuation of $2.19 billion.[52][53] By June 2025, following a 1.5 billion yuan funding round that valued the firm at around 14 billion yuan pre-money, Biren advanced preparations for a Hong Kong IPO, with potential filing as early as August 2025, adapting to heightened regulatory scrutiny on A-share listings for tech firms under U.S. sanctions.[7] This pivot reflects strategic flexibility, as Hong Kong's exchange offers swifter approvals and access to international capital while aligning with Beijing's push for domestic semiconductor champions.[5] Financial projections hinge on China's AI infrastructure boom, with Biren positioning itself to capture demand from hyperscalers like Alibaba and Tencent, though sustained losses are expected until production volumes achieve economies of scale projected for 2026 onward.[54] The firm's approach prioritizes technological autonomy and market share in high-performance computing over profitability margins, betting on policy-backed ecosystem growth to offset R&D burdens.[55]Geopolitical Context and US Sanctions
Imposition of Export Controls
In October 2022, the U.S. Bureau of Industry and Security (BIS) implemented export controls under the Export Administration Regulations (EAR) targeting advanced computing chips, components, and semiconductor manufacturing equipment destined for China, with the explicit aim of restricting the People's Republic of China's (PRC) capacity to produce or acquire integrated circuits enabling supercomputing and artificial intelligence systems potentially usable for military applications. These rules, effective October 7, 2022, prohibited exports of chips exceeding specified performance thresholds—such as total processing performance (TPP) above 4800 or performance density metrics—and restricted access to tools for fabricating nodes at or below 16nm without licenses, which are subject to a presumption of denial for entities involved in military end-use. The controls were justified by U.S. assessments of PRC military modernization efforts, including the integration of high-performance computing into weapons systems and surveillance, based on intelligence indicating state-directed acquisition of U.S.-origin technologies for such purposes. On October 19, 2023, BIS added Beijing Biren Technology Development Co., Ltd. and seven subsidiaries—including Guangzhou Biren Integrated Circuit Co., Ltd., Shanghai Biren Information Technology Co., Ltd., and Hangzhou Biren Technology Co., Ltd.—to the Entity List under the destination of China.[56] This designation requires U.S. exporters, reexporters, and in-country transferors to obtain licenses for any items subject to the EAR, with a policy of denial for those supporting military end-uses or end-users in the PRC. The rationale cited Biren's development of general-purpose computing GPUs, such as the BR100, as posing risks of enabling PRC military advancements in AI training and high-performance computing, consistent with broader U.S. evaluations of dual-use technologies in state-backed firms.[56] These additions built on the 2022 framework by heightening scrutiny on specific actors procuring restricted nodes (e.g., below 14nm) and tools from foreign partners. U.S. officials framed these measures as empirically grounded responses to documented PRC efforts to militarize advanced semiconductors, drawing on declassified assessments of programs like the Military-Civil Fusion strategy that channel commercial innovations toward defense applications. In contrast, PRC authorities and state media have characterized the controls as protectionist tactics to suppress China's indigenous innovation and maintain U.S. dominance in semiconductors, arguing that they arbitrarily target competitive firms without evidence of direct military involvement. Subsequent BIS updates in 2023 refined performance parameters and expanded tool restrictions, further limiting Biren's access to global supply chains for advanced fabrication.Impacts on Operations and Supply Chain
The US Department of Commerce's addition of Biren Technology to its Entity List in October 2022 prompted Taiwan Semiconductor Manufacturing Company (TSMC) to suspend production of Biren's BR100 GPU, which was slated for fabrication on TSMC's 7nm process, thereby severing access to one of the world's leading foundries and creating immediate supply chain disruptions.[57][58] This halt forced Biren to pivot to domestic alternatives, primarily Semiconductor Manufacturing International Corporation (SMIC), whose capabilities are constrained to mature nodes like 7nm without extreme ultraviolet (EUV) lithography tools due to the same export controls, resulting in lower yields and higher defect rates compared to TSMC's equivalents.[59][60] These shifts contributed to production scaling delays, as Biren encountered challenges in adapting designs to SMIC's process variations and ramping output volumes, which extended delivery timelines for BR100 units to enterprise customers and hampered operational momentum in 2023.[61][10] Yield penalties at SMIC, estimated to be significantly below global leaders on comparable nodes owing to restricted access to advanced equipment and materials, increased per-unit costs and reduced throughput, exacerbating Biren's difficulties in meeting initial market commitments amid the sanctions' enforcement.[62][60] In response to foundry constraints, Biren accelerated substitution of foreign-dependent electronic design automation (EDA) tools with domestic equivalents, enabling partial continuity in chip verification and tape-out processes despite broader ecosystem gaps, though this did not fully mitigate the foundational manufacturing bottlenecks.[63][64] Overall, these operational impacts underscored the sanctions' effectiveness in limiting Biren's access to high-volume, high-yield production, compelling a costly transition to less efficient domestic supply chains.[65]Adaptations and Sanction Evasion Efforts
In response to U.S. export controls imposed in October 2022, which restricted high-performance AI chips exceeding certain total processing performance thresholds, Biren Technology adjusted the BR100 GPU's specifications to align with regulatory limits. The original design included eight BLink interconnect ports delivering 640 GB/s of bandwidth, surpassing the 480 GB/s cap for interconnect performance under the rules; Biren reduced this to seven ports, yielding 576 GB/s, thereby enabling potential continued access to fabrication services.[12][66][67] These modifications, implemented by late 2022, involved disabling components to lower clock speeds and overall throughput without altering the core architecture fundamentally.[66] Biren has emphasized these changes as measures of legal compliance, asserting that the revised BR100 adheres to export licensing requirements while preserving competitive viability for domestic applications.[12] The company also leveraged pre-sanction stockpiles of chips and intellectual property, produced via Taiwan Semiconductor Manufacturing Company (TSMC) before production halted in October 2022 due to compliance reviews.[57][68] To address supply chain vulnerabilities from restricted foreign foundry access, Biren pivoted toward domestic alternatives, aligning with China's broader push for semiconductor self-sufficiency. In September 2025, the firm initiated IPO preparations on the Shanghai Stock Exchange, with proceeds earmarked for enhancing indigenous fabrication and reducing reliance on overseas nodes like TSMC's 7nm process.[28] This includes investments in local equipment and materials, though domestic foundries such as SMIC operate at less advanced nodes, limiting scalability.[10] Biren has publicly framed such efforts as necessary responses to perceived U.S. extraterritorial restrictions that undermine global technology norms, without violating international trade laws.[12]Achievements and Market Impact
Performance Benchmarks and Comparisons
The BR100 GPU from Biren Technology exhibits peak theoretical performance metrics that surpass Nvidia's A100 in several precision levels, according to specifications released by the company. Biren reports the BR100 achieves up to 256 TFLOPS in FP32 operations, 1 PFLOPS in BF16, and 2 POPS in INT8, compared to the A100's 19.5 TFLOPS FP32, approximately 312 TFLOPS FP16/BF16 (with tensor cores), and 1.25 POPS INT8.[31][15] These figures stem from the BR100's design incorporating 77 billion transistors on a 7 nm TSMC process with 64 GB HBM2e memory and a clustered architecture optimized for multi-GPU scaling, enabling purported linear performance increases in large clusters.[69]| Metric | Biren BR100 | Nvidia A100 (40GB) |
|---|---|---|
| FP32 TFLOPS | 256 | 19.5 |
| BF16 PFLOPS | 1 | ~0.312 |
| INT8 POPS | 2 | 1.25 |