Nvidia Launches Rubin CPX AI Accelerator, Promising 30 PetaFLOPs for Data Centers
September 19, 2025
Nvidia has unveiled the Rubin CPX, a new AI inference accelerator designed to boost high-value inference tasks in data centers, complementing their existing Rubin AI GPU.
This accelerator boasts 128GB of GDDR7 memory, hardware encode/decode engines for video processing, and can achieve 30 petaFLOPs of performance using the NVFP4 data format, with notable improvements in attention acceleration and token processing.
Optimized for complex AI tasks such as large-scale software development, video generation, and deep research, the Rubin CPX works alongside the Vera CPU and Rubin AI GPU within Nvidia's data center ecosystem.
The development of the Rubin CPX underscores Nvidia’s focus on creating specialized AI accelerators tailored for different AI workloads, highlighting the importance of hardware optimization for evolving AI models.
Nvidia plans to integrate Rubin CPX into a single rack with the Vera CPU and Rubin AI GPU or offer it as a standalone rack, with configurations capable of delivering up to 8 exaFLOPs of performance, promising significant ROI for data center investments.
The company anticipates the need for annual updates to AI GPU architectures to keep pace with rapid AI innovation and to optimize performance across various AI workloads.
In addition to hardware, Nvidia introduced infrastructure innovations such as the NVL144 rack design, KV Cache, Dynamo inference framework, and enhancements to NVLink, Spectrum-X, and Quantum-X networking to support AI data centers and 'AI factories'.
Summary based on 1 source
Get a daily email with more Tech stories
Source

Forbes • Sep 19, 2025
Nvidia’s AI Factory Vision Comes Into Focus With Rubin CPX