Nvidia Launches Rubin CPX AI Accelerator, Promising 30 PetaFLOPs for Data Centers

September 19, 2025
Nvidia Launches Rubin CPX AI Accelerator, Promising 30 PetaFLOPs for Data Centers
  • Nvidia has unveiled the Rubin CPX, a new AI inference accelerator designed to boost high-value inference tasks in data centers, complementing their existing Rubin AI GPU.

  • This accelerator boasts 128GB of GDDR7 memory, hardware encode/decode engines for video processing, and can achieve 30 petaFLOPs of performance using the NVFP4 data format, with notable improvements in attention acceleration and token processing.

  • Optimized for complex AI tasks such as large-scale software development, video generation, and deep research, the Rubin CPX works alongside the Vera CPU and Rubin AI GPU within Nvidia's data center ecosystem.

  • The development of the Rubin CPX underscores Nvidia’s focus on creating specialized AI accelerators tailored for different AI workloads, highlighting the importance of hardware optimization for evolving AI models.

  • Nvidia plans to integrate Rubin CPX into a single rack with the Vera CPU and Rubin AI GPU or offer it as a standalone rack, with configurations capable of delivering up to 8 exaFLOPs of performance, promising significant ROI for data center investments.

  • The company anticipates the need for annual updates to AI GPU architectures to keep pace with rapid AI innovation and to optimize performance across various AI workloads.

  • In addition to hardware, Nvidia introduced infrastructure innovations such as the NVL144 rack design, KV Cache, Dynamo inference framework, and enhancements to NVLink, Spectrum-X, and Quantum-X networking to support AI data centers and 'AI factories'.

Summary based on 1 source


Get a daily email with more Tech stories

More Stories