Skip to content
·NVIDIA

NVIDIA's Rubin CPX GPU Revealed: Featuring 192 SMs and an Estimated 512-bit Memory Bus

NVIDIA has officially announced the Rubin CPX GPU, a new processor specifically engineered for large-scale context inference and AI-driven video generation applications. Unveiled on September 9th in the US, this GPU is built on the "Rubin" architecture but presents a distinct set of features compared to the previously detailed Rubin Tensor Core GPU, carving out its own niche in the AI hardware landscape.

The Rubin CPX GPU is characterized by a cost-effective monolithic design. Its architecture is heavily optimized for NVFP4 data format computations, making it highly efficient for specific AI workloads. Furthermore, it integrates dedicated NVENC and NVDNC video encode/decode units, highlighting its focus on video-centric tasks. To support these demanding operations, the GPU is equipped with a substantial 128GB of next-generation GDDR7 memory.

An analysis of NVIDIA's official render of the Rubin CPX reveals a structure composed of 192 repeating units, arranged in a 4x4x3x4 configuration. It is widely speculated that these units correspond directly to 192 Streaming Multiprocessors (SMs). This SM count is notably on par with that of the GB202, suggesting a powerful computational core designed for massive parallelism.

At the rack system level, the integration of these new GPUs delivers a significant performance uplift. The Vera Rubin NVL144 CPX system, which incorporates 144 Rubin CPX GPUs, sees its NVFP4 compute performance increase by an astounding 4.4 EFLOPS. Concurrently, the system's memory bandwidth is boosted by 0.3 PB/s, and its fast storage capacity grows by 25TB, demonstrating the massive scalability of the Rubin CPX architecture.

Diving into the memory specifications, each Rubin CPX GPU provides an approximate memory bandwidth of 2083 GB/s. By comparing this with the GeForce RTX 5090, which achieves 1792 GB/s with a 512-bit bus, and considering the Rubin CPX's 128GB memory capacity, it is strongly suggested that the Rubin CPX GPU also utilizes a 512-bit memory bus width. This configuration would imply an impressive effective memory speed of approximately 32.55 Gbps.

_{area}

_{region}
_{language}