8. September 2025·NVIDIA

NVIDIA Rubin CPX Unveiled: AI Inference Enters the “Million Token Era” with 30 Petaflops Per Card

NVIDIA has officially unveiled its new Rubin CPX chip system, set to launch by the end of 2026. This cutting-edge product is designed specifically for AI-driven video generation, software development, and large-scale context processing tasks. The Rubin CPX will be available as a card form factor, either for integration into existing server infrastructure or as an independent device operating within data centers.

Rubin CPX: Entering the “Million Token Era” of AI Inference

The Rubin CPX is part of the upcoming Rubin product line and offers a unique architecture aimed at splitting AI inference into distinct stages—“input understanding” and “output generation”—which are handled independently by separate GPU chips. This separation boosts efficiency significantly.

Each Rubin CPX GPU can deliver an impressive 30 petaflops of processing power (at NVFP4 precision), equipped with 128GB of GDDR7 memory and built-in video encoding and decoding hardware. When compared to current systems, Rubin CPX triples the performance of attention acceleration, making it a game-changer for AI computation.

On large-scale platforms, a complete Vera Rubin NVL144 CPX system can integrate up to 144 Rubin CPX GPUs, 144 Rubin GPUs, and 36 Vera CPUs within a single rack. This configuration achieves a total of 8 exaflops of AI performance, 7.5 times more powerful than the current GB300 NVL72 system. NVIDIA emphasizes that its ROI (Return on Investment) could be 30 to 50 times, meaning that a $100 million investment could potentially generate up to $5 billion in returns.

AI Inference and Real-World Applications

Positioned as a "million-token inference GPU", the Rubin CPX is designed to handle large-scale software development and video generation tasks. In software development, it helps developers manage large codebases by understanding full project libraries across multiple files. For video generation, the Rubin CPX supports AI models that can process up to one hour of video content at once, ensuring continuity and consistency in the generated results.

Several companies, including Cursor (a code generation platform), Runway (a video creation platform), and Magic (an AI research company), have already expressed interest in collaborating with NVIDIA on this technology.

Strengthening NVIDIA’s AI Infrastructure Lead

Industry analysts predict that Rubin CPX will strengthen NVIDIA’s dominance in the AI infrastructure space. In 2025, NVIDIA's data center business revenue is expected to surpass $184 billion, exceeding the total revenue of some competitors combined. The Rubin CPX’s release not only represents a hardware breakthrough but also signifies a shift in AI computation—from general-purpose architectures to highly optimized, dedicated solutions.

Do you think that the rapid advancements in AI hardware, like the Rubin CPX, will completely reshape the work patterns in software development and video generation? How will AI hardware transformation affect your workflow?

NVIDIA Rubin CPX Unveiled: AI Inference Enters the “Million Token Era” with 30 Petaflops Per Card

Rubin CPX: Entering the “Million Token Era” of AI Inference

AI Inference and Real-World Applications

Strengthening NVIDIA’s AI Infrastructure Lead

Related Posts

Prime Day 2024: Recommendations for Smart Home gadgets under $100

Hungry for More Heyup News?

Heyup Community

For Partners

Contact Us

Heyup Drop

Follow Us

NVIDIA Rubin CPX Unveiled: AI Inference Enters the “Million Token Era” with 30 Petaflops Per Card

Rubin CPX: Entering the “Million Token Era” of AI Inference

AI Inference and Real-World Applications

Strengthening NVIDIA’s AI Infrastructure Lead

Related Posts

Prime Day 2024: Recommendations for Smart Home gadgets under $100

Hungry for More Heyup News?

Heyup Community

For Partners

Contact Us

Heyup Drop

Follow Us

_{area}

_{area}