Research Notes

Is Radical Transparency the Only Way to Win the AI Cloud Wars?

Research Finder

Find by Keyword

Is Radical Transparency the Only Way to Win the AI Cloud Wars?

Nebius challenges the GPU cloud status quo with Aether 3.1, betting that visibility into capacity is just as critical as the Blackwell Ultra chips powering it.

24/12/2025

Key Highlights

  • Nebius becomes the first European cloud provider to deploy NVIDIA GB300 NVL72 and HGX B300 systems in production.

  • The new Aether 3.1 platform introduces "Capacity Blocks" to solve the industry-wide issue of phantom GPU availability.

  • Native support for NVIDIA NIM and Blueprints signals a strategic shift from pure infrastructure to application-layer enablement.

  • Enhanced governance and security features aim to attract highly regulated enterprise sectors like healthcare and finance.

  • The integration of 800 Gbps NVIDIA Quantum-X800 InfiniBand networking targets the bandwidth bottlenecks inherent in distributed training.

The News

Nebius has released Nebius AI Cloud 3.1, an update to its "Aether" platform that introduces next-generation NVIDIA Blackwell Ultra compute clusters and a suite of operational transparency tools. The update makes Nebius the first provider in Europe to offer live NVIDIA GB300 NVL72 clusters while adding features like real-time capacity dashboards and native support for NVIDIA NIM microservices. This release focuses on enterprise readiness, offering granular control over GPU reservations and improved security compliance. Find out more by clicking here to read the various press releases.

Analyst Take

The narrative in the AI infrastructure market has historically been dominated by a single metric: who has the most GPUs. However, the launch of Nebius AI Cloud 3.1 suggests a maturation in the sector, where the conversation is shifting from simple scarcity to operational reliability and transparency. My perspective is that while the hardware headlines regarding the NVIDIA Blackwell Ultra deployment are significant, the underlying software orchestration updates in the Aether platform are the true differentiators here. Nebius is architected to address the "black box" frustration that many engineering teams face when dealing with hyperscalers and GPU clouds, where reserved capacity does not always equal available capacity.

What was Announced

The core of the announcements focuses on two distinct pillars: next-generation compute hardware and the software layer designed to manage it. On the hardware front, Nebius is deploying NVIDIA GB300 NVL72 and HGX B300 systems. These are not standard commodity clusters; the GB300 NVL72 acts as a rack-scale system where 72 GPUs function as a single massive accelerator, interconnected via NVLink. Nebius aims to deliver this with NVIDIA Quantum-X800 InfiniBand networking, providing 800 Gbps of throughput to minimize latency in distributed training workloads.

On the software side, the Aether 3.1 platform introduces "Capacity Blocks" and a real-time Capacity Dashboard. This feature is designed to give users a window into their specific GPU reservations, allowing them to see exactly what compute is available in which region at any given moment. This is paired with project-level quotas and lifecycle object storage rules, which are engineered to give finance and ops teams granular control over spend and resource allocation. Additionally, the platform now natively supports NVIDIA NIM (inference microservices) and Blueprints. This allows developers to deploy pre-packaged workflows, such as the virtual screening pipeline for drug discovery, without needing to manually configure the underlying container orchestration or manage license keys for specific bio-models like Boltz2 or GenMol.

I find the move to introduce Capacity Blocks particularly observant of the current market friction. We are seeing a trend where enterprises sign contracts for thousands of GPUs but struggle to get their workloads scheduled due to fragmentation or oversubscription by the cloud provider. By visualizing this capacity, Nebius is effectively "de-risking" the procurement process for CIOs. It is a bold move because it forces the provider to be honest; you cannot hide behind an opaque dashboard if you promise real-time visibility.

The deployment of the GB300 NVL72 in Europe is also a strategic geographic play. Data sovereignty and latency are becoming critical issues for European AI labs. By aiming to deliver this caliber of compute within the European economic zone, Nebius is positioning itself to capture a segment of the market that cannot, or will not, ship their data to US-based GPU clusters. The inclusion of the Quantum-X800 InfiniBand fabric is crucial here, as model sizes grow, the network becomes the bottleneck, not the GPU. Nebius seems to have over-indexed on networking performance, which is the correct architectural decision for the Blackwell generation.

Furthermore, moving up the stack with NVIDIA NIM and Blueprint integration shows that Nebius understands it cannot win on hardware rental alone. The "time-to-science" metric is becoming the new KPI for research organizations. By allowing a bio-researcher to spin up a virtual screening workflow in a few clicks, rather than hiring a DevOps engineer to build a Kubernetes cluster first, Nebius is increasing the stickiness of its platform. It transitions the relationship from a transactional hardware rental to a platform dependency.

Looking Ahead

The announcements from Nebius signal the beginning of the "Day 2" era of AI cloud computing. "Day 1" was the frantic land grab for H100s; "Day 2" is about the manageability, observability, and efficiency of those resources. Going forward I am going to be closely monitoring how the hyperscalers respond to this level of transparency. The large players have traditionally thrived on opacity regarding resource contention, and a smaller, agile player like Nebius forcing the issue could trigger a shift in what customers demand from their Service Level Agreements.

The key trend that I am going to be looking out for is the adoption rate of the GB300 NVL72 architecture for inference versus training. While marketed heavily for training massive foundation models, the memory bandwidth of the GB300 makes it exceptionally potent for high-throughput inference of reasoning models. If Nebius can demonstrate that their "Capacity Blocks" model allows for the elastic scaling needed for inference spikes, they could carve out a defensible niche against CoreWeave and Lambda. The industry is pivoting toward "AI Factories" purpose-built facilities that differ fundamentally from general-purpose clouds. Nebius is clearly aligning its Aether platform to be the operating system for these factories, and their success will likely depend on whether their software execution can match the velocity of their hardware deployment.

Author Information

Steven Dickens | CEO HyperFRAME Research

Regarded as a luminary at the intersection of technology and business transformation, Steven Dickens is the CEO and Principal Analyst at HyperFRAME Research.
Ranked consistently among the Top 10 Analysts by AR Insights and a contributor to Forbes, Steven's expert perspectives are sought after by tier one media outlets such as The Wall Street Journal and CNBC, and he is a regular on TV networks including the Schwab Network and Bloomberg.