NVIDIA and Google Cloud Unveil Game-Changing AI Innovations Around Gemini, Blackwell, and More

NVIDIA and Google Cloud are expanding their partnership to bring 5 major advancements that combine cutting-edge hardware and AI models. These new features announced in May 2025 are centered around Google’s Gemini and Gemma model families, NVIDIA’s Blackwell GPUs, and secure cloud infrastructure. The result is a robust framework to help you build, run, and scale AI workloads more efficiently, securely, and flexibly.

The partnership is designed to give you more control, better performance, and a faster path to next-gen AI solutions both in the cloud and on-premises. From confidential computing to on-premise AI model deployment, here’s what customers and developers need to know.

Gemini models can now run on-premises with NVIDIA Blackwell

One of the biggest announcements is the ability to deploy Google Gemini models on-premise using NVIDIA’s Blackwell GPUs through the Google Distributed Cloud. This means you can have full control over your data and models and get the power of agentic AI.

Google’s Distributed Cloud is a managed service that supports air-gapped environments and edge deployments. Now, with Gemini models able to run locally, you can meet compliance requirements in regulated sectors healthcare, defense, and finance, without sacrificing AI capabilities. This means you can bring advanced AI to your internal secure environments.

Optimizations for Gemini and Gemma on NVIDIA GPUs

The Gemini and Gemma models have been optimized for NVIDIA’s GPU infrastructure to further boost performance. Gemini inference workloads will run faster and more efficiently across Vertex AI and the Google Distributed Cloud using NVIDIA’s hardware acceleration.

Meanwhile, Gemma models have been tuned using the Tensor RT-LLM library. These will be available as NVIDIA NIM microservices, so you can deploy AI quickly with optimized speed and resource usage.

Confidential computing with NVIDIA H100 GPUs

Google Cloud’s focus on security has extended into confidential computing. Customers can now preview Confidential VMs and Confidential GKE nodes on the A3 series with NVIDIA H100 GPUs. These VMs use hardware-level protections to encrypt data while it’s being processed, commonly referred to as “data in use” protection.

NVIDIA’s Daniel Rohrer said this approach puts data owners in full control of how their data is accessed and managed. With hardware-backed security from NVIDIA, you can now run sensitive AI workloads like scientific research or proprietary model training with more trust and safety.

Google’s A4 VMs with NVIDIA Blackwell are now available.

And finally, the A4 VMs with NVIDIA Blackwell are now generally available. These VMs have eight NVIDIA Blackwell GPUs connected by NV Link and built on NVIDIA’s HGX B200 architecture. We first introduced these VMs in February, and now they offer significant performance gains over the A3 instances.

These powerful VMs are for AI hypercomputing and are now available through services like Vertex AI and GKE. You get ultra-fast model training, improved throughput, and lower latency, perfect for large language models, agentic AI systems, and real-time applications.

Powering the next-gen of enterprise AI

These five are a big deal for businesses looking to secure high-performance AI infrastructure. Whether on-premises or in the cloud, the combination of NVIDIA’s hardware and Google Cloud’s software is a flexible, scalable, and secure foundation for your next AI project.

As businesses demand more control, performance, and trust in their AI systems, NVIDIA and Google Cloud are the ones delivering integrated solutions that meet enterprise-grade requirements, without sacrificing innovation or agility.

Oracle Commits $3 Billion to Boost AI and Cloud In ...

10 Benefits of Integrating Agentic AI in Cloud-Nat ...