Google Cloud has announced a widespread expansion in its cloud infrastructure, promising a varied and powerful approach to AI training, inference and data processing.
At Google Cloud Next 2024, its annual conference held at Mandalay Bay in Las Vegas, Google Cloud unveiled a series of new advancements for its 'AI Hypercomputer' architecture, aimed at helping customers unlock the full potential of AI models. .
Needing to meet growing customer demand, AI Hypercomputer brings together Google Cloud TPUs and GPUs as well as its AI software to offer a broad portfolio of generative AI training options.
One of the central pillars of the AI hypercomputer architecture is Google Cloud's Tensor Processing Units (TPUs) (circuits designed specifically for neural networks and AI acceleration) and Google Cloud has announced that the latest iteration of TPU v5p is now available. widely available.
First announced in December 2023, Google Cloud claimed that TPU v5p can train large language models three times faster than the previous generation. Each TPU v5p contains 8960 chips that can unlock memory bandwidth improvements of 300% per chip.
Google Cloud's A3 virtual machines (VMs), announced in May 2023, will be joined by a new 'A3 mega' VM. With an array of Nvidia H100 GPUs in each VM, A3 mega will offer twice the GPU-to-GPU network bandwidth, making them ideal for running and training the largest AI workloads on the market.
A new service called Hyperdisk ML will help companies leverage block storage to improve data access for artificial intelligence and machine learning (ML) purposes. It follows the 2023 announcement of Google Cloud Hyperdisk, a block storage service that helps enterprises connect durable storage devices to individual VM instances.
Hyperdisk ML was built with AI in mind, with the ability to cache data on servers to perform inferences across thousands of inferences if needed. In a briefing, Google Cloud claimed that Hyperdisk ML can load models up to 12 times faster than alternative solutions.
In its race to achieve greater power to support workloads, Google Cloud will soon also host Nvidia's Blackwell family of chips. This brings Google Cloud virtual machines to the frontier of performance, with the Nvidia GHX B200 capable of ten times the improvements over Nvidia's Hopper chips for AI use cases.
The GB200 NVL72 brings even more power to the table, combining 36 Grace CPUs and 72 Blackwell GPUs for power that could handle LLMs of trillions of near-future parameters. With liquid cooling and its high-density, low-latency design, the data center solution can deliver 25 times the performance of an Nvidia H100 while using the same power and less water.
This puts it more in line with green data centersa key selling point for Google as it balances AI's energy demands with its own sustainability goals.
AWS also recently announced that it is bringing Blackwell to his platform, pitting the two hyperscalers against each other in terms of performance potential. The key differentiator here is that the partnership between Google Cloud and Nvidia parallels the former's broader investment focus on competitive hardware for enterprise use cases.
“Our strategy here is really focused on system-level design and optimization, across compute, storage, networking, hardware and software to meet the unique needs of each and every workload,” said Mark Lohmeyer, vice president /CEO of IT and Machine Learning Infrastructure. on Google Cloud.
Leaning toward open, general-purpose computing
In the context of specific AI needs, Google Cloud has also made the decision to expand its line of general-purpose cloud computing offerings. With this in mind, the firm has announced a general-purpose CPU for data centers called Google Axion.
Axion is Google Cloud's first Arm-based CPU and is already used to power Google services like BigTable, BigQuery, and Google Earth Engine. The firm said Axion is capable of achieving performance improvements of up to 50% compared to current-generation x86 instances and a 30% improvement over Arm-based instances.
The chip is also expected to offer up to 60% more power efficiency than comparable x86 instances, which businesses could take advantage of to reduce operating costs and more easily achieve environmental goals.
Built on an open foundation, the goal is for Axion to be as interoperable as possible so that customers don't need to rewrite any code to get the most out of the CPU.
Google Cloud has announced two new virtual machines, C4 and N4, which will be the first in the industry to feature Intel's fifth-generation CPUs. Google Cloud stated that N4 excels in workloads that do not need to run at full speed for long periods of time, giving the example of microservices, virtual desktops or data analysis.
In contrast, C4 is intended to accelerate mission-critical workloads, with a 25% performance improvement over the C3 VM and “zero-impact” upgrades to ensure critical services can run without interruption.
In all of its new infrastructure offerings, Google Cloud has emphasized the importance of meeting customers' performance demands without compromising price or energy efficiency. By casting a wide net, the hyperscaler has shown concrete improvements across all computing demands rather than simply duplicating higher-end workloads.