Powering generative AI with cloud storage innovations at Next ’24

April 20, 2024

5 min read

Generative AI is changing the way we create, innovate, and interact with the world. From generating realistic images and videos to composing music and writing code, gen AI models are pushing the boundaries of what’s possible. But achieving the heights of AI’s promises hinges on a scalable storage foundation.

At Google Cloud, we’re committed to providing the infrastructure for businesses to harness the possibilities of gen AI. At Google Cloud Next ’24, we’re excited to announce a series of advancements in our storage portfolio.

Accelerating AI training and inference with Google Cloud Storage

Gen AI models train on datasets in a computationally intensive and time-consuming process, gradually refining their ability to generate new content that resembles the training data. Similarly, AI inference (serving) in production requires low-latency access to models. At Next ’24, we introduced new storage solutions that address the challenge of decreasing model load, training, and inference times while maximizing accelerator utilization.

Cloud Storage FUSE with file caching: Faster training and inference through local data access

Cloud Storage FUSE allows you to mount Cloud Storage buckets as filesystems — a game-changer for AI/ML workloads that rely on frameworks that often require file-based data access. Training and inference can leverage the benefits of Cloud Storage, including lower cost, through filesystem APIs. And with the addition of file caching, Cloud Storage FUSE can increase training throughput by 2.9X. By keeping frequently accessed data closer to your compute instances, Cloud Storage FUSE file caching delivers faster training compared to native ML framework data loaders, so you can rapidly iterate and bring your gen AI models to market quicker.

Parallelstore: Ultra-low latency and caching for demanding training workloads

Parallelstore, Google Cloud’s parallel file system for high-performance computing and AI/ML workloads now also includes caching in preview. It delivers high performance, making it ideal for training and complex gen AI models. With caching, it enables up to 3.9X faster training times and up to 3.7X higher training throughput compared to native ML framework data loaders. Parallelstore also features optimized data import and export from Cloud Storage, to further accelerate training.

Hyperdisk ML: Purpose-built for high-performance training and inference

Training and serving inference in production require fast and reliable access to data. Hyperdisk ML is a new block storage offering that’s purpose-built for AI workloads. Currently in preview, it delivers exceptional performance, not only accelerating training times, but also improving model load times up to 11.9X compared to common alternatives. Hyperdisk ML allows you to attach up to 2,500 instances to the same volume, so a single volume can serve over 150x more compute instances than competitive block storage volumes ensuring that storage access scales with your accelerator needs.

Manage storage at scale with Generate insights with Gemini

Google Cloud is innovating to use large language models (LLMs) to help you manage cloud storage at scale. Generate insights with Gemini is built upon Insights Datasets, a Google-managed, BigQuery-based storage metadata warehouse. Using simple, natural language, you can easily and quickly analyze your storage footprint, optimize costs, and enhance security — even when managing billions of objects.

Leveraging Google Cloud’s history of thoughtfully-designed user experiences we’ve tailored Generate insights with Gemini with solutions to meet the demanding requirements of modern organizations, including:

Fully validated responses for top customer questions: Verified data responses for pre-canned prompts, ensuring rapid, precise answers to your team’s most critical questions.
Accelerated understanding with visuals: Translate complex data into clear, visual representations, making it easy to understand, analyze, and share key findings across teams.
Dive deeper with multi-turn chat: Need more context or have follow-up questions? Generate insights with Gemini’s multi-turn chat feature allows you to engage in interactive analysis, and gain a granular understanding of your environment.

Generate insights with Gemini is available now through the Google Cloud console as an allowlist experimental release.

Other notable storage announcements

Beyond AI/ML, we also unveiled a range of storage innovations at Next ’24 that benefit a wide variety of use cases:

Google Cloud NetApp Volumes: NetApp Volumes is a fully managed, SMB and NFS storage service that provides advanced data management capabilities and highly scalable performance, for enhanced cost efficiency and performance for Windows and Linux workloads. And now, NetApp Volumes dynamically migrates files by policy to lower-cost storage based on access frequency (in preview Q2’24). In addition, NetApp Volumes Premium and Extreme service levels will support volumes of up to 1PB in size, and are increasing throughput performance up to 2X and 3X, respectively (preview Q2’24). Additionally, we are introducing a new Flex service level enabling volumes as small as 1GiB, and expanding to 15 new Google Cloud regions in Q2’24 (GA).
Filestore: Google Cloud’s fully managed file storage service now supports single-share backup for Filestore Persistent Volumes and Google Kubernetes Engine (GKE) (generally available) and NFS v4.1 ( preview), plus expanded Filestore Enterprise capacity up to 100TiB.
Hyperdisk Storage Pools: With Hyperdisk Advanced Capacity (generally available) and Advanced Performance (preview), you can purchase and manage block storage capacity in a pool that’s shared across workloads. Individual volumes are thinly provisioned from these pools; they only consume capacity as data is actually written to disk, and they benefit from data reduction such as deduplication and compression. This lets you substantially increase storage utilization and can reduce storage TCO by over 50% in typical scenarios, compared to leading cloud providers. Google is the first and only cloud hyperscaler to offer storage capacity pooling.
Anywhere Cache: Working with multi-region buckets, Cloud Storage Anywhere Cache now uses zonal SSD read cache across multiple regions within a continent to speed up cacheable workloads such as analytics, and AI/ML training and inference (allowlist GA).
Soft delete: With this feature, Cloud Storage protects against accidental or malicious deletion of data by preserving deleted items for a configurable period of time (generally available).
Managed Folders: This new Cloud Storage resource type allows granular IAM permissions to be applied to groups of objects (generally available).
Tag-based at scale backup: With this feature, users can leverage Google Cloud tags to manage data protection for Compute Engine VMs (generally available).
High-performance backup for SAP HANA: A new option for backups of SAP HANA databases running in Compute Engine VMs leverages persistent disk (PD) snapshot capabilities for database-aware backups (generally available).
Backup and DR Service Report Manager: Customers can now customize reports with data from Google Cloud Backup and DR using Cloud Monitoring, Cloud Logging, and BigQuery (generally available).

Accelerate your journey with Google Cloud Storage

At Google Cloud, we’re committed to empowering businesses to unlock the full potential of AI/ML, enterprise, and cloud-first workloads. Whether you’re training massive gen AI models, serving inference at scale, or running Windows or GKE workloads, Google Cloud storage provides the versatility and power you need to succeed. Get in touch with your account team to learn how we can help you unleash the potential of generative AI with Google Cloud storage. You can also attend the following sessions live at Next ‘24 or watch them afterwards:

Posted in

techontheblog.com

Meta Llama 3 Available Today on Google Cloud Vertex AI

How to evaluate the impact of LLMs on business outcomes

July 2, 2024

How can you make sure that you are getting the desired results from your large language models (LLMs)? How to…

how-to-evaluate-the-impact-of-llms-on-business-outcomes

6 min read

Announcing general availability of Ray on Vertex AI

May 16, 2024

Developers and engineers face several major challenges when scaling AI/ML workloads. One challenge is getting…

announcing-general-availability-of-ray-on-vertex-ai

5 min read

To tune or not to tune? A guide to leveraging your data with LLMs

May 17, 2024

Customers tell us they see great potential in using large language models (LLMs) with their data to improve…

to-tune-or-not-to-tune?-a-guide-to-leveraging-your-data-with-llms

5 min read

Vertex AI at I/O: Bringing new Gemini and Gemma models to Google Cloud customers

May 15, 2024

Vertex AI is Google Cloud’s fully-managed, unified development platform for leveraging models at scale, with a…

vertex-ai-at-i/o:-bringing-new-gemini-and-gemma-models-to-google-cloud-customers

6 min read

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

Introducing Partner Companion: An AI-powered advisor for enhanced customer engagement

Search engines made simple: A low-code approach with GKE and Vertex AI Agent Builder

Deckmatch powers insights for venture capitalists with Cloud SQL for PostgreSQL

Powering generative AI with cloud storage innovations at Next ’24

Accelerating AI training and inference with Google Cloud Storage

Cloud Storage FUSE with file caching: Faster training and inference through local data access

Parallelstore: Ultra-low latency and caching for demanding training workloads

Hyperdisk ML: Purpose-built for high-performance training and inference

Manage storage at scale with Generate insights with Gemini

Other notable storage announcements

Accelerate your journey with Google Cloud Storage

Meta Llama 3 Available Today on Google Cloud Vertex AI

Using Gemini Code Assist to build APIs, integrations, and automation flows

Leave a Reply Cancel reply

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

Introducing Partner Companion: An AI-powered advisor for enhanced customer engagement

Search engines made simple: A low-code approach with GKE and Vertex AI Agent Builder

Deckmatch powers insights for venture capitalists with Cloud SQL for PostgreSQL

How to build user authentication into your gen AI app-accessing database

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

Introducing Partner Companion: An AI-powered advisor for enhanced customer engagement

Search engines made simple: A low-code approach with GKE and Vertex AI Agent Builder

Deckmatch powers insights for venture capitalists with Cloud SQL for PostgreSQL

How to build user authentication into your gen AI app-accessing database

Powering generative AI with cloud storage innovations at Next ’24

Accelerating AI training and inference with Google Cloud Storage

Cloud Storage FUSE with file caching: Faster training and inference through local data access

Parallelstore: Ultra-low latency and caching for demanding training workloads

Hyperdisk ML: Purpose-built for high-performance training and inference

Manage storage at scale with Generate insights with Gemini

Other notable storage announcements

Accelerate your journey with Google Cloud Storage

Share this article

Meta Llama 3 Available Today on Google Cloud Vertex AI

Using Gemini Code Assist to build APIs, integrations, and automation flows

Leave a Reply Cancel reply

Read next