As the fuel for AI, data’s role in driving innovation is of the utmost importance. According to recent research, 96% of CxOs recognize generative AI will become critical to the success of any organization, and 70% of an employee’s tasks have the potential to be automated by gen AI and other technology. However, a major obstacle that stands in the way for true enterprise gen AI adoption is data accessibility. According to the Google Cloud Data and AI Trends Report 2024, 66% of organizations report at least half of data is dark, and much of this data is unstructured and unmanaged today.
As a data leader, you are the agent of unprecedented change — to reach new customers, innovate products and processes, train teams, and so much more. The opportunity ahead of you is to connect all your business data to AI to empower your teams with data insights that are context-aware and constantly improving.
AI-ready data platforms are revolutionizing the way organizations handle information, leaving behind the legacy world of data warehouses and data lakes, and embracing all data regardless of format. These platforms consolidate data management into a single, unified environment, simplifying governance and helping ensure security. But that’s just the beginning. Imagine extracting meaningful insights from images, videos, and text effortlessly, while real-time interaction with large language models (LLMs) enhances the decision-making processes. AI-ready data platforms aren’t just a trend; they’re a strategic imperative for staying competitive in today’s data-driven landscape.
At Google Cloud, we are holding up our end of the bargain. In 2023, we delivered 350+ new data analytics product capabilities, up 60% from the previous year. This included new gen AI capabilities to accelerate the work of data teams, as well as AI infused into our data platform through integration of BigQuery with Vertex AI and Gemini models. In 2024, we moved to bring these capabilities and more into BigQuery — our unified platform from data to AI. In the coming weeks, we are releasing even more new innovations and maturing gen AI capabilities to general availability and preview to support your data to AI journey.
These include:
-
Gemini in BigQuery delivers AI-powered experiences for data engineering, data exploration and analysis, governance and security tasks. This includes Gemini in BigQuery features such as code generation, completion and explanation (SQL, Python), data insights, data canvas, data preparation, and partitioning and clustering recommendations.
-
Gemini in Looker capabilities for AI-powered formula assistance, enabling you to explore data and create metrics from complex formulas, and slide generation, delivering new ways for you to send your analysis straight to slides and leverage Gemini to create presentations with compelling narratives, all seamlessly integrated with Google Workspace. Users of all types can simply chat with your BigQuery data and Looker models to gain immediate and actionable insights.
-
BigQuery’s unified data platform now empowers you to analyze structured, unstructured, and open-format data seamlessly using SQL, Spark and Vertex AI integrations. Our support for Iceberg open format extends to automatically optimizing your data for price-performance. In addition, we are announcing the general availability of the Delta format, expanding the list of open data formats natively supported.
-
Support for open-source Apache Spark and Apache Kafka data streaming and processing, enabling you to leverage serverless Spark for processing data in your platform with lightning speed, versatility, and scalability, and use of Apache Kafka to extract real-time insights and take action on insights immediately.
-
Real-time streaming from Analytics Hub to enable you to subscribe and get access to real-time data feeds within Analytics Hub, manage access to shared topics at scale, helping to ensure data security and compliance. This is possible through enhanced data publisher capabilities and Pub/Sub streams via AnalyticsHub. Also with Analytics Hub integration with Google Marketplace means easy commercialization, with an AI and data marketplace where commercial assets like datasets, models, and streams can be purchased and sold to support faster decisions using up-to-date information.
-
A data migration customer offer and enhanced data migration services tooling to help you de-risk migrations, providing upfront credits to cover system costs and egress charges, as well as providing funding to cover migration services costs. In addition, our migration services tooling accelerate time-to-value of your data workloads and help you save on migration and egress costs.
“Google Cloud continues to strengthen its AI-ready data ecosystem. Gemini integration is an example of the gen AI augmentation we’re seeing that will drive innovation and enhance use cases for data teams and information workers. Platform unification, like the innovations we’re seeing with BigQuery, will make things simpler and easier for customers looking at data platform migrations.” – Doug Henschen, VP & Principal Analyst, Constellation Research, Inc.
Gemini data agents for enhanced productivity
Gemini in BigQuery delivers AI-powered experiences such as data preparation, exploration and analysis, governance and security throughout the data journey, as well as intelligent recommendations to enhance user productivity and optimize costs.
Building on our announcements at Next ‘24, we now have new features, including data insights and data preparation, coming to preview, and with previously announced Gemini in BigQuery features moving to general availability, including code assistance for SQL and Python, data canvas, as well as partitioning and clustering recommendations.
SQL code generation using Gemini in BigQuery
Indonesian fintech provider Julo uses Gemini in BigQuery to help boost their efficiency when making SQL templates, delivering a better workflow.
“Gemini in BigQuery has transformed our query generation process. The integration into BigQuery makes it easy to generate SQL templates and has helped boost the efficiency of our label and feature engineering, including crucial machine-learning model-monitoring queries. Gemini’s ability to understand complex data structures and deliver accurate queries has made our workflow smoother and faster than ever.” – Martijn Wieriks, Chief Data Officer, Julo
Wunderkind, a global performance marketing solution, uses data canvas with Gemini in BigQuery for investigation and exploration to help simplify visibility into queries, saving time and capacity of the data team.
“For any sort of investigation or exploratory exercise you know will result in multiple queries, there really is no replacement. It’s saved us so much time and mental capacity” – Scott Schaen, VP of Analytics, Wunderkind
Slide Generation using Gemini in Looker
With Gemini in Looker capabilities such as formula assistant and slide generation, now available in preview, information workers can chat with their data. Now you can create calculated fields on-the-fly without having to remember complicated formulas. Automatic slide generation helps create impactful presentations with insightful text summaries of your data.
Conversational Analytics is transforming the way organizations interact with their data. Imagine simply chatting with your data, getting immediate, actionable insights. This is a game-changer for analysts, liberating them from the endless cycle of report creation and empowering business users with true self-service.
This approach goes beyond basic question-and-chart interactions. Advanced language models are leveraged to guide users through their data, offer summaries, and even surface automated insights – ensuring crucial information doesn’t slip through the cracks.
We are committed to integrating Conversational Analytics across the entire portfolio. Looker customers will be able to start a standalone chat client and for more flexibility transition to Looker Studio for fine-tuned dashboarding and reporting.
Simplicity with an AI-ready, unified data platform
BigQuery helps you get all your data ready for AI. At Google Cloud Next ’24 we announced that BigQuery would be the single integrated platform for your data-to-AI journey, designed to be multimodal, multi-engine, and multicloud.
To help, we’ve built a first-party connection from BigQuery to Vertex AI to provide direct access to AI models and fine-tune LLMs using your enterprise data, helping to ensure greater consistency and accuracy for your models. We also introduced new query capabilities using vector indexing directly in BigQuery to leverage AI with your data where it is stored. BigQuery vector indexes now support Google’s ScaNN algorithm for efficient batch vector search. And recently, we added support for the latest Gemini models to BigQuery, as well as safety enhancements and grounding support.
“Switching to BigQuery has transformed our ability to access, understand, and use data at Veo. With its direct integration into other Google Cloud solutions, we get greater accessibility and faster insights, unlocking significant impact and enablement across our organization. Even non-technical team members feel comfortable using BigQuery to run ad-hoc analyses themselves, freeing up time for our analytics team to work on high-value projects.” – Max Schuman, Senior Data Scientist, Veo
Macquarie Bank is co-innovating with Google Cloud on the next wave of digital banking services for their customers in Australia. By unifying all of their data, they have made it easier to connect with the latest AI technologies — including gen AI — to enable scale and build new ways for customers to interact with its financial services.
More and more customers rely on BigQuery to manage unstructured data including images, audio, and video formats via object tables, usage of which has grown over 600% YoY. BigLake, the open storage engine for BigQuery, provides customers the ability to analyze open structured and multimodal data on one platform, along with a fully managed Apache Iceberg experience, to build a fully managed, streaming and AI-optimized open lakehouse. At Next ‘24 we announced foundational capabilities for BigLake to further enable an open multiformat and multimodal platform, including a single runtime metastore that interoperates across multiple engines and open-format table types such as Parquet and Apache Iceberg. In addition, we are adding support for more open formats like Delta Lake, which is now GA. Finally, workflow and scheduling enhancements that are now generally available in BigQuery provide data teams with more automation over their data pipelines.
“Deutsche Telekom built a horizontally scalable data platform in an innovative way that was designed to meet our current and future business needs. With BigQuery at the center of our enterprise’s One Data Ecosystem, we created a unified approach to maintain a single source of truth while fostering de-centralized usage of data across all of our data teams.” – Ashutosh Mishra, VP of Data Architecture, Deutsche Telekom
Data processing and streaming made easy
Apache Spark has become a popular data-processing runtime, especially for data engineering tasks. In fact, customers’ use of serverless Apache Spark in Google Cloud increased by over 500% in the past year. BigQuery’s newly integrated Spark engine lets you process data using PySpark as you do with SQL. Like the rest of BigQuery, the Spark engine is completely serverless — no need to manage compute infrastructure.
For many companies, running Apache Kafka meant managing many clusters across multiple cloud vendors and on-premises distributions. We’ve heard from many customers that they would like a simpler way to run an Apache Kafka cluster on Google Cloud. Today, you can turn one up in any of your projects with Google Cloud Managed Service for Apache Kafka directly in the Google Cloud console. This managed service helps automate Kafka operations and security while providing customers with the ability to run streaming analytics at scale and integrate them with user-facing operational systems.
Streaming data is also important across multiple industries that need to share real-time data with their partners and customers. For example, a retailer may want to share inventory levels in real-time with Consumer Packaged Goods (CPG) enterprises to provide real-time fulfillment visibility. To help organizations easily share and monetize their data from BigQuery in real-time, we are announcing the preview of Pub/Sub topics sharing in Analytics Hub. Pub/Sub is a globally used messaging service for reliable, large-scale data streaming. Analytics Hub, BigQuery’s data exchange platform, is used by thousands of companies to securely share hundreds of petabytes across organizational boundaries in a given week with zero-copying at scale. This new integration enables curated sharing of streaming data, centralized management of data access, and the discovery of valuable data from other organizations, all in real-time.
Accelerate your data journey to the cloud for AI readiness
Moving your data to the cloud is perhaps the number one way to prepare for the AI era. To help organizations accelerate time-to-value with data and AI, we are excited to introduce a data platform migration incentive program targeted at data warehouses and data lakes, plus enhanced BigQuery migration services tooling. Now, it’s easier than ever to migrate all types of data and workloads — from multimodal data to SQL, Spark, and Python — to Google Cloud. Get started on your Data Cloud journey today — take the data strategy assessment and connect with our team to accelerate your path to innovation using your enterprise data for AI.