Google I/O 2025: From research to reality

Written by
Audrey Miles
Updated on:June-20th-2025
Recommendation

Google I/O 2025 gives you a glimpse into the amazing progress and application scenarios of AI in the future.

Key content:
1. Google CEO Sundar Pichai reveals the rapid development and iteration of AI models
2. The excellent performance of TPU Ironwood infrastructure and its contribution to the progress of AI models
3. The explosive growth of global AI applications and new progress of Google Beam technology

Yang Fangxian
Founder of 53A/Most Valuable Expert of Tencent Cloud (TVP)


Editor's Note: The following is an edited transcript of Google CEO Sundar Pichai's remarks at Google I/O 2025, adjusted to include more announcements on stage.


Normally, we don’t reveal much in the weeks leading up to I/O because we save our biggest models for release at the conference, but in the Gemini era, we’re likely to have our smartest model out on a Tuesday in March, or announce an exciting breakthrough like AlphaEvolve a week in advance.


We want to get the best models out to you as quickly as possible and into our products as quickly as possible, so we’re releasing faster than ever before.



Continuous iteration of the model


I am particularly excited about the rapid development of the model. Elo rating, an important mechanism for measuring model progress, has increased by more than 300 points since the release of the first model, Gemini Pro. Today, Gemini 2.5 Pro ranks first in all categories of the LMArena leaderboard.


The model’s progress is made possible by our world-leading infrastructure. Our seventh-generation TPU, Ironwood, is the first designed to support thinking and reasoning AI workloads at scale. It delivers 10 times the performance of the previous generation, with each pod (computing unit) delivering an incredible 42.5 exaflops of computing power—it’s simply amazing.


It is our overall infrastructure advantage deep into TPUs that has helped us deliver faster, better performing models while the price of models has dropped significantly. Time and again, we have delivered the best models in the most cost-effective way. Google is not only staying ahead on the Pareto frontier, but fundamentally expanding it.



The world is embracing AI


More intelligence is becoming accessible, benefiting everyone and everywhere. All parts of the world are responding and embracing AI at an unprecedented speed. Here are some important developments:


  • Last year at this time, we were processing 9.7 trillion tokens per month across our various products and APIs. Now, we’re processing over 480 trillion tokens per month—an increase of more than 50 times.

  • More than 7 million developers are building with Gemini, five times more than last year, and Vertex AI usage has grown 40 times.

  • The Gemini app now has over 400 million monthly active users. We are seeing strong growth and high engagement with the 2.5 series models in particular, with 2.5 Pro usage in the Gemini app growing by 45% .


From research to reality


All of these advances suggest that we are in a new phase of the AI ​​platform shift. This means that decades of research are now becoming a reality for people, businesses, and communities around the world.


Project Starline →  Google Beam +  Voice Translation


A few years ago at I/O, we debuted Project Starline, a breakthrough 3D video technology designed to create the feeling of being in the same room even when people are thousands of miles apart.


We continue to make technological advances. Today, we’re introducing the next chapter: Google Beam, a new AI-first video communications platform. Beam uses state-of-the-art video models, an array of six cameras, and AI to transform 2D video streams into a realistic 3D experience, fusing the streams to present the user on a 3D light field display. It enables near-perfect head tracking, accurate to the millimeter, and renders in real time at 60 frames per second. The result is a more natural, immersive conversational experience. In partnership with HP, the first Google Beam devices will be available to early customers later this year.



Over the years, we've also been creating more immersive experiences in Google Meet. This includes bringing a technology that helps people break language barriers through voice translation to Google Meet. It matches the speaker's voice, intonation, and even expressions in near real time, making cross-language communication more natural and smooth. The translation feature for English and Spanish is rolling out in beta to Google AI Pro and Ultra subscribers, with more languages ​​to come in the coming weeks. This year, this feature will also be available for early testing to Workspace enterprise customers.


Project Astra → Gemini Live


Another exciting research project that debuted at I/O is Project Astra, which explores a general AI assistant that can understand the world around it. Now, Gemini Live has integrated Project Astra's camera and screen sharing capabilities. People are using it in interesting ways, from interview preparation to marathon training. This feature is available to all Android users and will start rolling out to iOS users today.


We'll also bring these capabilities to products like Search.


Project Mariner → Agent Mode


We think of agents as systems that combine the intelligence of advanced AI models with the power of tool invocation so they can perform actions on your behalf, under your control.


Project Mariner, our early research prototype, is an early attempt at computer-use capabilities for an agent that can interact with the web and complete tasks for you. We released an early research prototype of it last December. Since then, we’ve made a lot of progress on new multitasking capabilities and introduced a method called “teach and repeat,” where you only need to show it a task once and it learns to plan similar tasks in the future. We’re making Project Mariner’s computer-use capabilities available to developers through the Gemini API. Trusted testers like Automation Anywhere and UiPath have already started using it for development, and it will be available more broadly this summer.


The thriving of the intelligent agent ecosystem requires us to build a broader set of tools, and the use of computers is part of that.


For example, our open Agent2Agent protocol is designed to enable agents to talk to each other. In addition, the Model Context Protocol introduced by Anthropic allows agents to access other services. Today, we are happy to announce that our Gemini API and SDK are now compatible with MCP tools.


We’re also starting to bring agent capabilities to Chrome, Search, and the Gemini app. For example, the new Agent Mode in the Gemini app will help you get more done. If you’re looking for an apartment, it will help you find listings that match your criteria on sites like Zillow, adjust filters, and use MCP to access listings and even schedule showings for you. An experimental version of Agent Mode in the Gemini app will be available to subscribers soon. This is great for companies like Zillow, bringing in new customers and increasing conversions.


This is an emerging field, and we’re excited to explore how best to bring the benefits of artificial intelligence more broadly to users and the ecosystem as a whole.


The power of personalization


The best way to bring research to life is to make it work in your own real life. That’s the power of personalization. We’re doing this through a technology we call “personal context.” With your permission, Gemini models can use relevant personal context in your Google apps in a way that is private, transparent, and fully controlled by you.


One example is our new personalized Smart Replies in Gmail. If a friend emails you asking for advice about a past trip you took, Gemini can search your past emails and files in Google Drive, like an itinerary you created in Google Docs, to suggest a reply with specific details. It will match your typical greeting, capturing your tone, style, and even your favorite words to create a reply that is more relevant and more like you. Personalized Smart Replies will be available to subscribers later this year. It’s easy to imagine how personal context will be useful in Search, Gemini, and more.


AI Mode in Search


Our Gemini model is helping make Google Search smarter, more agent-capable, and more personalized.


Since launching last year, AI Overviews has reached more than 1.5 billion users and is now available in 200 countries and regions. As people use AI Overviews, we’ve seen them become more satisfied with the results and search more often. In our largest markets, such as the United States and India, the AI ​​Overviews feature has driven more than 10% growth in queries showing the feature, and this growth continues.


This is undoubtedly one of the most successful launches in the past decade of search.


For those who want an end-to-end AI search experience, we’re launching the new AI Mode. This is a complete reinvention of search. With more advanced reasoning capabilities, you can ask longer and more complex queries with AI Mode. In fact, early testers are asking queries two to three times longer than traditional search, and you can go further and ask follow-up questions. This will be available directly in Search as a new tab.


I’ve been using it a lot and it has completely changed the way I use Search. I’m excited to announce that AI Mode is rolling out to all users in the US starting today. With our latest Gemini model, the quality and accuracy of our AI responses are the same you’d expect from Search and are the fastest in the industry. Gemini 2.5 is also coming to Search in the US starting this week.


Advancing our smartest model yet:  Gemini 2.5


Our powerful and efficient workhorse model, Gemini 2.5 Flash, is popular with developers for its speed and low cost. The new version 2.5 Flash has been improved in almost every aspect, with improvements in key benchmarks such as reasoning, multimodality, code, and long context. It ranks second only to 2.5 Pro on the LMArena leaderboard.


We are making 2.5 Pro even more powerful by introducing an enhanced reasoning mode we call Deep Think. It applies our latest cutting-edge research in thinking and reasoning, including parallel thinking technology.


More personalized, more proactive, more powerful

Gemini Application


We are making Deep Research more personal, allowing you to upload your own files and soon connect to Google Drive and Gmail to enhance its ability to generate customized research reports. We are also integrating it with Canvas to create dynamic infographics, quizzes, and even podcasts in multiple languages ​​with a single click. In addition to this, we are excited to see the widespread adoption of Canvas in vibe coding, which allows more people to easily create functional applications just by chatting with Gemini.


For Gemini Live, a feature that users love, we’re making camera and screen sharing free to everyone, including iOS users. Soon it will connect to your favorite Google apps for even more seamless help.


Our progress in generative media models


We launched Veo 3, our newest and most advanced video model, now with native audio generation capabilities, and Imagen 4, our newest and most powerful image generation model. Both models are available in the Gemini app, opening up a whole new world of creativity.


We’re also bringing these possibilities to filmmakers with a new tool called Flow, which lets you create movie clips and expand short clips into longer scenes.


Opportunities to improve lives


The opportunities presented by AI are truly profound. It will be up to our generation of developers, technologists, and problem solvers to ensure that they benefit as many people as possible. It’s especially exciting to think that the research we’re doing today—from robotics to quantum computing to AlphaFold to Waymo—will become the foundation of tomorrow’s realities.


I know firsthand that opportunities to improve lives are hard to come by. I recently experienced this firsthand. I was in San Francisco with my parents. The first thing they did was ask to ride a Waymo. I learned that this is becoming one of the most popular tourist attractions in the area. I have ridden in a Waymo before, but my father, who is in his 80s, was completely blown away; at that moment, I had a new appreciation for this progress.


It reminded me of the incredible power technology has to inspire, astound, and propel us forward. I can’t wait to see what we’ll create together next.