Google issues an invitation to fuel creativity
![]() |
| Source: Google blog post. Google has introduced its newest generative media models at Google I/O. |
Google has introduced its latest video and image generation models, Veo 3 and Imagen 4.
Veo 3 introduces the ability to generate videos with audio. It represents improvements over Veo 2 for text and image prompting, real-world physics and lip-syncing accuracy.
"It’s great at understanding; you can tell a short story in your prompt, and the model gives you back a clip that brings it to life," said Google Deepmind VP Eli Collins in a blog post.
Veo 3 is available today in the US, and also to enterprise users on Vertex AI.
Imagen 4 offers finer detailing for images of intricate fabrics, water droplets, and animal fur, and excels in both photorealistic and abstract styles, Collins said. "Imagen 4 can create images in a range of aspect ratios and up to 2K resolution - even better for printing or presentations. It is also significantly better at spelling and typography, making it easier to create your own greeting cards, posters and even comics," he shared.
Imagen 4 is already available in the Gemini app, Whisk, Vertex AI and across Google Slides, Vids, Docs and more in Workspace. A variant of Imagen 4 that is up to 10x faster than Imagen 3 is coming soon, Collins said.
The company is also expanding access to Lyria 2, giving musicians more tools to create music.
Visual storytellers are invited to try Flow, a new AI film-making tool. Flow allows users to describe envisioned shots while managing cast, locations, objects and styles in one place.
Flow is currently available in the US, with more countries coming soon.
Also new is Google Beam, an AI-first video communications platform. Beam transforms 2D video streams into a realistic 3D experience using six cameras and AI. "It has near perfect head tracking, down to the millimetre, and at 60 frames per second, all in real-time. The result is a much more natural and deeply immersive conversational experience," said Sundar Pichai, CEO of Alphabet and Google in a May 20 blog post about Google I/O. The first Google Beam devices will be available for early customers later in 2025 in collaboration with HP.
Pichai also commented on the rate at which Google is rolling out new products with significant improvements.
"Model progress is enabled by our world-leading infrastructure. Our seventh-generation TPU, Ironwood, is the first designed specifically to power thinking and inferential AI workloads at scale. It delivers 10 times the performance over the previous generation, and packs an incredible 42.5 exaflops compute per pod — just amazing," he said.
"Our infrastructure strength, down to the TPU, is what helps us deliver dramatically faster models, even as model prices are coming down significantly. Over and over, we've been able to deliver the best models at the most effective price point. Not only is Google leading the Pareto Frontier, we’ve fundamentally shifted the frontier itself."
Pichai shared advances in usage:
In 2024, 9.7 trillion tokens were processed a month across our products and APIs.
In 2025, there has been a 50x jump to over 480 trillion.
Over 7 million developers are building with Gemini in 2025, 5x more than a year ago.
Gemini usage on Vertex AI is up 40 times.
The Gemini app now has over 400 million monthly active users.
"We are seeing strong growth and engagement particularly with the 2.5 series of models. For those using 2.5 Pro in the Gemini app, usage has gone up 45%," Pichai said in the blog post. He also disclosed that Gemini 2.5 Pro has swept the LMArena leaderboard in all categories.
"What all this progress means is that we’re in a new phase of the AI platform shift. Where decades of research are now becoming reality for people, businesses and communities all over the world," Pichai said.
The Pareto Frontier is about trade-offs, describing a set of solutions where changing one thing in one solution will negatively affect other solutions in some way. Pareto optimisation refers to creating an improved solution which doesn't affect other solutions negatively. TPU stands for Tensor processing unit.

Comments
Post a Comment