← AI News
Model Updatesvia Google Blog

Google Releases Gemini Embedding 2: One Model for Text, Images, Video and Audio

Google has launched Gemini Embedding 2, the first model to embed text, images, video, audio, and PDFs into a single shared vector space -- simplifying AI-powered search and retrieval at scale.

20 March 2026·Original source →

What happened

Google has released Gemini Embedding 2, a multimodal embedding model that unifies five different content types -- text, images, video, audio, and PDFs -- into a single shared vector space.

In plain English: previously, if you wanted to build a search system across different content types, you needed separate AI models to handle each one. Gemini Embedding 2 handles all of them with one model, making cross-media search far simpler to build.

Technical specs:

  • Handles up to 8,192 text tokens (roughly 6,000 words)
  • Supports up to six images per request
  • Handles 120-second video clips
  • Processes native audio without needing transcription first
  • Tops leading benchmarks on retrieval and search tasks

Google also updated Gemini capabilities across Workspace -- Docs, Sheets, Slides, and Drive -- in a separate beta release, with agents now able to handle tasks within apps.

What this means for your business

If you run any kind of content-heavy operation -- a knowledge base, a product catalogue, a media library, or a document archive -- this matters.

Until now, building AI-powered search across mixed content types (say, a library of PDFs, images, and videos) required significant technical effort. Gemini Embedding 2 is designed to collapse that complexity into a single API call.

Practical applications:

  • Estate agents: Search a property library by description, photo, or floor plan
  • Retailers: Let customers search your catalogue by uploading a photo of what they want
  • Professional services: Build a knowledge base that searches across PDFs, recordings, and notes simultaneously
  • Marketers: Find relevant assets across a mixed media library without manual tagging

This is the kind of capability that, 18 months ago, would have required a significant AI engineering budget. In 2026, it is accessible via a standard API.

Explore more on AdaHQ

Everything you need to start using AI in your business.