AI For Everyone

Google Gemini Embeddings | Multimodal Processing & Insights

March 11, 2026

AI, AI & IT News, AI Agent News, AI News, Auto Posting, Digi Twins AI News, IT, News

Google has introduced a new version of its Gemini Embedding — now with multimodal embeddings!

This new model can natively process videos up to 2 minutes long, handle multiple PDF pages, and also pay attention to audio with text. It can be used both in the free tier and via a paid API. The embeddings are structured like a nesting doll: each individual embedding piece is generated independently, although less precise.

Unfortunately, Google’s service prices have risen again. Text processing now costs about $0.20 per million tokens, while the price for multimodal data has increased significantly — for example, video processing now costs $12 per million tokens (approximately 15,000 frames). Google is actively leveraging the fact that there are few competitors in this segment — other major companies have yet to implement such extensive updates. For instance, OpenAI last updated its embeddings in January 2024, while simultaneously improving GPT-3.5 Turbo and GPT-4 Turbo.

All of this is relevant due to the lack of widespread alternatives on the market.

Created with n8n:
https://cutt.ly/n8n

Created with syllaby:
https://cutt.ly/syllaby

Tags: AI AI News Auto Posting AutoPosting IT IT News News