Omni Mode launches with Google I/O · May 20

One studio for every Gemini modality.

Edit images with Nano Banana 2. Generate cinematic clips with Veo 3.1 — picture and native spatial audio in one model pass, character consistency across scenes. Talk live to Gemini. Drop in a 1,500-page PDF and ask anything.
One quiet, opinionated studio — no credit grind, no GCP account, no markup theatre.

TOOL
AI Video Generator
CinematicGolden hourAmbient jazz
0 / 4000
RESOLUTION
ASPECT RATIO
DURATION
SPEED
AUDIO
≈ 50 credits·~30s generation·

Built by an ex-Hugging Face engineer in Berlin. Pay-per-use, server-side proxied, transparent about which Gemini model runs each tool.

What is GeminiOmni?

An independent, opinionated front-end for the parts of Google Gemini that are worth using. Four focused tools, each pinned to the cheapest model that does the job — and one waiting room for Gemini Omni itself.

Omni-modal, but honest

Gemini already covers image, video, audio, and 1M-token text. We surface those modalities cleanly instead of dressing them up as something they aren’t.

Pay per use, no credit games

Every generation has a published cents-level cost. No “50 credits / 65 credits” puzzles, no auto-renewing bundles you forget to cancel.

Server-side proxied keys

Your prompts hit our API. Our API hits Google. No client-side keys, no surprise $80k bills like the Gemini API leaks of early 2026.

Built by one person, in public

Made by Lena Hoffmann, ex-Hugging Face. Build notes, model choices, and price-deltas are all in the blog — so you can audit how every euro is spent.

Why GeminiOmni vs. the other wrappers

The Gemini wrapper space is crowded but lazy. Here’s where we draw the line.

Each tool tells you exactly which Gemini model runs underneath (Gemini 3.1 Flash Image Preview, Veo 3.1 Fast, Gemini 3.1 Flash Live, Gemini 2.5 Flash). No ‘powered by AI’ fog.

Tools

Four tools today. A fifth one the moment Google ships the API for it.

01.

Banana Studio — Chat-based image editor

Upload a photo, describe the change, get a perfectly consistent edit. Powered by Gemini 3.1 Flash Image Preview (Nano Banana 2). Native 2K, optional 4K upscale.

02.

Omni Reel — Text & image to video

8-second cinematic clips with native synced audio. Default tier uses Veo 3.1 Fast at $0.15/sec. Templates for LinkedIn intros, TikTok hooks, product demos.

03.

Live Booth — Real-time voice with Gemini

60-second demo of Gemini 3.1 Flash Live. No sign-up. Speak, hear it answer in real time. The first consumer surface for Google's realtime voice model.

04.

Long Read — 1M-context PDF & doc QA

Drop a 1,500-page PDF, ask anything, get answers with page citations. Gemini 2.5 Flash — no chunking, no RAG plumbing, just answers.

05.

Omni Mode — waitlist (May 20)

Google is expected to unveil Gemini Omni at I/O on May 20. We integrate the same day. Drop your email and we'll ping you the moment the API opens.

06.

Build notes — model picks, price deltas, gotchas

Every release of the studio comes with a short post: which model we tried, what it cost per call, why we kept or swapped it. Skip the marketing, read the receipts.

Plain-English FAQ

Anything else? Email lena@geminiomni-ai.com.







Pick a tool. Make something.

No sign-up to start. Pay only for what you actually generate.

GeminiOmni — One studio for every Gemini modality