Audio
Gemini 3.1 Flash TTS
Use it ↗Convert text to natural speech using Google's Gemini 3.1 Flash. Supports 30 voices, 24 languages, and inline audio tags for emotion control. Outputs mono MP3 at 24kHz.
Generate expressive, natural-sounding audio from text using Google's Gemini 3.1 Flash TTS. Input up to 5,000 characters and choose from 30 built-in voices across 24 language locales. Configure two simultaneous speakers for dialogue scenes using multiSpeakerConfig. Control emotion and pacing with 200+ inline audio tags such as [enthusiasm] or [whispers], placed directly in your text. Output is mono MP3 at 24kHz. Ideal for game dialogue, voiceovers, audiobooks, e-learning narration, and multilingual content.