Audio
Seed Audio 1.0
Use it ↗Generate expressive speech with a prompt-defined voice (age, accent, mood, character...) plus optional audio/image references and controls for speed, pitch, or volume.
Seed Audio 1.0 by ByteDance turns text into expressive, recorded-sounding speech, with the voice defined however you like. Describe it in plain language (age, accent, mood, or character). Clone it from up to three short audio clips, or derive it from a single reference image. Built-in emotional range, natural pauses, and emphasis keep long scripts consistent from the first word to the last. Fine-tune speech rate, pitch, loudness, and sample rate, then synthesize across multiple languages and accents.