Scenario
← All Models
Video

LTX-2.3 Pro Audio to Video

Use it ↗

Generate video driven by an audio clip. Voice cadence controls pacing, musical energy shapes motion. Up to 20 seconds, 1080p, precise audio-visual sync.

Turn any audio clip into a synchronized video. Provide a 2 to 20 second audio file with an optional first frame image or text prompt. LTX-2.3 Pro uses a joint audio-video diffusion transformer that reads the audio's temporal structure to control motion timing, pacing, and emphasis. The result is a video where visuals move with the sound, not just alongside it. Produces up to 20 seconds at 1080p. Guidance scale controls prompt adherence. Built for podcasts, voice-driven narratives, avatar animation, and audio-led creative production.

More models from Lightricks