image
video
MM Audio
MM Audio (Dec 2024) is a multimodal video-to-audio model that generates realistic "Foley“ and ambient sound effects aligned with visual scenes or text prompts
- Audio
- MMAudio
- V2V
- Video Editing
MM Audio creates synchronized audio that matches video content. Inputs include a video file to provide visual and motion cues, or alternatively a text prompt to guide audio generation. The output is a video with the audio track —ambient sounds, effects, or music—that aligns with the video timeline. Best results come from using clear, well-lit videos with visible actions or context, and from writing prompts that specify the intended atmosphere, instruments, or sound style