Scenario
← All Models
video

MM Audio

MM Audio (Dec 2024) is a multimodal video-to-audio model that generates realistic "Foley“ and ambient sound effects aligned with visual scenes or text prompts

  • Audio
  • MMAudio
  • V2V
  • Video Editing
Use in Scenario ↗

MM Audio creates synchronized audio that matches video content. Inputs include a video file to provide visual and motion cues, or alternatively a text prompt to guide audio generation. The output is a video with the audio track —ambient sounds, effects, or music—that aligns with the video timeline. Best results come from using clear, well-lit videos with visible actions or context, and from writing prompts that specify the intended atmosphere, instruments, or sound style

More models from Academia