Scenario
← All Models
Video

SAM 3.1 Video

Use it ↗

Tracks and segments moving objects across video frames into isolated mask tracks. Requires a text prompt. Outputs one video mask per object, up to 16 simultaneous tracks.

Segment videos with Meta SAM 3.1. Text prompt initializes detection; optional per-frame points, box, or mask refine tracking.

More models from Meta