Released in Nov 2025, this closed-source model enables precise, text-guided modifications. As part of the unified O1 architecture, its input consists of an existing video clip, text commands, and optional images or multi-angle "Elements." This allows for adding, removing, or replacing subjects and backgrounds with high fidelity (e.g., Add [@Element] to [@Video]). It prioritizes surgical editing over large-scale generation, making it effective for post-production refinement. This offers a more intuitive, multi-modal workflow than traditional frame-by-frame manipulation.