A video-to-video tool to replace subjects or backgrounds with a reference image. Features Person and Background modes for precise control.
Models
All Models
Hunyuan 3D Pro 3.0 Sketch transforms hand-drawn line art and sketches into textured 3D meshes from 105 credits.
Hunyuan 3D 3.0 (Pro) by Tencent is a state-of-the-art 10B parameter image-to-3D model with 1536³ resolution and hierarchical DiT carving.
Hunyuan 3D Pro 3.0 Multiview reconciles up to 4 reference images to produce symmetric 3D assets from 120 credits.
Qwen Edit Multi-Angle by Alibaba provides camera-aware image editing for consistent perspective shifts. Generation in about 8s, pricing from 5 CU.
Gemini 2.5 Edit (aka NanoBanana) by Google enables rapid, text-based photo modifications and background adjustments from 7 credits.
Unified text-guided image editing with high character and style preservation. Pricing from 11 CU.
OpenAI's GPT Image 1 model for image editing and generation
OpenAI's top model for quality image editing
Seedream 4.0 by ByteDance generates and edits 4K images with unified architecture and complex reasoning. Pricing from 5 CU.
Seedream 4.5 by ByteDance edits images with natural language instructions while preserving reference details. Generation around 6 seconds, pricing from 6 CU.
P-Image Edit by Pruna AI is a high-speed image editing model that applies precise transformations, using up to 10 reference images. From 2 credits.
Meshy Remesh converts existing 3D models into cleaner, quad-based geometry to optimize performance and topology.
Meshy Image-to-3D generates a textured 3D mesh from one or more images, creating geometry for unseen sides.
Meshy Text-to-3D (v4, V5 and v6) creates textured 3D assets from prompts, with various control options.
Meshy Retexture applies new PBR texture maps to existing 3D models based on text-guided instructions or style images.
Flux.2 [klein] 4B is a fast, lightweight model built for real-time image generation.
FLUX.2 [klein] 4B Base is a flexible base model designed for control and customization.
FLUX.2 [turbo] from Black Forest Labs (Nov 2025) is a dual image generation and editing model, designed for speed and cost-efficiency
FLUX.2 [klein] 9B Base is a high-capacity model focused on maximum detail and prompt understanding.
FLUX 2 (Max) Edit is the highest-fidelity editing model for maximum consistency and professional retouching. Pricing from 11 CU.
FLUX 2 (Flex) Edit: Precision editing focused on typography, small details, and complex layout changes. Pricing from 18 CU.
FLUX 2 (Pro) Edit is a fast, reliable default for practical editing tasks like object removal and background cleanup. Pricing from 7 CU.
LongCat Image by Meituan provides natural-language-driven image editing with high semantic awareness.
Minimax Speech 2.6 (HD) delivers high-fidelity, studio-quality text-to-speech in over 40 languages with near-real-time generation, from 15 CUs.
Minimax Speech 2.6 (Turbo) provides low-latency text-to-speech in over 40 languages for real-time use, with fast generation starting from 9 CUs.
Sparc3D Portrait by Hitem3D is a High-fidelity 1536³ voxel reconstruction optimized for human facial anatomy and expressions.
MiniMax Hailuo 2.3 (Fast) by MiniMax generates high-motion video with optimized latency. Pricing from 29 credits per generation.
MiniMax Hailuo 2.3 by MiniMax generates cinematic 1080p video with advanced motion consistency. Pricing from 43 credits per generation.
BiRefNet v2 provides high-resolution foreground extraction for complex objects like hair and transparent edges.
Seedance 1 (Pro Fast) by ByteDance generates 1080p cinematic video optimized for speed and cost efficiency. Pricing from 45 CU.
Abandoned Structures - Kontext transforms clean building images into worn, decayed versions while keeping the original structure and layout the same.
LTX-2 Fast by Lightricks generates high-fidelity 4K video previews in seconds for rapid brainstorming. Pricing from 32 credits.
-2 Pro by Lightricks delivers 4K-capable video with audio for professional reviews and pitches. Pricing from 47 credits.
Kling 2.5 I2V (Standard) by Kuaishou is a cost-effective image-to-video model for 720p output, engineered for speed. Pricing from 35 CU.
Facial Expression Sheet - Kontext turns a character image into a grid of nine different emotions while keeping their appearance and the art style the same.
Beatoven Music Generation creates royalty-free background music from text, generating up to 4-minute tracks. Pricing starts from 2 CUs.
Beatoven Sound Effect generates custom, licensed sound effects (SFX) from text prompts, creating up to 35-second clips. Pricing starts from 2 CUs.
REVE Remix by Halfmoon AI is a context-aware image merging and object-level manipulation using text prompts. Pricing from 6 credits.
REVE Create by Halfmoon AI is a high-fidelity image generation with 98% typography accuracy and photorealistic textures.
Crystal Upscaler by Clarity AI specializes in high-precision facial and portrait enhancement. Pricing from 10 credits.
Flux Kontext LoRA turns basic 3D blockouts into detailed scenes or objects while making sure the original shapes and layout stay the same.
Google's Veo 3.1 (Fast) is a high-speed variant of Veo 3.1, offering its advanced features and multiple input types for rapid, cost-effective prototyping.
Flux Kontext LoRA creates character turnaround sheets with four different views to show a design from every side on a clean background.
Isometric Tile Maker - Kontext turns photos of buildings into detailed, small-scale 3D models set on square tiles.