Magnific Video Upscaler Precision: A high-fidelity upscaling model focused on accuracy, strength blending, and detail preservation.
Models
All Models
Magnific Video Upscaler Creative: An AI tool for creative video enhancement with 4K support, "flavor" controls, and a creativity slider for adding detail.
Generate a textured 3D model from one or more images using ReconViaGen 0.5.
Professional-grade AI lipsync. Drop in a video and any audio, Sync-3 aligns mouth movements to sound. Built for dubbing, voice-over, ADR, and high-fidelity video editing.
JoyAI Image Edit: An all-in-one model for image editing through natural language and advanced visual understanding.
Pruna's P-Image Upscale: Fast, precise upscaling (1-8x, 1-8MP) with optional detail enhancement.
Edit video with natural language — swap backgrounds, shift lighting, apply style transfers, and restyle with a reference image while preserving original motion. 2–10s at 720p.
Alibaba Wan 2.7 text-to-video — 720p/1080p, 2–15s, optional synced audio, prompt expansion.
Wan 2.7 Image Pro: Alibaba’s advanced AI for 4K text-to-image generation and multi-image editing.
Alibaba's image generation & editing model (Apr 2026). Text-to-image, image editing, style transfer, and coherent image sets. Thinking mode, up to 9 reference images, improved text rendering, flexible aspect ratios up to 2K.
Animate images into cinematic video with first/last frame, or clip-continuation modes. Direct multi-shot sequences with temporal brackets. 2–15s at 720p or 1080p.
Generate full songs from lyrics and style prompts. Supports 14 structure tags for arrangement control, vocal and instrumental modes, and CD-quality output at 44.1kHz / 256kbps.
Fast text-to-speech with the same 17 voices, emotions, and 40+ languages as HD, optimized for speed and cost. For real-time apps, assistants, or interactive content.
Premium text-to-speech with 17 voices, 10 emotions, 40+ languages, and natural interjections. Fine-tune speed, pitch, and volume for broadcast-ready narration and voice-overs.
Image editing with HY-WU. Transfer outfits, swap faces, and blend textures instantly with reference images.
Google Lyria 3 Clip is the short-form variant of the Lyria 3 family, purpose-built for generating tight, expressive 30-second music clips in MP3 format.
Grok Edit Video is a video-to-video editing model powered by xAI's Grok Imagine technology, integrated directly into Scenario.
Extend an existing video with new footage using xAI Grok: up to 15s of source video is preprocessed at 720p, then 2–10s of continuation is generated from your prompt.
Produces high-quality 3D renders of character outfit sets and equipment displayed on invisible figures with detailed textures and studio lighting.
Tada 3B Text to Speech is a voice cloning model that synthesizes speech in any target voice using a short audio reference.
Tada 1B Text to Speech is a streamlined voice cloning model that converts any text into speech using a short reference audio as the voice template.
HeyGen Photo Avatar 4: Create realistic talking avatar videos from a photo and text prompt with built-in TTS voices.
HeyGen Translate with extreme speed. Faster video translation at 5 credits per output second.
A high-fidelity video-to-video (V2V) model designed for extreme precision in translating spoken content across different languages.
AI-powered explainer video generation that creates digital presenters from natural language prompts.
Fast, ultra high-quality background removal from images. Perfect for e-commerce and image editing workflows.
High-fidelity 3D generation utilizing native 3D diffusion from multi-angle reference images.
Physics-aware image editing with realistic refraction, material changes, and deformations.
A versatile model for making 3D Low-Poly stylized environments, designed for platformers and adventure games, featuring geometric shapes, vibrant colors, and soft lighting.
A specialized asset generation model for high-quality Stylized Game Icons, focusing on gemstones, crystals, and enchanted loot with vibrant colors and polished digital painting.
Generate SVG glyphs from a text prompt while matching the style of reference glyph images.
Generate SVG glyphs directly from text prompts with controllable typography and styling.
Re-generate a section of an existing video. Replace audio, video, or both with LTX 2.3 Pro. 1080p only.
Add duration to the beginning or end of a video using LTX 2.3 Pro. Extend existing clips with high-fidelity continuation.
Speed-optimized LTX-2.3 video generation. Generates videos with synchronized audio faster than real-time — ideal for rapid prototyping, mobile workflows, and high-volume production.
Retexture 3D meshes with Trellis 2. Apply high-quality textures from a reference image, with control over resolution, scale, and output consistency.
A high-fidelity motion transfer model optimized for complex gestures and professional-grade character animation.
A cost-effective motion transfer model designed to animate character images using reference videos.
Generates high-fidelity 3D meshes from a set of multi-angle reference images.
Automated rigging and animation retargeting for quadruped 3D models. For biped models and humanoids, please continue using Tripo Rigging v1.0.
Vidu Q2 references-to-video generation. Supports video reference, video editing, and video replacement
AI-powered UV unwrapping for 3D models. Generates clean UV maps for FBX, OBJ, and GLB models with up to 30,000 faces.
AI-powered texture editing for 3D models. Apply textures from text prompts or reference images to FBX models. Supports PBR (Physically Based Rendering) when using prompts.
Qwen Edit Plus by Alibaba delivers high-fidelity instruction-based editing on a single reference image with precise object and style control, and LoRA support. From 7 credits.
Qwen Edit 2511 by Alibaba edits images from natural language instructions using Qwen2.5-VL (November 2025). Accepts multiple reference images and up to 6 LoRA styles. From 8 credits.
Qwen Edit 2509 by Alibaba applies text-based edits using the Qwen2-VL model (September 2025 release). Accepts multiple reference images and up to 6 LoRA styles. From 11 credits.
FLUX.2 Klein 9b - Efficient text-to-image and image-to-image model
Transform videos with text prompts. Input a reference video and describe the desired output. Supports IC-LoRA control (canny, depth, pose, detailer) and optional first-frame conditioning.
Interpolate between multiple keyframe images to generate smooth video transitions with synchronized audio.