Kling 3.0 Pro: Top-tier image-to-video with cinematic visuals, fluid motion, and native audio generation, with custom element support.
Models
All Models
Convert text to natural-sounding speech using Google's Gemini 2.5 Flash model with multiple voice presets.
Convert text to natural-sounding speech using Google's Gemini 2.5 Pro model with multiple voice presets.
Hunyuan Polygen 1.5 by Tencent is an art-grade AI retopology model that converts high-poly 3D meshes into clean, production-ready low-poly assets (quad or triangles)
Hunyuan 3D 3.1 (Pro) by Tencent is a state-of-the-art 10B parameter image-to-3D model with 1536³ resolution, hierarchical DiT carving, and enhanced texture color accuracy.
Hunyuan 3D 3.1 (Multiview) reconciles up to 8 reference images to produce symmetric 3D assets from 120 credits.
Hunyuan 3D 3.1 (Sketch) transforms hand-drawn line art and sketches into textured 3D meshes from 105 credits.
High-precision video upscaler optimized for portraits, faces and products.
Converts a given raster image to SVG format using Recraft model.
Concatenates multiple videos into a single video with optional transitions between clips.
Automatically detects and removes uniform color borders/padding from images and videos.
Resize a video to a specified width and height or a maximum size while preserving aspect ratio.
Resize an image to a specified width and height or a maximum size while preserving aspect ratio.
Generate same scene from different angles with Qwen image Edit 2511 and the LoRA Multiple Angles
UltraShape-1.0 is an open-source 3D-to-3D AI model that upgrades low-detail meshes into high-poly, high-fidelity 3D models.
Applies a parabolic distortion that curves the sides of an image.
Applies a 3D color lookup table to transform the overall color grading of an image.
Adds a soft glow and bloom effect to enhance bright areas.
Reduces color intensity to create a muted or monochrome look.
Transforms images into abstract, geometric compositions inspired by cubism.
Adjusts color and tonal balance for precise image correction.
Adds RGB color separation to create a distorted, lens-like effect.
Sparc3D 2.0 Portrait is optimized for human subjects, generating more stable heads and faces from 1–4 images with clean, watertight meshes and resolution control.
Sparc3D 2.0 turns 1-4 images into clean, watertight, textured 3D meshes up to 1536³, with resolution and polycount control for games, AR/VR, or 3D printing.
Qwen Image 2512 by Alibaba is a text-to-image model with photorealistic human rendering and advanced text rendering. Generation in about 6s, pricing from 3 CU.
Seedance 1.5 Pro by ByteDance generates 1080p video with native audio and multilingual lip-sync. Generation around 5-8 minutes, pricing from 40 CU.
Pika 2.2 T2V by Pika Labs creates 1080p video from text with cinematic realism. Generation takes around 1.5 minutes and starts from 30 credits.
Pika 2.2 Scenes by Pika Labs creates filmmaker-grade video from multiple images and text. Generation is ~1.5 mins and starts from 30 credits.
Pika 2.2 I2V by Pika Labs animates images with high character consistency. Generation takes around 1.5 minutes and starts from 30 credits.
Pika 2.2 Frames by Pika Labs generates video between two keyframes with high precision. Generation up to 25s and starts from 30 credits.
MM Audio 2 Text-To-Audio (SFX) generates realistic sound effects from text prompts with variable generation times, starting from 2 CUs.
MMAudio 2 (Dec 2024) generates realistic soundtracks for silent videos. Upload a video and add an optional text prompt to get synchronized, high-quality audio.
Kling 2.6 Motion Control by Kuaishou transfers motion from a reference video to a character image with precise control over body, face, and lip-sync.
Qwen Image Layered by Alibaba decomposes images into editable RGBA layers for structured editing. Generation around 10s, pricing from 11 CU.
This Flux LoRA generates quaint, storybook-like fantasy isometric background illustrations. It excels at architectural scenes and 3D-style visuals.
Google's Veo 3.1 Extend Video is a standalone model that seamlessly extends any existing video, adding new length while maintaining visual consistency.
SAM3D Objects by Meta turns photos or visuals into simple 3D objects and scene layouts, helping quickly rebuild real or imagined spaces from a single image.
SAM3D Human Body by Meta turns photos into accurate 3D human figures for animation and design.
SAM3D Align integrates 3D body meshes from SAM3D Human Body with props generated by SAM3D Objects in a shared 3D space.