Introducing Grok Imagine Image: Text-to-Image Generation (txt2img): Creates images from written descriptions, turning concepts into visuals.
Grok Imagine turns a text description into a finished image — no design software, no production queue.

What is Grok Imagine Image?
Grok Imagine turns a text description into a finished image — no design software, no production queue. Photorealistic scenes, stylized artwork, campaign mockups: describe what you want and get a visual asset in seconds.
The model works both ways. Generate images from scratch, or bring in existing visuals and edit them with plain text commands. Swap backgrounds, adjust lighting, remove objects — changes that once required hours in post-production now take a single prompt. That means faster iteration, tighter feedback loops, and more time spent on ideas rather than execution.
Key Capabilities
- Generate original, high-quality images from text descriptions
- Edit existing images by describing the changes you want
- Work across visual styles — photorealistic, illustrated, artistic, and beyond
- Produce on-brand assets for marketing, design, and product mockups
Examples
See how Grok Imagine translates text prompts into finished visuals across a range of styles and use cases.

{
"meta": {
"style": "8k raw photo, hyper-detailed, photorealistic masterpiece, National Geographic aesthetic",
"creativity_temp": 1.8
},
"subject": {
"identity": "Realistic interpretation of Clash Royale Hog Rider and mount. Rider: muscular dark-skinned male, defined vascularity, signature black mohawk, gold nose ring. Hog: massive boar, pinkish-grey skin, prominent ivory tusks.",
"pose_action": "Mid-gallop across shallow water. Hog's front hooves smashing into the brine, generating a crown-splash of saline droplets. Rider leaning forward, gripping leather reins, golden warhammer raised.",
"material_detail": "Rider Skin: PBR subsurface scattering, visible sweat pores, glistening moisture, hyper-realistic melanin texture. Hog Fur: Coarse, stiff bristles, wet and matted near legs, distinct follicle density. Leather: Worn saddle texture, cracked edges. Metal: Hammer gold with micro-scratches and oxidation."
},
"environment": {
"location": "Salar de Uyuni, Bolivia. Infinite horizon where sky meets earth.",
"background_elements": "Seamless mirror reflection of the azure sky and cumulus clouds on the ground. Hexagonal salt crust patterns visible through translucent shallow water.",
"atmosphere": "High-altitude clarity, thin air, zero haze. Water surface tension breaking at impact points. Crystalline salt particles suspended in splash droplets."
},
"lighting": {
"source_angle": "High-noon zenith sun, hard directional light, 90-degree angle.",
"kelvin_quality": "5800K pure daylight, blinding white albedo from salt reflection.",
"visual_effects": "Ray-traced reflections, harsh contact shadows, specular highlights on wet skin and water ripples, slight chromatic aberration on water droplets."
},
"camera_specs": {
"gear_lens": "Phase One XF IQ4 150MP, 28mm wide-angle prime lens.",
"aperture_iso": "f/11 for deep depth of field, ISO 50, 1/4000s shutter speed to freeze water.",
"film_finish": "Kodak Ektar 100 simulation, high contrast, saturated blues and golds, ultra-sharp focus."
}
}
{
"meta": {
"style": "8k raw photo, hyper-realistic cyberpunk, cinematic atmosphere, unreal engine 5 render, ray-tracing enabled",
"creativity_temp": 1.8
},
"subject": {
"identity": "Translucent holographic Pop Idol avatar, semi-materialized voxel geometry, glitching ethereal form",
"pose_action": "Hovering inches above the water surface, head tilted back in digital anguish, flickering in and out of existence",
"material_detail": "Emissive photon-mesh skin texture with CRT scanline interference, chromatic aberration on edges, wearing a jagged crown of thorns composed of high-intensity hard-light laser filaments, digital haute couture dress dissolving into pixels at the hem"
},
"environment": {
"location": "Derelict brutalist subway station, abandoned underground infrastructure",
"background_elements": "Peeling emerald-green ceramic subway tiles covered in grime and mold, rusted iron tracks disappearing into infinite tunnel darkness, exposed rebar, floating debris",
"atmosphere": "Heavy volumetric humidity, rising steam and water vapor distorting the projection, stagnant knee-deep floodwater with a viscous iridescent oil slick surface"
},
"lighting": {
"source_angle": "Omnidirectional emission from the subject, casting harsh shadows upwards",
"kelvin_quality": "Cold 8500K Cyan and Magenta holographic shift contrasting with deep tunnel blacks",
"visual_effects": "Sharp ray-traced caustics reflecting off the rippling oily water onto the wet tiled walls, flickering strobe intensity, Tyndall effect through suspended dust motes"
},
"camera_specs": {
"gear_lens": "ARRI Alexa Mini LF, Panavision T Series Anamorphic 35mm lens",
"aperture_iso": "f/1.4 aperture for shallow depth of field, ISO 3200 for gritty texture",
"film_finish": "Cinestill 800T film stock emulation, pronounced red halation around the laser crown, wet-plate aesthetic, high contrast color grading"
}
}
{
"meta": {
"style": "Unreal Engine 5.2 render mixed with 8k raw photography, hyper-realistic background vs stylized character, cinematic composition",
"creativity_temp": 1.8
},
"subject": {
"identity": "Stylized Fortnite-aesthetic tactical aviator avatar, vibrant saturated color palette",
"pose_action": "Standing confident amidst wreckage, inspecting a rusted turbine engine",
"material_detail": "Smooth diffuse shading on character skin, clean ballistic nylon textures, matte polymer armor plates, emissive neon teal LED accents, sharp cel-shaded normal maps contrasting against realistic grime"
},
"environment": {
"location": "Mojave Desert Aircraft Boneyard, Davis-Monthan AFB inspiration",
"background_elements": "Decommissioned Boeing 747 and B-52 fuselages, oxidized bare aluminum, flaking chemically-weathered paint, sun-bleached nose art, cracked caliche clay ground, dry tumbleweeds",
"atmosphere": "Intense heat haze distortion (shimmering mirage effect), suspended silica dust particles, dry arid air density, atmospheric perspective fading into cyan horizon"
},
"lighting": {
"source_angle": "High-noon zenith sunlight, direct incidence",
"kelvin_quality": "6500K harsh daylight, blinding white specular highlights on metal",
"visual_effects": "Ray-traced hard shadows, Global Illumination (Lumen) bounce light from sand to metal, lens flare anamorphic streaks, ambient occlusion in mechanical crevices"
},
"camera_specs": {
"gear_lens": "Sony A7R IV, 85mm f/1.4 GM lens for subject separation",
"aperture_iso": "f/2.8, ISO 50, 1/8000s shutter speed",
"film_finish": "Kodak Ektar 100 simulation, high contrast, vibrant saturation, sharp digital clarity, slight chromatic aberration at edges"
}
}
{
"meta": {
"style": "8k raw photo, avant-garde fashion, hyper-detailed, volcanic aesthetic, cinematic composition",
"creativity_temp": 1.8
},
"subject": {
"identity": "High-fashion model, angular bone structure, skin texture detailed with thermal perspiration and microscopic volcanic soot particles.",
"pose_action": "Standing rigid against wind, arms extended to display billowing sleeves, dynamic fabric simulation.",
"material_detail": "Sheer organza blouse, oversized ruffled architecture. PBR properties: high transmission, subsurface scattering turning fabric glowing ember-orange from beneath. Micro-details: intricate silk weave, tiny singe marks where sparks land, translucent layering revealing silhouette."
},
"environment": {
"location": "Cooling basalt ridge, geometric hexagonal rock formations, jagged obsidian ground.",
"background_elements": "Fissures of molten lava, distant pyroclastic flow.",
"atmosphere": "Heavy volumetric ash clouds, grey particulate suspension, heat haze shimmering (refraction index 1.0003), choked sky, oppressive density."
},
"lighting": {
"source_angle": "Strong up-lighting from ground fissures (subterranean glow), soft diffuse top-down light from overcast sky.",
"kelvin_quality": "Contrast between 1200K (magma red/orange) and 7500K (ash grey).",
"visual_effects": "Lumen global illumination, ray-traced translucency through fabric, ember sparks acting as micro-point lights, bloom on lava veins."
},
"camera_specs": {
"gear_lens": "Phase One XF IQ4 150MP, Schneider Kreuznach 110mm LS f/2.8 Blue Ring.",
"aperture_iso": "f/1.8 for shallow depth of field, ISO 50, 1/4000s shutter to freeze fabric ripple.",
"film_finish": "Kodak Ektar 100 emulation, high dynamic range, crushed shadows, vibrant thermal highlights, chromatic aberration on edges."
}
}
{
"input_image": "User-provided product photo",
"resolution": "8K UHD",
"image_style": "hyper-realistic commercial product photography",
"global_settings": {
"quality": "Ultra-high detail, sharp focus, premium advertising quality",
"lighting": "Controlled studio lighting emphasizing internal textures and contents",
"camera": "High-speed photography look, shallow to medium depth of field",
"motion": "Frozen mid-air action, cinematic energy",
"post_processing": "Balanced contrast, natural saturation, clean finish"
},
"scene": {
"main_subject": {
"description": "User-provided product with realistically opened packaging",
"position": "Centered, hero shot",
"state": "Packaging opening according to its real-world mechanism, revealing contents dynamically",
"integrity": "Packaging structure and branding remain intact and readable"
},
"opening_effects": {
"style": "Realistic commercial opening action",
"mechanism": "Opening method strictly matches the product type (pull tab, tear seal, wrapper peel, lid removal, cap twist, seal break)",
"elements": "Contents, ingredients, fragments, liquids or particles emerging naturally from inside",
"motion": "Physically accurate movement, frozen mid-action"
}
},
"background": {
"style": "Clean studio background or smooth gradient",
"color": "High contrast with both packaging and contents",
"depth": "Subtle separation to highlight internal elements"
},
"rules": {
"content_focus": "Internal product is visually dominant",
"realism": "No unrealistic cuts, breaks or openings; no destructive packaging behavior",
"branding": "Logos, labels and package details remain sharp and legible",
"no_artifacts": "No AI distortions, warping or unnatural shapes"
},
"goal": {
"mood": "Energetic, appetizing, premium",
"usage": "Advertising, packaging reveal, product launch",
"priority": "Contents first, packaging enhances and frames the reveal"
}
}
Dark Background
A highly detailed digital illustration with a hyper-realistic, scientific visualization style, combining research-grade anatomical accuracy with advanced military targeting system aesthetics. Clean, high-contrast rendering with precise line work, layered data overlays, glowing vector paths, and sensor-style annotations. Dark, minimal background to enhance readability and focus. Balanced lighting with controlled glow intensity, no artistic distortion, no exaggeration. The composition follows an analytical, documentary tone, presenting the subject as a system under observation rather than a narrative scene, maintaining clinical respect and technical clarity. a futuristic motorcycle
A beautiful black woman is dressed up in urban clothes and make up. Full body shot
Generate a cinematic image from the scene in the Bottom-left. The movie genre is Action, Thriller. , and the Visual style is Futuristic, Tron movie style, Realistic. . The scene reflects the mood, tension, and atmosphere typical of the MOVIE GENRE. Visual language inspired by the VISUAL STYLE, with detailed environments, realistic materials, and strong depth. High detail, sharp focus, cinematic framing, film still quality.
{
"meta": {
"style": "8k raw photo, hyper-realistic, cinematic sci-fi, unreal engine 5 render, masterpiece",
"creativity_temp": 1.8
},
"subject": {
"identity": "Prehistoric primate, Australopithecus physique, fur matted with jagged rime ice and hoarfrost crystals",
"pose_action": "Finger making contact with the cold silicon surface, muscles tensed, breath visible as vapor",
"material_detail": "Monolith: Translucent sapphire-blue silicon wafer, nanometer-scale gold-etched circuitry visible deep within, refractive index 1.77, smooth glass-like surface with sub-surface scattering at touch point"
},
"environment": {
"location": "Frozen glacial tundra, jagged permafrost terrain, pool of melted water at monolith base",
"background_elements": "Holographic galaxy projection expanding from contact point, sky fracturing into geometric Voronoi shards, digital glitch artifacts blending with clouds",
"atmosphere": "Dense freezing fog, suspended ice particles, steam rising from the thermal reaction, volumetric density"
},
"lighting": {
"source_angle": "Low angle upward cast from the monolith's internal glow, ambient glacial twilight from above",
"kelvin_quality": "Internal Pulse: Electric Cyan (9000K), Ambient: Deep Arctic Blue (7500K), Hologram: Spectrum White",
"visual_effects": "Ray-traced caustics on wet ice, anamorphic lens flares, chromatic aberration on sky fractures, global illumination"
},
"camera_specs": {
"gear_lens": "Phase One XF IQ4 150MP, Rodenstock 28mm wide-angle lens",
"aperture_iso": "f/8, ISO 50, 1/250s shutter",
"film_finish": "Kodak Ektar 100 simulation, bleach bypass, high contrast, sharp texture detail, 8k resolution"
}
}
Make a full body turnaround sheet of this exact character. Four full-body poses on a pure white background: Front view, left profile, back view, and right profile. Evenly spaced in a horizontal row and consistent style.
A cinematic portrait of an elderly Japanese fisherman standing on a quiet harbor at dawn, his weathered face marked by deep wrinkles and salt-stained skin. Soft golden morning light illuminates his features from the side, casting long shadows and highlighting the texture of his beard and worn jacket. The background shows calm water, wooden boats gently floating, and mist rising from the sea. Shot with shallow depth of field, ultra-realistic, 85mm lens look, natural color grading, calm and contemplative mood.
A dramatic close-up portrait of a young woman with freckles and emerald-green eyes, standing in heavy rain at night. Neon city lights reflect off the wet pavement behind her, creating vibrant bokeh in shades of cyan and magenta. Raindrops cling to her hair and eyelashes, illuminated by a strong backlight. High contrast lighting, cinematic cyberpunk atmosphere, ultra-detailed skin texture, emotional and intense expression.
An ultra-detailed macro photograph of a jumping spider’s face, capturing its multiple reflective eyes and fine hair texture. Natural daylight softly illuminates the subject, revealing vivid colors and microscopic details. Shallow depth of field with a smooth green background blur. Hyper-realistic macro photography, scientific yet visually striking.
An extreme macro photograph of a precision mechanical watch movement, showing interlocking gears, micro-screws, and polished metal components. Every surface reveals fine machining marks and subtle reflections. Soft diffused studio lighting highlights the metallic textures and depth of field is razor-thin, with only a few gears in sharp focus. Ultra-high detail, photorealistic macro photography, technical and elegant mood.
A futuristic armored soldier sprinting across a high-tech command center while holographic interfaces flicker around him. The camera is positioned low and slightly tilted, creating a dynamic diagonal composition. Motion blur trails behind his moving limbs, while his helmet and weapon remain sharply in focus. Blue and teal lights streak across the scene, with sparks and floating particles enhancing the sense of speed. Cinematic action shot, AAA video game key art, intense and tactical atmosphere.
A woman in a metallic fashion dress captured mid-spin in a minimalist studio. The camera is angled slightly above and off-center, emphasizing the flowing movement of the fabric. The dress creates sweeping motion lines as it catches the light, while her body remains sharply defined. Strong directional lighting sculpts the form, with soft shadows trailing behind the movement. High-fashion editorial photography, energetic and expressive mood.
A dark fantasy warrior mid-swing with a massive sword inside a ruined cathedral. The camera captures the scene from a low-angle perspective, close to the ground, emphasizing power and momentum. The warrior’s cape and hair flow through the air, with debris and dust kicked up around his feet. Moonlight cuts through broken arches, creating dramatic highlights and deep shadows. Dynamic action pose, epic fantasy game art, violent and heroic energy.
A high-bitrate sports broadcast still from an ultra-wide F1 on-board T-Cam, capturing a car violently attacking the Sainte-Dévote corner at the Monaco Grand Prix. The scene is dominated by extreme motion blur on the track surface, yellow-and-red curbing, and Armco barriers due to a slow shutter speed (1/60s look). Heat haze ripples visibly from the exhaust, carbon-fiber mirrors vibrate, and the front brake discs glow bright red. The steering wheel display clearly reads 'GEAR 4' and 'RPM 12000'. Harsh midday Mediterranean sunlight creates high-contrast shadows, glinting off metallic paint and showing realistic reflections on the driver's helmet visor. The background is a blur of the harbor and luxury yachts. Tire marbles and skid marks cover the track.
A sleek cybernetic panther stalks across a rain-soaked neon alley at midnight. Its matte black, segmented metal plating is accented by glowing blue circuit lines that pulse rhythmically along the muscles.
Droplets of rain glisten on the panther’s body, catching the shifting hues from overhead holographic billboards. Its eyes shine with a sharp, intelligent turquoise glow and low, electrical vapor drifts from its nostrils as it exhales.
The alley is lined with wet concrete, reflecting vibrant signs in pink, cyan, and green. Tense, cinematic composition with dramatic, angled lighting casts elongated shadows and intricate reflections beneath padded, clawed paws. No text appears anywhere in the image.
A voxel-style mech charging forward with heavy mechanical steps, captured from a low-angle perspective. Each movement displaces voxel debris and dust. Directional lighting creates dramatic contrast across the blocky forms. Modern voxel game aesthetic, powerful and kinetic.
A lone astronaut walking decisively across a vast desert landscape toward a towering alien structure. The camera is placed low and behind the subject, creating a sense of forward momentum and scale. Sand is kicked up by each step, frozen mid-air by harsh sunlight. The astronaut’s cape and equipment straps move with the wind. Cinematic sci-fi exploration shot, epic scale, sense of journey and motion.
generate polished advertising creative with lifestyle integration, proper lighting, and marketing-optimized composition.
Generate a polished advertising creative from the provided product image.
Place the product in a lifestyle context and add clear ad elements such as a bold headline, short benefit text, and a call-to-action label.
Use professional lighting, strong visual hierarchy, and modern layout design to highlight the product as the hero.
Create a final render of this sketch following these instructions:
Style: Recreate the character as a high-end, polished splash art illustration for a flagship game. Use a vibrant, painterly digital art style with soft, volumetric lighting and rich color depth. Shapes should be bold and well-defined with a smooth 'hand-painted' look. Ensure all surfaces are clean with no visible line art, balancing 3D depth with a professional illustrative finish. Isolated against a solid white background.
Character description: A humanoid horse character holding a fire sword.
Wooden pier extending into misty lake at dawn, calm water reflecting golden light, fog over water, mountain and forest silhouette in background, serene landscape photography
Traditional Alpine chalet on green mountain slope, wooden structure with stone foundation, scattered pine trees on rolling hillside, pastoral landscape, natural lighting
Northern lights dancing across the night sky above calm mountain lake, green and purple aurora reflected perfectly in still water, snow-capped peaks silhouetted against a starry sky, a single log cabin with warm window glow on the shoreline, long exposure photography
3D robot couple in American Gothic style, male robot in black suit with pitchfork, female robot in brown polka dot dress, farmhouse background with warm orange tones, retro-futuristic aesthetic
A dynamic anime cyberpunk girl with short black hair and green eyes in a black and cyan futuristic bodysuit, striking a powerful pose with motion blur speed lines and holographic sci-fi effects.
Colorful coral reef teeming with tropical fish, shafts of sunlight penetrating clear blue water, sea turtle swimming in mid-distance, school of yellow fish in the foreground, underwater photography, wide-angle perspective, National Geographic style
Ethereal figure with long platinum blonde hair wearing elegant black robes with ornate embroidery, holding medieval longsword with both hands, dramatic side lighting against textured stone wall, fantasy warrior aesthetic, cinematic depth, moody atmosphere
Neon-lit alleyway in a futuristic Tokyo, holographic advertisements floating in mid-air, a lone figure in a reflective raincoat walking away from camera, wet pavement reflecting purple and cyan lights, Blade Runner atmosphere, cinematic wide shot
Young chef preparing food in retrofuturistic kitchen, wearing white shirt and dark apron with patterned tie, oversized geometric eyeglasses, hands working on cutting board, neon pink lighting strips, synthwave color palette, vintage kitchen equipment, cinematic composition
A dramatic silhouette of a guitarist holding a sunburst electric guitar outlined by an orange backlight, set against a black circular backdrop with a sunset gradient in bold 80s synthwave style.
An ethereal woman in profile with flowing blonde hair merging into a starry galaxy, golden light particles scattered throughout, set against a deep blue circular background, in fine art photography style.
Close-up portrait with expressive eyes, soft skin tones, shallow depth of field, cinematic color grading, using reference photoAbout the Provider
Grok Imagine is built by xAI, an AI company focused on advancing tools that expand human creativity and understanding.
Related models
Grok Imagine Video
Generate and edit videos using xAI's Grok model — supports text-to-video, image-to-video (up to 15s), and video editing (up to 8.7s) with 480p/720p resolution and flexible aspect ratios
Grok Imagine Image Pro
Generate and edit high-resolution images using xAI's Grok Pro model, powered by Aurora — supports text-to-image generation and image editing with up to 2K resolution, multiple aspect ratios, and batch output up to 10 images

