ERNIE Image
Use it ↗ERNIE Image is Baidu's text-to-image model built for accurate text rendering inside images. Great for posters, signage, infographics, and UI mockups.
ERNIE Image is a text-to-image model from Baidu specifically designed for accurate text rendering inside images. It includes an optional lightweight Prompt Enhancer that expands short prompts into richer, more structured descriptions. You can tune inference steps (default 50) or guidance scale to balance quality and speed, and choose from multiple resolution presets up to 2048×2048. It works especially well for posters, infographics, UI mockups, signage, editorial layouts, and any output where readable text inside the image is important. It also handles complex instructions in both Chinese and English, performs well on structured image generation, and supports a wide range of styles, from realistic photography to design-focused and more stylized visual outputs.