The company has announced the release of a next-generation model capable of handling complex visual tasks and producing more precise images. Beyond improvements in image quality, text generation has also significantly advanced.
Where AI-generated images were once easily distinguishable from real ones, the new model makes this far less noticeable thanks to its improved sense of composition and visual style. According to OpenAI, the model works accurately across multiple languages and leverages expanded world and stylistic knowledge, enabling better results with fewer prompts and greater precision.
ChatGPT Image 2.0 is the first image model with reasoning capabilities. When handling complex tasks, and when using OpenAI’s advanced thinking and pro models, it can search for relevant information online, generate multiple images from a single prompt, and verify its own outputs.
“With both the intelligence of OpenAI’s reasoning models and a vast understanding of the visual world, this model moves image generation from rendering to strategic design, from a tool to a visual system, helping people turn ideas into outputs they can understand, share, teach with, and build from.” — the company notes.
What’s new in the model?
More accuracy and control
The new Image 2.0 model delivers significantly improved accuracy and detail in image generation. Previously, generated visuals could contain text errors, misrepresented elements, or fail to fully match user intent. Now, the model adheres closely to instructions, preserves details, and accurately renders subtle stylistic requirements.
Additionally, Images 2.0 incorporates updated knowledge up to December 2025. This ensures higher accuracy for educational materials and visual explainers, where relevance and correctness are especially critical.
Example provided by OpenAI. Infographic created with Images 2.0
Expanded multilingual capabilities
Previously, models performed better in English and other Latin-based languages, with reduced accuracy in others. OpenAI reports that this limitation has been addressed in Images 2.0, with significantly improved understanding and rendering of non-Latin languages.
Performance in Japanese, Korean, Chinese, Hindi, and Bengali has become notably more accurate. The model can now generate not only short labels but also complex, text-rich visuals where language is an integral part of the design. Examples shared by OpenAI include comics, manga, posters, and more.
Example provided by OpenAI. Japanese shonen adventure manga. Demonstrates how Images 2.0 works with non-Latin languages.
Visual styles and realism
The new model produces images with a higher level of realism by replicating photographic characteristics, including subtle imperfections that enhance authenticity. It can also generate cinematic stills, pixel art, and other styles with strong consistency in detail, lighting, and overall composition.
OpenAI demonstrated outputs for prompts such as surreal portrait, disposable camera-style images, street photography, and more.
Example provided by OpenAI. A page of a comic book in the style of Mid-Century Pastel Comic Art
OpenAI comments:
“As a result, the model can produce outputs that more faithfully reflect the style requested, rather than approximating it. This is especially useful for game prototyping, storyboarding, marketing creative, and creating assets in a particular medium or genre.”
The model also supports a wide range of aspect ratios, from ultra-wide (3:1) to vertical (1:3), depending on user needs. This makes it suitable for banners, presentations, posters, mobile screens, and social media content.
Example provided by OpenAI. Traditional long Chinese 山水画. Aspect ratio: Landscape 3:1
Image creation as a collaborative process
As mentioned, the model can be used in thinking or pro modes. In thinking mode, it operates in a more agent-like manner: analyzing tasks, retrieving relevant information, transforming materials into clear visualizations, and planning image structure in advance.
It can also generate multiple related images at once, which is particularly useful for manga sequences, interior design variations, or a series of posters.
Integration with Codex
Images 2.0 is also available in Codex. This means users can generate and iterate on visual ideas within a single workspace. Designers and creators can explore multiple design directions, UI concepts, or prototypes, compare them, and select the strongest option.
The chosen concept can then be directly turned into a finished product, such as a website or presentation, without leaving Codex.
OpenAI notes that no separate API key is required – access is included with a standard ChatGPT subscription.
API (gpt-image-2)
The new capabilities can also be integrated into products via the gpt-image-2 API.
Developers and businesses can use the model to add image generation and editing features to their applications. This simplifies workflows for localized advertising, infographics, educational content, design tools, creative platforms, and website builders.
OpenAI reports that companies are already using gpt-image-2 in production and have shared feedback:
“GPT Image 2 helps teams in Figma generate everything from text-rich visuals to photorealistic scenes, with major improvements in editing and aesthetics that give designers far more ways to shape and evolve the final result.” — Loredana Crisan, Chief Design Officer, Figma
“What surprised us most was the detail GPT Image 2 added. It introduced elements we hadn’t considered, like a “viral on TikTok” sticker—a smart creative choice designed to build hype. The model wasn’t just rendering images. It was interpreting briefs, understanding audiences, and making creative decisions behind the scenes. We’ve been measuring AI on technical outputs. The real shift is creative reasoning and design taste—and that shift just happened.” — Dwayne Koh, Creative Strategist, Canva
Limitations
OpenAI notes that Images 2.0 still faces challenges in accurately modeling physical objects, solving tasks like Rubik’s Cubes, handling dense or repetitive patterns such as sand, and correctly rendering hidden or reversed surfaces. Additionally, diagrams and labels may still require manual verification for accuracy.
How to get started with Images 2.0
Images 2.0 is now available to ChatGPT and Codex users. The thinking mode is accessible on Plus, Pro, and Business plans.
The gpt-image-2 model is available via API, with pricing depending on image quality and resolution.