AI systems for generating images have often struggled with producing accurate text, resulting in distorted characters that easily reveal their artificial origin. However, the latest image creation feature in ChatGPT stands out as the most effective at displaying legible text that I've encountered.
Using the Images 2.0 feature, now accessible to every ChatGPT user regardless of subscription level, I requested it to convert a passage from one of my latest articles into a pencil sketch on a yellow notepad. The outcome appeared remarkably authentic to my eyes:
Next, I directed it to produce an infographic explaining AI tokens, starting with a web search for reliable data, and specifying a serif typeface in a 3:2 landscape format. The result was as follows:
I then instructed Images 2.0 to develop a further infographic outlining the different Raspberry Pi variants, incorporating technical specs and additional information:
In another test, I provided a photo of myself by the pool and asked the model to design a summer fashion catalog featuring various outfits with me as the model:
According to OpenAI, Images 2.0 represents their initial image-generation system equipped with reasoning functions, allowing it to pause and analyze an input description prior to beginning the creation process.
For textual elements, the tool accommodates numerous languages such as Japanese, Korean, Chinese, Hindi, Bengali, and several others that use scripts beyond the Latin alphabet.
Additionally, it has the ability to query the internet for current details before producing visuals, and it can generate several images simultaneously, which proves useful for tasks like product catalogs, sequential comic panels, or storyboard sequences.
OpenAI claims that Images 2.0 achieves an exceptional degree of precision and detail, which should improve its ability to closely match user descriptions in the resulting visuals.
This enhanced reliability addresses a persistent query about the utility of image-creation AI beyond novelty items like humorous graphics or unsettling fabricated portraits: what are their genuine, everyday uses?
Possible real-world benefits include rapid layout of printed matter, quick production of data visuals, and efficient generation of product displays, though correcting minor errors would necessitate regenerating the entire visual.
Furthermore, extended use of Images 2.0, based on my brief initial trials, might yield visuals that start to appear repetitive, underscoring the importance of an experienced prompt engineer with design expertise to guide the process.
Ben has covered consumer tech topics for over two decades and currently emphasizes AI's impact on daily life. His reporting examines recent large language models and their applications in professional and personal settings to help navigate the upcoming AI transformations. As he notes, 'AI will reshape our world more rapidly than anticipated, and regular engagement is key to adjustment.' Since joining PCWorld in 2014, Ben has reported on items from portable computers to surveillance devices prior to establishing the publication's AI focus. His work has featured in outlets including PC Magazine, TIME, Wired, CNET, Men's Fitness, Mobile Magazine, and others. He earned a master's in English literature.