Khazen

By Jose Antonio Lanz — decrypt.co — OpenAI has been privately testing a new iteration of its generative artificial intelligence (AI) imaging model over the past several months, and early samples leaked by YouTuber MattVidPro show it outperforming previous image generators. “Extremely exciting—this blows anything we’ve seen before out of the water, it’s insane,” Matt said in a preview he posted to YouTube. “Midjourney cannot compete at this level—I don’t even think Midjourney version six would be able to compete at this level.” Don’t expect to try it out anytime soon, however. Access is extremely limited. The unpublished model is likely an upgrade of DALL-E 2 and is being tested through an invite-only preview inside ChatGPT-4. Matt said there are only around 400 people worldwide who have access to this new OpenAI image generator.

While limited, the image samples demonstrate the AI’s advanced skills. It produced sharp images with lighting and reflections that mimic real photos. The model recreated detailed paintings down to visible brush strokes. It also recreated brand names like “Snickers” and logos of well-known brands like Subway flawlessly on generated products, and achieved reasonably good spelling in rendered text. While current image generators struggle with coherent hands, the examples showed realistic, properly proportioned hands. Backgrounds also appeared more convincing than competing AI systems. OpenAI apparently removed its safety filters to test the model’s full potential. Users said it can generate violent content and nudity without hesitation. However, knowing OpenAI’s stance towards NSFW content, it’s highly unlikely that an official public version is released under such standards.

Some experts have criticized OpenAI for “dumbing down” its models to avoid potential controversy. Some studies even suggest that OpenAI trained ChatGPT to have a strong political bias in its outputs. Nonetheless, the consistent quality shown in the samples is a leap forward. It highlights OpenAI’s ongoing efforts to improve generative AI capabilities. The company may reveal more on its progress later this year, especially if the field of image recognition and generation helps improve the robustness of its star product: a multimodal GPT-4 capable of understanding text, images, and drafts in one prompt. For now, the technology remains confined to closed testing with a minuscule number of users. As models continue improving, the line between artificial and real blurs even further. While this excites many, concerns around misuse will persist. Building this technology responsibly remains an urgent challenge.