Generated: 2026-04-26 · API: Gemini 2.5 Flash · Modes: Summary


Craig Does AI: JSON Prompts for Advanced ChatGPT Image 2.0 Control

Clip title: I Tested ChatGPT’s New Image 2.0 and Accidentally Stumbled Upon an Awesome Workflow Author / channel: Craig Does AI URL: https://www.youtube.com/watch?v=qXUww5tnLHs

Summary

The video introduces a significant discovery regarding image generation with GPT Image 2.0, which the presenter suggests could establish it as a leading tool. Having previously developed a custom “JSON Image Creator V.3” Gem for image prompting, the presenter describes how, during recent testing with GPT Image 2.0, he stumbled upon an unexpected and powerful “hack.” This breakthrough provides enhanced control over image generation and enables advanced capabilities not widely known. The presenter not only demonstrates this functionality but also offers all necessary resources—including the Gem itself, its source files, and a detailed Notion document—for viewers to replicate and experiment with his findings.

The core of the presenter’s method lies in utilizing JSON-structured prompts rather than simple text commands. He explains that JSON prompts offer a superior structure, leading to more precise and consistent image generation. In an initial demonstration, he uses his JSON Image Creator V.3 to generate a detailed prompt for “a group of NASA astronauts on the moon playing kickball, but the ball floats away due to no gravity.” This JSON code, when submitted to ChatGPT (which uses DALL-E 3 for image generation), produces a high-quality, realistic image. Furthermore, ChatGPT’s interface allows for seamless aspect ratio adjustments (e.g., from square to 16:9 landscape or 9:16 portrait) and in-image editing, such as changing the color of the ball or adding elements like a fish to an eagle’s talons, all while maintaining image consistency.

The most exciting revelation, however, is a trick for generating consistent, sequential images, perfect for storyboarding. This advanced feature is exclusively available to paid ChatGPT users and requires activating a “Thinking” mode. By leveraging the initial JSON prompt, users can instruct ChatGPT to create a series of images (ideally around five or six) that tell a continuous story, with each subsequent image drawing consistency from the previously generated one. The presenter showcases this by creating a humorous sequence of a chimpanzee and a miniature donkey-giraffe playing football, transitioning from a yard to a street scene, with the chimp eventually diving and being consoled by the donkey-giraffe.

While this storyboarding hack offers unprecedented narrative capabilities in AI image generation, the presenter notes a limitation: image quality can begin to degrade and introduce “artifacts” after about the fifth or sixth image in a sequence. He suggests a workaround of copying the most consistent image into a new chat to maintain quality for longer narratives. Beyond photographic styles, the system also supports generating illustrations, providing further creative flexibility. Overall, the discovery highlights a powerful, structured approach to AI image creation that significantly enhances control and enables complex narrative development, setting GPT Image 2.0 apart as a formidable tool for creators.