OpenAI image generation API is now open! Developers can also "generate images with one click"

AI image generation revolution, developers can also easily achieve "one-click image generation"!
Core content:
1. OpenAI open image generation API, equipped with a new generation of multimodal model gpt-image-1
2. All-scenario "drawing artifact", global users generated more than 700 million images in one week
3. Flexible customization of style, size, color, etc., support for multiple image generation and super long prompts
Today, OpenAI officially opened the image generation API interface, which is powered by their new generation of multimodal model - gpt-image-1 , which is the "drawing brain" of GPT-4o behind ChatGPT.
The "drawing artifact" that can be used in all scenarios is finally available!
Since GPT-4o launched the image generation function, in just one week, 130 million users around the world have produced more than 700 million pictures , with styles ranging from animation, realism, fairy tales, cyber, flat... everything. They have directly swept social media and also caused a heat wave on the servers.
Ghibli's style aside, OpenAI's model is not based on sentiment, but on accurate prompts and ability to follow, which is better than the industry average . In comparison, although Midjourney V7 has updated the sketch mode, it still lags behind in terms of picture consistency and context understanding.
However, these "magical experiences" have always been available only to C-end users, and enterprises and developers have long been eager for them. Today, APIs are finally here.
From now on, not only can you play, but your products can also draw.
Developers love it: flexible, sophisticated, and customizable
How good is this API? Here are a few key points to understand:
✅ Rich styles : Whether it is Ghibli, cyberpunk, low polygon or realistic style, you can adjust the feeling you want in one sentence; up to 1536×1024 pixels are supported ; ✅The text embedding is very accurate : the English effect is very stable, and the Chinese effect may occasionally fail, but it is already a lot better than the old generation model; ✅Remember the context well : can continue the conversation and iterate the creation without having to say it again from the beginning; ✅ Parameters can be adjusted freely : size, color, and transparency can all be fine-tuned. Want a transparent bottom? Direct support! ✅Full range of formats : PNG, JPEG, WebP are all supported, with a maximum single image size of 20MB, and flexible output ;
In addition, the Image API provides two core capabilities :
Generate images : Draw from scratch based on text prompts; ✂️Edit images : Upload existing images and modify them partially or completely with new prompts.
It even supports super long prompts (up to 32,000 characters) - an order of magnitude higher than the DALLE generation, and it really can listen to your nonsense without getting annoyed.
To sum it up in one sentence: It doesn’t matter if you can’t draw, using it you can be the “next generation visual designer”.
Developer Getting Started Example (Python + OpenAI SDK)
The GPT-Image-1 API not only supports custom size, style, and transparency output, but also supports generating multiple images at once (n
The parameter supports up to 10 images. The following is a complete example of using Python to quickly generate and save images:
from openai import OpenAIimport base64client = OpenAI()prompt = """A children's book drawing of a veterinarian using a stethoscope to listen to the heartbeat of a baby otter."""result = client.images.generate( model="gpt-image-1", prompt=prompt)image_base64 = result.data[0].b64_jsonimage_bytes = base64.b64decode(image_base64)with open("otter.png", "wb") as f: f.write(image_bytes)
The resulting graph is as follows:
?Image editing & reference pictures: not only drawing, but also modifying and learning
In addition to generating images from scratch, GPT-image-1 also supports a full set of powerful image editing and reference capabilities , giving it the multi-functional capabilities of "creation + editing + imitation":
✏️Edit existing images : You can upload an image and add a new prompt to redraw the original image or adjust the details; Inpainting : upload an image + mask, accurately replace a specific area, repair faces, fill in backgrounds, and remove watermarks; Image reference generation : Upload one or more reference images, and AI will extract style, structure or object features from them to generate new images that incorporate reference content.
For example: you upload 4 pictures, and then tell the AI "generate a gift basket for me with these things in it" - it can automatically combine them into a picture, evolving from "knowing how to draw" to "knowing how to combine".
import base64from openai import OpenAIclient = OpenAI()prompt = """Generate a photorealistic image of a gift basket on a white background labeled 'Relax & Unwind' with a ribbon and handwriting-like font, containing all the items in the reference pictures."""result = client.images.edit( model="gpt-image-1", image=[ open("body-lotion.png", "rb"), open("bath-bomb.png", "rb"), open("incense-kit.png", "rb"), open("soap.png", "rb"), ], prompt=prompt)image_base64 = result.data[0].b64_jsonimage_bytes = base64.b64decode(image_base64)# Save the image to a filewith open("gift-basket.png", "wb") as f: f.write(image_bytes)
?What about the price? Not the cheapest, but the price/performance ratio is really good
The API billing method of GPT-image-1 is very detailed and is calculated based on the number of tokens , as follows:
? Text input (the prompt you wrote) : 1 million tokens = $5 ? ️Image input (if you upload an image for reference) : 1 million tokens = $10 ? Image output (generated image) : 1 million tokens = $40
In other words, based on image size and quality, the price of a single image is roughly as follows:
Low-quality images : $ 0.02 (suitable for quick iterations and sketches) ⚖️Medium quality image : $0.07 (good enough for most daily use) High-quality images : $0.19 (for commercial publication, print quality)
Although the unit price seems slightly higher than some competing products, considering its image and text accuracy, context retention ability and image consistency , it can be said that it is worth the price.