Overview
The NanoGPT API provides image generation through the OpenAI-compatible endpoint. This guide covers text-to-image and image-to-image flows, response formats, and best practices.OpenAI-Compatible Endpoint (v1/images/generations)
This endpoint follows the OpenAI specification for image generation and returns a list of base64-encoded images by default (or hosted URLs whenresponse_format: "url").
Basic Image Generation
Return Hosted URLs (response_format: “url”)
By default, responses include base64 strings. Ask for hosted URLs instead:- Output:
datawill usually contain{ url: "https://..." }entries instead ofb64_json(eachdata[i]is one or the other, never both). - Fallback: if URL generation (upload/presign) fails, the API may return
b64_jsoneven when you requestedresponse_format: "url". - Inputs:
imageDataUrlaccepts data URLs; download and convert remote images before sending. - Retention: the returned
data[i].urlis a signed, expiring URL (~1 hour). Generated files are kept for 24 hours, then permanently deleted. Download and store elsewhere if you need longer retention.
Downloading the image
If you requested response_format: “url”, download the image by performing a normal HTTP GET to data[i].url. Because the URL is signed and expiring, you should download it soon after generation.Image-to-Image with OpenAI Endpoint
The OpenAI-compatible endpoint also supports img2img generation using theimageDataUrl parameter. Here are the different ways to provide input images:
Uploads sent via the API must be 4 MB or smaller. Compress or resize larger files before converting to a data URL.
Method 1: Upload from Local File
Method 2: Upload from Image URL
Method 3: Multiple Images (for supported models)
Some models likegpt-4o-image, flux-kontext, and gpt-image-1 support multiple input images:
Supported Image-to-Image Models
The following models support image input via the OpenAI endpoint: Single Image Input:flux-dev-image-to-image- Image-to-image onlyghiblify- Transform images to Studio Ghibli stylegemini-flash-edit- Edit images with promptshidream-edit- Advanced image editingbagel- Both text-to-image and image-to-imageSDXL-ArliMix-v1- Artistic transformationsUpscaler- Upscale images to higher resolution
flux-kontext- Advanced context-aware generationflux-kontext/dev- Development version (image-to-image only)gpt-4o-image- GPT-4 powered image generationgpt-image-1- Advanced multi-image processing
flux-lora/inpainting- Requires bothimageDataUrl(base image) andmaskDataUrl(mask)
Image Data URL Format
All images must be provided as base64-encoded data URLs:image/jpegorimage/jpgimage/pngimage/webpimage/gif(first frame only)
Additional Parameters for Image-to-Image
When using image-to-image models, you can include these additional parameters:Best Practices
-
Prompt Engineering
- Be specific and detailed in your prompts
- Include style references when needed
- Use the negative prompt to avoid unwanted elements
- For img2img, describe the changes you want relative to the input image
-
Image Quality
- Higher resolution settings produce better quality but take longer
- More steps generally mean better quality but slower generation
- Adjust the guidance scale based on how closely you want to follow the prompt
- For img2img, ensure your input image has good quality for best results
-
Cost Optimization
- Start with lower resolution for testing
- Use fewer steps during development
- Generate one image at a time unless multiple variations are needed
- Compress input images appropriately to reduce upload size
Error Handling
The API may return various error codes:- 400: Bad Request (invalid parameters)
- 401: Unauthorized (invalid API key)
- 429: Too Many Requests (rate limit exceeded)
- 500: Internal Server Error