Documentation Index
Fetch the complete documentation index at: https://docs.nano-gpt.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The NanoGPT Batch API is an OpenAI-compatible way to run large numbers of chat completion requests asynchronously. It is best for offline workloads where latency is not important, such as classification, summarization, evals, synthetic data generation, document processing, and GPT image analysis. Batch jobs upload a JSONL file, create a batch, poll the batch, and download the output file after completion.Base URL
Use the dedicated NanoGPT API host for Batch API requests:Supported Endpoints
Files
POST /filesGET /files/{file_id}GET /files/{file_id}/content
Batches
POST /batchesGET /batches/{batch_id}GET /batchesPOST /batches/{batch_id}/cancel
Supported Batch Request Endpoint
Only this endpoint is supported inside batch JSONL rows:/v1/responses/v1/completions/v1/embeddings- Image generation, audio, video, transcription, TTS, moderation, and other non-chat endpoints
Supported Models
Batch jobs currently support supported OpenAI/GPT chat model IDs and the Claude model IDs listed below.GPT Models
Users can use supported direct GPT/OpenAI chat model IDs such as:gpt-5.4gpt-5.4-minigpt-5.4-nanogpt-5.4-progpt-5.3-chatgpt-5.3-chat-latestgpt-5.3-codexgpt-5.2gpt-5.2-chat-latestgpt-5.2-progpt-5.1gpt-5.1-chatgpt-5.1-chat-latestgpt-5.1-codexgpt-5.1-codex-maxgpt-5gpt-5-chat-latestgpt-5-minigpt-5-nanogpt-4.1gpt-4.1-minigpt-4.1-nanogpt-4ogpt-4o-minichatgpt-4o-latesto1o1-minio1-previewo3o3-minio4-mini
openai/ prefix is also accepted for supported OpenAI models. For example, openai/gpt-4.1 is normalized to gpt-4.1.
Claude Models
Supported Claude batch models:anthropic/claude-sonnet-4.5anthropic/claude-sonnet-4anthropic/claude-haiku-4.5anthropic/claude-opus-4.5anthropic/claude-opus-4.1anthropic/claude-opus-4
anthropic/ prefix may also work when they map to the same supported model. The anthropic/ prefixed form is recommended for documentation examples.
Claude thinking aliases are supported for compatible Claude models by adding a thinking suffix such as:
anthropic/claude-sonnet-4.5:thinking:lowanthropic/claude-sonnet-4.5:thinking:mediumanthropic/claude-sonnet-4.5:thinking:highanthropic/claude-sonnet-4.5:thinking:16000
max_tokens.
Request Rules
Each batch file must follow these rules:- File format must be JSONL: one JSON object per line.
- File upload must use
purpose=batch. - Every row must include a unique non-empty
custom_id. - Every row must use
method: "POST". - Every row must use
url: "/v1/chat/completions". - Every row must include a
bodyobject. - Every row must include
body.messages. - Every row must include
body.model. - Every row must include
body.max_tokensorbody.max_completion_tokens. - All rows in one file must use the same model.
- All rows in one file must use the same model family. Do not mix GPT and Claude rows in one batch.
- Streaming is not supported.
stream: trueis rejected. - GPT batch jobs support text and image-input chat messages.
- Claude batch jobs currently support text messages only.
- GPT image inputs must use OpenAI-compatible
image_urlcontent parts. - Image inputs must be placed in
usermessages. - Image URLs may be
http://,https://, or base64data:image/...;base64,...URLs. - Supported image media types are PNG, JPEG, GIF, and WebP.
- Tools, functions,
response_format, audio, and multimodal request bodies are not supported in this first version.
ngreater than1frequency_penaltypresence_penaltylogit_biaslogprobstop_logprobsseeduserservice_tierstore
Billing
Batch jobs use NanoGPT account balance. Subscription included-token logic is not applied to batch jobs. At batch creation time, NanoGPT checks the account balance against a conservative maximum liability estimate. This is based on the uploaded requests and each row’smax_tokens or max_completion_tokens.
The final charge is created after the batch reaches a terminal status and actual usage is available. Completed usage is billed once, idempotently. If a job produces no billable usage, no usage charge is created.
Batch jobs are discounted versus normal synchronous API usage. Use the model pricing page or API pricing endpoint as the source of truth for current prices.
Example Input File
Createbatch.jsonl:
Upload The File
Create The Batch
Poll The Batch
validatingin_progressfinalizingcompletedfailedexpiredcancellingcancelled
status is completed, output_file_id should be present. If individual rows failed, error_file_id may also be present.
Download Output
custom_id.
Example output line:
List Batches
limitafter
Cancel A Batch
Common Errors
Unsupported endpoint
The first version only supports/v1/chat/completions.
Missing max output cap
Every row must includemax_tokens or max_completion_tokens. This is required so NanoGPT can estimate the maximum possible liability before submitting the job.
Mixed model
All rows in one batch file must use the same model.Mixed model family
Do not mix GPT and Claude requests in one batch file.Streaming unsupported
Batch jobs are asynchronous and do not stream. Removestream: true.
Invalid image input
GPT image inputs must useimage_url parts with an http://, https://, or supported base64 data:image/...;base64,... URL. Local file paths, file:// URLs, unsupported media types, and malformed data URLs are rejected. Claude batch image inputs are not supported yet.