Skip to main content
POST
/
v1
/
responses
cURL
curl --request POST \
  --url https://nano-gpt.com/api/v1/responses \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "input": "<string>",
  "instructions": "<string>",
  "max_output_tokens": 17,
  "temperature": 1,
  "top_p": 0.5,
  "tools": [
    {}
  ],
  "tool_choice": "<string>",
  "parallel_tool_calls": true,
  "stream": false,
  "store": true,
  "previous_response_id": "<string>",
  "reasoning": {},
  "text": {},
  "metadata": {},
  "truncation": "auto",
  "user": "<string>",
  "seed": 123,
  "background": true,
  "service_tier": "auto"
}
'
{
  "id": "<string>",
  "object": "<string>",
  "created_at": 123,
  "model": "<string>",
  "status": "queued",
  "output": [
    {}
  ],
  "output_text": "<string>",
  "usage": {},
  "error": {},
  "incomplete_details": {},
  "metadata": {},
  "service_tier": "<string>"
}

Overview

The /v1/responses API is an OpenAI Responses API-compatible endpoint for creating AI model responses. It supports:
  • Stateless and stateful (conversation threading) chat completions
  • Streaming responses via Server-Sent Events (SSE)
  • Background (async) processing for long-running requests
  • Response storage and retrieval
  • Function/tool calling support
  • Multimodal inputs (images, files) for supported models

Authentication

All requests require authentication via API key:
Authorization: Bearer YOUR_API_KEY
Or alternatively:
x-api-key: YOUR_API_KEY

Endpoints

  • POST /v1/responses - Create a new response from the model
  • GET /v1/responses - Returns endpoint information
  • GET /v1/responses/{id} - Retrieve a stored response by ID
  • DELETE /v1/responses/{id} - Delete a stored response (soft delete)

Create Response

Request

POST /v1/responses
Content-Type: application/json
Authorization: Bearer YOUR_API_KEY

Request Body

ParameterTypeRequiredDescription
modelstringYesThe model to use (e.g., gpt-4o, claude-sonnet-4-20250514)
inputstring or arrayYesThe input prompt or array of input items
instructionsstringNoSystem instructions for the model
max_output_tokensintegerNoMaximum tokens in the response (minimum: 16)
temperaturenumberNoSampling temperature (0-2). Not supported by reasoning models (o1, o3, etc.)
top_pnumberNoNucleus sampling parameter. Not supported by reasoning models
toolsarrayNoArray of function tools available to the model
tool_choicestring or objectNoTool use: auto, none, required, or { type: "function", name: "..." }
parallel_tool_callsbooleanNoAllow multiple tool calls in parallel
streambooleanNoEnable streaming responses (default: false)
storebooleanNoStore response for later retrieval (default: true)
previous_response_idstringNoLink to previous response for conversation threading
reasoningobjectNoReasoning configuration for o1/o3 models
textobjectNoText/format configuration
metadataobjectNoCustom metadata (max 16 keys, 64 char keys, 512 char values)
truncationstringNoTruncation strategy: auto or disabled
userstringNoUnique user identifier
seedintegerNoRandom seed for reproducibility
backgroundbooleanNoEnable background/async processing
service_tierstringNoService tier: auto, default, or flex

Input Types

The input parameter accepts either a simple string or an array of input items.

Simple String Input

{
  "model": "gpt-4o",
  "input": "What is the capital of France?"
}

Array Input

{
  "model": "gpt-4o",
  "input": [
    {
      "type": "message",
      "role": "user",
      "content": "What is the capital of France?"
    }
  ]
}

Input Item Types

TypeDescription
messageA message with role and content
function_callA tool/function call made by the model
function_call_outputThe result of a tool/function call

Message Item

{
  "type": "message",
  "role": "user",
  "content": "Hello, how are you?"
}
Supported roles: user, assistant, system, developer Content can be a string or an array of content parts:
{
  "type": "message",
  "role": "user",
  "content": [
    { "type": "input_text", "text": "What's in this image?" },
    { "type": "input_image", "image_url": "https://example.com/image.jpg" }
  ]
}

Content Part Types

TypeDescription
input_textText input
input_imageImage input (via URL or file_id)
input_fileFile input
output_textText output (for assistant messages)
refusalModel refusal

Image Input

{
  "type": "input_image",
  "image_url": "https://example.com/image.jpg",
  "detail": "auto"
}
The detail parameter can be: auto, low, or high.

Function Call Item

{
  "type": "function_call",
  "id": "fc_123",
  "call_id": "call_abc123",
  "name": "get_weather",
  "arguments": "{\"location\": \"Paris\"}"
}

Function Call Output Item

{
  "type": "function_call_output",
  "call_id": "call_abc123",
  "output": "{\"temperature\": 22, \"condition\": \"sunny\"}"
}

Tools (Function Calling)

Define functions that the model can call:
{
  "model": "gpt-4o",
  "input": "What's the weather in Paris?",
  "tools": [
    {
      "type": "function",
      "name": "get_weather",
      "description": "Get current weather for a location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "City name"
          }
        },
        "required": ["location"]
      },
      "strict": false
    }
  ],
  "tool_choice": "auto"
}
Note: Only function type tools are currently supported. Built-in tools (web_search_preview, file_search, code_interpreter, mcp, image_generation) are not yet available.

Reasoning Configuration

For reasoning models (o1, o3, etc.):
{
  "model": "o3-mini",
  "input": "Solve this complex problem...",
  "reasoning": {
    "effort": "high",
    "summary": "auto"
  }
}
ParameterValuesDescription
effortlow, medium, highHow much effort the model puts into reasoning
summarynone, auto, detailed, conciseReasoning summary format

Text/Format Configuration

Control response format:
{
  "model": "gpt-4o",
  "input": "List 3 colors",
  "text": {
    "format": { "type": "json_object" }
  }
}

Format Types

  • { "type": "text" } - Plain text (default)
  • { "type": "json_object" } - JSON object output
  • { "type": "json_schema", "json_schema": { ... } } - Structured JSON with schema

JSON Schema Format

{
  "text": {
    "format": {
      "type": "json_schema",
      "json_schema": {
        "name": "color_list",
        "schema": {
          "type": "object",
          "properties": {
            "colors": {
              "type": "array",
              "items": { "type": "string" }
            }
          }
        },
        "strict": true
      }
    }
  }
}

Response Format

Successful Response

{
  "id": "resp_abc123",
  "object": "response",
  "created_at": 1699000000,
  "model": "gpt-4o",
  "status": "completed",
  "output": [
    {
      "type": "message",
      "id": "msg_xyz789",
      "role": "assistant",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "The capital of France is Paris."
        }
      ]
    }
  ],
  "output_text": "The capital of France is Paris.",
  "usage": {
    "input_tokens": 15,
    "output_tokens": 10,
    "total_tokens": 25
  }
}

Response Fields

FieldTypeDescription
idstringUnique response identifier (format: resp_*)
objectstringAlways "response"
created_atintegerUnix timestamp of creation
modelstringModel used for the response
statusstringResponse status
outputarrayArray of output items
output_textstringConvenience field with concatenated text output
usageobjectToken usage statistics
errorobjectError details (if status is failed)
incomplete_detailsobjectDetails if status is incomplete
metadataobjectCustom metadata (if provided)
service_tierstringService tier used

Response Status Values

StatusDescription
queuedBackground request is queued
in_progressRequest is being processed
completedRequest completed successfully
incompleteResponse was truncated
failedRequest failed with error
cancelledRequest was cancelled

Output Item Types

Message Output

{
  "type": "message",
  "id": "msg_123",
  "role": "assistant",
  "status": "completed",
  "content": [
    {
      "type": "output_text",
      "text": "Response text here"
    }
  ]
}

Function Call Output

{
  "type": "function_call",
  "id": "fc_123",
  "call_id": "call_abc",
  "name": "get_weather",
  "arguments": "{\"location\": \"Paris\"}",
  "status": "completed"
}

Reasoning Output (o1/o3 models)

{
  "type": "reasoning",
  "id": "reasoning_123",
  "content": [],
  "summary": [
    {
      "type": "summary_text",
      "text": "I analyzed the problem by..."
    }
  ]
}

Streaming

Enable streaming to receive incremental response updates:
{
  "model": "gpt-4o",
  "input": "Write a short story",
  "stream": true
}

Streaming Response

The response is delivered as Server-Sent Events (SSE):
data: {"type":"response.created","response":{...},"sequence_number":0}

data: {"type":"response.in_progress","response":{...},"sequence_number":1}

data: {"type":"response.output_item.added","output_index":0,"item":{...},"sequence_number":2}

data: {"type":"response.output_text.delta","output_index":0,"content_index":0,"delta":"The ","sequence_number":3}

data: {"type":"response.output_text.delta","output_index":0,"content_index":0,"delta":"capital ","sequence_number":4}

data: {"type":"response.output_text.done","output_index":0,"content_index":0,"text":"The capital of France is Paris.","sequence_number":10}

data: {"type":"response.completed","response":{...},"sequence_number":11}

data: [DONE]

Streaming Event Types

EventDescription
response.createdResponse object created
response.in_progressProcessing started
response.output_item.addedNew output item started
response.output_item.doneOutput item completed
response.content_part.addedContent part started
response.content_part.doneContent part completed
response.output_text.deltaIncremental text chunk
response.output_text.doneText content completed
response.function_call_arguments.deltaIncremental function arguments
response.function_call_arguments.doneFunction call completed
response.completedResponse completed successfully
response.incompleteResponse truncated
response.failedResponse failed

Conversation Threading

Chain responses together for multi-turn conversations.

First Request

{
  "model": "gpt-4o",
  "input": "My name is Alice."
}
Response includes id: "resp_abc123"

Follow-up Request

{
  "model": "gpt-4o",
  "input": "What is my name?",
  "previous_response_id": "resp_abc123"
}
The model has access to the conversation history and responds: “Your name is Alice.” Note: previous_response_id requires authentication and store: true (default) on previous responses.

Background Mode

For long-running requests, use background mode to receive an immediate response and poll for results.

Initiate Background Request

{
  "model": "gpt-4o",
  "input": "Write a detailed analysis...",
  "background": true
}

Immediate Response (202 Accepted)

{
  "id": "resp_abc123",
  "object": "response",
  "created_at": 1699000000,
  "model": "gpt-4o",
  "status": "queued",
  "output": []
}

Poll for Completion

GET /v1/responses/resp_abc123
Authorization: Bearer YOUR_API_KEY
Keep polling until status is completed, failed, or incomplete. Constraints:
  • Cannot be combined with stream: true
  • Requires authentication
  • Maximum processing time: approximately 800 seconds

Retrieve Response

GET /v1/responses/{id}
Authorization: Bearer YOUR_API_KEY

Response

Returns the full response object (same format as POST response).

Errors

  • 404 - Response not found or belongs to different account
  • 401 - Authentication required/invalid

Delete Response

DELETE /v1/responses/{id}
Authorization: Bearer YOUR_API_KEY

Response

{
  "id": "resp_abc123",
  "object": "response.deleted",
  "deleted": true
}

Error Handling

Error Response Format

{
  "error": {
    "type": "invalid_request_error",
    "message": "model is required",
    "param": "model",
    "code": "missing_required_parameter"
  }
}

Error Types

TypeHTTP StatusDescription
invalid_request_error400Invalid request parameters
authentication_error401Missing or invalid API key
permission_denied_error403Insufficient permissions
not_found_error404Resource not found
rate_limit_error429Rate limit exceeded
api_error500Internal server error
server_error503Service unavailable

Common Error Codes

CodeDescription
missing_required_parameterRequired parameter not provided
model_not_foundSpecified model does not exist
response_not_foundResponse ID not found
invalid_response_idInvalid response ID format
authentication_requiredNo API key provided
invalid_api_keyAPI key is invalid or inactive

Complete Examples

Simple Text Completion

curl -X POST https://nano-gpt.com/api/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "input": "Explain quantum computing in one sentence."
  }'

Multi-turn Conversation

# First turn
curl -X POST https://nano-gpt.com/api/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "input": "I want to learn Python programming."
  }'

# Second turn (using response ID from first request)
curl -X POST https://nano-gpt.com/api/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "input": "Where should I start?",
    "previous_response_id": "resp_abc123"
  }'

Streaming Response

curl -X POST https://nano-gpt.com/api/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "input": "Write a haiku about programming",
    "stream": true
  }'

Function Calling

curl -X POST https://nano-gpt.com/api/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "input": "What is the weather in Tokyo?",
    "tools": [
      {
        "type": "function",
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": { "type": "string" }
          },
          "required": ["location"]
        }
      }
    ]
  }'

Submitting Tool Results

curl -X POST https://nano-gpt.com/api/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "input": [
      {
        "type": "message",
        "role": "user",
        "content": "What is the weather in Tokyo?"
      },
      {
        "type": "function_call",
        "id": "fc_1",
        "call_id": "call_123",
        "name": "get_weather",
        "arguments": "{\"location\": \"Tokyo\"}"
      },
      {
        "type": "function_call_output",
        "call_id": "call_123",
        "output": "{\"temperature\": 18, \"condition\": \"cloudy\"}"
      }
    ]
  }'

Image Input (Vision)

curl -X POST https://nano-gpt.com/api/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "input": [
      {
        "type": "message",
        "role": "user",
        "content": [
          { "type": "input_text", "text": "What is in this image?" },
          { "type": "input_image", "image_url": "https://example.com/photo.jpg", "detail": "auto" }
        ]
      }
    ]
  }'

JSON Output

curl -X POST https://nano-gpt.com/api/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "input": "List the planets in our solar system",
    "text": {
      "format": { "type": "json_object" }
    }
  }'

Background Processing

# Start background request
curl -X POST https://nano-gpt.com/api/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "input": "Generate a comprehensive report...",
    "background": true
  }'

# Poll for results
curl https://nano-gpt.com/api/v1/responses/resp_abc123 \
  -H "Authorization: Bearer YOUR_API_KEY"

Limitations

  1. Built-in tools not supported: web_search_preview, file_search, code_interpreter, mcp, and image_generation tool types are not yet available. Use function type tools instead.
  2. Deep research models: Models like o3-deep-research and o4-mini-deep-research are not supported.
  3. GPU-TEE streaming: Streaming is not supported for GPU-TEE models. Use /v1/chat/completions for these models.
  4. Background mode: Maximum duration is approximately 800 seconds.
  5. Metadata limits: Maximum 16 keys, 64 character key names, 512 character values.

Response Headers

All responses include:
HeaderDescription
X-Request-IDUnique request/response identifier
Content-Typeapplication/json or text/event-stream

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

Parameters for the response request

model
string
required

Model ID to use for the response

input
required

Prompt string or array of input items

instructions
string

System instructions for the model

max_output_tokens
integer

Maximum tokens in the response

Required range: x >= 16
temperature
number

Sampling temperature (not supported by reasoning models)

Required range: 0 <= x <= 2
top_p
number

Nucleus sampling parameter

Required range: 0 <= x <= 1
tools
object[]

Function tools available to the model

tool_choice

How the model should use tools

parallel_tool_calls
boolean

Allow multiple tool calls in parallel

stream
boolean
default:false

Enable streaming responses

store
boolean
default:true

Store response for later retrieval

previous_response_id
string

Link to previous response for conversation threading

reasoning
object

Reasoning configuration for o1/o3 models

text
object

Text/format configuration

metadata
object

Custom metadata

truncation
enum<string>

Truncation strategy

Available options:
auto,
disabled
user
string

Unique user identifier

seed
integer

Random seed for reproducibility

background
boolean

Enable background/async processing

service_tier
enum<string>

Service tier

Available options:
auto,
default,
flex

Response

Response created

Response object returned by the Responses API

id
string
object
string
created_at
integer
model
string
status
enum<string>
Available options:
queued,
in_progress,
completed,
incomplete,
failed,
cancelled
output
object[]
output_text
string
usage
object
error
object
incomplete_details
object
metadata
object
service_tier
string