Create a response with the OpenAI-compatible Responses API
/v1/responses API is an OpenAI Responses API-compatible endpoint for creating AI model responses. It supports:
POST /v1/responses - Create a new response from the modelGET /v1/responses - Returns endpoint informationGET /v1/responses/{id} - Retrieve a stored response by IDDELETE /v1/responses/{id} - Delete a stored response (soft delete)| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | The model to use (e.g., gpt-4o, claude-sonnet-4-20250514) |
input | string or array | Yes | The input prompt or array of input items |
instructions | string | No | System instructions for the model |
max_output_tokens | integer | No | Maximum tokens in the response (minimum: 16) |
temperature | number | No | Sampling temperature (0-2). Not supported by reasoning models (o1, o3, etc.) |
top_p | number | No | Nucleus sampling parameter. Not supported by reasoning models |
tools | array | No | Array of function tools available to the model |
tool_choice | string or object | No | Tool use: auto, none, required, or { type: "function", name: "..." } |
parallel_tool_calls | boolean | No | Allow multiple tool calls in parallel |
stream | boolean | No | Enable streaming responses (default: false) |
store | boolean | No | Store response for later retrieval (default: true) |
previous_response_id | string | No | Link to previous response for conversation threading |
reasoning | object | No | Reasoning configuration for o1/o3 models |
text | object | No | Text/format configuration |
metadata | object | No | Custom metadata (max 16 keys, 64 char keys, 512 char values) |
truncation | string | No | Truncation strategy: auto or disabled |
user | string | No | Unique user identifier |
seed | integer | No | Random seed for reproducibility |
background | boolean | No | Enable background/async processing |
service_tier | string | No | Service tier: auto, default, or flex |
input parameter accepts either a simple string or an array of input items.
| Type | Description |
|---|---|
message | A message with role and content |
function_call | A tool/function call made by the model |
function_call_output | The result of a tool/function call |
user, assistant, system, developer
Content can be a string or an array of content parts:
| Type | Description |
|---|---|
input_text | Text input |
input_image | Image input (via URL or file_id) |
input_file | File input |
output_text | Text output (for assistant messages) |
refusal | Model refusal |
detail parameter can be: auto, low, or high.
function type tools are currently supported. Built-in tools (web_search_preview, file_search, code_interpreter, mcp, image_generation) are not yet available.
| Parameter | Values | Description |
|---|---|---|
effort | low, medium, high | How much effort the model puts into reasoning |
summary | none, auto, detailed, concise | Reasoning summary format |
{ "type": "text" } - Plain text (default){ "type": "json_object" } - JSON object output{ "type": "json_schema", "json_schema": { ... } } - Structured JSON with schema| Field | Type | Description |
|---|---|---|
id | string | Unique response identifier (format: resp_*) |
object | string | Always "response" |
created_at | integer | Unix timestamp of creation |
model | string | Model used for the response |
status | string | Response status |
output | array | Array of output items |
output_text | string | Convenience field with concatenated text output |
usage | object | Token usage statistics |
error | object | Error details (if status is failed) |
incomplete_details | object | Details if status is incomplete |
metadata | object | Custom metadata (if provided) |
service_tier | string | Service tier used |
| Status | Description |
|---|---|
queued | Background request is queued |
in_progress | Request is being processed |
completed | Request completed successfully |
incomplete | Response was truncated |
failed | Request failed with error |
cancelled | Request was cancelled |
| Event | Description |
|---|---|
response.created | Response object created |
response.in_progress | Processing started |
response.output_item.added | New output item started |
response.output_item.done | Output item completed |
response.content_part.added | Content part started |
response.content_part.done | Content part completed |
response.output_text.delta | Incremental text chunk |
response.output_text.done | Text content completed |
response.function_call_arguments.delta | Incremental function arguments |
response.function_call_arguments.done | Function call completed |
response.completed | Response completed successfully |
response.incomplete | Response truncated |
response.failed | Response failed |
id: "resp_abc123"
previous_response_id requires authentication and store: true (default) on previous responses.
status is completed, failed, or incomplete.
Constraints:
stream: true404 - Response not found or belongs to different account401 - Authentication required/invalid| Type | HTTP Status | Description |
|---|---|---|
invalid_request_error | 400 | Invalid request parameters |
authentication_error | 401 | Missing or invalid API key |
permission_denied_error | 403 | Insufficient permissions |
not_found_error | 404 | Resource not found |
rate_limit_error | 429 | Rate limit exceeded |
api_error | 500 | Internal server error |
server_error | 503 | Service unavailable |
| Code | Description |
|---|---|
missing_required_parameter | Required parameter not provided |
model_not_found | Specified model does not exist |
response_not_found | Response ID not found |
invalid_response_id | Invalid response ID format |
authentication_required | No API key provided |
invalid_api_key | API key is invalid or inactive |
web_search_preview, file_search, code_interpreter, mcp, and image_generation tool types are not yet available. Use function type tools instead.o3-deep-research and o4-mini-deep-research are not supported./v1/chat/completions for these models.| Header | Description |
|---|---|
X-Request-ID | Unique request/response identifier |
Content-Type | application/json or text/event-stream |
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Parameters for the response request
Model ID to use for the response
Prompt string or array of input items
System instructions for the model
Maximum tokens in the response
x >= 16Sampling temperature (not supported by reasoning models)
0 <= x <= 2Nucleus sampling parameter
0 <= x <= 1Function tools available to the model
How the model should use tools
Allow multiple tool calls in parallel
Enable streaming responses
Store response for later retrieval
Link to previous response for conversation threading
Reasoning configuration for o1/o3 models
Text/format configuration
Custom metadata
Truncation strategy
auto, disabled Unique user identifier
Random seed for reproducibility
Enable background/async processing
Service tier
auto, default, flex Response created
Response object returned by the Responses API