Skip to main content

Overview

The /v1/messages endpoint provides full Anthropic API compatibility. Clients using the Anthropic SDK can use NanoGPT by simply changing the base URL — no code changes required. This endpoint supports:
  • Text generation (streaming and non-streaming)
  • Multi-turn conversations
  • Tool use (function calling)
  • Vision (images) and document/PDF processing
  • Extended thinking (reasoning models)
  • Prompt caching

Endpoint

POST https://nano-gpt.com/api/v1/messages

Authentication

Use either header:
  • Authorization: Bearer YOUR_API_KEY
  • x-api-key: YOUR_API_KEY

Request Format

Required Fields

FieldTypeDescription
modelstringModel identifier (e.g., claude-3-5-sonnet-20241022)
max_tokensnumberMaximum tokens to generate (must be a finite number)
messagesarrayArray of conversation messages

Optional Fields

FieldTypeDefaultDescription
systemstring or arraySystem prompt (string or array of text blocks)
streambooleanfalseEnable streaming responses
temperaturenumberSampling temperature
top_pnumberNucleus sampling parameter
top_knumberTop-k sampling parameter
stop_sequencesstring[]Custom stop sequences
toolsarrayTool definitions for function calling
tool_choicestring or objectControl tool selection behavior
disable_parallel_tool_usebooleanDisable parallel tool calls
thinkingobjectEnable extended thinking for supported models
metadataobjectRequest metadata (user or user_id)

Message Format

Messages must have a role (user or assistant) and content:
{
  "role": "user",
  "content": "Hello!"
}
Or with structured content blocks:
{
  "role": "user",
  "content": [
    { "type": "text", "text": "What's in this image?" },
    {
      "type": "image",
      "source": {
        "type": "base64",
        "media_type": "image/jpeg",
        "data": "<base64-encoded-image>"
      }
    }
  ]
}

Content Block Types

Text Block

{ "type": "text", "text": "Your message here" }

Image Block (for vision-capable models)

{
  "type": "image",
  "source": {
    "type": "base64",
    "media_type": "image/jpeg",
    "data": "<base64-data>"
  }
}
Or with URL:
{
  "type": "image",
  "source": {
    "type": "url",
    "url": "https://example.com/image.jpg"
  }
}
Supported media types: image/jpeg, image/png, image/gif, image/webp

Document Block (for PDF-capable models)

{
  "type": "document",
  "source": {
    "type": "base64",
    "media_type": "application/pdf",
    "data": "<base64-data>"
  }
}

Tool Use Block (in assistant messages)

{
  "type": "tool_use",
  "id": "tool_abc123",
  "name": "get_weather",
  "input": { "city": "Paris" }
}

Tool Result Block (in user messages)

{
  "type": "tool_result",
  "tool_use_id": "tool_abc123",
  "content": "The weather in Paris is sunny, 22 C"
}

Tool Definitions

{
  "tools": [
    {
      "name": "get_weather",
      "description": "Get current weather for a city",
      "input_schema": {
        "type": "object",
        "properties": {
          "city": { "type": "string", "description": "City name" }
        },
        "required": ["city"]
      }
    }
  ]
}

Tool Choice Options

ValueDescription
"auto"Model may use tools if appropriate
"none"Disable tool use for this request
"any"Model must use at least one tool
"required"Model must use at least one tool
{"type": "tool", "name": "tool_name"}Force use of a specific tool

Extended Thinking

For models that support extended thinking (reasoning):
{
  "thinking": {
    "type": "enabled",
    "budget_tokens": 8192
  }
}
Requirements:
  • budget_tokens must be >= 1024
  • budget_tokens must be < max_tokens
  • Model must support thinking (e.g., models with -thinking suffix)

Response Format

Non-Streaming Response

{
  "id": "msg_abc123",
  "type": "message",
  "role": "assistant",
  "model": "claude-3-5-sonnet-20241022",
  "content": [
    { "type": "text", "text": "Hello! How can I help you today?" }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 10,
    "output_tokens": 12,
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 0,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 0,
      "ephemeral_1h_input_tokens": 0
    },
    "service_tier": "standard"
  }
}

Stop Reasons

Stop ReasonDescription
end_turnNatural end of response
max_tokensHit token limit
stop_sequenceHit a custom stop sequence
tool_useModel wants to use a tool
content_filterContent was filtered

Streaming Response (SSE)

When stream: true, the response is Server-Sent Events with named event types:

Event: message_start

event: message_start
data: {"type": "message_start", "message": {"id": "msg_abc", "type": "message", "role": "assistant", "model": "claude-3-5-sonnet-20241022", "content": [], "stop_reason": null, "usage": {"input_tokens": 10, "output_tokens": 0}}}

Event: content_block_start

event: content_block_start
data: {"type": "content_block_start", "index": 0, "content_block": {"type": "text", "text": ""}}

Event: content_block_delta

event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": "Hello"}}

Event: content_block_stop

event: content_block_stop
data: {"type": "content_block_stop", "index": 0}

Event: message_delta

event: message_delta
data: {"type": "message_delta", "delta": {"stop_reason": "end_turn"}, "usage": {"output_tokens": 12}}

Event: message_stop

event: message_stop
data: {"type": "message_stop"}

Streaming Tool Use

When the model uses tools during streaming:
event: content_block_start
data: {"type": "content_block_start", "index": 0, "content_block": {"type": "tool_use", "id": "tool_abc", "name": "get_weather", "input": {}}}

event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "input_json_delta", "partial_json": "{\"city\":"}}

event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "input_json_delta", "partial_json": " \"Paris\"}"}}

event: content_block_stop
data: {"type": "content_block_stop", "index": 0}

Supported Models

Claude Models (Full Support)

ModelStreamingToolsVisionThinking
claude-sonnet-4-5-20250929YesYesYesYes*
claude-3-5-sonnet-20241022YesYesYes
claude-3-5-haiku-20241022YesYesYes
*Use the model with -thinking suffix for extended reasoning.

Other Models (Via Compatibility Layer)

The v1/messages endpoint also works with non-Anthropic models:
ModelStreamingToolsVision
gpt-4oYesYesYes
gpt-4o-miniYesYesYes
gemini-2.0-flashYesYesYes
llama-3.3-70bYesYes
deepseek-chatYesYes
See the Models documentation for the full list.

Prompt Caching

Enable prompt caching to reduce costs on repeated prompts.

Enable via Header

anthropic-beta: prompt-caching-2024-07-31

TTL Options

  • Default: 5-minute cache TTL
  • Extended: Add extended-cache-ttl to request 1-hour TTL

Cache Control in Content

Add cache_control to content blocks:
{
  "type": "text",
  "text": "This is a long system prompt...",
  "cache_control": { "type": "ephemeral" }
}

Cache Usage in Response

{
  "usage": {
    "input_tokens": 100,
    "output_tokens": 50,
    "cache_creation_input_tokens": 80,
    "cache_read_input_tokens": 0
  }
}

Error Handling

Error Response Format

{
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "message": "max_tokens is required",
    "param": "max_tokens"
  }
}

Error Types

HTTP StatusError TypeDescription
400invalid_request_errorInvalid request (missing fields, bad format)
401authentication_errorInvalid or missing API key
403permission_errorInsufficient permissions
404not_found_errorUnknown model
429rate_limit_errorRate limit exceeded
500+api_errorServer error
All error responses include an X-Request-ID header for support requests.

Headers

Request Headers

HeaderRequiredDescription
AuthorizationYes*Bearer token authentication
x-api-keyYes*Alternative API key header
Content-TypeYesMust be application/json
anthropic-betaNoEnable beta features (e.g., prompt caching)
anthropic-versionNoAPI version (accepted but not required)
*One of Authorization or x-api-key is required.

BYOK Headers

For Bring Your Own Key:
HeaderDescription
x-use-byokSet to true to use your own API key
x-byok-providerProvider name for your key

Examples

Basic Request (cURL)

curl -X POST https://nano-gpt.com/api/v1/messages \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "max_tokens": 256,
    "messages": [
      { "role": "user", "content": "Hello!" }
    ]
  }'

Streaming Request (cURL)

curl -N -X POST https://nano-gpt.com/api/v1/messages \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "stream": true,
    "max_tokens": 256,
    "messages": [{ "role": "user", "content": "Hello!" }]
  }'

With Tools (cURL)

curl -X POST https://nano-gpt.com/api/v1/messages \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "max_tokens": 1024,
    "tools": [
      {
        "name": "get_weather",
        "description": "Get current weather",
        "input_schema": {
          "type": "object",
          "properties": {
            "city": { "type": "string" }
          },
          "required": ["city"]
        }
      }
    ],
    "messages": [
      { "role": "user", "content": "What is the weather in Tokyo?" }
    ]
  }'

Anthropic SDK (Node.js)

import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic({
  apiKey: process.env.NANOGPT_API_KEY,
  baseURL: "https://nano-gpt.com/api"
});

const message = await anthropic.messages.create({
  model: "claude-3-5-sonnet-20241022",
  max_tokens: 256,
  messages: [
    { role: "user", content: "Hello!" }
  ]
});

console.log(message.content[0].text);

Anthropic SDK with Streaming (Node.js)

import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic({
  apiKey: process.env.NANOGPT_API_KEY,
  baseURL: "https://nano-gpt.com/api"
});

const stream = await anthropic.messages.stream({
  model: "claude-3-5-sonnet-20241022",
  max_tokens: 256,
  messages: [
    { role: "user", content: "Tell me a story" }
  ]
});

for await (const event of stream) {
  if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
    process.stdout.write(event.delta.text);
  }
}

Anthropic SDK with Prompt Caching (Node.js)

import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic({
  apiKey: process.env.NANOGPT_API_KEY,
  baseURL: "https://nano-gpt.com/api"
});

const message = await anthropic.messages.create({
  model: "claude-3-5-sonnet-20241022",
  max_tokens: 256,
  system: [
    {
      type: "text",
      text: "You are a helpful assistant with expertise in...",
      cache_control: { type: "ephemeral" }
    }
  ],
  messages: [
    { role: "user", content: "Hello!" }
  ]
}, {
  headers: {
    "anthropic-beta": "prompt-caching-2024-07-31"
  }
});

Vision Example (Node.js)

import Anthropic from "@anthropic-ai/sdk";
import fs from "fs";

const anthropic = new Anthropic({
  apiKey: process.env.NANOGPT_API_KEY,
  baseURL: "https://nano-gpt.com/api"
});

const imageData = fs.readFileSync("image.jpg").toString("base64");

const message = await anthropic.messages.create({
  model: "claude-3-5-sonnet-20241022",
  max_tokens: 1024,
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "What's in this image?" },
        {
          type: "image",
          source: {
            type: "base64",
            media_type: "image/jpeg",
            data: imageData
          }
        }
      ]
    }
  ]
});

Python SDK

import anthropic

client = anthropic.Anthropic(
    api_key="YOUR_NANOGPT_API_KEY",
    base_url="https://nano-gpt.com/api"
)

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=256,
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

print(message.content[0].text)

Limits

LimitValue
Request timeout800 seconds
Tool argument size~100 KB per tool call
Image typesJPEG, PNG, GIF, WebP

Migration from Anthropic

To migrate from Anthropic’s API to NanoGPT:
  1. Change the base URL:
    • From: https://api.anthropic.com
    • To: https://nano-gpt.com/api
    The full endpoint will be: https://nano-gpt.com/api/v1/messages
  2. Use your NanoGPT API key instead of your Anthropic key
  3. No other code changes required — the API is fully compatible

Notes

  • The anthropic-version header is accepted but not required
  • Token usage numbers use NanoGPT’s token accounting (may differ slightly from Anthropic’s exact counts)
  • All Anthropic SDK features are supported, including streaming, tools, and caching