Skip to main content
POST
/
v1
/
completions
cURL
curl --request POST \
  --url https://nano-gpt.com/api/v1/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "chatgpt-4o-latest",
  "prompt": "<string>",
  "max_tokens": 4000,
  "temperature": 0.7,
  "top_p": 1,
  "stream": false,
  "stop": "<string>",
  "frequency_penalty": 0,
  "presence_penalty": 0,
  "repetition_penalty": 0,
  "top_k": 123,
  "top_a": 123,
  "min_p": 0.5,
  "tfs": 0.5,
  "eta_cutoff": 123,
  "epsilon_cutoff": 123,
  "typical_p": 0.5,
  "mirostat_mode": 0,
  "mirostat_tau": 123,
  "mirostat_eta": 123,
  "min_tokens": 0,
  "stop_token_ids": [
    123
  ],
  "include_stop_str_in_output": false,
  "ignore_eos": false,
  "no_repeat_ngram_size": 1,
  "custom_token_bans": [
    123
  ],
  "logit_bias": {},
  "logprobs": true,
  "prompt_logprobs": true,
  "seed": 123
}
'
{
  "id": "<string>",
  "object": "<string>",
  "created": 123,
  "model": "<string>",
  "choices": [
    {
      "text": "<string>",
      "index": 123,
      "logprobs": {},
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123
  }
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

Parameters for text completion

model
string
default:chatgpt-4o-latest
required

The model to use for completion. Append ':online' for web search ($0.005/request) or ':online/linkup-deep' for deep web search ($0.05/request)

Examples:

"chatgpt-4o-latest"

"chatgpt-4o-latest:online"

"chatgpt-4o-latest:online/linkup-deep"

"claude-3-5-sonnet-20241022:online"

prompt
string
required

The text prompt to complete

max_tokens
integer
default:4000

Upper bound on generated tokens

Required range: x >= 1
temperature
number
default:0.7

Classic randomness control. Accepts any decimal between 0-2. Lower numbers bias toward deterministic responses, higher values explore more aggressively

Required range: 0 <= x <= 2
top_p
number
default:1

Nucleus sampling. When set below 1.0, trims candidate tokens to the smallest set whose cumulative probability exceeds top_p. Works well as an alternative to tweaking temperature

Required range: 0 <= x <= 1
stream
boolean
default:false

Whether to stream the response

stop

Stop sequences. Accepts string or array of strings. Values are passed directly to upstream providers

frequency_penalty
number
default:0

Penalizes tokens proportionally to how often they appeared previously. Negative values encourage repetition; positive values discourage it

Required range: -2 <= x <= 2
presence_penalty
number
default:0

Penalizes tokens based on whether they appeared at all. Good for keeping the model on topic without outright banning words

Required range: -2 <= x <= 2
repetition_penalty
number

Provider-agnostic repetition modifier (distinct from OpenAI penalties). Values >1 discourage repetition

Required range: -2 <= x <= 2
top_k
integer

Caps sampling to the top-k highest probability tokens per step

top_a
number

Combines top-p and temperature behavior; leave unset unless a model description explicitly calls for it

min_p
number

Ensures each candidate token probability exceeds a floor (0-1). Helpful for stopping models from collapsing into low-entropy loops

Required range: 0 <= x <= 1
tfs
number

Tail free sampling. Values between 0-1 let you shave the long tail of the distribution; 1.0 disables the feature

Required range: 0 <= x <= 1
eta_cutoff
number

Cut probabilities as soon as they fall below the specified tail threshold

epsilon_cutoff
number

Cut probabilities as soon as they fall below the specified tail threshold

typical_p
number

Typical sampling (aka entropy-based nucleus). Works like top_p but preserves tokens whose surprise matches the expected entropy

Required range: 0 <= x <= 1
mirostat_mode
enum<integer>

Enables Mirostat sampling for models that support it. Set to 1 or 2 to activate

Available options:
0,
1,
2
mirostat_tau
number

Mirostat target entropy parameter. Used when mirostat_mode is enabled

mirostat_eta
number

Mirostat learning rate parameter. Used when mirostat_mode is enabled

min_tokens
integer
default:0

For providers that support it, enforces a minimum completion length before stop conditions fire

Required range: x >= 0
stop_token_ids
integer[]

Numeric array that lets callers stop generation on specific token IDs. Not supported by many providers

include_stop_str_in_output
boolean
default:false

When true, keeps the stop sequence in the final text. Not supported by many providers

ignore_eos
boolean
default:false

Allows completions to continue even if the model predicts EOS internally. Useful for long creative writing runs

no_repeat_ngram_size
integer

Extension that forbids repeating n-grams of the given size. Not supported by many providers

Required range: x >= 0
custom_token_bans
integer[]

List of token IDs to fully block

logit_bias
object

Object mapping token IDs to additive logits. Works just like OpenAI's version

logprobs

When true or a number, forwards the request to providers that support returning token-level log probabilities

prompt_logprobs
boolean

Requests logprobs on the prompt itself when the upstream API allows it

seed
integer

Numeric seed. Wherever supported, passes the value to make completions repeatable

Response

Text completion response

id
string

Unique identifier for the completion

object
string

Object type, always 'text_completion'

created
integer

Unix timestamp of when the completion was created

model
string

Model used for completion

choices
object[]

Array of completion choices

usage
object