Creates a completion for the provided prompt
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Parameters for text completion
The model to use for completion. Append ':online' for web search ($0.005/request) or ':online/linkup-deep' for deep web search ($0.05/request)
"chatgpt-4o-latest"
"chatgpt-4o-latest:online"
"chatgpt-4o-latest:online/linkup-deep"
"claude-3-5-sonnet-20241022:online"
The text prompt to complete
Upper bound on generated tokens
x >= 1Classic randomness control. Accepts any decimal between 0-2. Lower numbers bias toward deterministic responses, higher values explore more aggressively
0 <= x <= 2Nucleus sampling. When set below 1.0, trims candidate tokens to the smallest set whose cumulative probability exceeds top_p. Works well as an alternative to tweaking temperature
0 <= x <= 1Whether to stream the response
Stop sequences. Accepts string or array of strings. Values are passed directly to upstream providers
Penalizes tokens proportionally to how often they appeared previously. Negative values encourage repetition; positive values discourage it
-2 <= x <= 2Penalizes tokens based on whether they appeared at all. Good for keeping the model on topic without outright banning words
-2 <= x <= 2Provider-agnostic repetition modifier (distinct from OpenAI penalties). Values >1 discourage repetition
-2 <= x <= 2Caps sampling to the top-k highest probability tokens per step
Combines top-p and temperature behavior; leave unset unless a model description explicitly calls for it
Ensures each candidate token probability exceeds a floor (0-1). Helpful for stopping models from collapsing into low-entropy loops
0 <= x <= 1Tail free sampling. Values between 0-1 let you shave the long tail of the distribution; 1.0 disables the feature
0 <= x <= 1Cut probabilities as soon as they fall below the specified tail threshold
Cut probabilities as soon as they fall below the specified tail threshold
Typical sampling (aka entropy-based nucleus). Works like top_p but preserves tokens whose surprise matches the expected entropy
0 <= x <= 1Enables Mirostat sampling for models that support it. Set to 1 or 2 to activate
0, 1, 2 Mirostat target entropy parameter. Used when mirostat_mode is enabled
Mirostat learning rate parameter. Used when mirostat_mode is enabled
For providers that support it, enforces a minimum completion length before stop conditions fire
x >= 0Numeric array that lets callers stop generation on specific token IDs. Not supported by many providers
When true, keeps the stop sequence in the final text. Not supported by many providers
Allows completions to continue even if the model predicts EOS internally. Useful for long creative writing runs
Extension that forbids repeating n-grams of the given size. Not supported by many providers
x >= 0List of token IDs to fully block
Object mapping token IDs to additive logits. Works just like OpenAI's version
When true or a number, forwards the request to providers that support returning token-level log probabilities
Requests logprobs on the prompt itself when the upstream API allows it
Numeric seed. Wherever supported, passes the value to make completions repeatable