Documentation Index
Fetch the complete documentation index at: https://docs.nano-gpt.com/llms.txt
Use this file to discover all available pages before exploring further.
Model Suffixes
Append supported suffixes to the model value to request optional behavior for a single request. Most suffixes are stripped before the base model is routed. Model identity suffixes, such as supported :thinking variants, are preserved because they identify a distinct model or alias.
Suffix parsing is case-insensitive for provider routing and provider suffixes. Avoid combining multiple suffixes that make conflicting provider-selection requests.
Provider Routing Preference Suffixes
These apply only to provider-selection-capable models.
| Suffix | Meaning |
|---|
:speed | Pick the provider with the best estimated completion time, using TTFT plus TPS. |
:fast | Alias for :speed. |
:throughput | Pick the provider with the highest tokens per second. |
:latency | Pick the provider with the lowest time to first token. |
:price | Pick the cheapest provider by base input-plus-output token price. |
:cheap | Alias for :price. |
:floor | Alias for :price. |
:tools | Route to a tools-capable provider path for models that support tools provider selection. |
{ "model": "zai-org/glm-5:fast", "messages": [{ "role": "user", "content": "Hello" }] }
{ "model": "zai-org/glm-5:cheap", "messages": [{ "role": "user", "content": "Hello" }] }
{ "model": "moonshotai/kimi-k2.6:tools", "messages": [{ "role": "user", "content": "Hello" }], "tools": [{ "type": "function", "function": { "name": "lookup", "parameters": { "type": "object", "properties": {} } } }] }
Rules:
- Routing preference suffixes only consider user-selectable providers. Internal routing-only providers are excluded.
- Routing preference suffixes are stripped before model mapping/routing. Non-routing identity suffixes such as
:thinking are preserved.
- Do not combine
:speed, :fast, :throughput, :latency, :price, :cheap, or :floor with X-Provider, body provider, or a provider model suffix.
- Do not combine
:tools with a routing preference suffix, X-Provider, body provider, a provider model suffix, or caching=true.
- Routing preference requests are billed like explicit provider-selection requests.
- Conflict error codes use the existing
speed_suffix_* family for API compatibility, even when the suffix is not literally :speed.
Provider Suffixes
The API accepts a trailing provider suffix for public user-selectable provider IDs:
{ "model": "zai-org/glm-5:cerebras", "messages": [{ "role": "user", "content": "Hello" }] }
Recommended: use X-Provider for explicit provider overrides. The API also accepts a trailing provider suffix for user-selectable providers, such as model-id:cerebras, but X-Provider is clearer and easier to validate against provider-discovery responses.
Provider suffixes must match a public user-selectable provider ID. They are case-insensitive, cannot be combined with routing preference suffixes or :tools, and are billed like explicit provider selection.
Web Search Suffixes
| Suffix | Meaning |
|---|
:online | Default web search, standard depth. For GPT-5+ / o-series models this may use OpenAI native web search unless a provider is explicit. |
:online/linkup | Linkup standard. |
:online/linkup-deep | Linkup deep. |
:online/tavily | Tavily standard. |
:online/tavily-deep | Tavily deep. |
:online/brave | Brave standard. |
:online/brave-deep | Brave deep. |
:online/exa-fast | Exa fast. |
:online/exa-auto | Exa auto. |
:online/exa-neural | Exa neural. |
:online/exa-deep | Exa deep. |
:online/exa-deep-reasoning | Exa deep-reasoning. |
:online/exa-instant | Exa instant. |
:online/kagi | Kagi standard search. |
:online/kagi-web | Kagi standard web. |
:online/kagi-news | Kagi standard news. |
:online/kagi-search | Kagi deep search. |
:online/perplexity | Perplexity standard. |
:online/perplexity-deep | Perplexity deep. |
:online/valyu | Valyu standard, all sources. |
:online/valyu-deep | Valyu deep, all sources. |
:online/valyu-web | Valyu standard, web only. |
:online/valyu-web-deep | Valyu deep, web only. |
Web search suffixes can compose with memory and reasoning-exclude suffixes. If request body webSearch.enabled or legacy linkup.enabled is true, body configuration takes precedence over model suffix configuration.
Memory Suffixes
| Suffix | Meaning |
|---|
:memory | Enable Context Memory with default 30-day retention. |
:memory-<days> | Enable Context Memory with retention clamped to 1..365 days. |
Header memory_expiration_days takes precedence over :memory-<days>. memory: false in the request body explicitly disables memory even if the model has a memory suffix. Memory can compose with web search, for example openai/gpt-5.2:online:memory-90.
Reasoning and Thinking Suffixes
| Suffix | Meaning |
|---|
:thinking | Model-specific thinking/reasoning variant when that exact ID or documented alias exists. |
-thinking | Legacy alias pattern for some model families only, not universal. |
:<number> on Anthropic thinking IDs | Thinking budget suffix for mapped Anthropic thinking aliases, for example claude-sonnet-4-thinking:8192. |
:reasoning-exclude | Equivalent to reasoning: { "exclude": true }; hides reasoning fields/blocks and strips the suffix before routing. |
:reasoning-exclude works on Chat Completions and Text Completions. It composes with other suffixes such as :thinking, :online, and :memory. :thinking is model identity, not provider routing, and is not universal.
Official and Original Route Suffixes
Some model families expose :official / :original aliases to force the official-provider route. These are model-specific; check the model page or provider-selection docs for supported IDs.
Conflict Rules
Do not combine conflicting routing directives:
{ "model": "zai-org/glm-5:fast:cerebras" }
{ "model": "zai-org/glm-5:cheap", "provider": "cerebras" }
{ "model": "zai-org/glm-5:tools:fast" }
Examples
{ "model": "zai-org/glm-5:fast", "messages": [{ "role": "user", "content": "Hello" }] }
{ "model": "zai-org/glm-5:cheap", "messages": [{ "role": "user", "content": "Hello" }] }
{ "model": "zai-org/glm-5:thinking:fast", "messages": [{ "role": "user", "content": "Hello" }] }
{ "model": "openai/gpt-5.2:online/exa-instant:memory-30", "messages": [{ "role": "user", "content": "Hello" }] }
{ "model": "anthropic/claude-opus-4.6:memory-30:online/linkup-deep:reasoning-exclude", "messages": [{ "role": "user", "content": "Hello" }] }