Model Suffixes

Append supported suffixes to the model value to request optional behavior for a single request. Most suffixes are stripped before the base model is routed. Model identity suffixes, such as supported :thinking variants, are preserved because they identify a distinct model or alias. Suffix parsing is case-insensitive for provider routing and provider suffixes. Avoid combining multiple suffixes that make conflicting provider-selection requests.

Provider Routing Preference Suffixes

These apply only to provider-selection-capable models.

Suffix	Meaning
`:speed`	Pick the provider with the best estimated completion time, using TTFT plus TPS.
`:fast`	Alias for `:speed`.
`:throughput`	Pick the provider with the highest tokens per second.
`:latency`	Pick the provider with the lowest time to first token.
`:price`	Pick the cheapest provider by base input-plus-output token price.
`:cheap`	Alias for `:price`.
`:floor`	Alias for `:price`.
`:tools`	Route to a tools-capable provider path for models that support tools provider selection.

{ "model": "zai-org/glm-5:fast", "messages": [{ "role": "user", "content": "Hello" }] }
{ "model": "zai-org/glm-5:cheap", "messages": [{ "role": "user", "content": "Hello" }] }
{ "model": "moonshotai/kimi-k2.6:tools", "messages": [{ "role": "user", "content": "Hello" }], "tools": [{ "type": "function", "function": { "name": "lookup", "parameters": { "type": "object", "properties": {} } } }] }

Rules:

Routing preference suffixes only consider user-selectable providers. Internal routing-only providers are excluded.
Routing preference suffixes are stripped before model mapping/routing. Non-routing identity suffixes such as :thinking are preserved.
Do not combine :speed, :fast, :throughput, :latency, :price, :cheap, or :floor with X-Provider, body provider, or a provider model suffix.
Do not combine :tools with a routing preference suffix, X-Provider, body provider, a provider model suffix, or caching=true.
Routing preference requests are billed like explicit provider-selection requests.
Conflict error codes use the existing speed_suffix_* family for API compatibility, even when the suffix is not literally :speed.

Provider Suffixes

The API accepts a trailing provider suffix for public user-selectable provider IDs:

{ "model": "zai-org/glm-5:cerebras", "messages": [{ "role": "user", "content": "Hello" }] }

Recommended: use X-Provider for explicit provider overrides. The API also accepts a trailing provider suffix for user-selectable providers, such as model-id:cerebras, but X-Provider is clearer and easier to validate against provider-discovery responses. Provider suffixes must match a public user-selectable provider ID. They are case-insensitive, cannot be combined with routing preference suffixes or :tools, and are billed like explicit provider selection.

Web Search Suffixes

Suffix	Meaning
`:online`	Default web search, standard depth. For GPT-5+ / o-series models this may use OpenAI native web search unless a provider is explicit.
`:online/linkup`	Linkup standard.
`:online/linkup-deep`	Linkup deep.
`:online/tavily`	Tavily standard.
`:online/tavily-deep`	Tavily deep.
`:online/brave`	Brave standard.
`:online/brave-deep`	Brave deep.
`:online/exa-fast`	Exa fast.
`:online/exa-auto`	Exa auto.
`:online/exa-neural`	Exa neural.
`:online/exa-deep`	Exa deep.
`:online/exa-deep-reasoning`	Exa deep-reasoning.
`:online/exa-instant`	Exa instant.
`:online/kagi`	Kagi standard search.
`:online/kagi-web`	Kagi standard web.
`:online/kagi-news`	Kagi standard news.
`:online/kagi-search`	Kagi deep search.
`:online/perplexity`	Perplexity standard.
`:online/perplexity-deep`	Perplexity deep.
`:online/valyu`	Valyu standard, all sources.
`:online/valyu-deep`	Valyu deep, all sources.
`:online/valyu-web`	Valyu standard, web only.
`:online/valyu-web-deep`	Valyu deep, web only.

Web search suffixes can compose with memory and reasoning-exclude suffixes. If request body webSearch.enabled or legacy linkup.enabled is true, body configuration takes precedence over model suffix configuration.

Memory Suffixes

Suffix	Meaning
`:memory`	Enable Context Memory with default 30-day retention.
`:memory-<days>`	Enable Context Memory with retention clamped to 1..365 days.

Header memory_expiration_days takes precedence over :memory-<days>. memory: false in the request body explicitly disables memory even if the model has a memory suffix. Memory can compose with web search, for example openai/gpt-5.2:online:memory-90.

Reasoning and Thinking Suffixes

Suffix	Meaning
`:thinking`	Model-specific thinking/reasoning variant when that exact ID or documented alias exists.
`-thinking`	Legacy alias pattern for some model families only, not universal.
`:<number>` on Anthropic thinking IDs	Thinking budget suffix for mapped Anthropic thinking aliases, for example `claude-sonnet-4-thinking:8192`.
`:reasoning-exclude`	Equivalent to `reasoning: { "exclude": true }`; hides reasoning fields/blocks and strips the suffix before routing.

:reasoning-exclude works on Chat Completions and Text Completions. It composes with other suffixes such as :thinking, :online, and :memory. :thinking is model identity, not provider routing, and is not universal.

Official and Original Route Suffixes

Some model families expose :official / :original aliases to force the official-provider route. These are model-specific; check the model page or provider-selection docs for supported IDs.

Conflict Rules

Do not combine conflicting routing directives:

{ "model": "zai-org/glm-5:fast:cerebras" }
{ "model": "zai-org/glm-5:cheap", "provider": "cerebras" }
{ "model": "zai-org/glm-5:tools:fast" }

Examples

{ "model": "zai-org/glm-5:fast", "messages": [{ "role": "user", "content": "Hello" }] }
{ "model": "zai-org/glm-5:cheap", "messages": [{ "role": "user", "content": "Hello" }] }
{ "model": "zai-org/glm-5:thinking:fast", "messages": [{ "role": "user", "content": "Hello" }] }
{ "model": "openai/gpt-5.2:online/exa-instant:memory-30", "messages": [{ "role": "user", "content": "Hello" }] }
{ "model": "anthropic/claude-opus-4.6:memory-30:online/linkup-deep:reasoning-exclude", "messages": [{ "role": "user", "content": "Hello" }] }

Get Started

Endpoint Examples

API Reference

Miscellaneous

Integrations

Model Suffixes

Model Suffixes

Provider Routing Preference Suffixes

Provider Suffixes

Web Search Suffixes

Memory Suffixes

Reasoning and Thinking Suffixes

Official and Original Route Suffixes

Conflict Rules

Examples

Get Started

Endpoint Examples

API Reference

Miscellaneous

Integrations

Documentation Index

​Model Suffixes

​Provider Routing Preference Suffixes

​Provider Suffixes

​Web Search Suffixes

​Memory Suffixes

​Reasoning and Thinking Suffixes

​Official and Original Route Suffixes

​Conflict Rules

​Examples

Model Suffixes

Provider Routing Preference Suffixes

Provider Suffixes

Web Search Suffixes

Memory Suffixes

Reasoning and Thinking Suffixes

Official and Original Route Suffixes

Conflict Rules

Examples