Model Suffixes
Append supported suffixes to themodel value to request optional behavior for a single request. Most suffixes are stripped before the base model is routed. Model identity suffixes, such as supported :thinking variants, are preserved because they identify a distinct model or alias.
Suffix parsing is case-insensitive for provider routing and provider suffixes. Avoid combining multiple suffixes that make conflicting provider-selection requests.
Provider Routing Preference Suffixes
These apply only to provider-selection-capable models.| Suffix | Meaning |
|---|---|
:speed | Pick the provider with the best estimated completion time, using TTFT plus TPS. |
:fast | Alias for :speed. |
:throughput | Pick the provider with the highest tokens per second. |
:latency | Pick the provider with the lowest time to first token. |
:price | Pick the cheapest provider by base input-plus-output token price. |
:cheap | Alias for :price. |
:floor | Alias for :price. |
:tools | Route to a tools-capable provider path for models that support tools provider selection. |
:caching | Require routing to a cache-capable provider, equivalent to top-level caching: true. |
:cache | Alias for :caching. |
:cached | Alias for :caching. |
- Routing preference suffixes only consider user-selectable providers. Internal routing-only providers are excluded.
- Routing preference suffixes are stripped before model mapping/routing. Non-routing identity suffixes such as
:thinkingare preserved. - Do not combine
:speed,:fast,:throughput,:latency,:price,:cheap, or:floorwithX-Provider, bodyprovider, or a provider model suffix. - Do not combine
:toolswith a routing preference suffix,X-Provider, bodyprovider, a provider model suffix, orcaching: true. - Do not combine
:caching,:cache, or:cachedwith:tools,X-Provider, bodyprovider, a provider model suffix, or another routing preference suffix. - Routing preference requests are billed like explicit provider-selection requests.
- Conflict error codes use the existing
speed_suffix_*family for API compatibility, even when the suffix is not literally:speed.
cache_control markers, configure cache TTLs, or force a cache write. For details, see Prompt Caching.
Provider Suffixes
The API accepts a trailing provider suffix for public user-selectable provider IDs:X-Provider for explicit provider overrides. The API also accepts a trailing provider suffix for user-selectable providers, such as model-id:cerebras, but X-Provider is clearer and easier to validate against provider-discovery responses.
Provider suffixes must match a public user-selectable provider ID. They are case-insensitive, cannot be combined with routing preference suffixes or :tools, and are billed like explicit provider selection.
Web Search Suffixes
| Suffix | Meaning |
|---|---|
:online | Default web search, standard depth. For GPT-5+ / o-series models this may use OpenAI native web search unless a provider is explicit. |
:online/linkup | Linkup standard. |
:online/linkup-deep | Linkup deep. |
:online/tavily | Tavily standard. |
:online/tavily-deep | Tavily deep. |
:online/brave | Brave standard. |
:online/brave-deep | Brave deep. |
:online/exa-fast | Exa fast. |
:online/exa-auto | Exa auto. |
:online/exa-neural | Exa neural. |
:online/exa-deep | Exa deep. |
:online/exa-deep-reasoning | Exa deep-reasoning. |
:online/exa-instant | Exa instant. |
:online/kagi | Kagi standard search. |
:online/kagi-web | Kagi standard web. |
:online/kagi-news | Kagi standard news. |
:online/kagi-search | Kagi deep search. |
:online/perplexity | Perplexity standard. |
:online/perplexity-deep | Perplexity deep. |
:online/valyu | Valyu standard, all sources. |
:online/valyu-deep | Valyu deep, all sources. |
:online/valyu-web | Valyu standard, web only. |
:online/valyu-web-deep | Valyu deep, web only. |
webSearch.enabled or legacy linkup.enabled is true, body configuration takes precedence over model suffix configuration.
Memory Suffixes
| Suffix | Meaning |
|---|---|
:memory | Enable Context Memory with default 30-day retention. |
:memory-<days> | Enable Context Memory with retention clamped to 1..365 days. |
memory_expiration_days takes precedence over :memory-<days>. memory: false in the request body explicitly disables memory even if the model has a memory suffix. Memory can compose with web search, for example openai/gpt-5.2:online:memory-90.
PII Redaction Suffixes
| Suffix | Meaning |
|---|---|
:redaction | Enable PII redaction for this request. |
:redacted | Alias for :redaction. |
:piiredaction | Alias for :redaction. |
:piiredacted | Alias for :redaction. |
:online. If a request explicitly enables redaction with a model suffix, redaction remains enabled even if an account-level or API-key default would otherwise be disabled for that request.
For details, pricing, and limitations, see PII Redaction.
Reasoning and Thinking Suffixes
| Suffix | Meaning |
|---|---|
:thinking | Model-specific thinking/reasoning variant when that exact ID or documented alias exists. |
-thinking | Legacy alias pattern for some model families only, not universal. |
:<number> on Anthropic thinking IDs | Thinking budget suffix for mapped Anthropic thinking aliases, for example claude-sonnet-4-thinking:8192. |
:reasoning-exclude | Equivalent to reasoning: { "exclude": true }; hides reasoning fields/blocks and strips the suffix before routing. |
:reasoning-exclude works on Chat Completions and Text Completions. It composes with other suffixes such as :thinking, :online, and :memory. :thinking is model identity, not provider routing, and is not universal.
Official and Original Route Suffixes
Some model families expose:official / :original aliases to force the official-provider route. These are model-specific; check the model page or provider-selection docs for supported IDs.