Documentation Index
Fetch the complete documentation index at: https://docs.nano-gpt.com/llms.txt
Use this file to discover all available pages before exploring further.
Provider Selection
Provider selection chooses the upstream provider for a supported model. It does not change the model ID. For example, to use Kimi K2.6 through Novita, keep the model asmoonshotai/kimi-k2.6 or moonshotai/kimi-k2.6, and send the provider separately with X-Provider.
Provider selection is optional; if you do nothing, the platform picks the provider. X-Provider explicitly selects a provider for that request. Explicit provider selection is always pay-as-you-go and is charged at the selected provider’s price, including provider-selection markup. For subscription users, sending X-Provider bypasses subscription coverage for that request.
X-Billing-Mode: paygo is only needed when forcing pay-as-you-go billing without an explicit provider, or when saved provider preferences should apply to subscription-included traffic. See Pay-As-You-Go Billing Override.
This page intentionally documents only public provider IDs returned by the provider-discovery endpoints. Internal routing/provider names are never part of the public API contract.
When Provider Selection Applies
- Provider selection only applies when a model reports
supportsProviderSelection: true. - Use
GET /api/models/:canonicalId/providersto discover eligible models and provider IDs. supportsProviderSelectionis exposed on the model-specific provider endpoint. Do not rely on/api/v1/modelsor/api/v1/models?detailed=trueto determine provider-selection support.- If a model does not support provider selection, the request ignores provider preferences.
- You cannot currently force a provider and have that same request count as subscription-included usage.
Discover Providers and Pricing
Use this endpoint to list available providers and the pricing you will pay when selecting one./api/v1. If your base URL is https://nano-gpt.com/api/v1, do not append this path to that base URL. Use:
/, URL-encode the slash when placing it in the path:
:canonicalId means the model ID or one of its supported aliases. For Kimi K2.6, both moonshotai/kimi-k2.6 and moonshotai/kimi-k2.6 work.
Example response:
defaultPriceis used when you do not select a provider.- If
providers[].pricingis present, it is what you pay when you select that provider (includes the markup). - Use the exact value from
providers[].providerin theX-Providerheader. - If a model is unsupported, the response includes
supportsProviderSelection: false.
Per-Request Provider Override
For a single request, send the provider ID in theX-Provider header.
X-Provideris case-insensitive (x-provideralso works).- The provider ID must be one of the values returned by the providers endpoint.
- Recommended: use
X-Providerfor explicit provider overrides. The API also accepts a trailing provider suffix for user-selectable providers, such asmodel-id:cerebras, butX-Provideris clearer and easier to validate against provider-discovery responses. - To avoid Moonshot for Kimi K2.6, select any available non-Moonshot provider ID returned by the provider-discovery endpoint, such as
novita,cloudflare,baseten, orinceptron. - If the request would otherwise be subscription-covered,
X-Providerstill bypasses subscription coverage and bills the request as pay-as-you-go. You do not need to also sendbilling_mode: "paygo"orX-Billing-Mode: paygo.
Consecutive Prompts and Prompt Caching
When you explicitly select a provider withX-Provider, NanoGPT routes that request to the selected provider. For consecutive prompts, using the same X-Provider helps keep routing stable, which is the setup you want for provider-side prompt-cache reuse when that provider and model support caching.
For capability-based routing, use top-level caching: true instead of choosing a specific provider. NanoGPT will route to any available provider that supports prompt/input caching for the requested provider-selection model, and it defaults to sticky provider routing so later matching requests prefer the same provider.
Do not rely on general automatic routing if provider-side cache affinity is important. Automatic routing may choose different upstream providers across requests. For cache-sensitive workflows, either consistently send the same X-Provider value or use caching: true when any cache-capable provider is acceptable.
Cache-Capable Provider Routing
Setcaching: true when you want NanoGPT to route a chat completion request to any available provider that supports prompt/input caching. This is capability-based provider selection, not explicit provider selection and not prompt-cache annotation. It does not add Anthropic-style cache_control markers or configure cache TTLs.
caching: true is sticky. The first successful matching request records the selected provider by API key or session. Later matching requests prefer that same provider when it is still usable, improving the chance of provider-side cache hits. NanoGPT does not guarantee that the request will be served from cache.
To require a cache-capable provider without stickiness:
stickyProvider is accepted as a camelCase alias for stickyprovider.
Routing order for caching: true:
- Filter to providers that are available, not excluded by preferences, and marked as prompt-caching capable.
- If stickiness is enabled, prefer the previously recorded provider for the same cache-relevant request shape when still usable.
- Otherwise choose the cheapest cache-capable provider by base input + output price.
- Use cache write/read pricing only as tie-breakers.
Per-Request Routing Preference
For provider-selection-capable models, you can append a routing preference suffix to the model ID::speed/:fast: best estimated completion time using TTFT plus TPS.:throughput: highest tokens per second.:latency: lowest time to first token.:price/:cheap: lowest base input-plus-output token price.:floor: alias for:price.:tools: route to a tools-capable provider path when the model supports tools provider selection.
- Routing preference suffixes are billed like explicit provider-selection requests.
- They cannot be combined with
X-Provider, bodyprovider, or provider model suffixes. :toolscannot be combined with routing preference suffixes orcaching: true.- If the model does not support provider selection, the request returns an invalid request error for these suffixes.
Persistent Provider Preferences
These endpoints let a user save provider preferences in their session metadata.X-Provider, unless preferences were already saved on the associated user session.
Saved provider preferences apply to pay-as-you-go traffic. For subscription-included traffic, send X-Billing-Mode: paygo or billing_mode: "paygo" when you want saved provider preferences to apply.
Example GET response (placeholders):
PATCH payload:
preferredProviders: ordered list of allowed providers; the system tries each in order.excludedProviders: providers that should never be used.enableFallback: whentrue(default), fall back to the platform default if no preferred provider is available.modelOverrides: optional per-model overrides for the fields above.availableProviders: full set of provider IDs available to your account for the model.
Resolution Order
Whencaching: true is set, use the routing order in Cache-Capable Provider Routing. Otherwise, when multiple selections exist, the system resolves providers in this order:
- Per-request explicit provider selection (
X-Provider, bodyprovider, or provider suffix) - Per-request routing preference suffix (
:fast,:cheap, etc.) - Per-model
preferredProviders - Global
preferredProviders - Platform default (only if
enableFallbackis true)
Billing and Markup
- If you explicitly select a provider with
X-Provider, billing uses the provider-specific price plus a 5% markup and the request is treated as pay-as-you-go. - If saved preferences select a provider on pay-as-you-go traffic, billing uses the provider-specific price plus a 5% markup.
- If you do not select a provider, billing uses the model’s default price.
Error Behavior
- If
enableFallbackisfalseand no preferred provider is available,/api/v1/chat/completionsreturns400witherror.code: "no_fallback_available". PATCH /api/user/provider-preferencesvalidates provider IDs and returns:422 INVALID_INPUTfor malformed payloads.400 INVALID_EXCLUSIONSif exclusions would leave a model with no usable provider.