Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.nano-gpt.com/llms.txt

Use this file to discover all available pages before exploring further.

Provider Selection

Provider selection chooses the upstream provider for a supported model. It does not change the model ID. For example, to use Kimi K2.6 through Novita, keep the model as moonshotai/kimi-k2.6 or moonshotai/kimi-k2.6, and send the provider separately with X-Provider. Provider selection is optional; if you do nothing, the platform picks the provider. X-Provider explicitly selects a provider for that request. Explicit provider selection is always pay-as-you-go and is charged at the selected provider’s price, including provider-selection markup. For subscription users, sending X-Provider bypasses subscription coverage for that request. X-Billing-Mode: paygo is only needed when forcing pay-as-you-go billing without an explicit provider, or when saved provider preferences should apply to subscription-included traffic. See Pay-As-You-Go Billing Override. This page intentionally documents only public provider IDs returned by the provider-discovery endpoints. Internal routing/provider names are never part of the public API contract.

When Provider Selection Applies

  • Provider selection only applies when a model reports supportsProviderSelection: true.
  • Use GET /api/models/:canonicalId/providers to discover eligible models and provider IDs.
  • supportsProviderSelection is exposed on the model-specific provider endpoint. Do not rely on /api/v1/models or /api/v1/models?detailed=true to determine provider-selection support.
  • If a model does not support provider selection, the request ignores provider preferences.
  • You cannot currently force a provider and have that same request count as subscription-included usage.

Discover Providers and Pricing

Use this endpoint to list available providers and the pricing you will pay when selecting one.
GET /api/models/:canonicalId/providers
This endpoint is not under /api/v1. If your base URL is https://nano-gpt.com/api/v1, do not append this path to that base URL. Use:
GET https://nano-gpt.com/api/models/moonshotai%2Fkimi-k2.6/providers
If the model ID contains /, URL-encode the slash when placing it in the path:
GET https://nano-gpt.com/api/models/moonshotai%2Fkimi-k2.6/providers
Do not use the unencoded form, because the slash is treated as a path separator:
GET https://nano-gpt.com/api/models/moonshotai/kimi-k2.6/providers
Here :canonicalId means the model ID or one of its supported aliases. For Kimi K2.6, both moonshotai/kimi-k2.6 and moonshotai/kimi-k2.6 work. Example response:
{
  "canonicalId": "moonshotai/kimi-k2.6",
  "displayName": "Kimi K2.6",
  "supportsProviderSelection": true,
  "defaultPrice": {
    "inputPer1kTokens": 0.0005,
    "outputPer1kTokens": 0.0026
  },
  "providers": [
    {
      "provider": "moonshot",
      "available": true
    },
    {
      "provider": "novita",
      "available": true
    },
    {
      "provider": "cloudflare",
      "available": true
    },
    {
      "provider": "baseten",
      "available": true
    }
  ]
}
Notes:
  • defaultPrice is used when you do not select a provider.
  • If providers[].pricing is present, it is what you pay when you select that provider (includes the markup).
  • Use the exact value from providers[].provider in the X-Provider header.
  • If a model is unsupported, the response includes supportsProviderSelection: false.

Per-Request Provider Override

For a single request, send the provider ID in the X-Provider header.
curl https://nano-gpt.com/api/v1/chat/completions \
  -H "Authorization: Bearer $NANOGPT_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Provider: novita" \
  -d '{
    "model": "kimi-k2.6",
    "messages": [
      { "role": "user", "content": "Hello" }
    ]
  }'
Notes:
  • X-Provider is case-insensitive (x-provider also works).
  • The provider ID must be one of the values returned by the providers endpoint.
  • Recommended: use X-Provider for explicit provider overrides. The API also accepts a trailing provider suffix for user-selectable providers, such as model-id:cerebras, but X-Provider is clearer and easier to validate against provider-discovery responses.
  • To avoid Moonshot for Kimi K2.6, select any available non-Moonshot provider ID returned by the provider-discovery endpoint, such as novita, cloudflare, baseten, or inceptron.
  • If the request would otherwise be subscription-covered, X-Provider still bypasses subscription coverage and bills the request as pay-as-you-go. You do not need to also send billing_mode: "paygo" or X-Billing-Mode: paygo.

Consecutive Prompts and Prompt Caching

When you explicitly select a provider with X-Provider, NanoGPT routes that request to the selected provider. For consecutive prompts, using the same X-Provider helps keep routing stable, which is the setup you want for provider-side prompt-cache reuse when that provider and model support caching. For capability-based routing, use top-level caching: true instead of choosing a specific provider. NanoGPT will route to any available provider that supports prompt/input caching for the requested provider-selection model, and it defaults to sticky provider routing so later matching requests prefer the same provider. Do not rely on general automatic routing if provider-side cache affinity is important. Automatic routing may choose different upstream providers across requests. For cache-sensitive workflows, either consistently send the same X-Provider value or use caching: true when any cache-capable provider is acceptable.

Cache-Capable Provider Routing

Set caching: true when you want NanoGPT to route a chat completion request to any available provider that supports prompt/input caching. This is capability-based provider selection, not explicit provider selection and not prompt-cache annotation. It does not add Anthropic-style cache_control markers or configure cache TTLs.
{
  "model": "model-id",
  "caching": true,
  "messages": [
    { "role": "user", "content": "Hello" }
  ]
}
If no usable cache-capable provider exists for the model, the request fails instead of falling back to a non-cache-capable provider. By default, caching: true is sticky. The first successful matching request records the selected provider by API key or session. Later matching requests prefer that same provider when it is still usable, improving the chance of provider-side cache hits. NanoGPT does not guarantee that the request will be served from cache. To require a cache-capable provider without stickiness:
{
  "model": "model-id",
  "caching": true,
  "stickyprovider": false,
  "messages": [
    { "role": "user", "content": "Hello" }
  ]
}
Top-level stickyProvider is accepted as a camelCase alias for stickyprovider. Routing order for caching: true:
  1. Filter to providers that are available, not excluded by preferences, and marked as prompt-caching capable.
  2. If stickiness is enabled, prefer the previously recorded provider for the same cache-relevant request shape when still usable.
  3. Otherwise choose the cheapest cache-capable provider by base input + output price.
  4. Use cache write/read pricing only as tie-breakers.

Per-Request Routing Preference

For provider-selection-capable models, you can append a routing preference suffix to the model ID:
  • :speed / :fast: best estimated completion time using TTFT plus TPS.
  • :throughput: highest tokens per second.
  • :latency: lowest time to first token.
  • :price / :cheap: lowest base input-plus-output token price.
  • :floor: alias for :price.
  • :tools: route to a tools-capable provider path when the model supports tools provider selection.
Example:
{
  "model": "zai-org/glm-5:fast",
  "messages": [{ "role": "user", "content": "Hello" }]
}
Notes:
  • Routing preference suffixes are billed like explicit provider-selection requests.
  • They cannot be combined with X-Provider, body provider, or provider model suffixes.
  • :tools cannot be combined with routing preference suffixes or caching: true.
  • If the model does not support provider selection, the request returns an invalid request error for these suffixes.
See Model Suffixes for the complete suffix reference and composition rules.

Persistent Provider Preferences

These endpoints let a user save provider preferences in their session metadata.
GET /api/user/provider-preferences
PATCH /api/user/provider-preferences
DELETE /api/user/provider-preferences
These endpoints require a logged-in web session. API-key-only requests should use per-request provider selection with X-Provider, unless preferences were already saved on the associated user session. Saved provider preferences apply to pay-as-you-go traffic. For subscription-included traffic, send X-Billing-Mode: paygo or billing_mode: "paygo" when you want saved provider preferences to apply. Example GET response (placeholders):
{
  "preferredProviders": ["provider-a", "provider-b"],
  "excludedProviders": ["provider-c"],
  "enableFallback": true,
  "modelOverrides": {
    "model-id": {
      "preferredProviders": ["provider-b"],
      "enableFallback": false
    }
  },
  "availableProviders": ["provider-a", "provider-b", "provider-c"]
}
Example PATCH payload:
{
  "preferredProviders": ["provider-a", "provider-b"],
  "excludedProviders": ["provider-c"],
  "enableFallback": false,
  "modelOverrides": {
    "model-id": {
      "preferredProviders": ["provider-b"],
      "enableFallback": true
    }
  }
}
Field details:
  • preferredProviders: ordered list of allowed providers; the system tries each in order.
  • excludedProviders: providers that should never be used.
  • enableFallback: when true (default), fall back to the platform default if no preferred provider is available.
  • modelOverrides: optional per-model overrides for the fields above.
  • availableProviders: full set of provider IDs available to your account for the model.

Resolution Order

When caching: true is set, use the routing order in Cache-Capable Provider Routing. Otherwise, when multiple selections exist, the system resolves providers in this order:
  1. Per-request explicit provider selection (X-Provider, body provider, or provider suffix)
  2. Per-request routing preference suffix (:fast, :cheap, etc.)
  3. Per-model preferredProviders
  4. Global preferredProviders
  5. Platform default (only if enableFallback is true)

Billing and Markup

  • If you explicitly select a provider with X-Provider, billing uses the provider-specific price plus a 5% markup and the request is treated as pay-as-you-go.
  • If saved preferences select a provider on pay-as-you-go traffic, billing uses the provider-specific price plus a 5% markup.
  • If you do not select a provider, billing uses the model’s default price.

Error Behavior

  • If enableFallback is false and no preferred provider is available, /api/v1/chat/completions returns 400 with error.code: "no_fallback_available".
  • PATCH /api/user/provider-preferences validates provider IDs and returns:
    • 422 INVALID_INPUT for malformed payloads.
    • 400 INVALID_EXCLUSIONS if exclusions would leave a model with no usable provider.