Overview
Some models generate a separate reasoning stream (sometimes called thinking) in addition to the final answer content. NanoGPT exposes this in an OpenAI-compatible way for Chat Completions:- Streaming (SSE):
choices[0].delta.reasoning(or legacychoices[0].delta.reasoning_content) - Non-streaming:
choices[0].message.reasoning(or legacychoices[0].message.reasoning_content)
Endpoint Variants (Chat Completions)
NanoGPT provides three base paths for chat completions. All accept the same request format and model names, but differ in how reasoning content is delivered:| Base URL | Behavior | Use When |
|---|---|---|
/api/v1/chat/completions | Reasoning and answer are separate fields (reasoning + content). | Most OpenAI-compatible clients |
/api/v1legacy/chat/completions | Same as /api/v1/, but uses the legacy field name reasoning_content. | Clients that only parse reasoning_content |
/api/v1thinking/chat/completions | Reasoning and answer are merged into the normal content stream. | Clients that ignore reasoning fields but should still display thoughts |
Controlling Reasoning Output
Hide Reasoning
To strip reasoning from the response (both streaming and non-streaming), send::reasoning-exclude
Reasoning Effort
For reasoning-capable models, you can ask for lighter/heavier reasoning:none, minimal, low, medium, high.
Legacy Field Name Compatibility (reasoning_content)
If a client expects reasoning_content instead of reasoning, you can:
- Use
/api/v1legacy/chat/completions, or - Set
reasoning.delta_field: "reasoning_content", or - Use the shorthands
reasoning_delta_field/reasoning_content_compat.
How Reasoning Appears in Responses
Streaming (SSE)
Reasoning deltas appear before or alongside content deltas:Non-Streaming
The final message may include a separate reasoning field:Cost Notes
Reasoning tokens are billed as output tokens. If you enable higher reasoning effort (or use thinking variants), expect highercompletion_tokens and higher cost.
See Also
- Chat Completions: reasoning controls and endpoint variants (Chat Completion)
- Streaming protocol details across endpoints (Streaming Protocol)