Reasoning
Some Mixlayer models support an extended thinking mode where the model produces an internal chain of thought before its visible answer. The reasoning is returned in a separate reasoning_content field on the assistant message — you can show it to users, log it for debugging, or just ignore it.
Enabling thinking
There are two equivalent ways to enable thinking on a request:
{ "thinking": true }or, for OpenAI compatibility:
{ "reasoning_effort": "low" | "medium" | "high" }Both toggle the same underlying behavior. reasoning_effort is accepted as an alias and currently maps to a boolean enable/disable — the specific effort level is reserved for future use.
To explicitly disable thinking on a model that defaults to it, send thinking: false.
Reading reasoning_content
A non-streaming response with thinking enabled includes both fields on the assistant message:
{
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"reasoning_content": "Let me work through this. The user is asking about...",
"content": "The answer is 42."
},
"finish_reason": "stop"
}]
}content is the visible answer you’d typically show to the user. reasoning_content is the model’s chain of thought — useful for debugging, evaluation, or building “show your work” UI.
Mixlayer extracts reasoning from <think>...</think> tags in the model’s
raw output and routes it to reasoning_content automatically. You will
never see the tags in either field.
Examples
curl https://models.mixlayer.ai/v1/chat/completions \
-H "Authorization: Bearer $MIXLAYER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen/qwen3.5-27b",
"thinking": true,
"messages": [
{"role": "user", "content": "If a train leaves at 3pm going 60mph and another leaves at 4pm going 80mph, when do they meet?"}
]
}'Streaming reasoning
When stream: true, reasoning arrives in delta.reasoning_content chunks alongside delta.content chunks. They interleave in the order the model produces them — typically reasoning first, then visible content.
data: {"choices":[{"delta":{"role":"assistant"}}]}
data: {"choices":[{"delta":{"reasoning_content":"Let me think. "}}]}
data: {"choices":[{"delta":{"reasoning_content":"17 * 23 = 17 * 20 + 17 * 3 = 340 + 51."}}]}
data: {"choices":[{"delta":{"content":"17 * 23 = 391."},"finish_reason":"stop"}]}To render reasoning and content in separate UI areas, route each delta based on which field is set:
for chunk in stream:
delta = chunk.choices[0].delta
extra = delta.model_extra or {}
if extra.get("reasoning_content"):
update_reasoning_pane(extra["reasoning_content"])
if delta.content:
update_answer_pane(delta.content)Constraints
Thinking mode is incompatible with response_format: json_schema. The
gateway returns an error if both are set on the same request. If you need
structured output from a reasoning model, use response_format: json_object
with explicit instructions in the prompt instead.
Thinking is supported on the Qwen 3.5 family. See Models for the up-to-date list of supported models and their recommended sampling settings for thinking vs. non-thinking modes.