feat: add thinking mode toggle to suppress reasoning-only response loops
Adds `llm.thinking` config option (default: true) that when disabled: - Injects /no_think into the last user message for Qwen 3.x compatibility - Sends chat_template_kwargs in API payload for backends that support it - Silently and immediately nudges on reasoning-only responses instead of showing warnings and wasting retry iterations Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -151,7 +151,11 @@ class LLMClient:
|
||||
if tools:
|
||||
payload["tools"] = tools
|
||||
|
||||
# Merge model-specific extra parameters (e.g., enable_thinking, reasoning_effort)
|
||||
# When thinking is disabled, inject chat_template_kwargs for backends that support it
|
||||
if not self._config.thinking:
|
||||
payload.setdefault("chat_template_kwargs", {})["enable_thinking"] = False
|
||||
|
||||
# Merge model-specific extra parameters (e.g., reasoning_effort)
|
||||
if self._config.extra_body:
|
||||
payload.update(self._config.extra_body)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user