V3.1不是混合模式嘛,如何控制thinking和 no thinking模式?
#7
by
jakyer
- opened
V3.1不是混合模式嘛,如何控制thinking和 no thinking模式?
如果是用vllms,glang或者llama.cpp推理的话可以使用下面的方法
https://docs.sglang.ai/basic_usage/openai_api_completions.html#Model-Thinking/Reasoning-Support
https://docs.vllm.ai/en/latest/features/reasoning_outputs.html#supported-models
https://github.com/ggml-org/llama.cpp/blob/master/tools/server/README.md#:~:text=chat_template_kwargs%3A%20Allows%20sending%20additional%20parameters%20to%20the%20json%20templating%20system
no thinking模式
"chat_template_kwargs": {"thinking": false}
thinking模式
"chat_template_kwargs": {"thinking": true}