Only end </think> tag but no start <think> tag.

by zhangziji1021 - opened Aug 21

Aug 21

I used vllm to deploy Qwen/Qwen3-4B-Thinking-2507 model, and used langchain framework to develop an agent with workflow. However, whether I use simple chat flow or the workflow, the returned responses always have no start tag but end tag.

Why is that?

bleepGG

Aug 22

•

edited Aug 22

If the response message is too long and exceeds Max_Context_Length, the beginning of the message may be truncated and output.

or maybe this

NOTE: This model supports only thinking mode. Meanwhile, specifying enable_thinking=True is no longer required.
Additionally, to enforce model thinking, the default chat template automatically includes . Therefore, it is normal for the model's output to contain only without an explicit opening tag.

xies12

Aug 24

how can i prevent this part from printing? i need only the output

zhangziji1021

Aug 25

•

edited Aug 25

I created a simple qwen_chat_template.jinja file, and deployed the LLM with --chat-template /model/qwen_chat_template.jinja parameter. Then the issue was solved. There are start tag and end tag of 'think'.

zhangziji1021

Aug 25

how can i prevent this part from printing? i need only the output

I think you can write code to filter the thinking part.

Harveenchadha

Sep 3

@zhangziji1021 can you please share the chat template that solved this problem

zhangziji1021

Sep 10

•

edited Sep 10

@zhangziji1021 can you please share the chat template that solved this problem

This is the simple template I use.

{% for message in messages %}
    {{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}
{% endfor %}
{% if add_generation_prompt %}
    {{ '<|im_start|>assistant\n' }}
{% endif %}

Also, I found that sometimes the response will have no think tag. When I strictly added rules in the system prompt like "wrap your thinking content by tags", the problem seems to be solved. I need to do more test rounds on it.

AaronVogler

4 days ago

Curious, why'd ya'll decide to leave out the initial tag?

I get that it's an implicit token because this model is a thinking model, but for model providers, this is a problem because all other models are following the convention of including it.

Which means providers have to one off patch it.

It also breaks LLM Studio and Cherry Studio's thinking functionality because it isn't present.

Anyway you guys could revert this change and have it return that first token?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment