This instruct model responds like a Thinking model - prompt template includes <think> as well

by CED6688 - opened 8 days ago

8 days ago

I tried replacing just the prompt template with the tokenizer_config from the base model and it made no difference, so I suspect that this is actually just a second FP8 quantization of the Thinking model. I didn't compare the weights to the other one, but it's definitely not the Instruct.

CED6688

8 days ago

Despite starting replies with reasoning content followed by text, turning on a reasoning parser with vllm (--reasoning-parser qwen3) still returns everything as delta->content, not reasoning text, so this quant isn't even a working reasoning model. I suspect it's just broken and needs to be redone properly.

cappsventures

8 days ago

The safetensors files have the same SHA256 as Qwen/Qwen3-VL-32B-Thinking-FP8, so it looks like an accidental upload

dudulu91

8 days ago

The safetensors files have the same SHA256 as Qwen/Qwen3-VL-32B-Thinking-FP8, so it looks like an accidental upload

I also verified

littlebird13

Qwen org 8 days ago

fixed

littlebird13 changed discussion status to closed 8 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment