Qwen3-30B-A3B-Instruct-2507-UD-Q2_K_XL.gguf output garbled

#8
by CalvinZero - opened

on apple M2 with metal, Qwen3-30B-A3B-Instruct-2507-UD-Q2_K_XL.gguf response garbled output.

llama-server --host :: --port 8080 -a a -hf unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF -hff Qwen3-30B-A3B-Instruct-2507-UD-Q2_K_XL.gguf --temp 0.7 --top-p 0.8 --top-k 20 --min-p 0.02 --presence-penalty 1.0 -c 32768 -np 3

Qwen3-30B-A3B-Instruct-2507-UD-IQ2_M.gguf no such problem.

llama-server --host :: --port 8080 -a a -hf unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF -hff Qwen3-30B-A3B-Instruct-2507-UD-IQ2_M.gguf --temp 0.7 --top-p 0.8 --top-k 20 --min-p 0.02 --presence-penalty 1.0 -c 32768 -np 3

you have to use --jinja

Sign up or log in to comment