Qwen3-30B-A3B-Instruct-2507-UD-Q2_K_XL.gguf output garbled

by CalvinZero - opened Sep 13

Sep 13

on apple M2 with metal, Qwen3-30B-A3B-Instruct-2507-UD-Q2_K_XL.gguf response garbled output.

llama-server --host :: --port 8080 -a a -hf unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF -hff Qwen3-30B-A3B-Instruct-2507-UD-Q2_K_XL.gguf --temp 0.7 --top-p 0.8 --top-k 20 --min-p 0.02 --presence-penalty 1.0 -c 32768 -np 3

Qwen3-30B-A3B-Instruct-2507-UD-IQ2_M.gguf no such problem.

llama-server --host :: --port 8080 -a a -hf unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF -hff Qwen3-30B-A3B-Instruct-2507-UD-IQ2_M.gguf --temp 0.7 --top-p 0.8 --top-k 20 --min-p 0.02 --presence-penalty 1.0 -c 32768 -np 3

koifish12

18 days ago

you have to use --jinja

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment