Granite Docling not working using vllm

#20

by SadiaSid - opened Sep 21, 2025

Sep 21, 2025

Hi, I am trying to run Docling Granite on vLLM, but I am constantly getting this error:
"AttributeError: 'LlamaModel' object has no attribute 'wte'"
This is the screenshot of the error:

Question:
Is this a known issue with vLLM and this model? Is there a fix or workaround?

Thanks.

asnassar

IBM Granite org Sep 21, 2025

Hey @SadiaSid , this is a known issue with vLLM with word tied embeddings. We uploaded the untied under a branch in this repo "untied" this works right away with vLLM, we will update the readme with this note. I will close the issue but feel free to open if something else comes up!

asnassar changed discussion status to closed Sep 21, 2025

auerchristoph

IBM Granite org Sep 22, 2025

For simplicity, you can serve it like this:

vllm serve ibm-granite/granite-docling-258M --revision untied

SuperbEmphasis

Sep 22, 2025

Does this impact performance?

saikanov

Oct 3, 2025

i got a very bad result with vllm serving, it doesnt even allow me to convert it with docling package

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment