Granite Docling not working using vllm

#20
by SadiaSid - opened

Hi, I am trying to run Docling Granite on vLLM, but I am constantly getting this error:
"AttributeError: 'LlamaModel' object has no attribute 'wte'"
This is the screenshot of the error:

image.png

Question:
Is this a known issue with vLLM and this model? Is there a fix or workaround?

Thanks.

IBM Granite org

Hey @SadiaSid , this is a known issue with vLLM with word tied embeddings. We uploaded the untied under a branch in this repo "untied" this works right away with vLLM, we will update the readme with this note. I will close the issue but feel free to open if something else comes up!

asnassar changed discussion status to closed
IBM Granite org

For simplicity, you can serve it like this:

vllm serve ibm-granite/granite-docling-258M --revision untied

Does this impact performance?

i got a very bad result with vllm serving, it doesnt even allow me to convert it with docling package

Sign up or log in to comment