FP16 Weights
Thanks for this wonderful model! Would it be possible for someone to please upload a FP16 version of the weights? The hardware I am using cannot load the FP8 version and I have not managed to successfully load the weights and convert to FP16 myself.
Thanks for this wonderful model! Would it be possible for someone to please upload a FP16 version of the weights? The hardware I am using cannot load the FP8 version and I have not managed to successfully load the weights and convert to FP16 myself.
vLLM docker somehow loads this on ampere arch, even tho its not supported, it manages automatically conversion, if i remember correctly it keeps weights in FP8, but math is FP16.
Thanks for this wonderful model! Would it be possible for someone to please upload a FP16 version of the weights? The hardware I am using cannot load the FP8 version and I have not managed to successfully load the weights and convert to FP16 myself.
vLLM docker somehow loads this on ampere arch, even tho its not supported, it manages automatically conversion, if i remember correctly it keeps weights in FP8, but math is FP16.
What config options you use for that? I can't get it to load
Thanks for this wonderful model! Would it be possible for someone to please upload a FP16 version of the weights? The hardware I am using cannot load the FP8 version and I have not managed to successfully load the weights and convert to FP16 myself.
vLLM docker somehow loads this on ampere arch, even tho its not supported, it manages automatically conversion, if i remember correctly it keeps weights in FP8, but math is FP16.
What config options you use for that? I can't get it to load
I dont have CLI command any more, since i dont use this model after tests, however i suspect most useful part was nightly image usage.
Thanks for this wonderful model! Would it be possible for someone to please upload a FP16 version of the weights? The hardware I am using cannot load the FP8 version and I have not managed to successfully load the weights and convert to FP16 myself.
vLLM docker somehow loads this on ampere arch, even tho its not supported, it manages automatically conversion, if i remember correctly it keeps weights in FP8, but math is FP16.
What config options you use for that? I can't get it to load
I dont have CLI command any more, since i dont use this model after tests, however i suspect most useful part was nightly image usage.
ah yes that was the issue. vllm will load and dynamically convert any FP8 to BF16 on load, i just had out of date version with some missing devstral stuff. Latest works. thanks!