why the int8 and fp16 model size both are 31GB?
#1
by
snomile
- opened
original model size is 61GB which is BF16, so I am confused...
unable to make it work with both files ,shows following error:
llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'qwen3-omni'
llama_model_load_from_file_impl: failed to load model
f16 is 2Bytes size,q8 is 1Byte size, f16 must be somewhere 2 times bigger than q8
original model size is 61GB which is BF16, so I am confused...
Because there is no file for f16 here, these two files are completely identical
Yes, I provided the wrong file, I'm very sorry
Yes, I provided the wrong file, I'm very sorry
Are you uploading the correct quantized one soon? Thanks.