Quantized version?
#2
by
muratowski
- opened
It would be great if we had some quantized version of it like GGUF or even fp8
x2
AWQ would be awesome !!!!!
can be done? or compressed tensors in W4A16_ASYM
It would be great if we had some quantized version of it like GGUF or even fp8
x2
AWQ would be awesome !!!!!
can be done? or compressed tensors in W4A16_ASYM