turboderp
/

Phi-4-mini-instruct-exl3

Model card Files Files and versions

turboderp commited on 7 days ago

Commit

d80df0a

·

verified ·

1 Parent(s): f7887d4

Update README.md

Files changed (1) hide show

README.md +9 -1

README.md CHANGED Viewed

@@ -1,3 +1,11 @@
 EXL3 quants of [Phi-4-mini-instruct](https://huggingface.co/microsoft/Phi-4-mini-instruct)
 At the moment these are all converted with 8-bpw output layers. Currently investigating why there's a small-but-noticeable drop in accuracy at 6-bpw. Likely it has to do with the tied embeddings.
@@ -10,4 +18,4 @@ At the moment these are all converted with 8-bpw output layers. Currently invest
 [4.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/4.0bpw)
 [5.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/5.0bpw)
 [6.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/6.0bpw)
-[8.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/8.0bpw)

+---
+license: mit
+base_model: microsoft/Phi-4-mini-instruct
+base_model_relation: quantized
+quantized_by: turboderp
+tags:
+- exl3
+---
 EXL3 quants of [Phi-4-mini-instruct](https://huggingface.co/microsoft/Phi-4-mini-instruct)
 At the moment these are all converted with 8-bpw output layers. Currently investigating why there's a small-but-noticeable drop in accuracy at 6-bpw. Likely it has to do with the tied embeddings.
 [4.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/4.0bpw)
 [5.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/5.0bpw)
 [6.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/6.0bpw)
+[8.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/8.0bpw)