turboderp commited on
Commit
d80df0a
·
verified ·
1 Parent(s): f7887d4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -1
README.md CHANGED
@@ -1,3 +1,11 @@
 
 
 
 
 
 
 
 
1
  EXL3 quants of [Phi-4-mini-instruct](https://huggingface.co/microsoft/Phi-4-mini-instruct)
2
 
3
  At the moment these are all converted with 8-bpw output layers. Currently investigating why there's a small-but-noticeable drop in accuracy at 6-bpw. Likely it has to do with the tied embeddings.
@@ -10,4 +18,4 @@ At the moment these are all converted with 8-bpw output layers. Currently invest
10
  [4.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/4.0bpw)
11
  [5.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/5.0bpw)
12
  [6.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/6.0bpw)
13
- [8.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/8.0bpw)
 
1
+ ---
2
+ license: mit
3
+ base_model: microsoft/Phi-4-mini-instruct
4
+ base_model_relation: quantized
5
+ quantized_by: turboderp
6
+ tags:
7
+ - exl3
8
+ ---
9
  EXL3 quants of [Phi-4-mini-instruct](https://huggingface.co/microsoft/Phi-4-mini-instruct)
10
 
11
  At the moment these are all converted with 8-bpw output layers. Currently investigating why there's a small-but-noticeable drop in accuracy at 6-bpw. Likely it has to do with the tied embeddings.
 
18
  [4.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/4.0bpw)
19
  [5.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/5.0bpw)
20
  [6.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/6.0bpw)
21
+ [8.00 bits per weight / H8](https://huggingface.co/turboderp/Phi-4-mini-instruct-exl3/tree/8.0bpw)