Update README.md
Browse files
    	
        README.md
    CHANGED
    
    | 
         @@ -5,7 +5,7 @@ license: llama3.1 
     | 
|
| 5 | 
         
             
            - ## Introduction
         
     | 
| 6 | 
         
             
              This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
         
     | 
| 7 | 
         
             
            - ## Quantization Stragegy
         
     | 
| 8 | 
         
            -
              - ***Quantized Layers 
     | 
| 9 | 
         
             
              - ***Weight***: FP8 symmetric per-tensor
         
     | 
| 10 | 
         
             
              - ***Activation***: FP8 symmetric per-tensor
         
     | 
| 11 | 
         
             
              - ***KV Cache***: FP8 symmetric  per-tensor
         
     | 
| 
         | 
|
| 5 | 
         
             
            - ## Introduction
         
     | 
| 6 | 
         
             
              This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
         
     | 
| 7 | 
         
             
            - ## Quantization Stragegy
         
     | 
| 8 | 
         
            +
              - ***Quantized Layers***: All linear layers excluding "lm_head"
         
     | 
| 9 | 
         
             
              - ***Weight***: FP8 symmetric per-tensor
         
     | 
| 10 | 
         
             
              - ***Activation***: FP8 symmetric per-tensor
         
     | 
| 11 | 
         
             
              - ***KV Cache***: FP8 symmetric  per-tensor
         
     |