Update README.md
Browse files
    	
        README.md
    CHANGED
    
    | 
         @@ -158,11 +158,7 @@ print(tokenizer.decode(outputs[0])) 
     | 
|
| 158 | 
         | 
| 159 | 
         
             
            ## Direct Use and Downstream Use
         
     | 
| 160 | 
         | 
| 161 | 
         
            -
             
     | 
| 162 | 
         
            -
             
     | 
| 163 | 
         
            -
            > The primary use is research on language models, including: research on zero-shot NLP tasks and in-context few-shot learning NLP tasks, such as reasoning, and question answering; advancing fairness and safety research, and understanding limitations of current large language models
         
     | 
| 164 | 
         
            -
             
     | 
| 165 | 
         
            -
            See the [research paper](https://arxiv.org/pdf/2210.11416.pdf) for further details.
         
     | 
| 166 | 
         | 
| 167 | 
         
             
            ## Out-of-Scope Use
         
     | 
| 168 | 
         | 
| 
         @@ -193,11 +189,7 @@ The model was trained on a Masked Language Modeling task, on Colossal Clean Craw 
     | 
|
| 193 | 
         | 
| 194 | 
         
             
            ## Training Procedure
         
     | 
| 195 | 
         | 
| 196 | 
         
            -
            According to the model card from the [original paper](https://arxiv.org/pdf/ 
     | 
| 197 | 
         
            -
             
     | 
| 198 | 
         
            -
            > These models are based on pretrained SwitchTransformers and are not fine-tuned. It is normal if they perform well on zero-shot tasks.
         
     | 
| 199 | 
         
            -
             
     | 
| 200 | 
         
            -
            The model has been trained on TPU v3 or TPU v4 pods, using [`t5x`](https://github.com/google-research/t5x) codebase together with [`jax`](https://github.com/google/jax).
         
     | 
| 201 | 
         | 
| 202 | 
         | 
| 203 | 
         
             
            # Evaluation
         
     | 
| 
         | 
|
| 158 | 
         | 
| 159 | 
         
             
            ## Direct Use and Downstream Use
         
     | 
| 160 | 
         | 
| 161 | 
         
            +
            See the [research paper](https://arxiv.org/pdf/2101.03961.pdf) for further details.
         
     | 
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 162 | 
         | 
| 163 | 
         
             
            ## Out-of-Scope Use
         
     | 
| 164 | 
         | 
| 
         | 
|
| 189 | 
         | 
| 190 | 
         
             
            ## Training Procedure
         
     | 
| 191 | 
         | 
| 192 | 
         
            +
            According to the model card from the [original paper](https://arxiv.org/pdf/2101.03961.pdf) the model has been trained on TPU v3 or TPU v4 pods, using [`t5x`](https://github.com/google-research/t5x) codebase together with [`jax`](https://github.com/google/jax).
         
     | 
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 193 | 
         | 
| 194 | 
         | 
| 195 | 
         
             
            # Evaluation
         
     |