Update config.json for flan-t5-small
Browse filesI believe the num_heads and num_layers values are swapped for google/flan-t5-small. See the comparison for t5-small (link below) which flan-t5-small is based off. With the current values, the hidden size of the model isn't divisible by the number of attention heads (512 % 6 = 2).
https://huggingface.co/t5-small/blob/df1b051c49625cf57a3d0d8d3863ed4d13564fe4/config.json#L16
- config.json +2 -2
    	
        config.json
    CHANGED
    
    | @@ -15,8 +15,8 @@ | |
| 15 | 
             
              "model_type": "t5",
         | 
| 16 | 
             
              "n_positions": 512,
         | 
| 17 | 
             
              "num_decoder_layers": 8,
         | 
| 18 | 
            -
              "num_heads":  | 
| 19 | 
            -
              "num_layers":  | 
| 20 | 
             
              "output_past": true,
         | 
| 21 | 
             
              "pad_token_id": 0,
         | 
| 22 | 
             
              "relative_attention_max_distance": 128,
         | 
|  | |
| 15 | 
             
              "model_type": "t5",
         | 
| 16 | 
             
              "n_positions": 512,
         | 
| 17 | 
             
              "num_decoder_layers": 8,
         | 
| 18 | 
            +
              "num_heads": 8,
         | 
| 19 | 
            +
              "num_layers": 6,
         | 
| 20 | 
             
              "output_past": true,
         | 
| 21 | 
             
              "pad_token_id": 0,
         | 
| 22 | 
             
              "relative_attention_max_distance": 128,
         | 
