Update config.json for flan-t5-small

I believe the num_heads and num_layers values are swapped for google/flan-t5-small. See the comparison for t5-small (link below) which flan-t5-small is based off. With the current values, the hidden size of the model isn't divisible by the number of attention heads (512 % 6 = 2).

https://huggingface.co/t5-small/blob/df1b051c49625cf57a3d0d8d3863ed4d13564fe4/config.json#L16

Files changed (1) hide show

config.json +2 -2

config.json CHANGED Viewed

@@ -15,8 +15,8 @@
   "model_type": "t5",
   "n_positions": 512,
   "num_decoder_layers": 8,
-  "num_heads": 6,
-  "num_layers": 8,
   "output_past": true,
   "pad_token_id": 0,
   "relative_attention_max_distance": 128,

   "model_type": "t5",
   "n_positions": 512,
   "num_decoder_layers": 8,
+  "num_heads": 8,
+  "num_layers": 6,
   "output_past": true,
   "pad_token_id": 0,
   "relative_attention_max_distance": 128,