head_dim * num_attention_heads != hidden_size

#4
by zhangchuanhu - opened

image.png

Sign up or log in to comment