Update README.md
Browse files
README.md
CHANGED
|
@@ -11,8 +11,8 @@ tags:
|
|
| 11 |
|
| 12 |
This method employs mergekit's passthrough method to expand blocks within the "mistralai/Mistral-7B-Instruct-v0.2" model. For every 5th layer,
|
| 13 |
a new layer is added, with the `o_proj` and `down_proj` parameters of these added layers initialized to zero, mirroring the approach used in LLaMA Pro.
|
| 14 |
-
|
| 15 |
-
while all other layers remain frozen.
|
| 16 |
|
| 17 |
## 🧩 Configuration
|
| 18 |
|
|
|
|
| 11 |
|
| 12 |
This method employs mergekit's passthrough method to expand blocks within the "mistralai/Mistral-7B-Instruct-v0.2" model. For every 5th layer,
|
| 13 |
a new layer is added, with the `o_proj` and `down_proj` parameters of these added layers initialized to zero, mirroring the approach used in LLaMA Pro.
|
| 14 |
+
|
| 15 |
+
### It's important to note that this configuration has not undergone fine-tuning. So this won't work. Therefore, when fine-tuning, ensure that only every 5th layer is trainable,while all other layers remain frozen.
|
| 16 |
|
| 17 |
## 🧩 Configuration
|
| 18 |
|