Update README.md
Browse files
README.md
CHANGED
|
@@ -214,7 +214,7 @@ This method proposes a novel method for generating datasets for DPO (Self-superv
|
|
| 214 |
* If the data set cannot be found, it is internal company data and cannot be made public.
|
| 215 |
|
| 216 |
## dpo dataset info : datasets_encomp_151k
|
| 217 |
-
Randomly selecting data from each category within the training dataset, we constructed a DPO (
|
| 218 |
* I'm sorry I can't reveal it.
|
| 219 |
|
| 220 |
## Evaluation
|
|
|
|
| 214 |
* If the data set cannot be found, it is internal company data and cannot be made public.
|
| 215 |
|
| 216 |
## dpo dataset info : datasets_encomp_151k
|
| 217 |
+
Randomly selecting data from each category within the training dataset, we constructed a DPO (Direct Preference Optimization) dataset using sentences with logits lower than the mean within the model-generated sentences.
|
| 218 |
* I'm sorry I can't reveal it.
|
| 219 |
|
| 220 |
## Evaluation
|