Update README.md
Browse files
README.md
CHANGED
|
@@ -2608,7 +2608,7 @@ model-index:
|
|
| 2608 |
|
| 2609 |
# gte-base-en-v1.5
|
| 2610 |
|
| 2611 |
-
We introduce `gte-v1.5` series, upgraded `gte` embeddings that support the context length of up to **8192
|
| 2612 |
The models are built upon the `transformer++` encoder [backbone](https://huggingface.co/Alibaba-NLP/new-impl) (BERT + RoPE + GLU).
|
| 2613 |
|
| 2614 |
The `gte-v1.5` series achieve state-of-the-art scores on the MTEB benchmark within the same model size category and prodvide competitive on the LoCo long-context retrieval tests (refer to [Evaluation](#evaluation)).
|
|
@@ -2689,8 +2689,8 @@ print(cos_sim(embeddings[0], embeddings[1]))
|
|
| 2689 |
### Training Data
|
| 2690 |
|
| 2691 |
- Masked language modeling (MLM): `c4-en`
|
| 2692 |
-
- Weak-supervised contrastive (WSC) pre-training: GTE pre-training data
|
| 2693 |
-
- Supervised contrastive fine-tuning: GTE fine-tuning data
|
| 2694 |
|
| 2695 |
### Training Procedure
|
| 2696 |
|
|
@@ -2734,14 +2734,16 @@ The gte evaluation setting: `mteb==1.2.0, fp16 auto mix precision, max_length=81
|
|
| 2734 |
|
| 2735 |
|
| 2736 |
|
| 2737 |
-
## Citation
|
|
|
|
| 2738 |
|
| 2739 |
-
|
| 2740 |
-
|
| 2741 |
-
|
| 2742 |
-
|
| 2743 |
-
|
|
|
|
|
|
|
|
|
|
| 2744 |
|
| 2745 |
-
**APA:**
|
| 2746 |
|
| 2747 |
-
[More Information Needed]
|
|
|
|
| 2608 |
|
| 2609 |
# gte-base-en-v1.5
|
| 2610 |
|
| 2611 |
+
We introduce `gte-v1.5` series, upgraded `gte` embeddings that support the context length of up to **8192**,while further enhancing model performance.
|
| 2612 |
The models are built upon the `transformer++` encoder [backbone](https://huggingface.co/Alibaba-NLP/new-impl) (BERT + RoPE + GLU).
|
| 2613 |
|
| 2614 |
The `gte-v1.5` series achieve state-of-the-art scores on the MTEB benchmark within the same model size category and prodvide competitive on the LoCo long-context retrieval tests (refer to [Evaluation](#evaluation)).
|
|
|
|
| 2689 |
### Training Data
|
| 2690 |
|
| 2691 |
- Masked language modeling (MLM): `c4-en`
|
| 2692 |
+
- Weak-supervised contrastive (WSC) pre-training: [GTE](https://arxiv.org/pdf/2308.03281.pdf) pre-training data
|
| 2693 |
+
- Supervised contrastive fine-tuning: [GTE](https://arxiv.org/pdf/2308.03281.pdf) fine-tuning data
|
| 2694 |
|
| 2695 |
### Training Procedure
|
| 2696 |
|
|
|
|
| 2734 |
|
| 2735 |
|
| 2736 |
|
| 2737 |
+
## Citation
|
| 2738 |
+
If you find our paper or models helpful, please consider citing them as follows:
|
| 2739 |
|
| 2740 |
+
```
|
| 2741 |
+
@article{li2023towards,
|
| 2742 |
+
title={Towards general text embeddings with multi-stage contrastive learning},
|
| 2743 |
+
author={Li, Zehan and Zhang, Xin and Zhang, Yanzhao and Long, Dingkun and Xie, Pengjun and Zhang, Meishan},
|
| 2744 |
+
journal={arXiv preprint arXiv:2308.03281},
|
| 2745 |
+
year={2023}
|
| 2746 |
+
}
|
| 2747 |
+
```
|
| 2748 |
|
|
|
|
| 2749 |
|
|
|