Alibaba-NLP
/

gte-base-en-v1.5

Sentence Similarity

sentence-transformers

Transformers.js

feature-extraction

text-embeddings-inference

Model card Files Files and versions

thenlper commited on Apr 21, 2024

Commit

fe3f323

·

verified ·

1 Parent(s): 8a5dd6b

Update README.md

Files changed (1) hide show

README.md +13 -11

README.md CHANGED Viewed

@@ -2608,7 +2608,7 @@ model-index:
 # gte-base-en-v1.5
-We introduce `gte-v1.5` series, upgraded `gte` embeddings that support the context length of up to **8192**.
 The models are built upon the `transformer++` encoder [backbone](https://huggingface.co/Alibaba-NLP/new-impl) (BERT + RoPE + GLU).
 The `gte-v1.5` series achieve state-of-the-art scores on the MTEB benchmark within the same model size category and prodvide competitive on the LoCo long-context retrieval tests (refer to [Evaluation](#evaluation)).
@@ -2689,8 +2689,8 @@ print(cos_sim(embeddings[0], embeddings[1]))
 ### Training Data
 - Masked language modeling (MLM): `c4-en`
-- Weak-supervised contrastive (WSC) pre-training: GTE pre-training data
-- Supervised contrastive fine-tuning: GTE fine-tuning data
 ### Training Procedure
@@ -2734,14 +2734,16 @@ The gte evaluation setting: `mteb==1.2.0, fp16 auto mix precision, max_length=81
-## Citation [TODO]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]

 # gte-base-en-v1.5
+We introduce `gte-v1.5` series, upgraded `gte` embeddings that support the context length of up to **8192**，while further enhancing model performance.
 The models are built upon the `transformer++` encoder [backbone](https://huggingface.co/Alibaba-NLP/new-impl) (BERT + RoPE + GLU).
 The `gte-v1.5` series achieve state-of-the-art scores on the MTEB benchmark within the same model size category and prodvide competitive on the LoCo long-context retrieval tests (refer to [Evaluation](#evaluation)).
 ### Training Data
 - Masked language modeling (MLM): `c4-en`
+- Weak-supervised contrastive (WSC) pre-training: [GTE](https://arxiv.org/pdf/2308.03281.pdf) pre-training data
+- Supervised contrastive fine-tuning: [GTE](https://arxiv.org/pdf/2308.03281.pdf) fine-tuning data
 ### Training Procedure
+## Citation
+If you find our paper or models helpful, please consider citing them as follows:
+```
+@article{li2023towards,
+  title={Towards general text embeddings with multi-stage contrastive learning},
+  author={Li, Zehan and Zhang, Xin and Zhang, Yanzhao and Long, Dingkun and Xie, Pengjun and Zhang, Meishan},
+  journal={arXiv preprint arXiv:2308.03281},
+  year={2023}
+}
+```