pkshatech
/

GLuCoSE-base-ja-v2

@@ -1,5 +1,6 @@
 ---
-language: []
 library_name: sentence-transformers
 tags:
 - sentence-transformers
@@ -18,46 +19,11 @@ metrics:
 - spearman_max
 widget: []
 pipeline_tag: sentence-similarity
-model-index:
-- name: SentenceTransformer
-  results:
-  - task:
-      type: semantic-similarity
-      name: Semantic Similarity
-    dataset:
-      name: Unknown
-      type: unknown
-    metrics:
-    - type: pearson_cosine
-      value: 0.841929698952355
-      name: Pearson Cosine
-    - type: spearman_cosine
-      value: 0.7942182059969294
-      name: Spearman Cosine
-    - type: pearson_manhattan
-      value: 0.8295844701949633
-      name: Pearson Manhattan
-    - type: spearman_manhattan
-      value: 0.7967029159438351
-      name: Spearman Manhattan
-    - type: pearson_euclidean
-      value: 0.8302175995746677
-      name: Pearson Euclidean
-    - type: spearman_euclidean
-      value: 0.7974109108557925
-      name: Spearman Euclidean
-    - type: pearson_dot
-      value: 0.8266168802012493
-      name: Pearson Dot
-    - type: spearman_dot
-      value: 0.7757964222446627
-      name: Spearman Dot
-    - type: pearson_max
-      value: 0.841929698952355
-      name: Pearson Max
-    - type: spearman_max
-      value: 0.7974109108557925
-      name: Spearman Max
 ---
 # SentenceTransformer
@@ -76,12 +42,6 @@ This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps
 <!-- - **Language:** Unknown -->
 <!-- - **License:** Unknown -->
-### Model Sources
-- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
-- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
-- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
 ### Full Model Architecture
 ```
@@ -147,26 +107,6 @@ You can finetune this model on your own dataset.
 *List how the model may foreseeably be misused and address what users ought not to do with the model.*
 -->
-## Evaluation
-### Metrics
-#### Semantic Similarity
-* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
-| Metric              | Value      |
-|:--------------------|:-----------|
-| pearson_cosine      | 0.8419     |
-| **spearman_cosine** | **0.7942** |
-| pearson_manhattan   | 0.8296     |
-| spearman_manhattan  | 0.7967     |
-| pearson_euclidean   | 0.8302     |
-| spearman_euclidean  | 0.7974     |
-| pearson_dot         | 0.8266     |
-| spearman_dot        | 0.7758     |
-| pearson_max         | 0.8419     |
-| spearman_max        | 0.7974     |
 <!--
 ## Bias, Risks and Limitations
@@ -182,12 +122,6 @@ You can finetune this model on your own dataset.
 ## Training Details
-### Training Logs
-| Epoch | Step | spearman_cosine |
-|:-----:|:----:|:---------------:|
-| 0     | 0    | 0.7942          |
 ### Framework Versions
 - Python: 3.10.13
 - Sentence Transformers: 3.0.0
@@ -196,6 +130,28 @@ You can finetune this model on your own dataset.
 - Accelerate: 0.30.1
 - Datasets: 2.19.2
 - Tokenizers: 0.19.1
 ## Citation

 ---
+language:
+- ja
 library_name: sentence-transformers
 tags:
 - sentence-transformers
 - spearman_max
 widget: []
 pipeline_tag: sentence-similarity
+datasets:
+- hpprc/emb
+- hpprc/mqa-ja
+- google-research-datasets/paws-x
+base_model: pkshatech/GLuCoSE-base-ja
 ---
 # SentenceTransformer
 <!-- - **Language:** Unknown -->
 <!-- - **License:** Unknown -->
 ### Full Model Architecture
 ```
 *List how the model may foreseeably be misused and address what users ought not to do with the model.*
 -->
 <!--
 ## Bias, Risks and Limitations
 ## Training Details
 ### Framework Versions
 - Python: 3.10.13
 - Sentence Transformers: 3.0.0
 - Accelerate: 0.30.1
 - Datasets: 2.19.2
 - Tokenizers: 0.19.1
+## Benchmarks
+## Zero-shot Search
+Evaluated with [MIRACL-ja](https://huggingface.co/datasets/miracl/miracl), [JQARA][https://huggingface.co/datasets/hotchpotch/JQaRA] and [MLDR-ja][https://huggingface.co/datasets/Shitao/MLDR].
+| model | size | MIRACL<br>Recall@5 | JQaRA<br>nDCG@10 | MLDR<br>nDCG@10 |
+|--------|--------|---------------------|-------------------|-------------------|
+| me5-base | 0.3B | 84.2 | 47.2 | 25.4 |
+| GLuCoSE | 0.1B | 53.3 | 30.8 | 25.2 |
+| GLuCoSE v2 | 0.1B | 85.5 | 60.6 | 33.8 |
+## JMTEB
+Evaluated with [JMTEB][https://github.com/sbintuitions/JMTEB].
+* Time-consuming [‘amazon_review_classification’, ‘mrtydi’, ‘jaqket’, ‘esci’] were excluded and evaluated.
+* The average is a macro-average per task.
+| model | size | Class. | Ret. | STS. | Clus. | Pair. | Avg. |
+|--------|--------|--------|------|------|-------|-------|------|
+| me5-base | 0.3B | 75.1 | 80.6 | 80.5 | 52.6 | 62.4 | 70.2 |
+| GLuCoSE | 0.1B | 82.6 | 69.8 | 78.2 | 51.5 | 66.2 | 69.7 |
+| GLuCoSE v2 | 0.1B | 80.5 | 82.8 | 83.0 | 49.8 | 62.4 | 71.7 |
 ## Citation