Update README.md
Browse files
    	
        README.md
    CHANGED
    
    | @@ -22,7 +22,7 @@ base_model: cointegrated/LaBSE-en-ru | |
| 22 |  | 
| 23 | 
             
            ---
         | 
| 24 |  | 
| 25 | 
            -
            Модель BERT для расчетов  | 
| 26 |  | 
| 27 |  | 
| 28 | 
             
            ## Использование:
         | 
| @@ -60,3 +60,38 @@ print(util.dot_score(embeddings, embeddings)) | |
| 60 | 
             
            | cointegrated/LaBSE-en-ru           |  0.794   |  0.659   |  0.431   |  0.761   |  0.946   |  0.766   |  0.789   |  0.769   |  0.340   |  0.414   |
         | 
| 61 |  | 
| 62 |  | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 22 |  | 
| 23 | 
             
            ---
         | 
| 24 |  | 
| 25 | 
            +
            Модель BERT для расчетов эмбеддингов предложений на русском языке. Модель основана на [cointegrated/LaBSE-en-ru](https://huggingface.co/cointegrated/LaBSE-en-ru) - имеет аналогичные размеры контекста (512), ембеддинга (768) и быстродействие.
         | 
| 26 |  | 
| 27 |  | 
| 28 | 
             
            ## Использование:
         | 
|  | |
| 60 | 
             
            | cointegrated/LaBSE-en-ru           |  0.794   |  0.659   |  0.431   |  0.761   |  0.946   |  0.766   |  0.789   |  0.769   |  0.340   |  0.414   |
         | 
| 61 |  | 
| 62 |  | 
| 63 | 
            +
            Оценки модели на бенчмарке [ruMTEB](https://habr.com/ru/companies/sberdevices/articles/831150/):
         | 
| 64 | 
            +
             | 
| 65 | 
            +
            |Model Name                         | Metric              | sbert_large_ mt_nlu_ru | sbert_large_ nlu_ru | [LaBSE-ru-sts](https://huggingface.co/sergeyzh/LaBSE-ru-sts)    | LaBSE-ru-turbo    | multilingual-e5-small | multilingual-e5-base | multilingual-e5-large |
         | 
| 66 | 
            +
            |:----------------------------------|:--------------------|-----------------------:|--------------------:|----------------:|------------------:|----------------------:|---------------------:|----------------------:|
         | 
| 67 | 
            +
            |CEDRClassification                 | Accuracy            |         0.368          |         0.358       |      0.418      |        0.451      |        0.401          |        0.423         |       **0.448**       |
         | 
| 68 | 
            +
            |GeoreviewClassification            | Accuracy            |         0.397          |         0.400       |      0.406      |        0.438      |        0.447          |        0.461         |       **0.497**       |
         | 
| 69 | 
            +
            |GeoreviewClusteringP2P             | V-measure           |         0.584          |         0.590       |      0.626      |      **0.644**    |        0.586          |        0.545         |         0.605         |
         | 
| 70 | 
            +
            |HeadlineClassification             | Accuracy            |         0.772          |       **0.793**     |      0.633      |        0.688      |        0.732          |        0.757         |         0.758         |
         | 
| 71 | 
            +
            |InappropriatenessClassification    | Accuracy            |       **0.646**        |         0.625       |      0.599      |        0.615      |        0.592          |        0.588         |         0.616         |
         | 
| 72 | 
            +
            |KinopoiskClassification            | Accuracy            |         0.503          |         0.495       |      0.496      |        0.521      |        0.500          |        0.509         |       **0.566**       |
         | 
| 73 | 
            +
            |RiaNewsRetrieval                   | NDCG@10             |         0.214          |         0.111       |      0.651      |        0.694      |        0.700          |        0.702         |       **0.807**       |
         | 
| 74 | 
            +
            |RuBQReranking                      | MAP@10              |         0.561          |         0.468       |      0.688      |        0.687      |        0.715          |        0.720         |       **0.756**       |
         | 
| 75 | 
            +
            |RuBQRetrieval                      | NDCG@10             |         0.298          |         0.124       |      0.622      |        0.657      |        0.685          |        0.696         |       **0.741**       |
         | 
| 76 | 
            +
            |RuReviewsClassification            | Accuracy            |         0.589          |         0.583       |      0.599      |        0.632      |        0.612          |        0.630         |       **0.653**       |
         | 
| 77 | 
            +
            |RuSTSBenchmarkSTS                  | Pearson correlation |         0.712          |         0.588       |      0.788      |        0.822      |        0.781          |        0.796         |       **0.831**       |
         | 
| 78 | 
            +
            |RuSciBenchGRNTIClassification      | Accuracy            |         0.542          |         0.539       |      0.529      |        0.569      |        0.550          |        0.563         |       **0.582**       |
         | 
| 79 | 
            +
            |RuSciBenchGRNTIClusteringP2P       | V-measure           |       **0.522**        |         0.504       |      0.486      |        0.517      |        0.511          |        0.516         |         0.520         |
         | 
| 80 | 
            +
            |RuSciBenchOECDClassification       | Accuracy            |         0.438          |         0.430       |      0.406      |        0.440      |        0.427          |        0.423         |       **0.445**       |
         | 
| 81 | 
            +
            |RuSciBenchOECDClusteringP2P        | V-measure           |       **0.473**        |         0.464       |      0.426      |        0.452      |        0.443          |        0.448         |         0.450         |
         | 
| 82 | 
            +
            |SensitiveTopicsClassification      | Accuracy            |       **0.285**        |         0.280       |      0.262      |        0.272      |        0.228          |        0.234         |         0.257         |
         | 
| 83 | 
            +
            |TERRaClassification                | Average Precision   |         0.520          |         0.502       |    **0.587**    |        0.585      |        0.551          |        0.550         |         0.584         |
         | 
| 84 | 
            +
             | 
| 85 | 
            +
            |Model Name                         | Metric              | sbert_large_ mt_nlu_ru | sbert_large_ nlu_ru | [LaBSE-ru-sts](https://huggingface.co/sergeyzh/LaBSE-ru-sts)    | LaBSE-ru-turbo    | multilingual-e5-small | multilingual-e5-base | multilingual-e5-large |
         | 
| 86 | 
            +
            |:----------------------------------|:--------------------|-----------------------:|--------------------:|----------------:|------------------:|----------------------:|----------------------:|---------------------:|
         | 
| 87 | 
            +
            |Classification                     | Accuracy            |         0.554          |        0.552        |      0.524      |        0.558      |        0.551          |        0.561          |      **0.588**       |
         | 
| 88 | 
            +
            |Clustering                         | V-measure           |         0.526          |        0.519        |      0.513      |      **0.538**    |        0.513          |        0.503          |        0.525         |
         | 
| 89 | 
            +
            |MultiLabelClassification           | Accuracy            |         0.326          |        0.319        |      0.340      |      **0.361**    |        0.314          |        0.329          |        0.353         |
         | 
| 90 | 
            +
            |PairClassification                 | Average Precision   |         0.520          |        0.502        |      0.587      |      **0.585**    |        0.551          |        0.550          |        0.584         |
         | 
| 91 | 
            +
            |Reranking                          | MAP@10              |         0.561          |        0.468        |      0.688      |        0.687      |        0.715          |        0.720          |      **0.756**       |
         | 
| 92 | 
            +
            |Retrieval                          | NDCG@10             |         0.256          |        0.118        |      0.637      |        0.675      |        0.697          |        0.699          |      **0.774**       |
         | 
| 93 | 
            +
            |STS                                | Pearson correlation |         0.712          |        0.588        |      0.788      |        0.822      |        0.781          |        0.796          |      **0.831**       |
         | 
| 94 | 
            +
            |Average                            | Average             |         0.494          |        0.438        |      0.582      |        0.604      |        0.588          |        0.594          |      **0.630**       |
         | 
| 95 | 
            +
             | 
| 96 | 
            +
             | 
| 97 | 
            +
             |