SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Devy1/MiniLM-cosqa-64")
# Run inference
sentences = [
    'bottom 5 rows in python',
    'def table_top_abs(self):\n        """Returns the absolute position of table top"""\n        table_height = np.array([0, 0, self.table_full_size[2]])\n        return string_to_array(self.floor.get("pos")) + table_height',
    'def refresh(self, document):\n\t\t""" Load a new copy of a document from the database.  does not\n\t\t\treplace the old one """\n\t\ttry:\n\t\t\told_cache_size = self.cache_size\n\t\t\tself.cache_size = 0\n\t\t\tobj = self.query(type(document)).filter_by(mongo_id=document.mongo_id).one()\n\t\tfinally:\n\t\t\tself.cache_size = old_cache_size\n\t\tself.cache_write(obj)\n\t\treturn obj',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000,  0.4847, -0.0572],
#         [ 0.4847,  1.0000, -0.0541],
#         [-0.0572, -0.0541,  1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 9,020 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 6 tokens
    • mean: 9.67 tokens
    • max: 21 tokens
    • min: 40 tokens
    • mean: 86.17 tokens
    • max: 256 tokens
  • Samples:
    anchor positive
    1d array in char datatype in python def _convert_to_array(array_like, dtype):
    """
    Convert Matrix attributes which are array-like or buffer to array.
    """
    if isinstance(array_like, bytes):
    return np.frombuffer(array_like, dtype=dtype)
    return np.asarray(array_like, dtype=dtype)
    python condition non none def _not(condition=None, **kwargs):
    """
    Return the opposite of input condition.

    :param condition: condition to process.

    :result: not condition.
    :rtype: bool
    """

    result = True

    if condition is not None:
    result = not run(condition, **kwargs)

    return result
    accessing a column from a matrix in python def get_column(self, X, column):
    """Return a column of the given matrix.

    Args:
    X: numpy.ndarray or pandas.DataFrame.
    column: int or str.

    Returns:
    np.ndarray: Selected column.
    """
    if isinstance(X, pd.DataFrame):
    return X[column].values

    return X[:, column]
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 64
  • fp16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss
0.0071 1 0.4603
0.0142 2 0.3179
0.0213 3 0.1802
0.0284 4 0.2268
0.0355 5 0.2288
0.0426 6 0.1769
0.0496 7 0.1555
0.0567 8 0.2626
0.0638 9 0.3319
0.0709 10 0.28
0.0780 11 0.3356
0.0851 12 0.3241
0.0922 13 0.2933
0.0993 14 0.3929
0.1064 15 0.1861
0.1135 16 0.1983
0.1206 17 0.1605
0.1277 18 0.0918
0.1348 19 0.2831
0.1418 20 0.1709
0.1489 21 0.1984
0.1560 22 0.2657
0.1631 23 0.1619
0.1702 24 0.1728
0.1773 25 0.1791
0.1844 26 0.2429
0.1915 27 0.2743
0.1986 28 0.2813
0.2057 29 0.2192
0.2128 30 0.166
0.2199 31 0.2557
0.2270 32 0.3556
0.2340 33 0.2238
0.2411 34 0.2552
0.2482 35 0.2266
0.2553 36 0.4347
0.2624 37 0.2803
0.2695 38 0.1219
0.2766 39 0.1989
0.2837 40 0.2364
0.2908 41 0.2237
0.2979 42 0.1567
0.3050 43 0.2509
0.3121 44 0.16
0.3191 45 0.2148
0.3262 46 0.1953
0.3333 47 0.2447
0.3404 48 0.2001
0.3475 49 0.283
0.3546 50 0.1505
0.3617 51 0.2825
0.3688 52 0.2137
0.3759 53 0.1376
0.3830 54 0.3898
0.3901 55 0.1873
0.3972 56 0.2226
0.4043 57 0.3129
0.4113 58 0.2127
0.4184 59 0.3474
0.4255 60 0.0971
0.4326 61 0.1728
0.4397 62 0.2851
0.4468 63 0.2608
0.4539 64 0.3269
0.4610 65 0.4905
0.4681 66 0.1886
0.4752 67 0.1465
0.4823 68 0.2342
0.4894 69 0.1915
0.4965 70 0.2291
0.5035 71 0.3232
0.5106 72 0.1633
0.5177 73 0.2039
0.5248 74 0.2441
0.5319 75 0.2336
0.5390 76 0.139
0.5461 77 0.4471
0.5532 78 0.1989
0.5603 79 0.2112
0.5674 80 0.1862
0.5745 81 0.2353
0.5816 82 0.2326
0.5887 83 0.3223
0.5957 84 0.2055
0.6028 85 0.2968
0.6099 86 0.2531
0.6170 87 0.2401
0.6241 88 0.1632
0.6312 89 0.4203
0.6383 90 0.1959
0.6454 91 0.2309
0.6525 92 0.3729
0.6596 93 0.2488
0.6667 94 0.1698
0.6738 95 0.267
0.6809 96 0.1658
0.6879 97 0.2158
0.6950 98 0.1665
0.7021 99 0.1897
0.7092 100 0.2159
0.7163 101 0.1932
0.7234 102 0.2236
0.7305 103 0.1287
0.7376 104 0.1917
0.7447 105 0.4039
0.7518 106 0.388
0.7589 107 0.1267
0.7660 108 0.1851
0.7730 109 0.1916
0.7801 110 0.1893
0.7872 111 0.1702
0.7943 112 0.1552
0.8014 113 0.1529
0.8085 114 0.1634
0.8156 115 0.2136
0.8227 116 0.1719
0.8298 117 0.2529
0.8369 118 0.2329
0.8440 119 0.2483
0.8511 120 0.132
0.8582 121 0.182
0.8652 122 0.127
0.8723 123 0.3685
0.8794 124 0.4202
0.8865 125 0.2173
0.8936 126 0.0657
0.9007 127 0.0838
0.9078 128 0.1592
0.9149 129 0.2506
0.9220 130 0.1624
0.9291 131 0.1511
0.9362 132 0.138
0.9433 133 0.2187
0.9504 134 0.2891
0.9574 135 0.158
0.9645 136 0.2595
0.9716 137 0.2911
0.9787 138 0.2141
0.9858 139 0.1723
0.9929 140 0.1847
1.0 141 0.2606
1.0071 142 0.1283
1.0142 143 0.1626
1.0213 144 0.2121
1.0284 145 0.142
1.0355 146 0.1335
1.0426 147 0.1084
1.0496 148 0.15
1.0567 149 0.1459
1.0638 150 0.0674
1.0709 151 0.1393
1.0780 152 0.1582
1.0851 153 0.1295
1.0922 154 0.1402
1.0993 155 0.2266
1.1064 156 0.1025
1.1135 157 0.1616
1.1206 158 0.1795
1.1277 159 0.1583
1.1348 160 0.1624
1.1418 161 0.1068
1.1489 162 0.1301
1.1560 163 0.1792
1.1631 164 0.1656
1.1702 165 0.1666
1.1773 166 0.1031
1.1844 167 0.1092
1.1915 168 0.1668
1.1986 169 0.1218
1.2057 170 0.146
1.2128 171 0.1041
1.2199 172 0.2275
1.2270 173 0.1017
1.2340 174 0.1025
1.2411 175 0.1385
1.2482 176 0.1024
1.2553 177 0.1073
1.2624 178 0.0802
1.2695 179 0.1985
1.2766 180 0.1918
1.2837 181 0.092
1.2908 182 0.1591
1.2979 183 0.2512
1.3050 184 0.2213
1.3121 185 0.129
1.3191 186 0.0759
1.3262 187 0.243
1.3333 188 0.1759
1.3404 189 0.126
1.3475 190 0.1105
1.3546 191 0.1789
1.3617 192 0.1841
1.3688 193 0.1074
1.3759 194 0.1293
1.3830 195 0.1228
1.3901 196 0.1574
1.3972 197 0.1073
1.4043 198 0.1305
1.4113 199 0.1911
1.4184 200 0.1088
1.4255 201 0.111
1.4326 202 0.1639
1.4397 203 0.0944
1.4468 204 0.2008
1.4539 205 0.136
1.4610 206 0.1981
1.4681 207 0.0848
1.4752 208 0.0771
1.4823 209 0.0933
1.4894 210 0.1794
1.4965 211 0.1533
1.5035 212 0.1841
1.5106 213 0.1724
1.5177 214 0.1205
1.5248 215 0.1118
1.5319 216 0.16
1.5390 217 0.2911
1.5461 218 0.1678
1.5532 219 0.1032
1.5603 220 0.1438
1.5674 221 0.1581
1.5745 222 0.1143
1.5816 223 0.1782
1.5887 224 0.2768
1.5957 225 0.1127
1.6028 226 0.1719
1.6099 227 0.2252
1.6170 228 0.2182
1.6241 229 0.287
1.6312 230 0.1314
1.6383 231 0.1951
1.6454 232 0.13
1.6525 233 0.0677
1.6596 234 0.1188
1.6667 235 0.1214
1.6738 236 0.1219
1.6809 237 0.1646
1.6879 238 0.1079
1.6950 239 0.1624
1.7021 240 0.0994
1.7092 241 0.194
1.7163 242 0.1104
1.7234 243 0.1223
1.7305 244 0.0918
1.7376 245 0.0835
1.7447 246 0.0994
1.7518 247 0.1375
1.7589 248 0.1004
1.7660 249 0.1164
1.7730 250 0.1151
1.7801 251 0.0868
1.7872 252 0.2498
1.7943 253 0.0741
1.8014 254 0.1417
1.8085 255 0.0514
1.8156 256 0.2346
1.8227 257 0.2383
1.8298 258 0.1432
1.8369 259 0.1563
1.8440 260 0.1267
1.8511 261 0.1331
1.8582 262 0.1904
1.8652 263 0.0912
1.8723 264 0.214
1.8794 265 0.1846
1.8865 266 0.1378
1.8936 267 0.1012
1.9007 268 0.1468
1.9078 269 0.109
1.9149 270 0.1136
1.9220 271 0.1734
1.9291 272 0.0785
1.9362 273 0.0388
1.9433 274 0.1138
1.9504 275 0.0806
1.9574 276 0.2819
1.9645 277 0.1719
1.9716 278 0.0479
1.9787 279 0.1038
1.9858 280 0.1401
1.9929 281 0.1961
2.0 282 0.1072
2.0071 283 0.1005
2.0142 284 0.147
2.0213 285 0.1011
2.0284 286 0.1304
2.0355 287 0.073
2.0426 288 0.0952
2.0496 289 0.0956
2.0567 290 0.1083
2.0638 291 0.1101
2.0709 292 0.0534
2.0780 293 0.0837
2.0851 294 0.0966
2.0922 295 0.195
2.0993 296 0.0608
2.1064 297 0.0999
2.1135 298 0.1588
2.1206 299 0.1283
2.1277 300 0.0962
2.1348 301 0.0872
2.1418 302 0.0793
2.1489 303 0.1209
2.1560 304 0.1346
2.1631 305 0.131
2.1702 306 0.1081
2.1773 307 0.1109
2.1844 308 0.197
2.1915 309 0.108
2.1986 310 0.1715
2.2057 311 0.0654
2.2128 312 0.1374
2.2199 313 0.0929
2.2270 314 0.033
2.2340 315 0.0903
2.2411 316 0.1417
2.2482 317 0.134
2.2553 318 0.041
2.2624 319 0.0947
2.2695 320 0.0655
2.2766 321 0.0525
2.2837 322 0.0547
2.2908 323 0.1342
2.2979 324 0.1088
2.3050 325 0.162
2.3121 326 0.0962
2.3191 327 0.154
2.3262 328 0.0935
2.3333 329 0.1186
2.3404 330 0.1192
2.3475 331 0.1075
2.3546 332 0.12
2.3617 333 0.0679
2.3688 334 0.1087
2.3759 335 0.1493
2.3830 336 0.085
2.3901 337 0.1784
2.3972 338 0.0567
2.4043 339 0.1842
2.4113 340 0.183
2.4184 341 0.1108
2.4255 342 0.1405
2.4326 343 0.2477
2.4397 344 0.2376
2.4468 345 0.1469
2.4539 346 0.1048
2.4610 347 0.1153
2.4681 348 0.1167
2.4752 349 0.1605
2.4823 350 0.1479
2.4894 351 0.0684
2.4965 352 0.0515
2.5035 353 0.1035
2.5106 354 0.1488
2.5177 355 0.0274
2.5248 356 0.0706
2.5319 357 0.1541
2.5390 358 0.1331
2.5461 359 0.0911
2.5532 360 0.0606
2.5603 361 0.1612
2.5674 362 0.2752
2.5745 363 0.1436
2.5816 364 0.1257
2.5887 365 0.1174
2.5957 366 0.0415
2.6028 367 0.0918
2.6099 368 0.0899
2.6170 369 0.1136
2.6241 370 0.1337
2.6312 371 0.1948
2.6383 372 0.1482
2.6454 373 0.1209
2.6525 374 0.1082
2.6596 375 0.1948
2.6667 376 0.1029
2.6738 377 0.0783
2.6809 378 0.0844
2.6879 379 0.1045
2.6950 380 0.0982
2.7021 381 0.075
2.7092 382 0.15
2.7163 383 0.1155
2.7234 384 0.1334
2.7305 385 0.0767
2.7376 386 0.0476
2.7447 387 0.068
2.7518 388 0.0967
2.7589 389 0.0953
2.7660 390 0.1307
2.7730 391 0.0923
2.7801 392 0.1159
2.7872 393 0.0769
2.7943 394 0.0993
2.8014 395 0.1018
2.8085 396 0.0783
2.8156 397 0.0792
2.8227 398 0.0914
2.8298 399 0.0821
2.8369 400 0.0947
2.8440 401 0.0622
2.8511 402 0.1858
2.8582 403 0.1977
2.8652 404 0.0398
2.8723 405 0.0784
2.8794 406 0.1622
2.8865 407 0.1213
2.8936 408 0.1867
2.9007 409 0.1257
2.9078 410 0.1366
2.9149 411 0.0983
2.9220 412 0.0967
2.9291 413 0.0398
2.9362 414 0.1582
2.9433 415 0.123
2.9504 416 0.1768
2.9574 417 0.131
2.9645 418 0.0731
2.9716 419 0.074
2.9787 420 0.1176
2.9858 421 0.0984
2.9929 422 0.0834
3.0 423 0.1985

Framework Versions

  • Python: 3.10.14
  • Sentence Transformers: 5.1.1
  • Transformers: 4.56.2
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.10.1
  • Datasets: 4.1.1
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
24
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Devy1/MiniLM-cosqa-64

Finetuned
(568)
this model

Collection including Devy1/MiniLM-cosqa-64