SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: sentence-transformers/all-MiniLM-L6-v2
Maximum Sequence Length: 256 tokens
Output Dimensionality: 384 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Devy1/MiniLM-cosqa-64")
# Run inference
sentences = [
    'bottom 5 rows in python',
    'def table_top_abs(self):\n        """Returns the absolute position of table top"""\n        table_height = np.array([0, 0, self.table_full_size[2]])\n        return string_to_array(self.floor.get("pos")) + table_height',
    'def refresh(self, document):\n\t\t""" Load a new copy of a document from the database.  does not\n\t\t\treplace the old one """\n\t\ttry:\n\t\t\told_cache_size = self.cache_size\n\t\t\tself.cache_size = 0\n\t\t\tobj = self.query(type(document)).filter_by(mongo_id=document.mongo_id).one()\n\t\tfinally:\n\t\t\tself.cache_size = old_cache_size\n\t\tself.cache_write(obj)\n\t\treturn obj',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000,  0.4847, -0.0572],
#         [ 0.4847,  1.0000, -0.0541],
#         [-0.0572, -0.0541,  1.0000]])

Training Details

Training Dataset

Unnamed Dataset

Size: 9,020 training samples
Columns: anchor and positive
Approximate statistics based on the first 1000 samples:
anchor positive
type string string
details
min: 6 tokens
mean: 9.67 tokens
max: 21 tokens

min: 40 tokens
mean: 86.17 tokens
max: 256 tokens

	anchor	positive
type	string	string
details	min: 6 tokens mean: 9.67 tokens max: 21 tokens	min: 40 tokens mean: 86.17 tokens max: 256 tokens

Samples:

anchor	positive
`1d array in char datatype in python`	`def _convert_to_array(array_like, dtype): """ Convert Matrix attributes which are array-like or buffer to array. """ if isinstance(array_like, bytes): return np.frombuffer(array_like, dtype=dtype) return np.asarray(array_like, dtype=dtype)`
`python condition non none`	`def _not(condition=None, kwargs): """ Return the opposite of input condition. :param condition: condition to process. :result: not condition. :rtype: bool """ result = True if condition is not None: result = not run(condition, kwargs) return result`
`accessing a column from a matrix in python`	`def get_column(self, X, column): """Return a column of the given matrix. Args: X: numpy.ndarray or pandas.DataFrame. column: int or str. Returns: np.ndarray: Selected column. """ if isinstance(X, pd.DataFrame): return X[column].values return X[:, column]`

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim",
    "gather_across_devices": false
}

Training Hyperparameters

Non-Default Hyperparameters

per_device_train_batch_size: 64
fp16: True

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: no
prediction_loss_only: True
per_device_train_batch_size: 64
per_device_eval_batch_size: 8
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 3
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.0
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: True
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
hub_revision: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
liger_kernel_config: None
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: proportional
router_mapping: {}
learning_rate_mapping: {}

Training Logs

Click to expand

Epoch	Step	Training Loss
0.0071	1	0.4603
0.0142	2	0.3179
0.0213	3	0.1802
0.0284	4	0.2268
0.0355	5	0.2288
0.0426	6	0.1769
0.0496	7	0.1555
0.0567	8	0.2626
0.0638	9	0.3319
0.0709	10	0.28
0.0780	11	0.3356
0.0851	12	0.3241
0.0922	13	0.2933
0.0993	14	0.3929
0.1064	15	0.1861
0.1135	16	0.1983
0.1206	17	0.1605
0.1277	18	0.0918
0.1348	19	0.2831
0.1418	20	0.1709
0.1489	21	0.1984
0.1560	22	0.2657
0.1631	23	0.1619
0.1702	24	0.1728
0.1773	25	0.1791
0.1844	26	0.2429
0.1915	27	0.2743
0.1986	28	0.2813
0.2057	29	0.2192
0.2128	30	0.166
0.2199	31	0.2557
0.2270	32	0.3556
0.2340	33	0.2238
0.2411	34	0.2552
0.2482	35	0.2266
0.2553	36	0.4347
0.2624	37	0.2803
0.2695	38	0.1219
0.2766	39	0.1989
0.2837	40	0.2364
0.2908	41	0.2237
0.2979	42	0.1567
0.3050	43	0.2509
0.3121	44	0.16
0.3191	45	0.2148
0.3262	46	0.1953
0.3333	47	0.2447
0.3404	48	0.2001
0.3475	49	0.283
0.3546	50	0.1505
0.3617	51	0.2825
0.3688	52	0.2137
0.3759	53	0.1376
0.3830	54	0.3898
0.3901	55	0.1873
0.3972	56	0.2226
0.4043	57	0.3129
0.4113	58	0.2127
0.4184	59	0.3474
0.4255	60	0.0971
0.4326	61	0.1728
0.4397	62	0.2851
0.4468	63	0.2608
0.4539	64	0.3269
0.4610	65	0.4905
0.4681	66	0.1886
0.4752	67	0.1465
0.4823	68	0.2342
0.4894	69	0.1915
0.4965	70	0.2291
0.5035	71	0.3232
0.5106	72	0.1633
0.5177	73	0.2039
0.5248	74	0.2441
0.5319	75	0.2336
0.5390	76	0.139
0.5461	77	0.4471
0.5532	78	0.1989
0.5603	79	0.2112
0.5674	80	0.1862
0.5745	81	0.2353
0.5816	82	0.2326
0.5887	83	0.3223
0.5957	84	0.2055
0.6028	85	0.2968
0.6099	86	0.2531
0.6170	87	0.2401
0.6241	88	0.1632
0.6312	89	0.4203
0.6383	90	0.1959
0.6454	91	0.2309
0.6525	92	0.3729
0.6596	93	0.2488
0.6667	94	0.1698
0.6738	95	0.267
0.6809	96	0.1658
0.6879	97	0.2158
0.6950	98	0.1665
0.7021	99	0.1897
0.7092	100	0.2159
0.7163	101	0.1932
0.7234	102	0.2236
0.7305	103	0.1287
0.7376	104	0.1917
0.7447	105	0.4039
0.7518	106	0.388
0.7589	107	0.1267
0.7660	108	0.1851
0.7730	109	0.1916
0.7801	110	0.1893
0.7872	111	0.1702
0.7943	112	0.1552
0.8014	113	0.1529
0.8085	114	0.1634
0.8156	115	0.2136
0.8227	116	0.1719
0.8298	117	0.2529
0.8369	118	0.2329
0.8440	119	0.2483
0.8511	120	0.132
0.8582	121	0.182
0.8652	122	0.127
0.8723	123	0.3685
0.8794	124	0.4202
0.8865	125	0.2173
0.8936	126	0.0657
0.9007	127	0.0838
0.9078	128	0.1592
0.9149	129	0.2506
0.9220	130	0.1624
0.9291	131	0.1511
0.9362	132	0.138
0.9433	133	0.2187
0.9504	134	0.2891
0.9574	135	0.158
0.9645	136	0.2595
0.9716	137	0.2911
0.9787	138	0.2141
0.9858	139	0.1723
0.9929	140	0.1847
1.0	141	0.2606
1.0071	142	0.1283
1.0142	143	0.1626
1.0213	144	0.2121
1.0284	145	0.142
1.0355	146	0.1335
1.0426	147	0.1084
1.0496	148	0.15
1.0567	149	0.1459
1.0638	150	0.0674
1.0709	151	0.1393
1.0780	152	0.1582
1.0851	153	0.1295
1.0922	154	0.1402
1.0993	155	0.2266
1.1064	156	0.1025
1.1135	157	0.1616
1.1206	158	0.1795
1.1277	159	0.1583
1.1348	160	0.1624
1.1418	161	0.1068
1.1489	162	0.1301
1.1560	163	0.1792
1.1631	164	0.1656
1.1702	165	0.1666
1.1773	166	0.1031
1.1844	167	0.1092
1.1915	168	0.1668
1.1986	169	0.1218
1.2057	170	0.146
1.2128	171	0.1041
1.2199	172	0.2275
1.2270	173	0.1017
1.2340	174	0.1025
1.2411	175	0.1385
1.2482	176	0.1024
1.2553	177	0.1073
1.2624	178	0.0802
1.2695	179	0.1985
1.2766	180	0.1918
1.2837	181	0.092
1.2908	182	0.1591
1.2979	183	0.2512
1.3050	184	0.2213
1.3121	185	0.129
1.3191	186	0.0759
1.3262	187	0.243
1.3333	188	0.1759
1.3404	189	0.126
1.3475	190	0.1105
1.3546	191	0.1789
1.3617	192	0.1841
1.3688	193	0.1074
1.3759	194	0.1293
1.3830	195	0.1228
1.3901	196	0.1574
1.3972	197	0.1073
1.4043	198	0.1305
1.4113	199	0.1911
1.4184	200	0.1088
1.4255	201	0.111
1.4326	202	0.1639
1.4397	203	0.0944
1.4468	204	0.2008
1.4539	205	0.136
1.4610	206	0.1981
1.4681	207	0.0848
1.4752	208	0.0771
1.4823	209	0.0933
1.4894	210	0.1794
1.4965	211	0.1533
1.5035	212	0.1841
1.5106	213	0.1724
1.5177	214	0.1205
1.5248	215	0.1118
1.5319	216	0.16
1.5390	217	0.2911
1.5461	218	0.1678
1.5532	219	0.1032
1.5603	220	0.1438
1.5674	221	0.1581
1.5745	222	0.1143
1.5816	223	0.1782
1.5887	224	0.2768
1.5957	225	0.1127
1.6028	226	0.1719
1.6099	227	0.2252
1.6170	228	0.2182
1.6241	229	0.287
1.6312	230	0.1314
1.6383	231	0.1951
1.6454	232	0.13
1.6525	233	0.0677
1.6596	234	0.1188
1.6667	235	0.1214
1.6738	236	0.1219
1.6809	237	0.1646
1.6879	238	0.1079
1.6950	239	0.1624
1.7021	240	0.0994
1.7092	241	0.194
1.7163	242	0.1104
1.7234	243	0.1223
1.7305	244	0.0918
1.7376	245	0.0835
1.7447	246	0.0994
1.7518	247	0.1375
1.7589	248	0.1004
1.7660	249	0.1164
1.7730	250	0.1151
1.7801	251	0.0868
1.7872	252	0.2498
1.7943	253	0.0741
1.8014	254	0.1417
1.8085	255	0.0514
1.8156	256	0.2346
1.8227	257	0.2383
1.8298	258	0.1432
1.8369	259	0.1563
1.8440	260	0.1267
1.8511	261	0.1331
1.8582	262	0.1904
1.8652	263	0.0912
1.8723	264	0.214
1.8794	265	0.1846
1.8865	266	0.1378
1.8936	267	0.1012
1.9007	268	0.1468
1.9078	269	0.109
1.9149	270	0.1136
1.9220	271	0.1734
1.9291	272	0.0785
1.9362	273	0.0388
1.9433	274	0.1138
1.9504	275	0.0806
1.9574	276	0.2819
1.9645	277	0.1719
1.9716	278	0.0479
1.9787	279	0.1038
1.9858	280	0.1401
1.9929	281	0.1961
2.0	282	0.1072
2.0071	283	0.1005
2.0142	284	0.147
2.0213	285	0.1011
2.0284	286	0.1304
2.0355	287	0.073
2.0426	288	0.0952
2.0496	289	0.0956
2.0567	290	0.1083
2.0638	291	0.1101
2.0709	292	0.0534
2.0780	293	0.0837
2.0851	294	0.0966
2.0922	295	0.195
2.0993	296	0.0608
2.1064	297	0.0999
2.1135	298	0.1588
2.1206	299	0.1283
2.1277	300	0.0962
2.1348	301	0.0872
2.1418	302	0.0793
2.1489	303	0.1209
2.1560	304	0.1346
2.1631	305	0.131
2.1702	306	0.1081
2.1773	307	0.1109
2.1844	308	0.197
2.1915	309	0.108
2.1986	310	0.1715
2.2057	311	0.0654
2.2128	312	0.1374
2.2199	313	0.0929
2.2270	314	0.033
2.2340	315	0.0903
2.2411	316	0.1417
2.2482	317	0.134
2.2553	318	0.041
2.2624	319	0.0947
2.2695	320	0.0655
2.2766	321	0.0525
2.2837	322	0.0547
2.2908	323	0.1342
2.2979	324	0.1088
2.3050	325	0.162
2.3121	326	0.0962
2.3191	327	0.154
2.3262	328	0.0935
2.3333	329	0.1186
2.3404	330	0.1192
2.3475	331	0.1075
2.3546	332	0.12
2.3617	333	0.0679
2.3688	334	0.1087
2.3759	335	0.1493
2.3830	336	0.085
2.3901	337	0.1784
2.3972	338	0.0567
2.4043	339	0.1842
2.4113	340	0.183
2.4184	341	0.1108
2.4255	342	0.1405
2.4326	343	0.2477
2.4397	344	0.2376
2.4468	345	0.1469
2.4539	346	0.1048
2.4610	347	0.1153
2.4681	348	0.1167
2.4752	349	0.1605
2.4823	350	0.1479
2.4894	351	0.0684
2.4965	352	0.0515
2.5035	353	0.1035
2.5106	354	0.1488
2.5177	355	0.0274
2.5248	356	0.0706
2.5319	357	0.1541
2.5390	358	0.1331
2.5461	359	0.0911
2.5532	360	0.0606
2.5603	361	0.1612
2.5674	362	0.2752
2.5745	363	0.1436
2.5816	364	0.1257
2.5887	365	0.1174
2.5957	366	0.0415
2.6028	367	0.0918
2.6099	368	0.0899
2.6170	369	0.1136
2.6241	370	0.1337
2.6312	371	0.1948
2.6383	372	0.1482
2.6454	373	0.1209
2.6525	374	0.1082
2.6596	375	0.1948
2.6667	376	0.1029
2.6738	377	0.0783
2.6809	378	0.0844
2.6879	379	0.1045
2.6950	380	0.0982
2.7021	381	0.075
2.7092	382	0.15
2.7163	383	0.1155
2.7234	384	0.1334
2.7305	385	0.0767
2.7376	386	0.0476
2.7447	387	0.068
2.7518	388	0.0967
2.7589	389	0.0953
2.7660	390	0.1307
2.7730	391	0.0923
2.7801	392	0.1159
2.7872	393	0.0769
2.7943	394	0.0993
2.8014	395	0.1018
2.8085	396	0.0783
2.8156	397	0.0792
2.8227	398	0.0914
2.8298	399	0.0821
2.8369	400	0.0947
2.8440	401	0.0622
2.8511	402	0.1858
2.8582	403	0.1977
2.8652	404	0.0398
2.8723	405	0.0784
2.8794	406	0.1622
2.8865	407	0.1213
2.8936	408	0.1867
2.9007	409	0.1257
2.9078	410	0.1366
2.9149	411	0.0983
2.9220	412	0.0967
2.9291	413	0.0398
2.9362	414	0.1582
2.9433	415	0.123
2.9504	416	0.1768
2.9574	417	0.131
2.9645	418	0.0731
2.9716	419	0.074
2.9787	420	0.1176
2.9858	421	0.0984
2.9929	422	0.0834
3.0	423	0.1985

Framework Versions

Python: 3.10.14
Sentence Transformers: 5.1.1
Transformers: 4.56.2
PyTorch: 2.8.0+cu128
Accelerate: 1.10.1
Datasets: 4.1.1
Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Downloads last month: 24

Safetensors

Model size

22.7M params

Tensor type

F32

Model tree for Devy1/MiniLM-cosqa-64

Base model

sentence-transformers/all-MiniLM-L6-v2

Finetuned

(568)

this model

Collection including Devy1/MiniLM-cosqa-64

MiniLM - CoSQA

Collection

Fine-tuned models of all-miniLM model on the CoSQA dataset • 6 items • Updated 28 days ago