redis
/

langcache-embed-experimental

@@ -81,28 +81,28 @@ model-index:
       type: test
     metrics:
     - type: cosine_accuracy@1
-      value: 0.44070346359110285
       name: Cosine Accuracy@1
     - type: cosine_precision@1
-      value: 0.44070346359110285
       name: Cosine Precision@1
     - type: cosine_recall@1
-      value: 0.42648577181064024
       name: Cosine Recall@1
     - type: cosine_ndcg@10
-      value: 0.627438499402098
       name: Cosine Ndcg@10
     - type: cosine_mrr@1
-      value: 0.44070346359110285
       name: Cosine Mrr@1
     - type: cosine_map@100
-      value: 0.5750186225138979
       name: Cosine Map@100
     - type: cosine_auc_precision_cache_hit_ratio
-      value: 0.27246772094744054
       name: Cosine Auc Precision Cache Hit Ratio
     - type: cosine_auc_similarity_distribution
-      value: 0.40850809564840007
       name: Cosine Auc Similarity Distribution
 ---
@@ -167,9 +167,9 @@ print(embeddings.shape)
 # Get the similarity scores for the embeddings
 similarities = model.similarity(embeddings, embeddings)
 print(similarities)
-# tensor([[1.0000, 0.9844, 0.9844],
-#         [0.9844, 0.9961, 0.9922],
-#         [0.9844, 0.9922, 0.9961]], dtype=torch.bfloat16)
 ```
 <!--
@@ -207,14 +207,14 @@ You can finetune this model on your own dataset.
 | Metric                               | Value      |
 |:-------------------------------------|:-----------|
-| cosine_accuracy@1                    | 0.4407     |
-| cosine_precision@1                   | 0.4407     |
-| cosine_recall@1                      | 0.4265     |
-| **cosine_ndcg@10**                   | **0.6274** |
-| cosine_mrr@1                         | 0.4407     |
-| cosine_map@100                       | 0.575      |
-| cosine_auc_precision_cache_hit_ratio | 0.2725     |
-| cosine_auc_similarity_distribution   | 0.4085     |
 <!--
 ## Bias, Risks and Limitations
@@ -284,11 +284,307 @@ You can finetune this model on your own dataset.
   }
   ```
 ### Training Logs
-| Epoch | Step | test_cosine_ndcg@10 |
-|:-----:|:----:|:-------------------:|
-| -1    | -1   | 0.6274              |
 ### Framework Versions
 - Python: 3.12.3

       type: test
     metrics:
     - type: cosine_accuracy@1
+      value: 0.6070776173931731
       name: Cosine Accuracy@1
     - type: cosine_precision@1
+      value: 0.6070776173931731
       name: Cosine Precision@1
     - type: cosine_recall@1
+      value: 0.588632794022045
       name: Cosine Recall@1
     - type: cosine_ndcg@10
+      value: 0.7755359823507149
       name: Cosine Ndcg@10
     - type: cosine_mrr@1
+      value: 0.6070776173931731
       name: Cosine Mrr@1
     - type: cosine_map@100
+      value: 0.7291245351244533
       name: Cosine Map@100
     - type: cosine_auc_precision_cache_hit_ratio
+      value: 0.348058858138603
       name: Cosine Auc Precision Cache Hit Ratio
     - type: cosine_auc_similarity_distribution
+      value: 0.21125989323367672
       name: Cosine Auc Similarity Distribution
 ---
 # Get the similarity scores for the embeddings
 similarities = model.similarity(embeddings, embeddings)
 print(similarities)
+# tensor([[1.0000, 0.9609, 0.4414],
+#         [0.9609, 1.0000, 0.4395],
+#         [0.4414, 0.4395, 1.0000]], dtype=torch.bfloat16)
 ```
 <!--
 | Metric                               | Value      |
 |:-------------------------------------|:-----------|
+| cosine_accuracy@1                    | 0.6071     |
+| cosine_precision@1                   | 0.6071     |
+| cosine_recall@1                      | 0.5886     |
+| **cosine_ndcg@10**                   | **0.7755** |
+| cosine_mrr@1                         | 0.6071     |
+| cosine_map@100                       | 0.7291     |
+| cosine_auc_precision_cache_hit_ratio | 0.3481     |
+| cosine_auc_similarity_distribution   | 0.2113     |
 <!--
 ## Bias, Risks and Limitations
   }
   ```
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+- `eval_strategy`: steps
+- `per_device_train_batch_size`: 100
+- `per_device_eval_batch_size`: 100
+- `weight_decay`: 0.001
+- `adam_beta2`: 0.98
+- `adam_epsilon`: 1e-06
+- `max_steps`: 75000
+- `warmup_ratio`: 0.1
+- `load_best_model_at_end`: True
+- `optim`: stable_adamw
+- `ddp_find_unused_parameters`: False
+- `push_to_hub`: True
+- `hub_model_id`: redis/langcache-embed-experimental
+- `batch_sampler`: no_duplicates
+#### All Hyperparameters
+<details><summary>Click to expand</summary>
+- `overwrite_output_dir`: False
+- `do_predict`: False
+- `eval_strategy`: steps
+- `prediction_loss_only`: True
+- `per_device_train_batch_size`: 100
+- `per_device_eval_batch_size`: 100
+- `per_gpu_train_batch_size`: None
+- `per_gpu_eval_batch_size`: None
+- `gradient_accumulation_steps`: 1
+- `eval_accumulation_steps`: None
+- `torch_empty_cache_steps`: None
+- `learning_rate`: 5e-05
+- `weight_decay`: 0.001
+- `adam_beta1`: 0.9
+- `adam_beta2`: 0.98
+- `adam_epsilon`: 1e-06
+- `max_grad_norm`: 1.0
+- `num_train_epochs`: 3.0
+- `max_steps`: 75000
+- `lr_scheduler_type`: linear
+- `lr_scheduler_kwargs`: {}
+- `warmup_ratio`: 0.1
+- `warmup_steps`: 0
+- `log_level`: passive
+- `log_level_replica`: warning
+- `log_on_each_node`: True
+- `logging_nan_inf_filter`: True
+- `save_safetensors`: True
+- `save_on_each_node`: False
+- `save_only_model`: False
+- `restore_callback_states_from_checkpoint`: False
+- `no_cuda`: False
+- `use_cpu`: False
+- `use_mps_device`: False
+- `seed`: 42
+- `data_seed`: None
+- `jit_mode_eval`: False
+- `use_ipex`: False
+- `bf16`: False
+- `fp16`: False
+- `fp16_opt_level`: O1
+- `half_precision_backend`: auto
+- `bf16_full_eval`: False
+- `fp16_full_eval`: False
+- `tf32`: None
+- `local_rank`: 0
+- `ddp_backend`: None
+- `tpu_num_cores`: None
+- `tpu_metrics_debug`: False
+- `debug`: []
+- `dataloader_drop_last`: False
+- `dataloader_num_workers`: 0
+- `dataloader_prefetch_factor`: None
+- `past_index`: -1
+- `disable_tqdm`: False
+- `remove_unused_columns`: True
+- `label_names`: None
+- `load_best_model_at_end`: True
+- `ignore_data_skip`: False
+- `fsdp`: []
+- `fsdp_min_num_params`: 0
+- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
+- `fsdp_transformer_layer_cls_to_wrap`: None
+- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
+- `parallelism_config`: None
+- `deepspeed`: None
+- `label_smoothing_factor`: 0.0
+- `optim`: stable_adamw
+- `optim_args`: None
+- `adafactor`: False
+- `group_by_length`: False
+- `length_column_name`: length
+- `ddp_find_unused_parameters`: False
+- `ddp_bucket_cap_mb`: None
+- `ddp_broadcast_buffers`: False
+- `dataloader_pin_memory`: True
+- `dataloader_persistent_workers`: False
+- `skip_memory_metrics`: True
+- `use_legacy_prediction_loop`: False
+- `push_to_hub`: True
+- `resume_from_checkpoint`: None
+- `hub_model_id`: redis/langcache-embed-experimental
+- `hub_strategy`: every_save
+- `hub_private_repo`: None
+- `hub_always_push`: False
+- `hub_revision`: None
+- `gradient_checkpointing`: False
+- `gradient_checkpointing_kwargs`: None
+- `include_inputs_for_metrics`: False
+- `include_for_metrics`: []
+- `eval_do_concat_batches`: True
+- `fp16_backend`: auto
+- `push_to_hub_model_id`: None
+- `push_to_hub_organization`: None
+- `mp_parameters`:
+- `auto_find_batch_size`: False
+- `full_determinism`: False
+- `torchdynamo`: None
+- `ray_scope`: last
+- `ddp_timeout`: 1800
+- `torch_compile`: False
+- `torch_compile_backend`: None
+- `torch_compile_mode`: None
+- `include_tokens_per_second`: False
+- `include_num_input_tokens_seen`: False
+- `neftune_noise_alpha`: None
+- `optim_target_modules`: None
+- `batch_eval_metrics`: False
+- `eval_on_start`: False
+- `use_liger_kernel`: False
+- `liger_kernel_config`: None
+- `eval_use_gather_object`: False
+- `average_tokens_across_devices`: False
+- `prompts`: None
+- `batch_sampler`: no_duplicates
+- `multi_dataset_batch_sampler`: proportional
+- `router_mapping`: {}
+- `learning_rate_mapping`: {}
+</details>
 ### Training Logs
+<details><summary>Click to expand</summary>
+| Epoch      | Step      | Training Loss | Validation Loss | test_cosine_ndcg@10 |
+|:----------:|:---------:|:-------------:|:---------------:|:-------------------:|
+| -1         | -1        | -             | -               | 0.6274              |
+| 0.0054     | 500       | 2.0433        | 0.5003          | 0.7156              |
+| 0.0108     | 1000      | 0.2913        | 0.3804          | 0.7423              |
+| 0.0162     | 1500      | 0.1876        | 0.3343          | 0.7526              |
+| 0.0217     | 2000      | 0.1484        | 0.3172          | 0.7528              |
+| 0.0271     | 2500      | 0.132         | 0.2945          | 0.7569              |
+| 0.0325     | 3000      | 0.1161        | 0.2822          | 0.7636              |
+| 0.0379     | 3500      | 0.1105        | 0.2918          | 0.7580              |
+| 0.0433     | 4000      | 0.1072        | 0.2820          | 0.7597              |
+| 0.0487     | 4500      | 0.1061        | 0.2483          | 0.7661              |
+| 0.0542     | 5000      | 0.0991        | 0.2671          | 0.7600              |
+| 0.0596     | 5500      | 0.0971        | 0.2843          | 0.7595              |
+| 0.0650     | 6000      | 0.0953        | 0.2448          | 0.7640              |
+| 0.0704     | 6500      | 0.1015        | 0.3021          | 0.7632              |
+| 0.0758     | 7000      | 0.0985        | 0.2744          | 0.7616              |
+| 0.0812     | 7500      | 0.1009        | 0.2764          | 0.7615              |
+| 0.0866     | 8000      | 0.0984        | 0.2865          | 0.7608              |
+| 0.0921     | 8500      | 0.0947        | 0.3062          | 0.7600              |
+| 0.0975     | 9000      | 0.0914        | 0.2997          | 0.7584              |
+| 0.1029     | 9500      | 0.0896        | 0.2484          | 0.7617              |
+| 0.1083     | 10000     | 0.0846        | 0.2850          | 0.7594              |
+| 0.1137     | 10500     | 0.0907        | 0.2896          | 0.7571              |
+| 0.1191     | 11000     | 0.0859        | 0.2657          | 0.7599              |
+| 0.1245     | 11500     | 0.0875        | 0.2509          | 0.7620              |
+| 0.1300     | 12000     | 0.0849        | 0.2728          | 0.7620              |
+| 0.1354     | 12500     | 0.0788        | 0.2707          | 0.7587              |
+| 0.1408     | 13000     | 0.0804        | 0.2985          | 0.7567              |
+| 0.1462     | 13500     | 0.0815        | 0.2526          | 0.7620              |
+| 0.1516     | 14000     | 0.0783        | 0.2441          | 0.7655              |
+| 0.1570     | 14500     | 0.0791        | 0.2707          | 0.7645              |
+| 0.1625     | 15000     | 0.0797        | 0.2781          | 0.7576              |
+| 0.1679     | 15500     | 0.077         | 0.2624          | 0.7595              |
+| 0.1733     | 16000     | 0.0742        | 0.2882          | 0.7620              |
+| 0.1787     | 16500     | 0.0739        | 0.2654          | 0.7630              |
+| 0.1841     | 17000     | 0.0695        | 0.2832          | 0.7607              |
+| 0.1895     | 17500     | 0.0726        | 0.2595          | 0.7627              |
+| 0.1949     | 18000     | 0.0739        | 0.2376          | 0.7653              |
+| 0.2004     | 18500     | 0.0751        | 0.2671          | 0.7652              |
+| 0.2058     | 19000     | 0.0717        | 0.3013          | 0.7595              |
+| 0.2112     | 19500     | 0.0696        | 0.2538          | 0.7671              |
+| 0.2166     | 20000     | 0.0659        | 0.2569          | 0.7612              |
+| 0.2220     | 20500     | 0.0669        | 0.2595          | 0.7648              |
+| 0.2274     | 21000     | 0.0679        | 0.2231          | 0.7664              |
+| 0.2328     | 21500     | 0.0657        | 0.2732          | 0.7636              |
+| 0.2383     | 22000     | 0.0703        | 0.2658          | 0.7674              |
+| 0.2437     | 22500     | 0.0636        | 0.2582          | 0.7676              |
+| 0.2491     | 23000     | 0.0688        | 0.2586          | 0.7682              |
+| 0.2545     | 23500     | 0.0598        | 0.2612          | 0.7675              |
+| 0.2599     | 24000     | 0.0664        | 0.2581          | 0.7655              |
+| 0.2653     | 24500     | 0.0621        | 0.2393          | 0.7642              |
+| 0.2708     | 25000     | 0.0641        | 0.2309          | 0.7673              |
+| 0.2762     | 25500     | 0.0624        | 0.2346          | 0.7700              |
+| 0.2816     | 26000     | 0.0595        | 0.2179          | 0.7671              |
+| 0.2870     | 26500     | 0.0605        | 0.2332          | 0.7664              |
+| 0.2924     | 27000     | 0.0609        | 0.2227          | 0.7678              |
+| 0.2978     | 27500     | 0.0621        | 0.2312          | 0.7688              |
+| 0.3032     | 28000     | 0.0626        | 0.2404          | 0.7680              |
+| 0.3087     | 28500     | 0.063         | 0.2429          | 0.7672              |
+| 0.3141     | 29000     | 0.0601        | 0.2275          | 0.7671              |
+| 0.3195     | 29500     | 0.0617        | 0.2235          | 0.7663              |
+| 0.3249     | 30000     | 0.0581        | 0.2370          | 0.7698              |
+| 0.3303     | 30500     | 0.06          | 0.2450          | 0.7652              |
+| 0.3357     | 31000     | 0.0591        | 0.2851          | 0.7638              |
+| 0.3411     | 31500     | 0.0585        | 0.2718          | 0.7664              |
+| 0.3466     | 32000     | 0.0563        | 0.2532          | 0.7664              |
+| 0.3520     | 32500     | 0.059         | 0.2330          | 0.7689              |
+| 0.3574     | 33000     | 0.0545        | 0.2158          | 0.7695              |
+| 0.3628     | 33500     | 0.0567        | 0.2263          | 0.7672              |
+| 0.3682     | 34000     | 0.0566        | 0.2338          | 0.7682              |
+| 0.3736     | 34500     | 0.0586        | 0.2244          | 0.7696              |
+| 0.3791     | 35000     | 0.0559        | 0.2474          | 0.7671              |
+| 0.3845     | 35500     | 0.053         | 0.2332          | 0.7687              |
+| 0.3899     | 36000     | 0.0507        | 0.2258          | 0.7679              |
+| 0.3953     | 36500     | 0.0527        | 0.2240          | 0.7712              |
+| 0.4007     | 37000     | 0.0545        | 0.2229          | 0.7700              |
+| 0.4061     | 37500     | 0.0558        | 0.2119          | 0.7704              |
+| 0.4115     | 38000     | 0.0538        | 0.2611          | 0.7693              |
+| 0.4170     | 38500     | 0.0549        | 0.2336          | 0.7686              |
+| 0.4224     | 39000     | 0.0501        | 0.2316          | 0.7687              |
+| 0.4278     | 39500     | 0.0497        | 0.2289          | 0.7697              |
+| 0.4332     | 40000     | 0.0512        | 0.2299          | 0.7683              |
+| 0.4386     | 40500     | 0.0511        | 0.2654          | 0.7704              |
+| 0.4440     | 41000     | 0.0498        | 0.2272          | 0.7731              |
+| 0.4495     | 41500     | 0.053         | 0.2327          | 0.7696              |
+| 0.4549     | 42000     | 0.0487        | 0.2380          | 0.7715              |
+| 0.4603     | 42500     | 0.0518        | 0.2230          | 0.7724              |
+| 0.4657     | 43000     | 0.0488        | 0.2249          | 0.7703              |
+| 0.4711     | 43500     | 0.0529        | 0.2452          | 0.7716              |
+| 0.4765     | 44000     | 0.0497        | 0.2341          | 0.7720              |
+| 0.4819     | 44500     | 0.0486        | 0.2480          | 0.7696              |
+| 0.4874     | 45000     | 0.0518        | 0.2349          | 0.7715              |
+| 0.4928     | 45500     | 0.0471        | 0.2237          | 0.7720              |
+| 0.4982     | 46000     | 0.0483        | 0.2299          | 0.7712              |
+| 0.5036     | 46500     | 0.0462        | 0.2184          | 0.7705              |
+| 0.5090     | 47000     | 0.0497        | 0.2335          | 0.7718              |
+| 0.5144     | 47500     | 0.05          | 0.2302          | 0.7697              |
+| 0.5198     | 48000     | 0.0488        | 0.2252          | 0.7701              |
+| 0.5253     | 48500     | 0.045         | 0.2291          | 0.7687              |
+| 0.5307     | 49000     | 0.048         | 0.2135          | 0.7698              |
+| 0.5361     | 49500     | 0.0442        | 0.2215          | 0.7704              |
+| 0.5415     | 50000     | 0.0479        | 0.2233          | 0.7707              |
+| 0.5469     | 50500     | 0.0464        | 0.2275          | 0.7713              |
+| 0.5523     | 51000     | 0.0454        | 0.2175          | 0.7717              |
+| 0.5578     | 51500     | 0.0477        | 0.2152          | 0.7719              |
+| 0.5632     | 52000     | 0.0463        | 0.2364          | 0.7701              |
+| 0.5686     | 52500     | 0.0433        | 0.2430          | 0.7736              |
+| 0.5740     | 53000     | 0.0454        | 0.2328          | 0.7708              |
+| 0.5794     | 53500     | 0.0472        | 0.2283          | 0.7722              |
+| 0.5848     | 54000     | 0.0447        | 0.2320          | 0.7720              |
+| 0.5902     | 54500     | 0.0445        | 0.2404          | 0.7689              |
+| 0.5957     | 55000     | 0.0429        | 0.2353          | 0.7693              |
+| 0.6011     | 55500     | 0.0422        | 0.2366          | 0.7722              |
+| 0.6065     | 56000     | 0.0436        | 0.2321          | 0.7720              |
+| 0.6119     | 56500     | 0.0453        | 0.2250          | 0.7723              |
+| 0.6173     | 57000     | 0.0431        | 0.2219          | 0.7733              |
+| 0.6227     | 57500     | 0.0421        | 0.2244          | 0.7723              |
+| 0.6281     | 58000     | 0.0434        | 0.2137          | 0.7728              |
+| 0.6336     | 58500     | 0.0416        | 0.2181          | 0.7743              |
+| 0.6390     | 59000     | 0.0412        | 0.2230          | 0.7717              |
+| 0.6444     | 59500     | 0.0436        | 0.2116          | 0.7737              |
+| 0.6498     | 60000     | 0.0404        | 0.2114          | 0.7736              |
+| 0.6552     | 60500     | 0.041         | 0.2095          | 0.7736              |
+| 0.6606     | 61000     | 0.0408        | 0.2079          | 0.7741              |
+| 0.6661     | 61500     | 0.0408        | 0.2040          | 0.7756              |
+| 0.6715     | 62000     | 0.0404        | 0.2098          | 0.7733              |
+| 0.6769     | 62500     | 0.0418        | 0.2105          | 0.7741              |
+| 0.6823     | 63000     | 0.0402        | 0.2081          | 0.7741              |
+| 0.6877     | 63500     | 0.0394        | 0.2120          | 0.7742              |
+| 0.6931     | 64000     | 0.0418        | 0.2129          | 0.7742              |
+| 0.6985     | 64500     | 0.0406        | 0.2145          | 0.7753              |
+| 0.7040     | 65000     | 0.0382        | 0.2257          | 0.7741              |
+| 0.7094     | 65500     | 0.0373        | 0.2250          | 0.7756              |
+| 0.7148     | 66000     | 0.0382        | 0.2269          | 0.7732              |
+| **0.7202** | **66500** | **0.0405**    | **0.2087**      | **0.7764**          |
+| 0.7256     | 67000     | 0.042         | 0.2114          | 0.7753              |
+| 0.7310     | 67500     | 0.0389        | 0.2138          | 0.7748              |
+| 0.7364     | 68000     | 0.0339        | 0.2084          | 0.7761              |
+| 0.7419     | 68500     | 0.0379        | 0.2090          | 0.7760              |
+| 0.7473     | 69000     | 0.0369        | 0.2161          | 0.7742              |
+| 0.7527     | 69500     | 0.0354        | 0.2226          | 0.7748              |
+| 0.7581     | 70000     | 0.0396        | 0.2191          | 0.7753              |
+| 0.7635     | 70500     | 0.0356        | 0.2195          | 0.7759              |
+| 0.7689     | 71000     | 0.0359        | 0.2182          | 0.7760              |
+| 0.7744     | 71500     | 0.0389        | 0.2187          | 0.7753              |
+| 0.7798     | 72000     | 0.0366        | 0.2194          | 0.7753              |
+| 0.7852     | 72500     | 0.0351        | 0.2198          | 0.7749              |
+| 0.7906     | 73000     | 0.038         | 0.2175          | 0.7754              |
+| 0.7960     | 73500     | 0.0378        | 0.2172          | 0.7756              |
+| 0.8014     | 74000     | 0.0376        | 0.2174          | 0.7754              |
+| 0.8068     | 74500     | 0.038         | 0.2176          | 0.7753              |
+| 0.8123     | 75000     | 0.0379        | 0.2174          | 0.7755              |
+* The bold row denotes the saved checkpoint.
+</details>
 ### Framework Versions
 - Python: 3.12.3

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:48e2a1a598b7d9d75eb9376d781d959044609905644916a262d6526ed13942e0
 size 789580328

 version https://git-lfs.github.com/spec/v1
+oid sha256:2304eb08c679af236a8a9179f0e08d478a4d1958873dd0990efbc0ca883decab
 size 789580328