Upload folder using huggingface_hub

Browse files

Files changed (4) hide show

1_Pooling/config.json +10 -0
README.md +54 -0
config_sentence_transformers.json +8 -0
modules.json +14 -0

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+    "word_embedding_dimension": 4096,
+    "pooling_mode_cls_token": false,
+    "pooling_mode_mean_tokens": false,
+    "pooling_mode_max_tokens": false,
+    "pooling_mode_mean_sqrt_len_tokens": false,
+    "pooling_mode_weightedmean_tokens": false,
+    "pooling_mode_lasttoken": true,
+    "include_prompt": true
+}

README.md CHANGED Viewed

@@ -2,6 +2,7 @@
 tags:
 - feature-extraction
 - sentence-similarity
 - transformers
 license: apache-2.0
 ---
@@ -54,6 +55,40 @@ print(similarity)
 ```
 ### Using HuggingFace Transformers
 ```python
 import torch
@@ -134,6 +169,25 @@ BGE-Reasoner-Embed-Qwen3-8B-0923 exhibits strong performance in reasoning-intens
 <img src="./imgs/bright-performance.png" alt="BRIGHT Performance" style="zoom:200%;" />
 ## Citation

 tags:
 - feature-extraction
 - sentence-similarity
+- sentence-transformers
 - transformers
 license: apache-2.0
 ---
 ```
+### Using Sentence Transformers
+```python
+from sentence_transformers import SentenceTransformer
+import torch
+# Load the model, optionally in float16 precision for faster inference
+model = SentenceTransformer("BAAI/bge-reasoner-embed-qwen3-8b-0923", model_kwargs={"torch_dtype": torch.float16})
+queries = [
+    # taken from BRIGHT TheoT dataset, qid: examples-TheoremQA_wenhuchen/eigen_value1.json
+    "Imagine you have a magical box that transforms any object you put inside it, where the object is represented by the column vector x = (x_1, x_2). The box's transformation can be represented by the matrix A = [[5, 4], [1, 2]], so when given an object x, the box outputs the new object Ax. On some special objects, this new object is just a constant multiple of the original object, λx = (λx_1, λx_2). Find both possible values of λ where this occurs — note that these are the box's eigenvalues.",
+    # taken from BRIGHT TheoT dataset, qid: examples-TheoremQA_maxku/ipnetwork13-hammingdist.json
+    "Imagine you're comparing three digital images that are extremely simplified down to a grid of 5 pixels each, represented by either black (0) or white (1) pixels. The images are as follows: Image A: 00000, Image B: 10101, and Image C: 01010. By counting the number of pixels that differ between each pair of images, find the smallest number of differing pixels."
+]
+documents = [
+    # taken from BRIGHT TheoT dataset, docid: 2723
+    "\\begin{definition}[Definition:Eigenvector/Linear Operator]\nLet $K$ be a field.\nLet $V$ be a vector space over $K$. \nLet $A : V \\to V$ be a linear operator.\nLet $\\lambda \\in K$ be an eigenvalue of $A$.\nA non-zero vector $v \\in V$ is an '''eigenvector corresponding to $\\lambda$''' {{iff}}:\n:$v \\in \\map \\ker {A - \\lambda I}$\nwhere: \n:$I : V \\to V$ is the identity mapping on $V$\n:$\\map \\ker {A - \\lambda I}$ denotes the kernel of $A - \\lambda I$.\nThat is, {{iff}}: \n:$A v = \\lambda v$\n\\end{definition}",
+    # taken from BRIGHT TheoT dataset, docid: 14101
+    "\\section{Error Correction Capability of Linear Code}\nTags: Linear Codes\n\n\\begin{theorem}\nLet $C$ be a linear code.\nLet $C$ have a minimum distance $d$.\nThen $C$ corrects $e$ transmission errors for all $e$ such that $2 e + 1 \\le d$.\n\\end{theorem}\n\n\\begin{proof}\nLet $C$ be a linear code whose master code is $V$.\nLet $c \\in C$ be a transmitted codeword.\nLet $v$ be the received word from $c$.\nBy definition, $v$ is an element of $V$.\nLet $v$ have a distance $e$ from $c$, where $2 e + 1 \\le d$.\nThus there have been $e$ transmission errors.\n{{AimForCont}} $c_1$ is a codeword of $C$, distinct from $c$, such that $\\map d {v, c_1} \\le e$.\nThen:\n{{begin-eqn}}\n{{eqn | l = \\map d {c, c_1}\n      | o = \\le\n      | r = \\map d {c, v} + \\map d {v, c_1}\n      | c = \n}}\n{{eqn | o = \\le\n      | r = e + e\n      | c = \n}}\n{{eqn | o = <\n      | r = d\n      | c = \n}}\n{{end-eqn}}\nSo $c_1$ has a distance from $c$ less than $d$.\nBut $C$ has a minimum distance $d$.\nThus $c_1$ cannot be a codeword of $C$.\nFrom this contradiction it follows that there is no codeword of $C$ closer to $v$ than $c$.\nHence there is a unique codeword of $C$ which has the smallest distance from $v$.\nHence it can be understood that $C$ has corrected the transmission errors of $v$.\n{{Qed}}\n\\end{proof}\n\n"
+]
+query_embeddings = model.encode(queries, prompt="Instruct: Given a Math problem, retrieve relevant theorems that help answer the problem.\nQuery: ")
+document_embeddings = model.encode(documents)
+# Compute the (cosine) similarity between the query and document embeddings
+similarity = model.similarity(query_embeddings, document_embeddings)
+print(similarity)
+# tensor([[0.8228, 0.5391],
+#         [0.5752, 0.6543]], dtype=torch.float16)
+```
 ### Using HuggingFace Transformers
 ```python
 import torch
 <img src="./imgs/bright-performance.png" alt="BRIGHT Performance" style="zoom:200%;" />
+Note:
+- "**Avg - ALL**" refers to the average performance across **all 12 datasets** in the BRIGHT benchmark.
+- "**Avg - SE**" refers to the average performance across the **7 datasets in the StackExchange subset** of the BRIGHT benchmark.
+- "**Avg - CD**" refers to the average performance across the **2 datasets in the Coding subset** of the BRIGHT benchmark.
+- "**Avg - MT**" refers to the average performance across the **3 datasets in the Theorem-based subset** of the BRIGHT benchmark.
+> Sources of Results:
+>
+> [1] https://arxiv.org/pdf/2407.12883
+>
+> [2] https://arxiv.org/pdf/2504.20595
+>
+> [3] https://github.com/Debrup-61/RaDeR
+>
+> [4] https://seed1-5-embedding.github.io
+>
+> [5] https://arxiv.org/pdf/2508.07995
+>
+> *: results evaluated with our script
 ## Citation

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,8 @@

+{
+    "prompts": {
+        "query": "Instruct: Given a query, retrieve documents that can help answer the query.\nQuery: ",
+        "document": ""
+    },
+    "default_prompt_name": null,
+    "similarity_fn_name": "cosine"
+}

modules.json ADDED Viewed

	@@ -0,0 +1,14 @@

+[
+    {
+        "idx": 0,
+        "name": "0",
+        "path": "",
+        "type": "sentence_transformers.models.Transformer"
+    },
+    {
+        "idx": 1,
+        "name": "1",
+        "path": "1_Pooling",
+        "type": "sentence_transformers.models.Pooling"
+    }
+]