hanhainebula commited on
Commit
ccc1c81
·
verified ·
1 Parent(s): 081f868

Upload folder using huggingface_hub

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 4096,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": true,
9
+ "include_prompt": true
10
+ }
README.md CHANGED
@@ -2,6 +2,7 @@
2
  tags:
3
  - feature-extraction
4
  - sentence-similarity
 
5
  - transformers
6
  license: apache-2.0
7
  ---
@@ -54,6 +55,40 @@ print(similarity)
54
  ```
55
 
56
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  ### Using HuggingFace Transformers
58
  ```python
59
  import torch
@@ -134,6 +169,25 @@ BGE-Reasoner-Embed-Qwen3-8B-0923 exhibits strong performance in reasoning-intens
134
 
135
  <img src="./imgs/bright-performance.png" alt="BRIGHT Performance" style="zoom:200%;" />
136
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
137
 
138
  ## Citation
139
 
 
2
  tags:
3
  - feature-extraction
4
  - sentence-similarity
5
+ - sentence-transformers
6
  - transformers
7
  license: apache-2.0
8
  ---
 
55
  ```
56
 
57
 
58
+ ### Using Sentence Transformers
59
+
60
+ ```python
61
+ from sentence_transformers import SentenceTransformer
62
+ import torch
63
+
64
+ # Load the model, optionally in float16 precision for faster inference
65
+ model = SentenceTransformer("BAAI/bge-reasoner-embed-qwen3-8b-0923", model_kwargs={"torch_dtype": torch.float16})
66
+
67
+
68
+ queries = [
69
+ # taken from BRIGHT TheoT dataset, qid: examples-TheoremQA_wenhuchen/eigen_value1.json
70
+ "Imagine you have a magical box that transforms any object you put inside it, where the object is represented by the column vector x = (x_1, x_2). The box's transformation can be represented by the matrix A = [[5, 4], [1, 2]], so when given an object x, the box outputs the new object Ax. On some special objects, this new object is just a constant multiple of the original object, λx = (λx_1, λx_2). Find both possible values of λ where this occurs — note that these are the box's eigenvalues.",
71
+ # taken from BRIGHT TheoT dataset, qid: examples-TheoremQA_maxku/ipnetwork13-hammingdist.json
72
+ "Imagine you're comparing three digital images that are extremely simplified down to a grid of 5 pixels each, represented by either black (0) or white (1) pixels. The images are as follows: Image A: 00000, Image B: 10101, and Image C: 01010. By counting the number of pixels that differ between each pair of images, find the smallest number of differing pixels."
73
+ ]
74
+ documents = [
75
+ # taken from BRIGHT TheoT dataset, docid: 2723
76
+ "\\begin{definition}[Definition:Eigenvector/Linear Operator]\nLet $K$ be a field.\nLet $V$ be a vector space over $K$. \nLet $A : V \\to V$ be a linear operator.\nLet $\\lambda \\in K$ be an eigenvalue of $A$.\nA non-zero vector $v \\in V$ is an '''eigenvector corresponding to $\\lambda$''' {{iff}}:\n:$v \\in \\map \\ker {A - \\lambda I}$\nwhere: \n:$I : V \\to V$ is the identity mapping on $V$\n:$\\map \\ker {A - \\lambda I}$ denotes the kernel of $A - \\lambda I$.\nThat is, {{iff}}: \n:$A v = \\lambda v$\n\\end{definition}",
77
+ # taken from BRIGHT TheoT dataset, docid: 14101
78
+ "\\section{Error Correction Capability of Linear Code}\nTags: Linear Codes\n\n\\begin{theorem}\nLet $C$ be a linear code.\nLet $C$ have a minimum distance $d$.\nThen $C$ corrects $e$ transmission errors for all $e$ such that $2 e + 1 \\le d$.\n\\end{theorem}\n\n\\begin{proof}\nLet $C$ be a linear code whose master code is $V$.\nLet $c \\in C$ be a transmitted codeword.\nLet $v$ be the received word from $c$.\nBy definition, $v$ is an element of $V$.\nLet $v$ have a distance $e$ from $c$, where $2 e + 1 \\le d$.\nThus there have been $e$ transmission errors.\n{{AimForCont}} $c_1$ is a codeword of $C$, distinct from $c$, such that $\\map d {v, c_1} \\le e$.\nThen:\n{{begin-eqn}}\n{{eqn | l = \\map d {c, c_1}\n | o = \\le\n | r = \\map d {c, v} + \\map d {v, c_1}\n | c = \n}}\n{{eqn | o = \\le\n | r = e + e\n | c = \n}}\n{{eqn | o = <\n | r = d\n | c = \n}}\n{{end-eqn}}\nSo $c_1$ has a distance from $c$ less than $d$.\nBut $C$ has a minimum distance $d$.\nThus $c_1$ cannot be a codeword of $C$.\nFrom this contradiction it follows that there is no codeword of $C$ closer to $v$ than $c$.\nHence there is a unique codeword of $C$ which has the smallest distance from $v$.\nHence it can be understood that $C$ has corrected the transmission errors of $v$.\n{{Qed}}\n\\end{proof}\n\n"
79
+ ]
80
+
81
+ query_embeddings = model.encode(queries, prompt="Instruct: Given a Math problem, retrieve relevant theorems that help answer the problem.\nQuery: ")
82
+ document_embeddings = model.encode(documents)
83
+
84
+ # Compute the (cosine) similarity between the query and document embeddings
85
+ similarity = model.similarity(query_embeddings, document_embeddings)
86
+ print(similarity)
87
+ # tensor([[0.8228, 0.5391],
88
+ # [0.5752, 0.6543]], dtype=torch.float16)
89
+ ```
90
+
91
+
92
  ### Using HuggingFace Transformers
93
  ```python
94
  import torch
 
169
 
170
  <img src="./imgs/bright-performance.png" alt="BRIGHT Performance" style="zoom:200%;" />
171
 
172
+ Note:
173
+ - "**Avg - ALL**" refers to the average performance across **all 12 datasets** in the BRIGHT benchmark.
174
+ - "**Avg - SE**" refers to the average performance across the **7 datasets in the StackExchange subset** of the BRIGHT benchmark.
175
+ - "**Avg - CD**" refers to the average performance across the **2 datasets in the Coding subset** of the BRIGHT benchmark.
176
+ - "**Avg - MT**" refers to the average performance across the **3 datasets in the Theorem-based subset** of the BRIGHT benchmark.
177
+
178
+ > Sources of Results:
179
+ >
180
+ > [1] https://arxiv.org/pdf/2407.12883
181
+ >
182
+ > [2] https://arxiv.org/pdf/2504.20595
183
+ >
184
+ > [3] https://github.com/Debrup-61/RaDeR
185
+ >
186
+ > [4] https://seed1-5-embedding.github.io
187
+ >
188
+ > [5] https://arxiv.org/pdf/2508.07995
189
+ >
190
+ > *: results evaluated with our script
191
 
192
  ## Citation
193
 
config_sentence_transformers.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "prompts": {
3
+ "query": "Instruct: Given a query, retrieve documents that can help answer the query.\nQuery: ",
4
+ "document": ""
5
+ },
6
+ "default_prompt_name": null,
7
+ "similarity_fn_name": "cosine"
8
+ }
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]