Spaces:
Paused
Paused
Derek Thomas
commited on
Commit
·
52ba5a5
1
Parent(s):
87e30f1
Adding more clarification
Browse files
notebooks/03_get_embeddings.ipynb
CHANGED
|
@@ -8,6 +8,24 @@
|
|
| 8 |
"# Pre-requisites"
|
| 9 |
]
|
| 10 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
{
|
| 12 |
"cell_type": "markdown",
|
| 13 |
"id": "3102abce-ea42-4da6-8c98-c6dd4edf7f0b",
|
|
@@ -297,6 +315,7 @@
|
|
| 297 |
}
|
| 298 |
],
|
| 299 |
"source": [
|
|
|
|
| 300 |
"# Create a list of async tasks\n",
|
| 301 |
"tasks = [main(documents[i:i+MAX_WORKERS]) for i in range(0, len(documents), MAX_WORKERS)]\n",
|
| 302 |
"\n",
|
|
|
|
| 8 |
"# Pre-requisites"
|
| 9 |
]
|
| 10 |
},
|
| 11 |
+
{
|
| 12 |
+
"cell_type": "markdown",
|
| 13 |
+
"id": "5f625807-0707-4e2f-a0e0-8fbcdf08c865",
|
| 14 |
+
"metadata": {},
|
| 15 |
+
"source": [
|
| 16 |
+
"## Why TEI\n",
|
| 17 |
+
"There are 2 **unsung** challenges with RAG at scale:\n",
|
| 18 |
+
"1. Getting the embeddings efficiently\n",
|
| 19 |
+
"1. Efficient ingestion into the vector DB\n",
|
| 20 |
+
"\n",
|
| 21 |
+
"The issue with `1.` is that there are techniques but they are not widely *applied*. TEI solves a number of aspects:\n",
|
| 22 |
+
"- Token Based Dynamic Batching\n",
|
| 23 |
+
"- Using latest optimizations (Flash Attention, Candle and cuBLASLt)\n",
|
| 24 |
+
"- Fast loading with safetensors\n",
|
| 25 |
+
"\n",
|
| 26 |
+
"The issue with `2.` is that it takes a bit of planning. We wont go much into that side of things here though."
|
| 27 |
+
]
|
| 28 |
+
},
|
| 29 |
{
|
| 30 |
"cell_type": "markdown",
|
| 31 |
"id": "3102abce-ea42-4da6-8c98-c6dd4edf7f0b",
|
|
|
|
| 315 |
}
|
| 316 |
],
|
| 317 |
"source": [
|
| 318 |
+
"%%time\n",
|
| 319 |
"# Create a list of async tasks\n",
|
| 320 |
"tasks = [main(documents[i:i+MAX_WORKERS]) for i in range(0, len(documents), MAX_WORKERS)]\n",
|
| 321 |
"\n",
|