Spaces:
Running
Running
Feature(LLMLingua): update information
Browse files
README.md
CHANGED
|
@@ -13,27 +13,47 @@ license: mit
|
|
| 13 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
| 14 |
|
| 15 |
|
| 16 |
-
<div style="display: flex; align-items: center;
|
| 17 |
-
<div style="width: 100px; margin-right: 10px; height:auto;" align="left">
|
| 18 |
-
<img src="images/LLMLingua_logo.png" alt="LLMLingua" width="100" align="left">
|
| 19 |
-
</div>
|
| 20 |
-
<div style="flex-grow: 1;" align="center">
|
| 21 |
-
<h2 align="center">LLMLingua
|
| 22 |
-
</div>
|
| 23 |
</div>
|
| 24 |
|
| 25 |
<p align="center">
|
| 26 |
-
| <a href="https://
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
</p>
|
| 28 |
|
| 29 |
-
##
|
| 30 |
|
| 31 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
|
| 33 |
-
|
| 34 |
-
_Huiqiang Jiang, Qianhui Wu, Chin-Yew Lin, Yuqing Yang and Lili Qiu_
|
| 35 |
|
| 36 |
-
|
| 37 |
|
| 38 |
-
[
|
| 39 |
-
_Huiqiang Jiang, Qianhui Wu,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
| 14 |
|
| 15 |
|
| 16 |
+
<div style="display: flex; align-items: center;">
|
| 17 |
+
<div style="width: 100px; margin-right: 10px; height:auto;" align="left">
|
| 18 |
+
<img src="images/LLMLingua_logo.png" alt="LLMLingua" width="100" align="left">
|
| 19 |
+
</div>
|
| 20 |
+
<div style="flex-grow: 1;" align="center">
|
| 21 |
+
<h2 align="center">LLMLingua Series | Effectively Deliver Information to LLMs via Prompt Compression</h2>
|
| 22 |
+
</div>
|
| 23 |
</div>
|
| 24 |
|
| 25 |
<p align="center">
|
| 26 |
+
| <a href="https://llmlingua.com/"><b>Project Page</b></a> |
|
| 27 |
+
<a href="https://aclanthology.org/2023.emnlp-main.825/"><b>LLMLingua</b></a> |
|
| 28 |
+
<a href="https://arxiv.org/abs/2310.06839"><b>LongLLMLingua</b></a> |
|
| 29 |
+
<a href="https://arxiv.org/abs/2403."><b>LLMLingua-2</b></a> |
|
| 30 |
+
<a href="https://huggingface.co/spaces/microsoft/LLMLingua"><b>LLMLingua Demo</b></a> |
|
| 31 |
+
<a href="https://huggingface.co/spaces/microsoft/LLMLingua-2"><b>LLMLingua-2 Demo</b></a> |
|
| 32 |
</p>
|
| 33 |
|
| 34 |
+
## News
|
| 35 |
|
| 36 |
+
- π¦ We're excited to announce the release of **LLMLingua-2**, boasting a 3x-6x speed improvement over LLMLingua! For more information, check out our [paper](https://arxiv.org/abs/2403.), visit the [project page](https://llmlingua.com/llmlingua-2.html), and explore our [demo](https://huggingface.co/spaces/microsoft/LLMLingua-2).
|
| 37 |
+
- π€³ Talk slides are available in [AI Time Jan, 24](https://drive.google.com/file/d/1fzK3wOvy2boF7XzaYuq2bQ3jFeP1WMk3/view?usp=sharing).
|
| 38 |
+
- π₯ EMNLP'23 slides are available in [Session 5](https://drive.google.com/file/d/1GxQLAEN8bBB2yiEdQdW4UKoJzZc0es9t/view) and [BoF-6](https://drive.google.com/file/d/1LJBUfJrKxbpdkwo13SgPOqugk-UjLVIF/view).
|
| 39 |
+
- π Check out our new [blog post](https://medium.com/@iofu728/longllmlingua-bye-bye-to-middle-loss-and-save-on-your-rag-costs-via-prompt-compression-54b559b9ddf7) discussing RAG benefits and cost savings through prompt compression. See the script example [here](https://github.com/microsoft/LLMLingua/blob/main/examples/Retrieval.ipynb).
|
| 40 |
+
- π Visit our [project page](https://llmlingua.com/) for real-world case studies in RAG, Online Meetings, CoT, and Code.
|
| 41 |
+
- π¨βπ¦― Explore our ['./examples'](https://github.com/microsoft/LLMLingua/blob/main/examples) directory for practical applications, including [RAG](https://github.com/microsoft/LLMLingua/blob/main/examples/RAG.ipynb), [Online Meeting](https://github.com/microsoft/LLMLingua/blob/main/examples/OnlineMeeting.ipynb), [CoT](https://github.com/microsoft/LLMLingua/blob/main/examples/CoT.ipynb), [Code](https://github.com/microsoft/LLMLingua/blob/main/examples/Code.ipynb), and [RAG using LlamaIndex](https://github.com/microsoft/LLMLingua/blob/main/examples/RAGLlamaIndex.ipynb).
|
| 42 |
+
- πΎ LongLLMLingua is now part of the [LlamaIndex pipeline](https://github.com/run-llama/llama_index/blob/main/llama_index/postprocessor/longllmlingua.py), a widely-used RAG framework.
|
| 43 |
|
| 44 |
+
## TL;DR
|
|
|
|
| 45 |
|
| 46 |
+
LLMLingua utilizes a compact, well-trained language model (e.g., GPT2-small, LLaMA-7B) to identify and remove non-essential tokens in prompts. This approach enables efficient inference with large language models (LLMs), achieving up to 20x compression with minimal performance loss.
|
| 47 |
|
| 48 |
+
- [LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models](https://aclanthology.org/2023.emnlp-main.825/) (EMNLP 2023)<br>
|
| 49 |
+
_Huiqiang Jiang, Qianhui Wu, Chin-Yew Lin, Yuqing Yang and Lili Qiu_
|
| 50 |
+
|
| 51 |
+
LongLLMLingua mitigates the 'lost in the middle' issue in LLMs, enhancing long-context information processing. It reduces costs and boosts efficiency with prompt compression, improving RAG performance by up to 21.4% using only 1/4 of the tokens.
|
| 52 |
+
|
| 53 |
+
- [LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression](https://arxiv.org/abs/2310.06839) (ICLR ME-FoMo 2024)<br>
|
| 54 |
+
_Huiqiang Jiang, Qianhui Wu, Xufang Luo, Dongsheng Li, Chin-Yew Lin, Yuqing Yang and Lili Qiu_
|
| 55 |
+
|
| 56 |
+
LLMLingua-2, a small-size yet powerful prompt compression method trained via data distillation from GPT-4 for token classification with a BERT-level encoder, excels in task-agnostic compression. It surpasses LLMLingua in handling out-of-domain data, offering 3x-6x faster performance.
|
| 57 |
+
|
| 58 |
+
- [LLMLingua-2: Context-Aware Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression](https://arxiv.org/abs/2403.) (Under Review)<br>
|
| 59 |
+
_Zhuoshi Pan, Qianhui Wu, Huiqiang Jiang, Menglin Xia, Xufang Luo, Jue Zhang, Qingwei Lin, Victor Ruhle, Yuqing Yang, Chin-Yew Lin, H. Vicky Zhao, Lili Qiu, Dongmei Zhang_
|