Spaces:
Running
Running
:pencil: [Doc] New model, and prettify formats
Browse files
README.md
CHANGED
|
@@ -10,15 +10,17 @@ app_port: 23333
|
|
| 10 |
## HF-LLM-API
|
| 11 |
Huggingface LLM Inference API in OpenAI message format.
|
| 12 |
|
|
|
|
|
|
|
| 13 |
## Features
|
| 14 |
|
| 15 |
-
- Available Models (2024/01/
|
| 16 |
-
- `mixtral-8x7b`, `
|
| 17 |
-
- Adaptive prompt templates for different models
|
| 18 |
- Support OpenAI API format
|
| 19 |
- Enable api endpoint via official `openai-python` package
|
| 20 |
- Support both stream and no-stream response
|
| 21 |
-
- Support API Key via both HTTP auth header and env varible (https://github.com/Hansimov/hf-llm-api/issues/4)
|
| 22 |
- Docker deployment
|
| 23 |
|
| 24 |
## Run API service
|
|
@@ -60,7 +62,7 @@ sudo docker run -p 23333:23333 --env http_proxy="http://<server>:<port>" hf-llm-
|
|
| 60 |
|
| 61 |
### Using `openai-python`
|
| 62 |
|
| 63 |
-
See: [examples/chat_with_openai.py](https://github.com/Hansimov/hf-llm-api/blob/main/examples/chat_with_openai.py)
|
| 64 |
|
| 65 |
```py
|
| 66 |
from openai import OpenAI
|
|
@@ -69,6 +71,8 @@ from openai import OpenAI
|
|
| 69 |
base_url = "http://127.0.0.1:23333"
|
| 70 |
# Your own HF_TOKEN
|
| 71 |
api_key = "hf_xxxxxxxxxxxxxxxx"
|
|
|
|
|
|
|
| 72 |
|
| 73 |
client = OpenAI(base_url=base_url, api_key=api_key)
|
| 74 |
response = client.chat.completions.create(
|
|
@@ -93,7 +97,7 @@ for chunk in response:
|
|
| 93 |
|
| 94 |
### Using post requests
|
| 95 |
|
| 96 |
-
See: [examples/chat_with_post.py](https://github.com/Hansimov/hf-llm-api/blob/main/examples/chat_with_post.py)
|
| 97 |
|
| 98 |
|
| 99 |
```py
|
|
@@ -104,7 +108,11 @@ import re
|
|
| 104 |
|
| 105 |
# If runnning this service with proxy, you might need to unset `http(s)_proxy`.
|
| 106 |
chat_api = "http://127.0.0.1:23333"
|
| 107 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 108 |
requests_headers = {}
|
| 109 |
requests_payload = {
|
| 110 |
"model": "mixtral-8x7b",
|
|
|
|
| 10 |
## HF-LLM-API
|
| 11 |
Huggingface LLM Inference API in OpenAI message format.
|
| 12 |
|
| 13 |
+
Project link: https://github.com/Hansimov/hf-llm-api
|
| 14 |
+
|
| 15 |
## Features
|
| 16 |
|
| 17 |
+
- Available Models (2024/01/22): [#5](https://github.com/Hansimov/hf-llm-api/issues/5)
|
| 18 |
+
- `mistral-7b`, `mixtral-8x7b`, `nous-mixtral-8x7b`
|
| 19 |
+
- Adaptive prompt templates for different models
|
| 20 |
- Support OpenAI API format
|
| 21 |
- Enable api endpoint via official `openai-python` package
|
| 22 |
- Support both stream and no-stream response
|
| 23 |
+
- Support API Key via both HTTP auth header and env varible [#4](https://github.com/Hansimov/hf-llm-api/issues/4)
|
| 24 |
- Docker deployment
|
| 25 |
|
| 26 |
## Run API service
|
|
|
|
| 62 |
|
| 63 |
### Using `openai-python`
|
| 64 |
|
| 65 |
+
See: [`examples/chat_with_openai.py`](https://github.com/Hansimov/hf-llm-api/blob/main/examples/chat_with_openai.py)
|
| 66 |
|
| 67 |
```py
|
| 68 |
from openai import OpenAI
|
|
|
|
| 71 |
base_url = "http://127.0.0.1:23333"
|
| 72 |
# Your own HF_TOKEN
|
| 73 |
api_key = "hf_xxxxxxxxxxxxxxxx"
|
| 74 |
+
# use below as non-auth user
|
| 75 |
+
# api_key = "sk-xxx"
|
| 76 |
|
| 77 |
client = OpenAI(base_url=base_url, api_key=api_key)
|
| 78 |
response = client.chat.completions.create(
|
|
|
|
| 97 |
|
| 98 |
### Using post requests
|
| 99 |
|
| 100 |
+
See: [`examples/chat_with_post.py`](https://github.com/Hansimov/hf-llm-api/blob/main/examples/chat_with_post.py)
|
| 101 |
|
| 102 |
|
| 103 |
```py
|
|
|
|
| 108 |
|
| 109 |
# If runnning this service with proxy, you might need to unset `http(s)_proxy`.
|
| 110 |
chat_api = "http://127.0.0.1:23333"
|
| 111 |
+
# Your own HF_TOKEN
|
| 112 |
+
api_key = "hf_xxxxxxxxxxxxxxxx"
|
| 113 |
+
# use below as non-auth user
|
| 114 |
+
# api_key = "sk-xxx"
|
| 115 |
+
|
| 116 |
requests_headers = {}
|
| 117 |
requests_payload = {
|
| 118 |
"model": "mixtral-8x7b",
|