Commit
·
f3ef13f
1
Parent(s):
96a5535
add vllm lastest image doc.
Browse files- README.md +3 -7
- README_CN.md +4 -8
README.md
CHANGED
|
@@ -227,9 +227,7 @@ We provide a pre-built Docker image containing vLLM 0.8.5 with full support for
|
|
| 227 |
- To get started:
|
| 228 |
|
| 229 |
```
|
| 230 |
-
docker pull
|
| 231 |
-
or
|
| 232 |
-
docker pull hunyuaninfer/hunyuan-a13b:hunyuan-moe-A13B-vllm
|
| 233 |
```
|
| 234 |
|
| 235 |
- Download Model file:
|
|
@@ -247,8 +245,7 @@ docker run --rm --ipc=host \
|
|
| 247 |
--net=host \
|
| 248 |
--gpus=all \
|
| 249 |
-it \
|
| 250 |
-
-
|
| 251 |
-
--entrypoint python hunyuaninfer/hunyuan-a13b:hunyuan-moe-A13B-vllm \
|
| 252 |
-m vllm.entrypoints.openai.api_server \
|
| 253 |
--host 0.0.0.0 \
|
| 254 |
--tensor-parallel-size 4 \
|
|
@@ -265,8 +262,7 @@ docker run --rm --ipc=host \
|
|
| 265 |
--net=host \
|
| 266 |
--gpus=all \
|
| 267 |
-it \
|
| 268 |
-
-
|
| 269 |
-
--entrypoint python hunyuaninfer/hunyuan-a13b:hunyuan-moe-A13B-vllm \
|
| 270 |
-m vllm.entrypoints.openai.api_server \
|
| 271 |
--host 0.0.0.0 \
|
| 272 |
--tensor-parallel-size 4 \
|
|
|
|
| 227 |
- To get started:
|
| 228 |
|
| 229 |
```
|
| 230 |
+
docker pull hunyuaninfer/hunyuan-infer-vllm-cuda12.4:v1
|
|
|
|
|
|
|
| 231 |
```
|
| 232 |
|
| 233 |
- Download Model file:
|
|
|
|
| 245 |
--net=host \
|
| 246 |
--gpus=all \
|
| 247 |
-it \
|
| 248 |
+
--entrypoint python3 hunyuaninfer/hunyuan-infer-vllm-cuda12.4:v1 \
|
|
|
|
| 249 |
-m vllm.entrypoints.openai.api_server \
|
| 250 |
--host 0.0.0.0 \
|
| 251 |
--tensor-parallel-size 4 \
|
|
|
|
| 262 |
--net=host \
|
| 263 |
--gpus=all \
|
| 264 |
-it \
|
| 265 |
+
--entrypoint python3 hunyuaninfer/hunyuan-infer-vllm-cuda12.4:v1 \
|
|
|
|
| 266 |
-m vllm.entrypoints.openai.api_server \
|
| 267 |
--host 0.0.0.0 \
|
| 268 |
--tensor-parallel-size 4 \
|
README_CN.md
CHANGED
|
@@ -180,14 +180,12 @@ print(response)
|
|
| 180 |
|
| 181 |
### Docker 镜像
|
| 182 |
|
| 183 |
-
|
| 184 |
|
| 185 |
- 快速开始方式如下:
|
| 186 |
|
| 187 |
```
|
| 188 |
-
docker pull
|
| 189 |
-
或
|
| 190 |
-
docker pull hunyuaninfer/hunyuan-a13b:hunyuan-moe-A13B-vllm
|
| 191 |
```
|
| 192 |
|
| 193 |
- 下载模型文件:
|
|
@@ -203,8 +201,7 @@ docker run --rm --ipc=host \
|
|
| 203 |
--net=host \
|
| 204 |
--gpus=all \
|
| 205 |
-it \
|
| 206 |
-
-
|
| 207 |
-
--entrypoint python hunyuaninfer/hunyuan-a13b:hunyuan-moe-A13B-vllm \
|
| 208 |
-m vllm.entrypoints.openai.api_server \
|
| 209 |
--host 0.0.0.0 \
|
| 210 |
--tensor-parallel-size 4 \
|
|
@@ -222,8 +219,7 @@ docker run --rm --ipc=host \
|
|
| 222 |
--net=host \
|
| 223 |
--gpus=all \
|
| 224 |
-it \
|
| 225 |
-
-
|
| 226 |
-
--entrypoint python hunyuaninfer/hunyuan-a13b:hunyuan-moe-A13B-vllm \
|
| 227 |
-m vllm.entrypoints.openai.api_server \
|
| 228 |
--host 0.0.0.0 \
|
| 229 |
--tensor-parallel-size 4 \
|
|
|
|
| 180 |
|
| 181 |
### Docker 镜像
|
| 182 |
|
| 183 |
+
我们提供了一个基于官方 vLLM 0.8.5 版本的 Docker 镜像方便快速部署和测试。**注意:该镜像要求使用 CUDA 12.4 版本。**
|
| 184 |
|
| 185 |
- 快速开始方式如下:
|
| 186 |
|
| 187 |
```
|
| 188 |
+
docker pull hunyuaninfer/hunyuan-infer-vllm-cuda12.4:v1
|
|
|
|
|
|
|
| 189 |
```
|
| 190 |
|
| 191 |
- 下载模型文件:
|
|
|
|
| 201 |
--net=host \
|
| 202 |
--gpus=all \
|
| 203 |
-it \
|
| 204 |
+
--entrypoint python3 hunyuaninfer/hunyuan-infer-vllm-cuda12.4:v1 \
|
|
|
|
| 205 |
-m vllm.entrypoints.openai.api_server \
|
| 206 |
--host 0.0.0.0 \
|
| 207 |
--tensor-parallel-size 4 \
|
|
|
|
| 219 |
--net=host \
|
| 220 |
--gpus=all \
|
| 221 |
-it \
|
| 222 |
+
--entrypoint python3 hunyuaninfer/hunyuan-infer-vllm-cuda12.4:v1 \
|
|
|
|
| 223 |
-m vllm.entrypoints.openai.api_server \
|
| 224 |
--host 0.0.0.0 \
|
| 225 |
--tensor-parallel-size 4 \
|