Update README.md
Browse files
README.md
CHANGED
|
@@ -326,12 +326,12 @@ Run the benchmarks under `vllm` root folder:
|
|
| 326 |
|
| 327 |
### baseline
|
| 328 |
```Shell
|
| 329 |
-
|
| 330 |
```
|
| 331 |
|
| 332 |
### INT4
|
| 333 |
```Shell
|
| 334 |
-
VLLM_DISABLE_COMPILE_CACHE=1
|
| 335 |
```
|
| 336 |
|
| 337 |
</details>
|
|
|
|
| 326 |
|
| 327 |
### baseline
|
| 328 |
```Shell
|
| 329 |
+
vllm bench latency --input-len 256 --output-len 256 --model microsoft/Phi-4-mini-instruct --batch-size 1
|
| 330 |
```
|
| 331 |
|
| 332 |
### INT4
|
| 333 |
```Shell
|
| 334 |
+
VLLM_DISABLE_COMPILE_CACHE=1 vllm bench latency --input-len 256 --output-len 256 --model pytorch/Phi-4-mini-instruct-INT4 --batch-size 1
|
| 335 |
```
|
| 336 |
|
| 337 |
</details>
|