microsoft
/

Phi-4-mini-instruct-onnx

Model card Files Files and versions

parinitarahi commited on Feb 26

Commit

a08bc95

·

verified ·

1 Parent(s): 8f74cff

Update README.md

Files changed (1) hide show

README.md +44 -0

README.md CHANGED Viewed

@@ -19,8 +19,52 @@ Here are some of the optimized configurations we have added:
 1. ONNX model for int4 CPU and Mobile: ONNX model for CPU and mobile using int4 quantization via RTN.
 2. ONNX model for int4 CUDA and DML GPU devices using int4 quantization via RTN.
 You can see how to run examples with ORT GenAI [here](https://github.com/microsoft/onnxruntime-genai/blob/main/examples/python/phi-3-tutorial.md)
 ## Model Description
 - Developed by: Microsoft
 - Model type: ONNX

 1. ONNX model for int4 CPU and Mobile: ONNX model for CPU and mobile using int4 quantization via RTN.
 2. ONNX model for int4 CUDA and DML GPU devices using int4 quantization via RTN.
+## Model Run
 You can see how to run examples with ORT GenAI [here](https://github.com/microsoft/onnxruntime-genai/blob/main/examples/python/phi-3-tutorial.md)
+For CPU:
+```bash
+# Download the model directly using the Hugging Face CLI
+huggingface-cli download microsoft/Phi-4-mini-instruct-onnx/ --include Phi-4-mini-instruct-onnx/cpu_and_mobile/* --local-dir .
+# Install the CPU package of ONNX Runtime GenAI
+pip install onnxruntime-genai
+# Please adjust the model directory (-m) accordingly
+curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/phi3-qa.py -o phi3-qa.py
+python phi3-qa.py -m cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4 -e cpu
+```
+For CUDA:
+```bash
+# Download the model directly using the Hugging Face CLI
+huggingface-cli download onnxruntime/Phi-4-mini-instruct-onnx --include Phi-4-mini-instruct-onnx/gpu/* --local-dir .
+# Install the CUDA package of ONNX Runtime GenAI
+pip install onnxruntime-genai-cuda
+# Please adjust the model directory (-m) accordingly
+curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/phi3-qa.py -o phi3-qa.py
+python phi3-qa.py -m gpu/gpu-int4-rtn-block-32 -e cuda
+```
+For DirectML:
+```bash
+# Download the model directly using the Hugging Face CLI
+huggingface-cli download onnxruntime/Phi-4-mini-instruct-onnx --include Phi-4-mini-instruct-onnx/gpu/* --local-dir .
+# Install the CUDA package of ONNX Runtime GenAI
+pip install onnxruntime-genai-cuda
+# Please adjust the model directory (-m) accordingly
+curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/phi3-qa.py -o phi3-qa.py
+python phi3-qa.py -m gpu/gpu-int4-rtn-block-32 -e dml
+```
 ## Model Description
 - Developed by: Microsoft
 - Model type: ONNX