Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -148,7 +148,7 @@ We will automatically find a batch size that fits in your GPU memory. The defaul
|
|
| 148 |
|
| 149 |
### Loading Huge Models
|
| 150 |
|
| 151 |
-
Huge models such as LLaMA 65B or nllb-moe-54b can be
|
| 152 |
See [BitsAndBytes](https://github.com/TimDettmers/bitsandbytes). Set precision to 8 or 4 with the `--precision` flag.
|
| 153 |
|
| 154 |
```bash
|
|
|
|
| 148 |
|
| 149 |
### Loading Huge Models
|
| 150 |
|
| 151 |
+
Huge models such as LLaMA 65B or nllb-moe-54b can be loaded in a single GPU with 8 bits and 4 bits quantification with minimal performance degradation.
|
| 152 |
See [BitsAndBytes](https://github.com/TimDettmers/bitsandbytes). Set precision to 8 or 4 with the `--precision` flag.
|
| 153 |
|
| 154 |
```bash
|