Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
Clearer message
Browse files
app.py
CHANGED
|
@@ -26,13 +26,17 @@ def report_results():
|
|
| 26 |
"Reports the results of a memory calculation to the model's discussion page, and opens a new tab to it afterwards"
|
| 27 |
global MODEL_NAME, LIBRARY, TOKEN, USER_TOKEN
|
| 28 |
api = HfApi(token=TOKEN)
|
| 29 |
-
results = calculate_memory(MODEL_NAME, LIBRARY, ["fp32", "fp16", "int8", "int4"], access_token=USER_TOKEN, raw=True)
|
|
|
|
|
|
|
| 30 |
USER_TOKEN = None
|
| 31 |
post = f"""# Model Memory Requirements\n
|
|
|
|
|
|
|
| 32 |
|
| 33 |
These calculations were measured from the [Model Memory Utility Space](https://hf.co/spaces/hf-accelerate/model-memory-utility) on the Hub.
|
| 34 |
|
| 35 |
-
The minimum recommended vRAM needed for this model
|
| 36 |
When performing inference, expect to add up to an additional 20% to this, as found by [EleutherAI](https://blog.eleuther.ai/transformer-math/). More tests will be performed in the future to get a more accurate benchmark for each model.
|
| 37 |
|
| 38 |
When training with `Adam`, you can expect roughly 4x the reported results to be used. (1x for the model, 1x for the gradients, and 2x for the optimizer).
|
|
@@ -105,7 +109,7 @@ def calculate_memory(model_name:str, library:str, options:list, access_token:str
|
|
| 105 |
LIBRARY = library
|
| 106 |
|
| 107 |
if raw:
|
| 108 |
-
return pd.DataFrame(data).to_markdown(index=False)
|
| 109 |
|
| 110 |
results = [
|
| 111 |
f'## {title}',
|
|
|
|
| 26 |
"Reports the results of a memory calculation to the model's discussion page, and opens a new tab to it afterwards"
|
| 27 |
global MODEL_NAME, LIBRARY, TOKEN, USER_TOKEN
|
| 28 |
api = HfApi(token=TOKEN)
|
| 29 |
+
results, data = calculate_memory(MODEL_NAME, LIBRARY, ["fp32", "fp16", "int8", "int4"], access_token=USER_TOKEN, raw=True)
|
| 30 |
+
minimum = data[0]
|
| 31 |
+
|
| 32 |
USER_TOKEN = None
|
| 33 |
post = f"""# Model Memory Requirements\n
|
| 34 |
+
|
| 35 |
+
You will need about {minimum[1]} VRAM to load this model for inference, and {minimum[3]} VRAM to train it using Adam.
|
| 36 |
|
| 37 |
These calculations were measured from the [Model Memory Utility Space](https://hf.co/spaces/hf-accelerate/model-memory-utility) on the Hub.
|
| 38 |
|
| 39 |
+
The minimum recommended vRAM needed for this model assumes using [Accelerate or `device_map="auto"`](https://huggingface.co/docs/accelerate/usage_guides/big_modeling) and is denoted by the size of the "largest layer".
|
| 40 |
When performing inference, expect to add up to an additional 20% to this, as found by [EleutherAI](https://blog.eleuther.ai/transformer-math/). More tests will be performed in the future to get a more accurate benchmark for each model.
|
| 41 |
|
| 42 |
When training with `Adam`, you can expect roughly 4x the reported results to be used. (1x for the model, 1x for the gradients, and 2x for the optimizer).
|
|
|
|
| 109 |
LIBRARY = library
|
| 110 |
|
| 111 |
if raw:
|
| 112 |
+
return pd.DataFrame(data).to_markdown(index=False), data
|
| 113 |
|
| 114 |
results = [
|
| 115 |
f'## {title}',
|