Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
| ## | |
| <pre> | |
| from accelerate import Accelerator | |
| accelerator = Accelerator( | |
| + gradient_accumulation_steps=2, | |
| ) | |
| dataloader, model, optimizer scheduler = accelerator.prepare( | |
| dataloader, model, optimizer, scheduler | |
| ) | |
| for batch in dataloader: | |
| + with accelerator.accumulate(model): | |
| inputs, targets = batch | |
| outputs = model(inputs) | |
| loss = loss_function(outputs, targets) | |
| accelerator.backward(loss) | |
| optimizer.step() | |
| scheduler.step() | |
| optimizer.zero_grad()</pre> | |
| ## | |
| When performing gradient accumulation in a distributed setup, there are many opportunities for efficiency mistakes | |
| to occur. `Accelerator` provides a context manager that will take care of the details for you and ensure that the | |
| model is training correctly. Simply wrap the training loop in the `Accelerator.accumulate` context manager | |
| while passing in the model you are training on and during training the gradients will accumulate and synchronize | |
| automatically when needed. | |
| ## | |
| To learn more checkout the related documentation: | |
| - <a href="https://huggingface.co/docs/accelerate/usage_guides/gradient_accumulation" target="_blank">Performing gradient accumulation</a> | |
| - <a href="https://huggingface.co/docs/accelerate/package_reference/accelerator#accelerate.Accelerator.accumulate" target="_blank">API reference</a> | |
| - <a href="https://github.com/huggingface/accelerate/blob/main/examples/by_feature/gradient_accumulation.py" target="_blank">Example script</a> | |
| - <a href="https://github.com/huggingface/accelerate/blob/main/examples/by_feature/automatic_gradient_accumulation.py" target="_blank">Performing automatic gradient accumulation example script</a> |