Transformers

Trainer

Trainer is a complete training and evaluation loop for Transformers models. You only need a model and dataset to get started.

Underneath, Trainer handles batching, shuffling, and padding your dataset into tensors. The training loop runs the forward pass, calculates loss, backpropagates gradients, and updates weights. Configure the training run with TrainingArguments to customize everything from batch size and training duration to distributed strategies, compilation, and more.

Next steps

Start with the fine-tuning tutorial for an introduction to training a large language model with Trainer.
Check the Subclassing Trainer methods guide for examples of how to subclass Trainer methods.
See the Data collators guide to learn how to create a data collator for custom batch assembly.
See the Callbacks guide to learn how to hook into training events.

Update on GitHub