Loss Function

A mathematical measure of how wrong a model’s predictions are during training.

A loss function converts model error into a number. The higher the loss, the worse the model is performing on a given example or batch. Training aims to minimize this value over time.

Different tasks use different loss functions. Classification models often use cross-entropy loss, regression models may use mean squared error, and language models optimize next-token prediction loss across sequences of tokens.

Why it matters: The loss function defines what "good performance" means during training. It is the signal the model uses to learn.

What Loss Functions Do

Quantify error — turn prediction quality into a numeric objective
Guide optimization — provide a target for gradient descent
Enable comparison — track whether training is improving
Shape behavior — different losses encourage different outcomes

Watching both training loss and validation loss is important. If training loss keeps improving while validation loss gets worse, the model may be drifting into overfitting.

Related Terms

← Back to Glossary