The process of finding the best set of model parameters by minimizing a loss function.
The process of finding the best set of model parameters by minimizing a loss function. Gradient descent and its variants (Adam, SGD with momentum, AdaFactor) are the workhorses. Good optimization is crucial — the same architecture can work brilliantly or fail completely depending on the optimizer and settings.
The fundamental optimization algorithm used to train neural networks.
An optimization algorithm that combines the best parts of two other methods — AdaGrad and RMSProp.
A mathematical function that measures how far the model's predictions are from the correct answers.
A mathematical function applied to a neuron's output that introduces non-linearity into the network.
Artificial General Intelligence.
The research field focused on making sure AI systems do what humans actually want them to do.
Browse our complete glossary or subscribe to our newsletter for the latest AI news and insights.