A technique where a smaller 'student' model learns to mimic a larger 'teacher' model.
A technique where a smaller 'student' model learns to mimic a larger 'teacher' model. The student trains on the teacher's outputs rather than raw data, capturing learned knowledge in a more compact form. Widely used to create efficient models that run on phones and edge devices.
Running a trained model to make predictions on new data.
Reducing the precision of a model's numerical values — for example, from 32-bit to 4-bit numbers.
A mathematical function applied to a neuron's output that introduces non-linearity into the network.
An optimization algorithm that combines the best parts of two other methods — AdaGrad and RMSProp.
Artificial General Intelligence.
The research field focused on making sure AI systems do what humans actually want them to do.
Browse our complete glossary or subscribe to our newsletter for the latest AI news and insights.