The process of measuring how well an AI model performs on its intended task.
The process of measuring how well an AI model performs on its intended task. Uses held-out test data, standardized benchmarks, or human judgment. Good evaluation is surprisingly hard — a model can ace benchmarks while failing at real-world tasks, or vice versa.
A standardized test used to measure and compare AI model performance.
A mathematical function applied to a neuron's output that introduces non-linearity into the network.
An optimization algorithm that combines the best parts of two other methods — AdaGrad and RMSProp.
Artificial General Intelligence.
The research field focused on making sure AI systems do what humans actually want them to do.
The broad field studying how to build AI systems that are safe, reliable, and beneficial.
Browse our complete glossary or subscribe to our newsletter for the latest AI news and insights.