A large AI model trained on broad data that can be adapted for many different tasks.
A large AI model trained on broad data that can be adapted for many different tasks. GPT-4, Claude, and LLaMA are foundation models. The term, coined by Stanford researchers, emphasizes that these models serve as the foundation for a wide range of downstream applications.
An AI model with billions of parameters trained on massive text datasets.
The initial, expensive phase of training where a model learns general patterns from a massive dataset.
The process of taking a pre-trained model and continuing to train it on a smaller, specific dataset to adapt it for a particular task or domain.
A mathematical function applied to a neuron's output that introduces non-linearity into the network.
An optimization algorithm that combines the best parts of two other methods — AdaGrad and RMSProp.
Artificial General Intelligence.
Browse our complete glossary or subscribe to our newsletter for the latest AI news and insights.