A French AI company that builds efficient, high-performance language models.
A French AI company that builds efficient, high-performance language models. Their Mixtral (mixture-of-experts) and Mistral 7B models showed that smaller, well-trained models can compete with much larger ones. Known for releasing open-weight models and competing effectively against Big Tech AI labs.
An architecture where multiple specialized sub-networks (experts) share a model, but only a few activate for each input.
AI models whose weights, code, and sometimes training data are publicly released for anyone to use, modify, and build upon.
A large AI model trained on broad data that can be adapted for many different tasks.
A mathematical function applied to a neuron's output that introduces non-linearity into the network.
An optimization algorithm that combines the best parts of two other methods — AdaGrad and RMSProp.
Artificial General Intelligence.
Browse our complete glossary or subscribe to our newsletter for the latest AI news and insights.