Mistral AI
A French AI company known for producing efficient, high-performing open-weight language models that punch above their weight class in benchmarks.
Key Models
Mistral 7B: Outperformed LLaMA 2 13B despite being half the size. Mixtral 8x7B: Mixture-of-experts model rivaling GPT-3.5. Mistral Large: Proprietary frontier model. Codestral: Code-specialized model.
Innovations
Sliding window attention for efficient long context, grouped-query attention for fast inference, and the Mixture of Experts architecture that activates only a subset of parameters per token.