On Monday, Mistral AI announced a new AI language model called Mixtral 8x7B, a “mixture of experts” (MoE) model with open weights that reportedly truly matches OpenAI’s GPT-3.5 in performance—an achievement that has been claimed by others in the past but is being taken seriously by AI heavyweights such as OpenAI’s Andrej Karpathy and Jim Fan. That means we’re closer to having a ChatGPT-3.5-level AI assistant that can run freely and locally on our devices, given the right implementation.
Mistral, based in Paris and founded by Arthur Mensch, Guillaume Lample, and Timothée Lacroix, has seen a rapid rise in the AI space recently. It has been quickly raising venture capital to become a sort of French anti-OpenAI, championing smaller models with eye-catching performance. Most notably, Mistral’s models run locally with open weights that can be downloaded and used with fewer restrictions than closed AI models from OpenAI, Anthropic, or Google. (In this context “weights” are the computer files that represent a trained neural network.)
Mixtral 8x7B can process a 32K token context window and works in French, German, Spanish, Italian, and English. It works much like ChatGPT, in that it can assist with compositional tasks, analyze data, troubleshoot software, and write programs. Mistral claims that it outperforms Meta’s much larger LLaMA 2 70B (70 billion parameter) large language model and that it matches or exceeds OpenAI’s GPT-3.5 on certain benchmarks, as seen in the chart below.
Read 6 remaining paragraphs | Comments