Mistral announced a new model — Small 4, which is now available for use.
The Mistral Small 4 model is a multimodal system capable of performing three key tasks simultaneously: reasoning, agent coding, and image processing. Previously, separate models existed for each of these functions — Magistral, Devstral, and Pixtral. Now, all these capabilities are integrated into a single powerful checkpoint, significantly simplifying workflows and increasing efficiency.
Regarding technical features, Small 4 is built on a MoE architecture with 128 experts. Each token activates four experts simultaneously, ensuring high accuracy and processing speed. The total number of model parameters is 119 billion, with approximately six billion active parameters per token. The context window size is impressive — up to 256,000 tokens, allowing for modeling longer dialogues or complex scenarios.
Compared to the previous Small 3 version, improvements are noticeable: latency has decreased by about 40%, and throughput has tripled. One of the main innovations is the reasoning_effort setting. When set to none, the model operates in a fast mode for chats without deep reasoning (similar to Small 3.2). When set to high, it begins to generate chains of logical reasoning, nearly comparable to Magistral. Switching between modes can be done on the fly without changing the model.
What do tests show? In reasoning mode, Small 4 outperforms GPT-OSS 120B on the LiveCodeBench platform in quality while generating roughly 20% fewer tokens. On the AA LCR platform, the model scores 0.72 with responses about 1,600 characters long. In comparison: to achieve similar results with Qwen, input length needs to be between 5,800 and 6,100 characters.
To deploy the model on private servers, the minimum required setup is a four-GPU machine with NVIDIA HGX H100, or two HGX H200 units, or one DGX B200.
You can try the model for free via the official website build.nvidia.com, as well as through the Mistral API or AI Studio platform.
The model license is Apache 2.0.
Additional information can be found in the article and model set review.
This solution is ideal for those seeking a versatile multimodal system with powerful capabilities and flexible customization options.
Created with n8n:
https://cutt.ly/n8n
Created with syllaby:
https://cutt.ly/syllaby
