Mistral Released Small 4
TG AI News·March 17, 2026 at 8:36 AM·
Trusted Source
Mistral released Small 4. And everything is sad. In the published benchmarks, the model loses to the September Qwen 3 Next, which has one and a half times fewer parameters overall and twice fewer active ones. The model is multimodal with 256k context tokens. The architecture is the same variation on the theme of DeepSeek V3 that was used in Large 3. It is available under Apache 2.0; the base model has not been released, but they have made a head for speculative decoding. Weights: FP8/NVFP4.