Mistrall Small is a 'knowledge-dense' 24B multi-modal (image input) local model that supports up to 128 token context length.
To run the smallest Mistral Small, you need at least 14 GB of RAM.
Mistral Small models support vision input. They are available in gguf and mlx.

Building upon Mistral Small 3 (2501), Mistral Small 3.1 (2503), this Mistral Small generation adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance.
With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks. This model is an instruction-finetuned version of: Mistral-Small-3.1-24B-Base-2503.
Small-3.2 improves in the following categories:
Vision: Vision capabilities enable the model to analyze images and provide insights based on visual content in addition to text.
Multilingual: Supports dozens of languages, including English, French, German, Greek, Hindi, Indonesian, Italian, Japanese, Korean, Malay, Nepali, Polish, Portuguese, Romanian, Russian, Serbian, Spanish, Swedish, Turkish, Ukrainian, Vietnamese, Arabic, Bengali, Chinese, Farsi.
Agent-Centric: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
Advanced Reasoning: State-of-the-art conversational and reasoning capabilities.
Apache 2.0 License: Open license allowing usage and modification for both commercial and non-commercial purposes.
Context Window: A 128k context window.
System Prompt: Maintains strong adherence and support for system prompts.
Tokenizer: Utilizes a Tekken tokenizer with a 131k vocabulary size.
MistralAI compares Mistral-Small-3.2-24B to Mistral-Small-3.1-24B-Instruct-2503. For more comparison against other models of similar size, please check Mistral-Small-3.1's Benchmarks'
| Model | Wildbench v2 | Arena Hard v2 | IF (Internal; accuracy) | 
|---|---|---|---|
| Small 3.1 24B Instruct | 55.6% | 19.56% | 82.75% | 
| Small 3.2 24B Instruct | 65.33% | 43.1% | 84.78% |