21K Downloads
The 20B variant of OpenAI's open source model. Apache 2.0 licensed.
Trained for Ttool use
Reasoning
Designed for lower latency and specialized or local deployment, the model has 21B total parameters with only 3.6B active at a time. Thanks to native MXFP4 quantization for the MoE layer, it runs efficiently; it's capable of operating within 16GB of memory.
This model is released under a permissive Apache 2.0 license and it features configurable reasoning effort—low, medium, or high, so users can balance output quality and latency based on their needs. The model offers full chain-of-thought visibility to support easier debugging and increased trust, though this output is not intended for end users. It is fully fine-tunable, enabling adaptation to specific tasks or domains, and includes native agentic capabilities such as function calling, web browsing, Python execution, and structured outputs.
This model supports a context length of 131k.
Special features defined by the model author
Reasoning Effort
: select
(default=low)
Controls how much reasoning the model should perform.
The underlying model files this model uses
Based on
When you download this model, LM Studio picks the source that will best suit your machine (you can override this)
Custom configuration options included with this model