1 Download
Capabilities
Minimum system memory
Tags
Last updated
Updated 9 days agobyForked from nvidia/nemotron-3-nano-4b
README
Custom Fields
Special features defined by the model author
Enable Thinking
: boolean
(default=true)
Controls whether the model will think before replying
Truncate Thinking History
: boolean
(default=true)
Controls whether thinking history will be truncated to save context space
Sources
The underlying model files this model uses
Based on
General purpose reasoning and chat model trained by NVIDIA and compressed from NVIDIA Nemotron Nano 9B v2. Contains 4B total parameters in a compact hybrid Mamba2-Transformer architecture designed for edge-ready inference.
Features a reasoning toggle to enable or disable intermediate reasoning traces, with improved accuracy on complex queries when reasoning is enabled. Includes native agentic capabilities for tool use, making it suitable for AI agents, local voice assistants, gaming NPCs, IoT automation, chatbots, and other AI-powered applications. Supports English.
Supports a context length of 256K tokens.