README

Nemotron 3 Nano 4B

General purpose reasoning and chat model trained by NVIDIA and compressed from NVIDIA Nemotron Nano 9B v2. Contains 4B total parameters in a compact hybrid Mamba2-Transformer architecture designed for edge-ready inference.

Features a reasoning toggle to enable or disable intermediate reasoning traces, with improved accuracy on complex queries when reasoning is enabled. Includes native agentic capabilities for tool use, making it suitable for AI agents, local voice assistants, gaming NPCs, IoT automation, chatbots, and other AI-powered applications. Supports English.

Supports a context length of 256K tokens.

Custom Fields

Special features defined by the model author

Enable Thinking

: boolean

(default=true)

Controls whether the model will think before replying

Truncate Thinking History

: boolean

(default=true)

Controls whether thinking history will be truncated to save context space

Sources

The underlying model files this model uses

nemotron-3-nano-4b

Nemotron 3 Nano 4B