nemotron-3-nano

Public

General purpose reasoning and chat model trained from scratch by NVIDIA. Contains 30B total parameters with only 3.5B active at a time for low-latency MoE inference

40.9K Downloads

16 stars

2 forks

Capabilities

Reasoning

Minimum system memory

25GB

Tags

30B
nemotron_h_moe

README

Nemotron 3 Nano by NVIDIA

General purpose reasoning and chat model trained from scratch by NVIDIA. Contains 30B total parameters with only 3.5B active at a time for low-latency MoE inference.

Features a reasoning toggle to enable or disable intermediate reasoning traces, with improved accuracy on complex queries when reasoning is enabled. Includes native agentic capabilities for tool use, making it suitable for AI agents, RAG systems, chatbots, and other AI-powered applications. Supports multiple languages including English, Spanish, French, German, Japanese, and Italian.

Supports a context length of 1M tokens.

Custom Fields

Special features defined by the model author

Enable Thinking

: boolean

(default=true)

Controls whether the model will think before replying

Truncate Thinking History

: boolean

(default=false)

Controls whether thinking history will be truncated to save context space

Parameters

Custom configuration options included with this model

Repeat Penalty
1