Gemma 4 E2B QAT is the Quantization-Aware Training version of Gemma 4 E2B. It aims to keep quality close to bfloat16 while using much less memory to load the model.
Gemma 4 is an open multimodal model family from Google DeepMind. It supports text and image input, text output, reasoning, long context, system prompts, and native tool use.
Gemma 4 E2B is one of the lightest Gemma 4 models. It is designed for efficient local use, and the QAT build makes it easier to run on tighter memory budgets.
Custom Fields
Special features defined by the model author
Enable Thinking
: boolean
(default=true)
Controls whether the model will think before replying
Parameters
Custom configuration options included with this model