← All Models

gemma-4-e2b-qat

Public

Gemma 4 E2B optimized with Quantization-Aware Training (QAT)

70 Downloads

Capabilities

Vision Input
Reasoning

Minimum system memory

4GB

Tags

5.1B
gemma4

README

Gemma 4 E2B QAT

Gemma 4 E2B QAT is the Quantization-Aware Training version of Gemma 4 E2B. It aims to keep quality close to bfloat16 while using much less memory to load the model.

Gemma 4 is an open multimodal model family from Google DeepMind. It supports text and image input, text output, reasoning, long context, system prompts, and native tool use.

Gemma 4 E2B is one of the lightest Gemma 4 models. It is designed for efficient local use, and the QAT build makes it easier to run on tighter memory budgets.

Custom Fields

Special features defined by the model author

Enable Thinking

: boolean

(default=true)

Controls whether the model will think before replying

Parameters

Custom configuration options included with this model

Reasoning Section Parsing
{ "enabled": true, "startString": "<|channel>thought", "endString": "<channel|>" }
Repeat Penalty
1
Temperature
1
Top K Sampling
64
Top P Sampling
0.95

Sources

The underlying model files this model uses