← All Models

gemma-4-26b-a4b-qat

Public

Gemma 4 26B A4B optimized with Quantization-Aware Training (QAT)

2.3K Downloads

Capabilities

Vision Input
Reasoning

Minimum system memory

16GB

Tags

26B
gemma4

README

Gemma 4 26B A4B QAT

Gemma 4 26B A4B QAT is the Quantization-Aware Training version of Gemma 4 26B A4B. It aims to keep quality close to bfloat16 while using much less memory to load the model.

Gemma 4 is an open multimodal model family from Google DeepMind. It supports text and image input, text output, reasoning, long context, system prompts, and native tool use.

Gemma 4 26B A4B uses an efficient architecture for scalable local deployment. The QAT build helps make that setup lighter to load while keeping the same model family behavior.

Custom Fields

Special features defined by the model author

Enable Thinking

: boolean

(default=true)

Controls whether the model will think before replying

Parameters

Custom configuration options included with this model

Reasoning Section Parsing
{ "enabled": true, "startString": "<|channel>thought", "endString": "<channel|>" }
Repeat Penalty
1
Temperature
1
Top K Sampling
64
Top P Sampling
0.95

Sources

The underlying model files this model uses