gemma-4-12b

Public

Model

Revisions

Description

The new Gemma 4 12B Unified reasoning model with image support

Stats

526.9K Downloads

54 stars

1 fork

Capabilities

Vision Input

Trained for tool use

ReasoningSupports reasoning

Minimum system memory

7GB

Gemma 4 12B Unified

Gemma 4 12B's "Unified" label refers to its encoder-free design. Unlike other Gemma 4 models that use separate encoders for multimodal inputs, this model skips them entirely, projecting image patches directly into the LLM's embedding space via lightweight linear layers. All modalities feed into a single decoder-only transformer, cutting multimodal latency and enabling end-to-end fine-tuning in one pass.

The model supports context length of 256K tokens with vision capabilities

Custom Fields

Special features defined by the model author

Enable Thinking

: boolean

(default=true)

Controls whether the model will think before replying

Parameters

Custom configuration options included with this model

Reasoning Section Parsing

{ "enabled": true, "startString": "<|channel>thought", "endString": "<channel|>" }

Temperature

Top K Sampling

Top P Sampling

0.95

Sources

The underlying model files this model uses

Based on

🤗lmstudio-community/gemma-4-12B-it-GGUF→

GGUF