Gemma 4 12B's "Unified" label refers to its encoder-free design. Unlike other Gemma 4 models that use separate encoders for multimodal inputs, this model skips them entirely, projecting image patches directly into the LLM's embedding space via lightweight linear layers. All modalities feed into a single decoder-only transformer, cutting multimodal latency and enabling end-to-end fine-tuning in one pass.
Custom Fields
Special features defined by the model author
Enable Thinking
: boolean
(default=true)
Controls whether the model will think before replying
Parameters
Custom configuration options included with this model