gemma-3-12b

Public

State-of-the-art image + text input models from Google, built from the same research and tech used to create the Gemini models

379.7K Downloads

26 stars

1 fork

Capabilities

Vision Input

Minimum system memory

11GB

Tags

12B
gemma3

Last updated

Updated on July 3by
lmmy's profile picture
lmmy

README

gemma 3 12b it by google

Supports a context length of 128k tokens, with a max output of 8192.

Multimodal supporting images normalized to 896 x 896 resolution.

Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning.

Requires latest (currently beta) llama.cpp runtime.

Sources

The underlying model files this model uses