gemma-3-4b

Public

State-of-the-art image + text input models from Google, built from the same research and tech used to create the Gemini models

91.1K Downloads

16 stars

Capabilities

Vision Input

Minimum system memory

2GB

Tags

4B
gemma3

Last updated

Updated on May 24by
lmmy's profile picture
lmmy

README

gemma 3 4b it by google

Supports a context length of 128k tokens, with a max output of 8192.

Multimodal supporting images normalized to 896 x 896 resolution.

Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning.

Requires latest (currently beta) llama.cpp runtime.

Sources

The underlying model files this model uses