gemma-3-4b

Public

State-of-the-art image + text input models from Google, built from the same research and tech used to create the Gemini models

97.9K Downloads

17 stars

1 fork

Capabilities

Vision Input

Minimum system memory

2GB

Tags

4B
gemma3

Last updated

Updated on May 24by
lmmy's profile picture
lmmy

README

gemma 3 4b it by google

Supports a context length of 128k tokens, with a max output of 8192.

Multimodal supporting images normalized to 896 x 896 resolution.

Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning.

Requires latest (currently beta) llama.cpp runtime.

Sources

The underlying model files this model uses