gemma-3-4b

Public

Description

State-of-the-art image + text input models from Google, built from the same research and tech used to create the Gemini models

Stats

105.2K Downloads

21 stars

1 fork

Capabilities

Vision Input

Minimum system memory

2GB

gemma 3 4b it by google

Supports a context length of 128k tokens, with a max output of 8192.

Multimodal supporting images normalized to 896 x 896 resolution.

Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning.

Requires latest (currently beta) llama.cpp runtime.

Sources

The underlying model files this model uses

Based on

🤗lmstudio-community/gemma-3-4b-it-GGUF→

GGUF

🤗lmstudio-community/gemma-3-4B-it-QAT-GGUF→

GGUF

🤗mlx-community/gemma-3-4b-it-qat-4bit→