← All Models

gemma-3n

116.1K Downloads

Gemma 3n is a generative AI model optimized for use in everyday devices, such as phones, laptops, and tablets.

Models
Updated Just now
2.95 GB
4.24 GB

Memory Requirements

To run the smallest gemma-3n, you need at least 3 GB of RAM. The largest one may require up to 4 GB.

Capabilities

gemma-3n models support vision input. They are available in gguf and mlx.

About gemma-3n

undefined

Gemma 3n is a generative AI model optimized for use in everyday devices, such as phones, laptops, and tablets. This model includes innovations in parameter-efficient processing, including Per-Layer Embedding (PLE) parameter caching and a MatFormer model architecture that provides the flexibility to reduce compute and memory requirements. These models feature audio input handling, as well as text and visual data.

The Gemma 3n family includes two models:

  • gemma-3n-e2b (2B effective parameters)
  • gemma-3n-e4b (4B effective parameters)

Both models support a context length of 32k tokens.

Key Capabilities

  • Optimized On-Device Performance & Efficiency: Gemma 3n starts responding approximately 1.5x faster on mobile with significantly better quality (compared to Gemma 3 4B) and a reduced memory footprint achieved through innovations like Per Layer Embeddings, KVC sharing, and advanced activation quantization.

  • Many-in-1 Flexibility: A model with a 4B active memory footprint that natively includes a nested state-of-the-art 2B active memory footprint submodel (thanks to MatFormer training). This provides flexibility to dynamically trade off performance and quality on the fly without hosting separate models. We further introduce mix’n’match capability in Gemma 3n to dynamically create submodels from the 4B model that can optimally fit your specific use case -- and associated quality/latency tradeoff. Stay tuned for more on this research in our upcoming technical report.

  • Privacy-First & Offline Ready: Local execution enables features that respect user privacy and function reliably, even without an internet connection. Expanded Multimodal Understanding with Audio: Gemma 3n can understand and process audio, text, and images, and offers significantly enhanced video understanding. Its audio capabilities enable the model to perform high-quality Automatic Speech

  • Improved Multilingual Capabilities: Improved multilingual performance, particularly in Japanese, German, Korean, Spanish, and French. Strong performance reflected on multilingual benchmarks such as 50.1% on WMT24++ (ChrF).

Performance

undefined

License

Gemma-3n models are provided under the custom Gemma license.