LFM2 is a family of hybrid models designed for on-device deployment. LFM2-24B-A2B is the largest model in the family, scaling the architecture to 24 billion parameters while keeping inference efficient.
To run the smallest LFM2-24B-A2B, you need at least 14 GB of RAM.
LFM2-24B-A2B models support tool use. They are available in gguf and mlx.

LFM2 is a family of hybrid models designed for on-device deployment. LFM2-24B-A2B is the largest model in the family, scaling the architecture to 24 billion parameters while keeping inference efficient.
LFM2-24B-A2B is a general-purpose instruct model (without reasoning traces) with the following features:
| Property | LFM2-8B-A1B | LFM2-24B-A2B |
|---|---|---|
| Total parameters | 8.3B | 24B |
| Active parameters | 1.5B | 2.3B |
| Layers | 24 (18 conv + 6 attn) | 40 (30 conv + 10 attn) |
| Context length | 32,768 tokens | 32,768 tokens |
| Vocabulary size | 65,536 | 65,536 |
| Training precision | Mixed BF16/FP8 | Mixed BF16/FP8 |
| Training budget | 12 trillion tokens | 17 trillion tokens |
| License | LFM Open License v1.0 | LFM Open License v1.0 |
Supported languages: English, Arabic, Chinese, French, German, Japanese, Korean, Spanish, Portuguese
Generation parameters:
temperature: 0.1top_k: 50repetition_penalty: 1.05Liquid recommends the following use cases:
LFM2 is a hybrid architecture that pairs efficient gated short convolution blocks with a small number of grouped query attention (GQA) blocks.

This design, developed through hardware-in-the-loop architecture search, gives LFM2 models fast prefill and decode at low memory cost. LFM2-24B-A2B applies this backbone in a Mixture of Experts configuration: with 24B total parameters but only 2.3B active per forward pass, it punches far above the cost of a 2B dense model at inference time.
Across benchmarks including GPQA Diamond, MMLU-Pro, IFEval, IFBench, GSM8K, and MATH-500, quality improves log-linearly as we scale from 350M to 24B total parameters. This near-100x parameter range confirms that the LFM2 hybrid architecture follows predictable scaling behavior and does not hit a ceiling at small model sizes.

LFM2 is provided under the custom LFM1.0 license.