LMmy choosing

Model Catalog

New & noteworthy local models you can run on your own machine.

minimax-m2
MiniMax M2 is a 230B MoE (10B active) model built for coding and agentic workflows
Updated 2 days ago
gpt-oss-safeguard
gpt-oss-safeguard-20b and gpt-oss-safeguard-120b are open safety models from OpenAI, building on gpt-oss. Trained to help classify text content based on customizable policies.
Updated 9 days ago
Qwen3-VL
2B
4B
8B
30B
32B
Qwen's latest vision-language model. Includes comprehensive upgrades to visual perception, spatial reasoning, and image understanding.
128.4K
31
5
Updated 10 days ago
Granite 4.0
Granite 4.0 language models are lightweight, state-of-the-art open models that natively support multilingual capabilities, coding tasks, RAG, tool use, and JSON output.
Updated 10 days ago
Qwen3 Next
Hybrid attention architecture, high-sparsity Mixture-of-Experts 80B model (active 3B). Currently supported for Mac only with MLX.
Updated 10 days ago
seed-oss
Advanced reasoning model from ByteDance with flexible "thinking budget" control and ability to reflect on the length of its own reasoning
Updated 10 days ago
Qwen3
4B
4B
30B
30B
235B
235B
The latest version of the Qwen3 model family, featuring 4B, 30B, and 235B dense and MoE models, both thinking and non-thinking variants.
267.9K
94
6
Updated 10 days ago
gpt-oss
OpenAI's first open source LLM. Comes in 2 sizes: 20B and 120B. Supports configurable reasoning effort (low, medium, high). Trained for tool use. Apache 2.0 licensed.
Updated 10 days ago
Qwen3-Coder
State-of-the-art, Mixture-of-Experts local coding model with native support for 256K context length. Available in 30B (3B active) and 480B (35B active) sizes.
Updated 10 days ago
Ernie-4.5
Medium-size Mixture-of-Experts model from Baidu's new Ernie 4.5 line of foundation models.
Updated 10 days ago
LFM2
LFM2 is a new generation of hybrid models developed by Liquid AI, specifically designed for edge AI and on-device deployment. It sets a new standard in terms of quality, speed, and memory efficiency.
Updated 10 days ago
devstral
23.6B
24B
Devstral is a coding model from Mistral AI. It excels at using tools to explore codebases, editing multiple files and power software engineering agents.
64.4K
30
2
Updated 10 days ago
gemma-3n
Gemma 3n is a generative AI model optimized for use in everyday devices, such as phones, laptops, and tablets.
Updated 10 days ago
Mistral Small
Mistrall Small is a 'knowledge-dense' 24B multi-modal (image input) local model that supports up to 128 token context length.
Updated 10 days ago
Magistral
MistralAI's open-weight reasoning model. 24B dense transformer model supporting up to 128K token context window. The model is capable of long chains of reasoning traces before providing answers.
Updated 10 days ago
mistral-nemo
General purpose dense transformer designed for multilingual use cases. Built in collaboration between MistralAI and NVIDIA.
Updated 10 days ago
qwen2.5-vl
3B
7B
32B
72B
Qwen2.5-VL is a performant vision-language model, capable of recognizing common objects and text. Supports context length of 128k tokens in a variety of human languages.
50.1K
12
4
Updated 10 days ago
gemma-3
State-of-the-art image + text input models from Google, built from the same research and tech used to create the Gemini models
Updated 10 days ago
phi-4-reasoning
Phi-4-mini-reasoning is a lightweight open model built upon synthetic data with a focus on high-quality, reasoning dense data.
Updated 10 days ago
phi-4
phi-4 is a state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets.
Updated 10 days ago
Codestral
22B
Mistral AI's latest coding model, Codestral can handle both instructions and code completions with ease in over 80 programming languages.
24.6K
13
Updated 10 days ago
Mistral
One of the most popular open-source LLMs, Mistral's 7B Instruct model's balance of speed, size, and performance makes it a great general-purpose daily driver.
Updated 10 days ago
Qwen3 (1st Generation)
4B
8B
14B
30B
32B
235B
The first batch of Qwen3 models (Qwen3-2504), a collection of dense and MoE models ranging from 4B to 235B. These are general purpose models that score highly on benchmarks.
278.4K
37
6
Updated 10 days ago
deepseek-r1
7B
8B
8B
14B
32B
70B
Distilled version of the DeepSeek-R1-0528 model, created by continuing the post-training process on the Qwen3 8B Base model using Chain-of-Thought (CoT) from DeepSeek-R1-0528.
355.8K
91
6
Updated 10 days ago