•
llama
A tiny and speedy Llama model from Meta, optimized for multilingual dialogue use cases.
•
llama
The new and small Llama model from Meta, optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks.
•
qwen
Code-specific LLMs for code generation, code reasoning and code fixing, supporting context length of up to 128K.
•
qwen
An LLM specializing in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating JSON structured outputs.
•
llama
Yi Coder is a Llama fine-tune with expanded size, trained for code. Supports max 128K tokens and "52 major programming languages".
•
llama
A fine-tune of Meta's Llama 3.1, Hermes is further trained on hand-curated datasets as well and synthetic data. Excels in dialogue and code generation.
•
internlm2
InternLM 2.5 offers strong reasoning across the board as well as tool use for developers, while sitting at the sweet spot of size for those with 24GB GPUs.
•
llava
The original LLaVA vision-enabled model, supporting image input and textual instruction following.
•
llama
The latest in Meta's long-running Llama series, Llama 3.1 is another jack of all trades and master of some, now in 8 languages and up to 128k tokens.
•
mistral
A slightly larger 12B parameter model from Mistral AI, NeMo offers a long 128k token context length, advanced world knowledge, and function calling for developers.
•
gemma2
"Google's Llama", Gemma benefits from Google's experience training its flagship Gemini model to provide excellent performance on low power or for autocompletion/drafting tasks.
•
mistral
A scientific specialist finetune of Mistral AI's popular 7B model, Mathstral excels at STEM chats and tasks.
•
gemma
The mid-sized option of the Gemma 2 model family. Built by Google, using from the same research and technology used to create the Gemini models
•
llama
A HuggingFace original model, SmolLM lives up to its name in size and will fit on just about any device. Slightly larger option at 1.7B parameters is also available.
•
gemma
The large option of the Gemma 2 model family. Built by Google, using from the same research and technology used to create the Gemini models
•
mistral
Mistral AI's latest coding model, Codestral can handle both instructions and code completions with ease in over 80 programming languages.
•
phi3
Microsoft's latest Phi Mini model supports a whopping context length of 128k tokens in a small size, offering extremely long chats for cheap.
•
deepseek2
The younger sibling of the GPT-4-beating 236B Deepseek Coder V2 model, this model also comes out strong with support for 338 different languages!
•
qwen
A small model from Alibaba's Qwen2 family that punches above its weight in mathematical and multi-step logical reasoning.
•
qwen2
Promising 27 languages and lightning fast responses, this is the smallest entry in Alibaba's Qwen2 family that scales up to 72B parameters.
•
command-r
Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities.
•
mistral
One of the most popular open-source LLMs, Mistral's 7B Instruct model's balance of speed, size, and performance makes it a great general-purpose daily driver.
•
stablelm
From the team behind Stable Diffusion, this small code model offers an excellent coding assistant for those with lighter hardware.
•
cohere
Able to chat in more than 10 languages, Cohere AI's Command-R is optimized for RAG but can perform well in any task.
•
starcoder2
Also coming in 3B, 15B, and Chat versions, the StarCoder2 family offers a diverse portfolio of local coding assistants.
•
deepseek
Promising comparable performance to GPT-4 on mathematical reasoning, DeepSeek Math also offers the ability to write code to solve and prove mathematical problems.