llama-3.2-3b-instruct-4bit

MLX

Llama 3.2 3B Instruct 4bit

•

mlx-community

llama

The new and small Llama model from Meta, optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks.

Model info

Model

Llama 3.2 3B Instruct 4bit

Author

mlx-community

Repository

🤗 mlx-community/Llama-3.2-3B-Instruct-4bit↗

Arch

llama

Parameters

3b

Format

safetensors

Size on disk

about 1.82 GB

Download and run Llama 3.2 3B Instruct 4bit

Open in LM Studio to view download options

Download llama-3.2-3b from the terminal

Download the model using lms — LM Studio's developer CLI.

lms get llama-3.2-3b

Call llama-3.2-3b from your code

curl http://localhost:1234/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.2-3b",
    "messages": [
      { "role": "system", "content": "Always answer in rhymes." },
      { "role": "user", "content": "Introduce yourself." }
    ],
    "temperature": 0.7,
    "max_tokens": -1,
    "stream": true
  }'

Next Steps: Build! 🔨

Use the Developer tab to configure the server and see incoming requests.
Run lms log stream to see your prompts as they are sent to the LLM.
🐛 Report bugs in lmstudio-ai/lmstudio-bug-tracker.

Learn more

OpenAI-like Local Server documentation
lmstudio.js - LM Studio SDK documentation (TypeScript)
lms log stream - Stream server logs
lms - LM Studio's CLI documentation

lmmy