MLX
•
llama
The new and small Llama model from Meta, optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks.
Model info
Model
Llama 3.2 3B Instruct 4bit
Author
mlx-community
Repository
Arch
llama
Parameters
3b
Format
safetensors
Size on disk
about 1.82 GB
Download the model using lms
— LM Studio's developer CLI.
lms get llama-3.2-3b
curl http://localhost:1234/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3.2-3b",
"messages": [
{ "role": "system", "content": "Always answer in rhymes." },
{ "role": "user", "content": "Introduce yourself." }
],
"temperature": 0.7,
"max_tokens": -1,
"stream": true
}'
lms log stream
to see your prompts as they are sent to the LLM.lmstudio.js
- LM Studio SDK documentation (TypeScript)lms log stream
- Stream server logslms
- LM Studio's CLI documentation