meta-llama-3.1-8b-instruct-4bit

MLX

Llama 3.1 8B Instruct 4bit

•

mlx-community

llama

The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out)

Model info

Model

Llama 3.1 8B Instruct 4bit

Author

mlx-community

Repository

🤗 mlx-community/Meta-Llama-3.1-8B-Instruct-4bit↗

Arch

llama

Parameters

8B

Format

safetensors

Size on disk

about 4.53 GB

Download and run Llama 3.1 8B Instruct 4bit

Open in LM Studio to view download options

Download llama-3.1-8b from the terminal

Download the model using lms — LM Studio's developer CLI.

lms get llama-3.1-8b

Call llama-3.1-8b from your code

curl http://localhost:1234/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.1-8b",
    "messages": [
      { "role": "system", "content": "Always answer in rhymes." },
      { "role": "user", "content": "Introduce yourself." }
    ],
    "temperature": 0.7,
    "max_tokens": -1,
    "stream": true
  }'

Next Steps: Build! 🔨

Use the Developer tab to configure the server and see incoming requests.
Run lms log stream to see your prompts as they are sent to the LLM.
🐛 Report bugs in lmstudio-ai/lmstudio-bug-tracker.

Learn more

OpenAI-like Local Server documentation
lmstudio.js - LM Studio SDK documentation (TypeScript)
lms log stream - Stream server logs
lms - LM Studio's CLI documentation

lmmy