MLX

Llama 3.1 8B Instruct 4bit

mlx-community

llama

The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out)

Model info

Model

Llama 3.1 8B Instruct 4bit

Author

mlx-community

Arch

llama

Parameters

8B

Format

safetensors

Size on disk

about 4.53 GB

Download and run Llama 3.1 8B Instruct 4bit

Open in LM Studio to view download options

Download llama-3.1-8b from the terminal

Download the model using lms — LM Studio's developer CLI.

lms get llama-3.1-8b

Call llama-3.1-8b from your code

curl http://localhost:1234/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.1-8b",
    "messages": [
      { "role": "system", "content": "Always answer in rhymes." },
      { "role": "user", "content": "Introduce yourself." }
    ],
    "temperature": 0.7,
    "max_tokens": -1,
    "stream": true
  }'

Next Steps: Build! 🔨

Learn more


lmmy