Llama 3.2 3B

Meta

llama

The new and small Llama model from Meta, optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks.

Model info

Model

Llama 3.2 3B

Author

Meta

Arch

llama

Parameters

3B

Size on disk

about 3.42 GB

Format

gguf

Download and run Llama 3.2 3B

Open in LM Studio to view download options

Use Llama 3.2 3B in your code

💡 LM Studio needs to be installed and run at least once for this to work. Don't have it yet? Get it here.

CLI Bootstrap

npx lmstudio install-cli # (only needed once)

Model Load

lms load hugging-quants/llama-3.2-3b-instruct-q8_0-gguf
Alternatively, load the model in the LM Studio app.

Use Llama 3.2 3B via an OpenAI-like API

Reuse your existing OpenAI client code and point it to LM Studio instead.

Python example
# Example: reuse your existing OpenAI client code
from openai import OpenAI

# Point to the local server
client = OpenAI(base_url="http://localhost:1234/v1", 
                api_key="lm-studio") # not used

completion = client.chat.completions.create(
  model="hugging-quants/llama-3.2-3b-instruct-q8_0-gguf",
  messages=[
    {"role": "system", "content": "Always answer in rhymes."},
    {"role": "user", "content": "Introduce yourself."}
  ],
  temperature=0.7,
)

print(completion.choices[0].message)

Develop

Learn more