lmstudio-python

Introduction

API Changelog

Core

Idle TTL and Auto-Evict

Headless Mode

Local Server

LM Studio REST API

REST API v0

OpenAI Compatible Endpoints

Overview

Structured Output

Tools and Function Calling

List Models

Responses

Chat Completions

Embeddings

Completions (Legacy)

OpenAI Compatible Endpoints

Chat Completions

Send a chat history and get the assistant's response.

Method: POST
Prompt template is applied automatically for chat‑tuned models
Provide inference parameters (temperature, top_p, etc.) in the payload
See OpenAI docs: https://platform.openai.com/docs/api-reference/chat
Tip: keep a terminal open with lms log stream to inspect model input

Python example

from openai import OpenAI
client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")

completion = client.chat.completions.create(
  model="model-identifier",
  messages=[
    {"role": "system", "content": "Always answer in rhymes."},
    {"role": "user", "content": "Introduce yourself."}
  ],
  temperature=0.7,
)

print(completion.choices[0].message)

Supported payload parameters

See https://platform.openai.com/docs/api-reference/chat/create for parameter semantics.

model
top_p
top_k
messages
temperature
max_tokens
stream
stop
presence_penalty
frequency_penalty
logit_bias
repeat_penalty
seed

This page's source is available on GitHub

On this page

Supported payload parameters

Page Source Edit on GitHub