Introducing lmstudio-python
and lmstudio-js
•
2025-03-03
Today we are launching lmstudio-python
(1.0.1)
and lmstudio-js
(1.0.0)
:
LM Studio's software developer kits for Python and TypeScript. Both libraries are MIT licensed and are developed in the open on Github.
Furthermore, we introduce LM Studio's first agent-oriented API: the .act()
call. You give it a prompt and tools, and the model goes on its own autonomously for multiple execution "rounds" until it accomplishes the task (or gives up).
The SDK lets you tap into the same AI system capabilities we've built for the LM Studio desktop app. Opening up these APIs was always our plan, and we've architected the software stack with this in mind: LM Studio uses the same public lmstudio-js
APIs for its core functions. Your apps can now do this too.
Our goal is to enable you to build your own tools, while (hopefully) saving you from having to solve the problems we already solved as a part of developing LM Studio: this includes automatic software dependency management (CUDA, Vulkan), multi GPU support (NVIDIA, AMD, Apple), multi operating system support (Windows, macOS, Linux), default LLM parameter selection, and much more.
Apps using lmstudio-python
or lmstudio-js
will be able to run on any computer that has LM Studio running (either in the foreground or in headless mode).
The SDK is available for both Python and TypeScript. You can install it via pip or npm:
pip install lmstudio
In Python, use the lmstudio
package in your scripts, tools, or even the Python REPL.
lmstudio-python
developer docs.In TypeScript, import the @lmstudio/sdk
package in your Node.js or browser apps.
lmstudio-js
developer docs.The core APIs surfaced through the LM Studio SDK are:
.respond()
).act()
).complete()
).embed()
).load()
, .unload()
)In TypeScript, SDK APIs work both from Node environments and the browser (with CORS enabled). In Python, we support both a convenience sync API, as well as a session based async API for scoped resource management.
For a full list of APIs, see the API References for Python and TypeScript.
Apps, tools, scripts using lmstudio-python
or lmstudio-js
would be to run the latest llama.cpp
or MLX
models without having to configure hardware or software. The system automatically chooses the right inferencing engine for a given model, and picks parameters (such as GPU offload) based on the available resources.
For example, if you just want any model for a quick prompt, run this:
model = lms.llm() # gets the current model if loaded
But if you have more specific needs, you can load a new instance of a model and configure every parameter manually. See more Python docs | TypeScript docs.
Back in the old days of LLMs (circa mid 2023), the best way to ensure a specific output format from a model involved prompting with pleas such as
USER: "Please respond with valid JSON and nothing else. PLEASE do not output ANYTHING after the last } bracket".
Nowadays, much better methods exist — for example grammar constrained sampling [1] [2], which LM Studio supports for both llama.cpp
and MLX
models.
The SDK surfaces APIs to enforce the model's output format using Pydantic (for Python), or zod (for TypeScript). In both libraries, you can also use a JSON schema.
from pydantic import BaseModel
# A class based schema for a book
class BookSchema(BaseModel):
title: str
author: str
year: int
result = model.respond("Tell me about The Hobbit",
response_format=BookSchema)
book = result.parsed
print(book)
# ^
# Note that `book` is correctly typed as
# { title: string, author: string, year: number }
Read more in the Python docs.
import { z } from "zod";
// A zod schema for a book
const bookSchema = z.object({
title: z.string(),
author: z.string(),
year: z.number().int(),
});
const result = await model.respond("Tell me about The Hobbit.",
{ structured: bookSchema }
);
const book = result.parsed;
console.info(book);
// ^
// Note that `book` is correctly typed as
// { title: string, author: string, year: number }
Read more in the TypeScript docs.
.act()
API: running tools in a loopIn addition to "standard" LLM APIs like .respond()
, the SDK introduces a new API: .act()
. This API is designed for agent-oriented programming, where a model is given a task and a set of tools, and it then goes on its own in an attempt to accomplish the task.
LLMs are largely text-in, text-out programs. So, you may ask "how can an LLM use a tool?". The answer is that some LLMs are trained to ask the human to call the tool for them, and expect the tool output to to be provided back in some format.
Imagine you're giving computer support to someone over the phone. You might say things like "run this command for me ... OK what did it output? ... OK now click there and tell me what it says ...". In this case you're the LLM! And you're "calling tools" vicariously through the person on the other side of the line.
rounds
"We introduce the concept of execution "rounds" to describe the combined process of running a tool, providing its output to the LLM, and then waiting for the LLM to decide what to do next.
Execution Round
• run a tool → ↑ • provide the result to the LLM → │ • wait for the LLM to generate a response │ └────────────────────────────────────────┘ └➔ (return)
A model might choose to run tools multiple times before returning a final result. For example, if the LLM is writing code, it might choose to compile or run the program, fix errors, and then run it again, rinse and repeat until it gets the desired result.
With this in mind, we say that the .act()
API is an automatic "multi-round" tool calling API.
In TypeScript, you define tools with a description and a function. In Python, you may even pass in a function directly!
import lmstudio as lms
def multiply(a: float, b: float) → float:
"""Given two numbers a and b. Returns the product of them."""
return a * b
model = lms.llm("qwen2.5-7b-instruct")
model.act(
"What is the result of 12345 multiplied by 54321?",
[multiply],
on_message=print,
)
When the LLM chooses to use a tool, the SDK runs it and automatically (in the client process) provides the result back to the LLM. The model might then choose to run tools again, e.g. there was a software error that provided the LLM information about how to proceed.
As the execution progresses, the SDK provides a stream of events to the developer via callbacks. This allows you to build interactive UIs that show the user what the LLM is doing, which tool it is using, and how it's progressing.
To get started with the .act()
API, see .act()
(Python) and .act()
(TypeScript).
Both libraries are MIT and contain up to date contribution guides. If you want to get involved with open source development, you're welcome to jump in!
lmstudio-python
on Githublmstudio-js
on GithubAdditional LM Studio open source software includes:
lms
: LM Studio's CLI (MIT) on Githubmlx-engine
: LM Studio's Apple MLX engine (MIT) on GithubAs you start using the SDK, we'd love to hear your feedback. What's working well? What's not? What features would you like to see next? And of course, whether you're running into bugs.
Please open issues on the respective Github repositories. We'd appreciate a star too 🙏⭐️.