App

Developer Docs

lmstudio-js

lmstudio-python

CLI

Introduction

Project Setup

Basics

Chat

Working with Chats

Cancelling Predictions

Image Input

Structured Response

Speculative Decoding

Generate Completions

Configuration Parameters

Agentic Flows

The .act() call

Tool Definition

Plugins (Beta)

Introduction to Plugins

Using npm Dependencies

Tools Provider

Prompt Preprocessor

Generators

Introduction

Text-only Generators

Tool calling generators

Custom Configuration

Publishing a Plugin

Text Embedding

Generating embedding vectors

Tokenization

Tokenizing text

Manage Models

List Local Models

List Loaded Models

Load and Access Models

API Reference

LLMLoadModelConfig

LLMPredictionConfigInput

Model Info

Get Context Length

Get Model Info

Feature In Beta

Plugins SDK is currently in private beta. Join the beta here.

Plugins (Beta)

Text-only Generators

Generators take in the the generator controller and the current conversation state, start the generation, and then report the generated text using the ctl.fragmentGenerated method.

The following is an example of a simple generator that echos back the last user message with 200 ms delay between each word:

src/toolsProvider.ts

import { Chat, GeneratorController } from "@lmstudio/sdk";

export async function generate(ctl: GeneratorController, chat: Chat) {
  // Just echo back the last message
  const lastMessage = chat.at(-1).getText();
  // Split the last message into words
  const words = lastMessage.split(/(?= )/);
  for (const word of words) {
    ctl.fragmentGenerated(word); // Send each word as a fragment
    ctl.abortSignal.throwIfAborted(); // Allow for cancellation
    await new Promise((resolve) => setTimeout(resolve, 200)); // Simulate some processing time
  }
}

Custom Configurations

You can access custom configurations via ctl.getPluginConfig and ctl.getGlobalPluginConfig. See Custom Configurations for more details.

Handling Aborts

A prediction may be aborted by the user while your generator is still running. In such cases, you should handle the abort gracefully by handling the ctl.abortSignal.

You can learn more about AbortSignal in the MDN documentation.

This page's source is available on GitHub

On this page

Custom Configurations

Handling Aborts

Page Source Edit on GitHub