Documentation
Predicting with LLMs
Agentic Flows
Text Embedding
Tokenization
Manage Models
Model Info
API Reference
Image Input
Some models, known as VLMs (Vision-Language Models), can accept images as input. You can pass images to the model using the .respond()
method.
If you don't yet have a VLM, you can download a model like qwen2-vl-2b-instruct
using the following command:
lms get qwen2-vl-2b-instruct
Connect to LM Studio and obtain a handle to the VLM (Vision-Language Model) you want to use.
import { LMStudioClient } from "@lmstudio/sdk";
const client = new LMStudioClient();
const model = await client.llm.model("qwen2-vl-2b-instruct");
Use the client.files.prepareImage()
method to get a handle to the image you can pass to the model.
const imagePath = "/path/to/image.jpg"; // Replace with the path to your image
const image = await client.files.prepareImage(imagePath);
If you only have the image in the form of a base64 string, you can use the client.files.prepareImageBase64()
method instead.
const imageBase64 = "Your base64 string here";
const image = await client.files.prepareImageBase64(imageBase64);
We support JPEG, PNG, and WebP image formats.
.respond()
Generate a prediction by passing the image to the model in the .respond()
method.
const prediction = model.respond([
{ role: "user", content: "Describe this image please", images: [image] },
]);