LM Studio 0.3.27: Find in Chat and Search All Chats
LM Studio 0.3.27 is now available as a stable release. Update in-app or download the latest version.
Find in Chat and Search across all chats
You can now search within the current conversation or across all conversations.
Cmd/Ctrl+F
: find within the current conversation. Matches plain text, Markdown, and code blocks — and also searches inside reasoning blocks.
Cmd/Ctrl+F
shortcut works in the large System Prompt editor (open with Cmd/Ctrl+E
).Cmd/Ctrl+Shift+F
: search across all conversations.Find in Chat searches plaintext, Markdown, code blocks, and reasoning blocks in the current chat. For search across chats, we build an in-memory index and search message contents only (reasoning and tool-use blocks are excluded).
Please give us feedback and you try it out! You can also report bugs in lmstudio-bug-tracker.
Before you load, either in the app or with lms load
, you have an opportunity to adjust the model load parameters such as the context length or the GPU offload %.
These parameters, along with the model size and other factors, impact the memory requirements for loading the model.
Until now, LM Studio estimated memory usage taking into account the GPU offload and would surface a warning if you were likely to run out of memory.
Starting in 0.3.27, we now also take into account the context length, whether flash attention is enabled, and whether the model is a vision model. This gives you a more accurate estimate of the memory requirements.
Model load memory estimate taking context length and GPU offload into account
When you adjust the context length or GPU offload slider, you'll see an updated estimate of the memory requirements. If you're likely to run out of memory, you'll see a warning before loading the model. If you believe the estimate is too conservative, you can always override the guardrails and attempt to load the model anyway.
lms load --estimate-only
This functionality is also available in the CLI. You can now perform a "dry run" of loading a model with:
lms load --estimate-only <model-name>
This will not load the model, but will print out an estimate of the memory requirements based on the parameters you provide. It'll take into account --context-length
and --gpu
if you provide them, or will use defaults if you don't.
A result might look like this:
You can now sort your chats in the side bar based on date updated, date created, or token count.
Sort chats by date updated, date created, or token count
We're about to kick off a beta for LM Studio 0.4.0. It's packed with features and we'd love for you to take it for an early spin + iterate with us. If you're interested, sign up here.
Build 4
Build 3
lms
) output colors to have better contrast in light modeBuild 2
lms load --estimate-only <model-name>
to preview a model's estimated memory requirements before loadinglms chat
, you can now use Ctrl^C to interrupt ongoing predictionsBuild 1
lms ps --json
now reports model generation status and the number of queued prediction requests