LM Studio 0.3.7

2025-01-20

We’re excited to share LM Studio 0.3.7 with support for DeepSeek R1 Distilled models and KV cache quantization for llama.cpp models.

Upgrade via in-app update, or from https://lmstudio.ai.

DeepSeek R1, what's the big deal?

DeepSeek's highly anticipated R1 model is a SOTA open source reasoning model, intended to achieve performance on-par with OpenAI's o1.

A range of "distilled" models are available for download within LM Studio, in 1.5B, 7B, 8B, 13B, 14B, 32B, and 70B variants. Distilled models are made by fine tuning smaller models with outputs from a larger and more capable model (in this case, DeepSeek’s full R1 model). If you’re curious, check out DeepSeek’s technical report here.

If you use DeepSeek R1 you'll notice it outputs its "thinking process" enclosed in <think> </think> tokens. These are currently printed into the chat window like a regular response, making it hard to work with. We're working on a UI upgrade to allow you to collapse and expand the thinking process. Stay tuned for the 0.3.8 update.

Full LM Studio 0.3.7 change log

Build 2

  • Support for DeepSeek R1.

Build 1

  • New: Hardware tab in Mission Control. Open with Cmd/Ctrl + Shift + H.
  • New: Added a server file logging mode option that gives you finer control over what gets logged in the log files.
  • New: KV Cache quantization for llama.cpp models (requires llama.cpp/1.9.0+ runtime)
  • Added support for nulls in Open AI compatible API server.
  • Fixed prediction queueing not working. (queued prediction will return empty results)
  • Show runtime update notifications only for currently used runtimes
  • Added a descriptive error when LM Studio fails to start due to lack of file system access.
  • Fixed a bug where sometimes JIT model loading can cause an error
  • Fixed a bug where engine extension's output had an extraneous new line in logs
  • Fixed a bug where sometimes two chats will be created for new users.

Even More