March 27, 2025

LM Studio 0.3.14

0.3.14 - Release Notes

  • New: Advanced GPU Controls 🎛️
    • On multi-GPU setups, customize how models are offloaded onto your GPUs
      • Enable/disable individual GPUs
      • CUDA-specific features:
        • "Priority order" mode: The system will try to allocate more on GPUs listed first
        • "Limit Model Offload to Dedicated GPU memory" mode: Improved stability and optimized for long context on single GPU setups
    • How to open GPU controls:
      • Windows: Ctrl+Shift+H
      • Mac: Cmd+Shift+H
    • How to open GPU controls in a pop-out window:
      • Windows: Ctrl+Alt+Shift+H
      • Mac: Cmd+Option+Shift+H
      • Benefit: Manage GPU settings while models are loading
  • LG AI EXAONE Deep reasoning model support
  • Improved model loader UI in small window sizes
  • Improve llama model family tool call reliability through LM Studio SDK and OpenAI compatible streaming API
  • [SDK] Added support for GBNF grammar when using structured generation
  • [SDK/RESTful API] Added support for specifying presets in API calls
  • Improved support for Nemotron models

Build 5

  • [Advanced GPU controls] Enlarge GPU controls pop-out window

Build 4

  • [Advanced GPU controls] Allow disabling all GPUs with any engine
  • [Advanced GPU controls] Fix bug where disabling a GPU would cause incorrect offloading when > 2 gpus
  • [Advanced GPU controls][CUDA] Improved stability of"Limit Model Offload to Dedicated GPU memory" mode
  • Added GPU controls logging to "Developer Logs"
  • Fixed a bug where sometimes editing model config inside the model loader popover does not take effect
  • Fixed a bug related to renaming state focus on chat cells

Build 3

  • [Advanced GPU controls] Fixed a bug where intermediate buffers were being allocated on disabled GPUs
  • Fixed "OpenSquareBracket !== CloseStatement" bug with Nemotron model
  • Fixed a bug where Nemotron GGUF model metadata was not being read properly
  • [Windows] Fixed: Make sure the LM Studio.exe executable is also signed. Should help with anti-virus false positives

Build 2

  • Optimized "Limit Model Offload to Dedicated GPU memory" mode in long context situations on single GPU setups
  • Speculative decoding draft model now respects GPU controls
  • [CUDA] Fixed a bug where model would crash with message "Invalid device index"
  • [Windows ARM] Fixed chat with document sometimes not working

Build 1

  • New: GPU Controls 🎛️
    • On multi-GPU setups, customize how models are offloaded onto your GPUs
      • Enable/disable individual GPUs
      • CUDA-specific features:
        • "Priority order" mode: The system will try to allocate more on GPUs listed first
        • "Limit Model Offload to Dedicated GPU memory" mode: The system will limit offload of model weights to dedicated GPU memory and RAM only. Context may still use shared memory
    • How to open GPU controls:
      • Windows: Ctrl+Shift+H
      • Mac: Cmd+Shift+H
    • How to open GPU controls in a pop-out window:
      • Windows: Ctrl+Alt+Shift+H
      • Mac: Cmd+Option+Shift+H
      • Benefit: Manage GPU settings while models are loading
  • LG AI EXAONE Deep reasoning model support
  • Improved model loader UI in small window sizes
  • Improve llama model family tool call reliability through LM Studio SDK and OpenAI compatible streaming API
  • [SDK] Added support for GBNF grammar when using structured generation
  • [SDK/RESTful API] Added support for specifying presets
  • Fixed a bug where sometimes the last couple fragments of a prediction are lost