Locally AI joins LM Studio
Adrien and the Locally AI apps are joining the LM Studio family to double down on Apple platforms
Adrien and the Locally AI apps are joining the LM Studio family to double down on Apple platforms
LM Studio now supports NVIDIA DGX Station - GB300 Blackwell in a form factor you can run outside of the data center
Run Claude Code with any local model using LM Studio's Anthropic-compatible API
Server deployment, parallel requests with continuous batching, new REST API endpoint, and refreshed application UI
Update to LM Studio 0.3.39 for Open Responses support
LFM2 tool call support and a generator stability fix
Mac M5 MLX fix, enable optimized MLX auto-upgrade
Step by step guide for fine-tuning FunctionGemma with Unsloth, and then running it in LM Studio
Support for Google's FunctionGemma (270M)
Devstral-2, GLM-4.6V, and system prompt fixes
EssentialAI rnj-1 support and a Jinja prompt formatting fix
LM Studio 0.3.33: Ministral 3 support, Olmo-3 tool calling, and release notes
GLM 4.5 tool calling, olmOCR-2, improved image input handling in `/v1/responses`, Flash Attention defaults for Vulkan/Metal, and bug fixes.
Image input improvements, MiniMax M2 tool calling, Flash Attention default for CUDA, new CLI runtime management, macOS 26 support, and bug fixes.
Open safety reasoning models (120B and 20B) with bring-your-own-policy moderation, now supported in LM Studio on launch day.
LM Studio now ships for Linux on ARM and launches with NVIDIA DGX Spark — a tiny but mighty Linux ARM box.
Bug fixes: Qwen tool-calling streaming, Vulkan iGPU loading, and `developer` role support in `/v1/responses`.
OpenAI-compatible `/v1/responses` endpoint (stateful chats, remote mcp, custom tools)
Default model‑variant selection in My Models, better RAM/VRAM estimates (mixed quantizations, mislabeled params, non‑transformers), and bug fixes.
Find/Search in chats, model resource estimation (GUI + CLI), CLI polish, and bug fixes.
Stream server, model logs (input and output) with `lms log stream`, native context menus, and bug fixes.
Select multiple chats for bulk actions, trash bin support, Google EmbeddingGemma, and NVIDIA Nemotron-Nano-v2 with tool calling capabilities.
Support for ByteDance/Seed-OSS, improved markdown code blocks and tables, bug fixes.
Improve in-app chat tool calling reliability for gpt-oss, and ability to place MoE expert weights on CPU
We worked with OpenAI to ensure LM Studio supports running gpt-oss models locally on launch day 🎉
Bug fixes, UI improvements, and support for Qwen3-Coder-480B-A35B with tools.
ROCm / Linux support for AMD 9000 series GPUs, bug fixes for model loading, UI improvements, and auto-deletion of engine dependencies to save disk space.
MCP bug fixes and improvements, OpenAI compat API new streaming options and bug fixes, improved tools calling for Mistral models, and UI touchups.
Starting today, it's no longer necessary to get a commercial license for using LM Studio at work. No need to fill out a form or contact us. You and your team can just use the app!
New in LM Studio 0.3.17: Model Context Protocol (MCP) Host support. Connect MCP servers to the app and use them with local models.
Leveraging `mlx-lm` and `mlx-vlm` to achieve unified multi-modal LLM inference in LM Studio's `mlx-engine`.
Run the distilled DeepSeek R1 0528 model (8B) locally in LM Studio on Mac, Windows, or Linux with as little as 4GB of RAM. Supports tool use and reasoning.
Public Preview of community presets, automatic deletion of least recently used Runtime Extension Packs, and a way to use LLMs as text embedding models.
Support for CUDA 12, new system prompt editor UI, improved tool use API support, and preview of community presets.
Advanced controls for multi-GPU setups: enable/disable specific GPUs, choose allocation strategy, limit model weight to dedicated GPU memory, and more.
LM Studio 0.3.13 supports Google's latest multi-modal model, Gemma 3. Run it locally on your Mac, Windows, or Linux machine.
Bug fixes and document chunking speed improvements for RAG
Developer SDKs for Python and TypeScript are now available in a 1.0.0 release. A programmable toolkit for local AI software.
Support for LM Studio SDK (Python, TS/JS), advanced Speculative Decoding settings, and bug fixes
Inference speed up with Speculative Decoding for `llama.cpp` and `MLX`
Idle TTL, auto-update for runtimes, support for nested folders in HF repos, and separate `reasoning_content` in chat completion responses
Run DeepSeek R1 models locally and offline on your computer
Thinking UI for DeepSeek R1, LaTeX rendering improvements, and bug fixes
DeepSeek R1 support and KV Cache quantization for llama.cpp models
Tool Calling API in beta, new installer / updater system, and support for `Qwen2VL` and `QVQ` (both GGUF and MLX)
An open source utility for packaging Python applications and all their dependencies into a portable, deterministic format based on Python's `sitecustomize.py`.
Headless mode, on-demand model loading, server auto-start, CLI command to download models from the terminal, and support for Pixtral with Apple MLX.
Super fast and efficient on-device LLM inferencing using MLX for Apple Silicon Macs.
Config presets are back! So are live token counts for user input and system prompt. Many bug fixes. Also several new app languages thanks to community contributors.
LM Studio 0.3.2 Release Notes
LM Studio 0.3.1 Release Notes
LM Studio 0.3.0 is here! Built-in (naïve) RAG, light theme, internationalization, Structured Outputs API, Serve on the network, and more.
Run Llama 3.1 locally on your computer with LM Studio.
A command line tool for scripting and automating your local LLM workflows.