OpenAI's first open source LLM. Comes in 2 sizes: 20B and 120B. Supports configurable reasoning effort (low, medium, high). Trained for tool use. Apache 2.0 licensed.
To run the smallest gpt-oss, you need at least 12 GB of RAM. The largest one may require up to 65 GB.
gpt-oss models support tool use and reasoning. They are available in gguf and mlx.

The gpt-oss series are OpenAI's open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. This release includes two models: gpt-oss-120b and gpt-oss-20b.
The gpt-oss-120b model achieves near-parity with OpenAI o4-mini on core reasoning benchmarks, while running efficiently on a single 80 GB GPU. The gpt-oss-20b model delivers similar results to OpenAI o3‑mini on common benchmarks and can run on edge devices with just 16 GB of memory, making it ideal for on-device use cases, local inference, or rapid iteration without costly infrastructure.
Both models were trained using our harmony response format. LM Studio ships with Harmony to support gpt-oss.
OpenAI released a paper alongside these models, available here: https://arxiv.org/abs/2508.10925.
LM Studio supports OpenAI's Responses API (docs).
gpt-oss models are expected work best with this API. They and are designed to be used within agentic workflows with exceptional instruction following.
gpt-oss-120b run on a single 80GB GPU (like NVIDIA H100 or AMD MI300X) and the gpt-oss-20b model run within 16GB of memory. All evals were performed with the same MXFP4 quantization.Use the lms CLI or download the model within LM Studio.
# gpt-oss-20b lms get openai/gpt-oss-20b # gpt-oss-120b lms get openai/gpt-oss-120b
Select benchmark metrics



gpt-oss models are Apache 2.0 licensed.