README

gpt-oss 120b

Designed for production, general purpose, high reasoning use cases that fits into a single H100 GPU (117B parameters with 5.1B active parameters)

This model is released under a permissive Apache 2.0 license and it features configurable reasoning effort low, medium, or high; so users can balance output quality and latency based on their needs. The model offers full chain-of-thought visibility to support easier debugging and increased trust, though this output is not intended for end users. It is fully fine-tunable, enabling adaptation to specific tasks or domains, and includes native agentic capabilities such as function calling, web browsing, Python execution, and structured outputs.

This model supports a context length of 131k.

Custom Fields

Special features defined by the model author

Reasoning Effort

: select

(default=low)

Controls how much reasoning the model should perform.

Parameters

Custom configuration options included with this model

Min P Sampling

0.05

Repeat Penalty

1.1

Temperature

0.8

Top K Sampling

40

Top P Sampling

0.8

Sources

The underlying model files this model uses

Based on

🤗lmstudio-community/gpt-oss-120b-GGUF→

GGUF

🤗lmstudio-community/gpt-oss-120b-mlx-8bit→

MLX