Qwen3.6-27B (MLX 8-bit)

The official Qwen 3.6 27B dense model for Apple Silicon, converted to MLX 8-bit with full vision, thinking, and tool calling support. Near-lossless quality at 8.6 bits per weight. The best quantization you can run locally. Built for 36 GB+ Macs.

What you get

27.8B parameters, all active. Dense model with hybrid linear/full attention (3:1 ratio, 64 layers). Every token uses all weights, unlike MoE models.
Text, image, and video. Native multimodal support via mlx-vlm.
262K context window. Extendable to 1M+ with YaRN.
Thinking toggle and tool calling. Both work via the included chat template.

The chat template

The official Qwen 3.6 Jinja template has four problems that make it unusable in LM Studio:

Tool calls crash. The template uses Python's filter, which does not exist in LM Studio's C++ Jinja runtime.

This model ships with a rewritten template that fixes all four. It also adds a thinking toggle and only emits thinking blocks when they contain actual reasoning content.

Thinking toggle

Drop <|think_on|> or <|think_off|> anywhere in your system or user prompt. The template intercepts the tag, removes it from context so the model never sees it, and flips the thinking mode.

Fast answer, no internal reasoning.

The model thinks step by step, then answers.

Quick start

Download. Open LM Studio and search for froggeric/qwen3.6-27b-mlx-8bit.
System prompt. The first line must be:
The model underperforms without it. Add whatever you want after that line.
Hardware. Apple Silicon Mac with 36 GB or more of unified memory.

Recommended sampling

From the official Qwen authors. Reserve 128K+ context for thinking mode.

Mode	temp	top_p	top_k	repeat_penalty
Thinking (coding)	0.6	0.95	20	1.0
Thinking (general)	1.0	0.95	20	1.0
Non-thinking (general)	0.7	0.8	20	1.0

Specs

Spec	Value
Quantization	8-bit (8.6 bits/weight)
Size	28 GB, 6 shards
Total params	27.8B (dense)
Layers	64 (3x linear attn + 1x full attn)
Context	262K native, 1M+ with YaRN
Vocabulary	248K tokens
model_type	`qwen3_5`

Authorship

Role	Author
Original model	Alibaba Cloud (Qwen team)
MLX 8-bit conversion	froggeric

License

Apache-2.0, inherited from Qwen3.6.

qwen3.6-27b-mlx-8bit