2.7K Downloads

qwen/
qwen3-next-80b
80B
qwen3_next

Hybrid attention architecture, high-sparsity Mixture-of-Experts 80B model (active 3B). Currently supported for Mac only with MLX.

Tool use

Last Updated4 days ago
README

Qwen3 Next 80B

The first model in the Qwen3-Next series featuring innovative hybrid attention architecture and high-efficiency Mixture-of-Experts design.

Key Features

  • Hybrid Attention: Combines Gated DeltaNet and Gated Attention for efficient ultra-long context modeling
  • High-Sparsity MoE: 80B total parameters with only 3B activated, providing excellent efficiency
  • Ultra-Long Context: Supports up to 262,144 tokens natively
  • Multi-Token Prediction: Enhanced pretraining performance and faster inference
  • Advanced Capabilities: Excels at reasoning, coding, creative writing, and agentic tasks
  • Multilingual Support: Over 100 languages and dialects

Architecture Highlights

  • 80B total parameters, 3B activated (A3B)
  • 48 layers with hybrid layout
  • 512 experts with only 10 activated per token
  • Context length: 262,144 tokens
  • No thinking mode support (instruct-only)

Performance

Delivers performance comparable to much larger models while maintaining exceptional efficiency:

  • Outperforms Qwen3-32B with 10x inference throughput for long contexts
  • Matches Qwen3-235B-A22B on many benchmarks with significantly lower computational requirements
  • Superior ultra-long-context handling up to 256K+ tokens
sources

The underlying model files this model uses

When you download this model, LM Studio picks the source that will best suit your machine (you can override this)

config

Custom configuration options included with this model

Temperature
0.7
Top K Sampling
20
Top P Sampling
0.8