Description
Qwen3.5 is a reasoning vision-language model that supports tool use. With 35B total parameters and 3B activated, it outperforms previous generation models more than 6x its size.
Stats
4K Downloads
4 stars
Capabilities
Minimum system memory
Tags
Last updated
Updated 7 hours agobyREADME
Qwen3.5 represents a significant leap forward, integrating breakthroughs in multimodal learning, architectural efficiency, reinforcement learning scale, and global accessibility. It has 35B total parameters and 3B activated, supporting a native context length of 262,144 tokens.
Unified Vision-Language Foundation. Early fusion training on multimodal tokens achieves cross-generational parity with Qwen3 and outperforms Qwen3-VL models across reasoning, coding, agents, and visual understanding benchmarks.
Efficient Hybrid Architecture. Gated Delta Networks combined with sparse Mixture-of-Experts (256 total experts, 8 routed + 1 shared active) deliver high-throughput inference with minimal latency and cost overhead.
Scalable RL Generalization. Reinforcement learning scaled across million-agent environments with progressively complex task distributions for robust real-world adaptability.
Global Linguistic Coverage. Expanded support to 201 languages and dialects, enabling inclusive, worldwide deployment with nuanced cultural and regional understanding.
Custom Fields
Special features defined by the model author
Enable Thinking
: boolean
(default=true)
Controls whether the model will think before replying
Parameters
Custom configuration options included with this model
Sources
The underlying model files this model uses
Based on