Description
The 8B version of Qwen's latest vision-language model. Includes comprehensive upgrades to visual perception, spatial reasoning, and image understanding.
Stats
136.4K Downloads
14 stars
Capabilities
Minimum system memory
Tags
Last updated
Updated 28 days agobyREADME
The latest generation vision-language model in the Qwen series with comprehensive upgrades to visual perception, spatial reasoning, and video understanding.
Delivers strong vision-language performance across diverse tasks including document analysis, visual question answering, video understanding, and agentic interactions.
Parameters
Custom configuration options included with this model
Sources
The underlying model files this model uses