Capabilities
Minimum system memory
Tags
Last updated
Updated 2 days agobyForked from zai-org/glm-4.6v-flash
README
Parameters
Custom configuration options included with this model
Sources
The underlying model files this model uses
Based on
GGUF
MLX
GLM 4.6V Flash is a 9B vision-language model optimized for local deployment and low-latency applications. It supports a context length of 128k tokens and achieves strong performance in visual understanding among models of similar scale.
The model introduces native multimodal function calling, enabling vision-driven tool use where images, screenshots, and document pages can be passed directly as tool inputs without text conversion.
MLX
MLX