Forked from zai-org/glm-4.6v-flash
Capabilities
Minimum system memory
Tags
Last updated
Updated on January 12byREADME
Parameters
Custom configuration options included with this model
Sources
The underlying model files this model uses
Based on
GGUF
MLX
GLM 4.6V Flash is a 9B vision-language model optimized for local deployment and low-latency applications. It supports a context length of 128k tokens and achieves strong performance in visual understanding among models of similar scale.
The model introduces native multimodal function calling, enabling vision-driven tool use where images, screenshots, and document pages can be passed directly as tool inputs without text conversion.
MLX
MLX