GLM 4.6V Flash is a 9B vision-language model optimized for local deployment and low-latency applications. It supports a context length of 128k tokens and achieves strong performance in visual understanding among models of similar scale.

Capabilities

Vision Input

Trained for tool use

ReasoningSupports reasoning

Minimum system memory

8GB

me-ai

Public

Forked from zai-org/glm-4.6v-flash

Model

Revisions

README

Parameters

Custom configuration options included with this model

Repeat Penalty

1.1

Temperature

0.8

Top K Sampling

Top P Sampling

0.6

Sources

The underlying model files this model uses

Based on

🤗lmstudio-community/GLM-4.6V-Flash-GGUF→

GGUF

🤗lmstudio-community/GLM-4.6V-Flash-MLX-4bit→

MLX

GLM 4.6V by Z.ai

The model introduces native multimodal function calling, enabling vision-driven tool use where images, screenshots, and document pages can be passed directly as tool inputs without text conversion.

🤗lmstudio-community/GLM-4.6V-Flash-MLX-6bit→

MLX

🤗lmstudio-community/GLM-4.6V-Flash-MLX-8bit→

MLX