README
An LM Studio plugin that provides access to Google Gemini models, including native image generation, within the LM Studio chat interface.
Google AI Studio API Key.No Google Cloud project, service account, or IAM configuration is required.
LM Studio currently gates image attachments on whether a vision-capable model is loaded on the client side. Since this plugin registers as a generator rather than a local model, LM Studio does not recognise its vision capabilities and blocks image uploads.

The workaround is called vision-capability-priming: download the helper model qwen/qwen3-vl-4b to your local machine. The plugin detects its presence and uses it to signal vision support to the LM Studio client. The model is not used for inference and does not require GPU resources.
This is tracked upstream as lmstudio-ai/lmstudio-js#459.
| Model | Capabilities | Typical use |
|---|---|---|
gemini-3-pro-image-preview | Multimodal generation (text + images), up to 14 reference images, native editing | Image creation, iterative refinement, design exploration |
gemini-3.1-pro-preview | Text reasoning, configurable thinking depth (Low / Medium / High), tool use | Analysis, coding, research, multi-step tasks |
Model names follow Google's preview naming convention. See the Gemini model documentation for current availability.
Most chat interfaces discard image context after a single turn. Vision Promotion addresses this by automatically re-injecting relevant images — both user-attached and model-generated — into subsequent requests. This enables the model to reason about, compare, or edit images across multiple turns.
Two transport modes are available, toggled per chat via Use Files for Vision:
| Mode | Mechanism | Trade-off |
|---|---|---|
| Base64 | Images embedded inline as JPEG previews | Simpler; higher per-request payload |
| Files API | Images uploaded once to Google's temporary storage, referenced by URI | Lower bandwidth; better for longer sessions |
| Setting | Description | Default |
|---|---|---|
| Model | Gemini model to use | gemini-3-pro-image-preview |
| Thinking Level | Reasoning depth for Gemini 3.1 Pro | Low |
| Show Only Last Image Variant | Suppress intermediate images, display only the final variant | On |
| Use Files for Vision | Route image promotion through Google Files API | On |
| Setting | Description | Default |
|---|---|---|
| Vision Promotion: Persistent | Re-inject images every turn rather than only when new | Off |
| Google AI Studio API Key | API key (AIza...) | — |
| Setting | Description |
|---|---|
| Log Chunks | Verbose streaming and tool-call event logging |
| Log Requests | Full request/response JSON (may expose sensitive data) |
The plugin uses a strategy pattern to route generation to the appropriate handler:
gemini-3.1-pro-preview: streaming generation with configurable thinking levels and thought-signature persistence across turns.gemini-3-pro-image-preview: image generation pipeline with native multimodal output, variant tracking, and edit support.Both strategies share a common base for message conversion, tool handling, and vision promotion.
Thought signatures — opaque tokens required by Gemini for multi-turn reasoning context — are persisted and replayed automatically. When thinking and tool calls co-occur, the plugin restructures conversation history to satisfy API constraints.
Based on the LM Studio openai-compat-endpoint plugin template.
MIT
npm install # Install dependencies
npm run build # Build
npm run rebuild # Clean build
npm run dev # Dev server