All-in-one LM Studio plugin: 50+ local tools, persistent memory, sub-agent delegation, design systems, and media analysis. Modular architecture with smart output caps on all tools, near-duplicate memory detection (TF-IDF + Jaccard), HTML asset auditing, tiered memory injection (L0 identity → L1 essential → L2 contextual), 58 pre-cached design system references, pattern-matched delegation hints (EN/PT), and auto hints that guide local models to use efficient editing patterns. Optimized for models with limited context (4K-20K tokens).
An all-in-one plugin for LM Studio that merges powerful local tools, persistent memory, sub-agent delegation, and media analysis into a single unified experience.
Maestro combines and extends the best community plugins into one cohesive package, optimized for local models with limited context windows (4K-20K tokens).
Document parsing — read PDFs and DOCX files as text
Git Integration
git_status, git_diff, git_commit, git_log — all built in
Web & Research
Web search via DuckDuckGo API — lightweight, no browser dependencies
Web scraping — fetch_web_content with smart text extraction (strips scripts, nav, ads)
Wikipedia — quick article summaries in any language
Code Execution (opt-in, off by default)
Python — run scripts via system Python
JavaScript/TypeScript — run scripts via Deno
Shell commands — execute_command for any CLI tool
Terminal — open visible terminal windows
Test runner — run_test_command with 2-minute timeout
Persistent Memory (SQLite + TF-IDF)
Remember/Recall — store and retrieve facts, preferences, and notes across conversations
Tiered auto-injection — L0 (identity), L1 (essential), L2 (query-relevant) memories injected with character budgets
Identity onboarding — on first conversation, naturally asks the user's name, role, and language
AI fact extraction — automatically extracts durable facts from your conversations
Near-duplicate detection — TF-IDF + Jaccard similarity blocks >80% similar memories before saving
Conflict detection — detects contradictions between new and existing memories
Project scoping — memories can be scoped to specific projects
Category filtering — filter which memory categories get auto-injected
Decay system — configurable half-life so old, unused memories fade naturally
Sub-Agent Delegation
Secondary model support — delegate auxiliary tasks (summarization, research, review) to a lighter model
Auto-detection — automatically discovers a second loaded model via LM Link
Configurable permissions — toggle file system, web, code, and memory access per sub-agent
Auto-save — code generated by sub-agents is automatically saved to files
Auto-debug — optional reviewer pass checks generated code for errors
Custom profiles — define agent personas (summarizer, coder, reviewer, etc.)
Design Systems
58 pre-built references — Tailwind, Shadcn, MUI, Chakra, Bootstrap, Ant Design, and more
Pre-download cache — predownload_design_systems fetches all DESIGN.md files to ~/.maestro-toolbox/design-systems/
3-tier lookup — memory → disk cache → GitHub raw fetch
Framework-aware — guides the model to use correct component patterns and class names
Media Analysis
Image analysis — analyze_image resizes and compresses local images for vision model analysis (JPEG, PNG, WebP, GIF, BMP, TIFF)
Video analysis — analyze_video extracts evenly-spaced frames from local videos (MP4, MOV, AVI, MKV, WebM). Requires ffmpeg
System & Utilities
Clipboard read/write, system info, OS notifications
Smart HTML preview — preview_html opens in browser + returns structural analysis (sections, duplicate images, missing assets, overflow warnings)
Optimizations for Local Models
Maestro is designed for context-constrained local models:
Output caps — all tool outputs truncated at safe limits (fetch_web_content 6K, git_diff 8K, execute_command/run_python/run_javascript/run_test_command 4K, read_file 6K)
Dynamic tool docs — only documents enabled tools, saving 200-400 tokens per conversation
Delegation hints — pattern-matched suggestions injected per turn (EN/PT triggers for summarize, research, review, etc.)
Tiered memory injection — L0/L1/L2 layers with per-tier character budgets and configurable count (1-15 memories)
Video byte budget — 600KB total with auto-recompression
Smart caching — secondary model detection and state persistence cached in memory
Auto hints — one system hint per turn (token budget warning > replace_text suggestion > memory reminder), never stacked
Large file tracking — suggests replace_text_in_file after saving files >10KB to avoid full rewrites
Architecture
Maestro uses a modular architecture — the tools provider is split into focused modules (fileTools, codeTools, gitTools, webTools, systemTools, secondaryAgent) composed by a thin orchestrator. All modules share a mutable ToolContext so state changes (like change_directory) propagate instantly.
Configuration
All settings are organized by section in the LM Studio plugin settings UI:
persistent-memory by @dirty-data (forked from @khtslv) — the persistent memory system with SQLite storage, TF-IDF retrieval, and AI fact extraction. 789+ downloads.
Media analysis tools (image and video) and all optimizations/new features are original additions.
Requirements
LM Studio 0.3.15+
Node.js 20+ (for plugin runtime)
ffmpeg (optional, for video/image analysis — brew install ffmpeg on macOS)