CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project

LM Studio plugin that exposes four web-oriented tools to local LLMs — Web Search, Image Search, Visit Website, and Fetch Images — built on @lmstudio/sdk. Descended from Daniel Sig's original lms-plugin-duckduckgo and lms-plugin-visit-website plugins, merged and extended by Nigel Packer.

Commands

npm run dev — run plugin in LM Studio dev mode (lms dev)
npm run push — publish to LM Studio Hub (lms push)
npm run lint / npm run lint:fix — ESLint on src/**/*.ts
npm run format / npm run format:check — Prettier
npm run knip — dead-code / unused-export check

A local pre-commit hook at .git/hooks/pre-commit runs lint, format:check, and knip sequentially and aborts the commit on any failure. The hook is not committed to the repo — fresh clones need to reinstall it. Bypass with git commit --no-verify when necessary.

No test suite is configured. TypeScript targets ES2023 / CommonJS. Requires Node >= 22 (fetch + AbortSignal.any).

Architecture

Entry point src/index.ts registers a config schematic and a tools provider with the LM Studio SDK.

Request flow

Content extraction for Visit Website

src/parsers/page/page-text.ts feeds raw HTML to @mozilla/readability to strip boilerplate (nav, sidebars, comments), then routes Readability's .content HTML through either:

Both paths share src/text/normalize-blank-lines.ts for trailing-whitespace and blank-line collapsing. Both contentFormat (plugin select field, default "markdown") and contentLimit are plugin-only — neither is exposed as a tool parameter so the model cannot override the user-set or default values. The tool still returns contentLength, the pre-truncation character count, so the model can detect truncation and refine with findInPage.

Image extraction for Fetch Images

src/parsers/page/page-images.ts scrapes <img> tags in document order, resolves relative src against the page URL, deduplicates, and returns { src, alt, title } tuples (up to maxImages). Fetch Images then downloads each via downloadImages and returns { filename, alt, title, image } on success or { filename, alt, title, error } on failure. filename is derived from the URL's last path segment via src/fs/url-filename.ts.

Caches

Three disk-backed TTL caches (via cacache) are constructed once in toolsProvider and shared across tools. They persist across plugin reloads — clearing requires removing the cacache directory at ~/.lmstudio/plugin-data/lms-plugin-duckduckgo-cache:

Web search results (subdir search-enriched) — up to 100 entries, (default 15 min). Stores the post-enrichment payload, so warm queries skip both the DuckDuckGo fetch and the per-result fan-out. The legacy subdir from before enrichment landed is orphaned; it can be deleted by hand alongside the rest of the cacache directory.

Cache sizes and subdirs are defined in src/tools-provider.ts; the TTLCache implementation is in src/cache/ttl-cache.ts. TTL defaults live in src/config/resolve-config.ts. Fetch Images downloads are not cached because they land in chat-scoped working directories.

Image search specifics

Bing renders each image tile as <a class="iusc" m="<JSON>">. The JSON blob carries murl (full image URL), purl (source page URL), and t (title), among others; only those fields are typed in src/bing/parse-results.ts. jsdom returns the m attribute already entity-decoded, so the parser feeds it straight to JSON.parse; malformed tiles are swallowed individually rather than aborting the result list, and tiles whose murl doesn't end in a supported image extension are filtered out. Bing returns ~35 tiles per HTML page; pagination advances by 35 via the first query parameter. imageMaxResults slices the parsed list — its slider tops at 35 because that's Bing's natural page size.

Safe search encoding

DuckDuckGo uses non-obvious p param values: strict→"1", moderate→"", off→"-1". Centralized in src/duckduckgo/safe-search.ts. Bing accepts the literal mode strings (strict/moderate/off) on its adlt param, so src/bing/build-urls.ts just passes the SafeSearch value through.

Errors

Two error hierarchies are load-bearing:

FetchError (src/http/fetch-error.ts) — HTTP/network failures, carries url and optional cause.
NoResultsError base with NoWebResultsError / NoImageResultsError (src/errors/no-results-error.ts).

formatToolError in src/errors/tool-error.ts converts these into user-facing strings per tool kind (web-search, image-search, website, image-download), including abort-detection via DOMException.name === "AbortError".

Tool-file conventions

ESLint enforces two rules on src/tools/*-tool.ts: the file must contain exactly one exported create<Name>Tool factory returning Tool, and module-level function declarations other than that factory are banned. Per-tool helpers either live in a sibling module (e.g. src/fs/url-filename.ts, src/parsers/page/) or are inlined inside the implementation arrow. Interfaces at module scope are allowed.

Web-search result enrichment

src/enrichment/ wires metascraper into the web-search flow. create-metascraper.ts builds a single in-tree rule plugin that resolves date, type, and description against OpenGraph, microdata, JSON-LD, and standard meta tags. Rules use @metascraper/helpers for the heavy lifting: helpers.date (chrono-node-backed) for ISO normalization across many input formats, helpers.$jsonld for memoized JSON-LD lookups so multiple property accesses on the same page reuse one parse pass, and helpers.description for the 500-char-clamped description sanitizer. The local og:type rule keeps a thin trimmed() helper since helpers does not export a generic string sanitizer. The date rule chain prefers article:modified_time over article:published_time so the model sees the most recent change date; helpers' date() collapses both into a single ISO 8601 value rather than splitting them. Types for @metascraper/helpers (which ships pure JS) are declared inline in src/enrichment/metascraper-helpers.d.ts. The wrapper only emits keys whose extraction succeeded so the per-result merge in cannot pollute records with properties. The fan-out runs concurrently via and gates each fetch on the so distinct domains run in parallel while same-host calls still observe ; the website cache is consulted first per result so warm enrichment pays no rate-limit cost. Non-HTML pages (PDF, plain text, JSON) are returned without metadata since the rules only match parsed HTML.

web-tools