CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project

LM Studio plugin that exposes four web-oriented tools to local LLMs — Web Search, Image Search, Visit Website, and Fetch Images — built on @lmstudio/sdk. Descended from Daniel Sig's original lms-plugin-duckduckgo and lms-plugin-visit-website plugins, merged and extended by Nigel Packer.

Commands

npm run dev — run plugin in LM Studio dev mode (lms dev)
npm run push — publish to LM Studio Hub (lms push)
npm run lint / npm run lint:fix — ESLint on src/**/*.ts
npm run format / npm run format:check — Prettier
npm run knip — dead-code / unused-export check

A local pre-commit hook at .git/hooks/pre-commit runs lint, format:check, and knip sequentially and aborts the commit on any failure. The hook is not committed to the repo — fresh clones need to reinstall it. Bypass with git commit --no-verify when necessary.

No test suite is configured. TypeScript targets ES2023 / CommonJS. Requires Node ^22.17.0 || >=24 (fetch, AbortSignal.any, and stable util.MIMEType; the EOL 23.0–23.10 window predates MIMEType's stabilisation).

Architecture

Entry point src/index.ts registers a config schematic and a tools provider with the LM Studio SDK.

Request flow

Content extraction for Visit Website

src/parsers/page-text.ts feeds raw HTML to @mozilla/readability to strip boilerplate (nav, sidebars, comments), then routes Readability's .content HTML through either:

Both paths share src/text/normalize-blank-lines.ts for trailing-whitespace and blank-line collapsing. The parser returns the full extracted content untruncated — bounding it to a budget and biasing it toward search terms is the retrieval layer's job (see below). Both contentFormat (plugin select field, default "markdown") and contentLimit are plugin-only — neither is exposed as a tool parameter so the model cannot override the user-set or default values. The tool still returns contentLength, the pre-truncation character count, so the model can detect truncation and refine with findInPage.

Content retrieval for Visit Website

src/retrieval/ bounds extracted text to the contentLimit budget and, when findInPage terms are supplied, biases the returned excerpt toward the most relevant chunks rather than head-slicing. It is a chunk → rank → select → assemble pipeline split one module per stage so a future embedding-based retriever (RAG) can swap the lexical ranker without touching the rest:

The barrel exports only buildExcerpt/Excerpt; the pipeline stages stay internal to the module (re-exporting them would trip knip's unused-export check). renderPageResult calls buildExcerpt for both HTML (on the parser's full content) and non-HTML kinds (on the pre-extracted PDF/text/JSON body), unifying the budget policy across page kinds.

Image extraction for Fetch Images

src/parsers/page-images.ts scrapes <img> tags in document order, resolves relative src against the page URL, deduplicates, and returns { src, alt, title } tuples (up to maxImages). Fetch Images then downloads each via downloadImages and the result records are assembled by the renderers layer (see below).

Tool-response rendering

src/renderers/ is the response-assembly layer for the two page-consuming tools — the counterpart to parsers/, which extracts structured fragments from raw content. Where parsers/ turns HTML/PDF/bytes into intermediate fragments, renderers/ composes those fragments (and other inputs) into the LLM-facing payload, sitting one layer above fetchPage. The two renderers share a parallel render<Subject>Result(s) → <Subject>Result shape, with singular/plural reflecting cardinality (one page result; many image results). src/renderers/page-result.ts (renderPageResult → PageResult) builds the Visit Website result by narrowing on the fetched page's kind, orchestrating the parsers/ HTML/text extractors, and bounding the result through the retrieval/ excerpt builder. src/renderers/image-results.ts (renderImageResults → ImageResult[]) pairs each ImageSubject with its positional downloadImages outcome (DownloadedImage), emitting a markdown reference on success ( derived via , markdown escaped via ) or a message on failure. The two search tools have no renderer — their parse/enrich output is already response-ready, so they return it directly.

Caches

Three disk-backed TTL caches (via cacache) are constructed once in toolsProvider and shared across tools. They persist across plugin reloads — clearing requires removing the cacache directory at ~/.lmstudio/plugin-data/lms-plugin-duckduckgo-cache:

Cache sizes and subdirs are defined in src/tools-provider.ts; the TTLCache implementation is in src/cache/ttl-cache.ts. TTL defaults live in src/config/config-defaults.ts. Fetch Images downloads are not cached because they land in chat-scoped working directories.

Image search specifics

Bing renders each image tile as <a class="iusc" m="<JSON>">. The JSON blob carries murl (full image URL), purl (source page URL), and t (title), among others; only those fields are typed in src/bing/parse-results.ts. jsdom returns the m attribute already entity-decoded, so the parser feeds it straight to JSON.parse; malformed tiles are swallowed individually rather than aborting the result list, and tiles whose murl doesn't end in a supported image extension are filtered out. Bing returns ~35 tiles per HTML page; pagination advances by 35 via the first query parameter. imageMaxResults slices the parsed list — its slider tops at 35 because that's Bing's natural page size.

Safe search encoding

DuckDuckGo uses non-obvious p param values: strict→"1", moderate→"", off→"-1". Encoding lives in src/duckduckgo/build-urls.ts. Bing accepts the literal mode strings (strict/moderate/off) on its adlt param, so src/bing/build-urls.ts just passes the SafeSearch value through. The SafeSearch type itself is provider-neutral and lives in src/search/safe-search.ts.

Errors

Two error hierarchies are load-bearing:

FetchError (src/http/fetch-error.ts) — HTTP/network failures, carries url and optional cause.
NoResultsError base with NoWebResultsError / NoImageResultsError (src/errors/no-results-error.ts).

formatToolError in src/errors/tool-error.ts converts these into user-facing strings per tool kind (web-search, image-search, website, image-download), including abort-detection via DOMException.name === "AbortError".

Tool-file conventions

ESLint enforces two rules on src/tools/*-tool.ts: the file must contain exactly one exported create<Name>Tool factory returning Tool, and module-level function declarations other than that factory are banned. Per-tool helpers either live in a sibling module (e.g. src/fs/url-filename.ts, src/parsers/, src/renderers/) or are inlined inside the implementation arrow. Interfaces at module scope are allowed.

Module naming

A file is named for its domain. Prefer a descriptive noun for the concept the module owns (src/timing/rate-limiter.ts → RateLimiter, src/http/ssrf.ts → assertPublicUrl, src/fs/markdown-path.ts → toMarkdownPath, src/enrichment/metascraper.ts → createMetascraper); a verb-phrase filename is acceptable only for a module that mirrors a single verb export (src/images/download-image.ts → downloadImage, src/page/fetch-page.ts → fetchPage). Never use vague or filler names — no retry, helpers, utils, or bare *-guard; name the actual concept (, , ). A module grouping several cohesive exports takes the domain-concept noun (, ). Export identifiers stay independent of the filename: functions/predicates are verbs (, ), classes/types/values are nouns (). Reserve the prefix for a function that converts a value a defined type ( returns a markdown path); do not use it for a function that returns or an ad-hoc shape. When a helper exists only to dispatch to another function, echo that function's verb rather than coining a new one (a wrapper delegating to is , not ). Never invent a name when the convention already fixes one; describing what a function does always beats describing where it is called.

Web-search result enrichment

src/enrichment/ wires metascraper into the web-search flow. metascraper.ts builds a single in-tree rule plugin that resolves date, type, and description against OpenGraph, microdata, JSON-LD, and standard meta tags. Rules use @metascraper/helpers for the heavy lifting: helpers.date (chrono-node-backed) for ISO normalization across many input formats, helpers.$jsonld for memoized JSON-LD lookups so multiple property accesses on the same page reuse one parse pass, and helpers.description for the 500-char-clamped description sanitizer. The local og:type rule keeps a thin trimmed() helper since helpers does not export a generic string sanitizer. The date rule chain prefers article:modified_time over article:published_time so the model sees the most recent change date; helpers' date() collapses both into a single ISO 8601 value rather than splitting them. Types for @metascraper/helpers (which ships pure JS) are declared inline in src/enrichment/metascraper-helpers.d.ts. The wrapper only emits keys whose extraction succeeded so the per-result merge in cannot pollute records with properties. The fan-out runs concurrently via and gates each fetch on the so distinct domains run in parallel while same-host calls still observe ; the website cache is consulted first per result so warm enrichment pays no rate-limit cost. Non-HTML pages (PDF, plain text, JSON) are returned without metadata since the rules only match parsed HTML.

web-tools