Research Plugin for LM Studio

A persistent local research analyst — the local equivalent of Perplexity Deep Research with a long-term memory. Searches the web, reads full pages, extracts and verifies claims across sources, scores source reliability, and generates structured reports that are saved to a local SQLite knowledge graph across sessions.

What It Does

Standard search gives you links. This plugin gives you:

Research planning — breaks complex questions into sub-questions and search strategies
Multi-source evidence — fetches and reads full page content, not snippets
Claim extraction — identifies specific verifiable claims with source attribution
Source reliability scoring — 6-signal heuristic score (authority, objectivity, depth, recency, structure, corroboration)
Evidence comparison — detects corroborated facts, single-source claims, and contradictions
Professional fact-checking — Google Fact Check Tools API (PolitiFact, Snopes, Reuters, AFP)
Persistent knowledge graph — entities, claims, sources, timelines, and reports saved to local SQLite
Structured deliverables — briefings, dossiers, market maps, literature reviews, competitor comparisons

Installation

Load the built plugin folder in LM Studio.

Configuration

Field	Default	Description
Data Path	`~/research-data/`	Directory for the SQLite knowledge graph
SearXNG URL	(blank)	Self-hosted SearXNG base URL (e.g. `http://localhost:8080`). Falls back to DuckDuckGo → Bing if blank
Max Results Per Query	`8`	Search results returned per query
Search Recency Window	`year`	Limit results to: `day`, `week`, `month`, `year`, or `any`
Research Mode	`auto`	`auto` = LLM drives the full pipeline. `guided` = LLM suggests each step, you confirm
Wikipedia Grounding	`true`	Auto-fetch Wikipedia summaries for named entities found in sources
Google Fact Check API Key	(blank)	Optional. Enables richer fact-check coverage. Works without key (rate-limited)

How It Works

Research Modes

Auto mode (default): the LLM drives the entire pipeline from question to final report. Ask your question and it runs plan_research → search_sources → read_source → extract_claims → score_source → compare_evidence → save_entity → generate_report autonomously.

Guided mode: the LLM suggests the next step and waits for your confirmation before proceeding. Use when you want to steer the research direction.

Source Scoring

Every fetched source is scored 0–100 across 6 signals:

Signal	Weight	What it measures
Cross-source corroboration	25%	Whether other sources in the session cover the same domain
Linguistic objectivity	20%	Sensationalism markers vs. evidence-based language
Authority & accountability	20%	TLD authority (.edu/.gov), named authors
Content depth	15%	Word count, citation markers
Structural clarity	10%	Schema markup, headings, clean URL structure
Recency	10%	Age of content (applied only to time-sensitive domains)

Score is a reliability proxy, not a truth indicator. A low-scored source may contain accurate facts. A high-scored source may be wrong. The score tells you how much to trust the signal before you verify.

Claim Confidence Labels

After compare_evidence, every claim in the final report is labeled:

✓ Corroborated — 2+ independent sources agree
⚠ Unverified — found in only one source; present as "reported by [source]", not as fact
✗ Contradicted — sources directly disagree; never pick a side, surface the conflict

Persistent Knowledge Graph

Entities, claims, sources, timelines, and report metadata are saved to ~/research-data/research.db. Future sessions can retrieve prior research with list_prior_reports and update entity timelines with update_entity_timeline.

Tools

`plan_research` (Scaffold)

Decompose a research question into sub-questions, knowns vs. unknowns, and concrete search queries. Always the first call on a new topic.

`search_sources` (Effect)

Execute multiple web searches in one call. Returns deduplicated results with URL, title, snippet, and domain. Uses SearXNG → DuckDuckGo → Bing fallback, no API keys required.

`read_source` (Effect)

Fetch a URL and extract clean body text, word count, published date, named entities, and structural signals. Optionally enriches entities with Wikipedia summaries.

`extract_claims` (Scaffold)

Extract specific verifiable claims from source body text. Returns instructions for the LLM to produce a structured JSON array of claims with type, confidence, and verbatim quote.

`score_source` (Compute + Store-write)

Score a source on 6 heuristic signals. Persists the score to the knowledge graph. Returns score (0–100), verdict, and per-signal breakdown.

`check_fact` (Effect)

Query the Google Fact Check Tools API for professional fact-checks on a specific claim. Returns verdicts from PolitiFact, Snopes, Reuters, AFP. Returns covered: false for niche claims.

`compare_evidence` (Scaffold)

Compare claims from multiple sources. Returns instructions to classify claims as corroborated/unverified/contradicted and write a synthesis summary.

`save_entity` (Store-write)

Persist a named entity (company, person, technology, concept, event) and its associated claims to the local knowledge graph. Upserts — calling again updates rather than duplicates.

`generate_report` (Scaffold + Store-write)

Produce a structured research deliverable from collected evidence. Saves report metadata to the knowledge graph. Pass entityNamesJson with entity names from save_entity calls this session.

Formats:

briefing — 400–600 word executive summary
dossier — 800–1200 word structured profile
market_map — player landscape with positioning
literature_review — academic synthesis with methodology notes
competitor_comparison — side-by-side strengths/weaknesses/differentiators

`list_prior_reports` (Store-read)

List saved research reports with id, topic, format, date, entity count, and source count. Pass a topic keyword to filter. Does not return full content — metadata only.

`update_entity_timeline` (Store-write)

Append a dated event to an existing entity's history. INSERT-only — never overwrites prior events.

Example Workflows

Competitor dossier

"Research Anthropic as a company — funding, products, team, and competitive position"

Auto mode runs the full pipeline and produces a dossier format report with corroboration labels. Anthropic is saved as an entity for future sessions.

Technical literature review

"What does the research actually say about transformer attention mechanisms and their computational limits?"

Use depth: "deep" in plan_research. Sources are weighted toward academic platforms (arXiv, Semantic Scholar score higher). Output: literature_review format.

Startup intelligence report

"I want a market map of AI coding assistants — who the players are, how they're positioned, and where the gaps are"

Output: market_map format. Each major player (GitHub Copilot, Cursor, Codeium, etc.) is saved as a company entity with claims attached.

Fact check

"Is it true that 90% of startups fail in the first year?"

plan_research → check_fact → if not covered, search_sources + extract_claims + compare_evidence. Presents the actual evidence, not a confident number.

Ongoing entity tracking

"Add to OpenAI's timeline: they announced o3 on 2025-04-16 with a 99.9% score on ARC-AGI-2"

update_entity_timeline — appends the event. Future sessions see the full history.

Search Backend

SearXNG (self-hosted) → DuckDuckGo HTML scraper → Bing HTML scraper. No API keys required for search. Run SearXNG locally for best quality:

Peer Plugin: altra/web-search

If the altra/web-search plugin is also installed and loaded alongside this plugin, search_sources and read_source are automatically omitted from this plugin's tool list at startup. The web-search plugin already provides equivalent (and more capable) tools for those steps:

research-plugin tool	Replaced by web-search-plugin tool
`search_sources`	`search`
`read_source`	`fetch_and_read`

All other research tools (plan_research, extract_claims, score_source, check_fact, compare_evidence, save_entity, generate_report, list_prior_reports, update_entity_timeline) are always registered and unaffected.

When both plugins are installed, the LLM-driven research pipeline works like this:

plan_research (research-plugin) — decompose the question
search (web-search-plugin) — run the searches and read top pages
extract_claims (research-plugin) — extract structured claims from page content
score_source (research-plugin) — score source reliability

This avoids duplicate web-fetching tools competing in the LLM's context and lets each plugin focus on what it does best.

research