Project Files
IMPLEMENTATION_PLAN.md
Upgrade the current LM Studio document RAG plugin from a simple prompt-preprocessor flow into a stronger, measurable, and safer RAG system while preserving a working fast path.
The plan is intentionally staged:
core runtime contracts from MCP-specific schema types.packages/core, packages/adapter-lmstudio, and packages/mcp-server.packages/core while preserving compatibility through temporary re-export shims.packages/adapter-lmstudio behind stable entrypoints.packages/mcp-server behind stable entrypoints.src/index.ts
src/config.ts
LLMDynamicHandle importmodelKeyThe prompt preprocessor is still the best fit for the simple fast path. More advanced iterative retrieval should be added later via a tools provider rather than overloading the preprocessor.
Create a repeatable way to measure whether changes improve the plugin.
eval/cases/basic.jsonleval/cases/hard.jsonlscripts/eval.tssrc/metrics.tssrc/types/eval.tsEach case should include:
idfilesquestionexpected_answer_pointsexpected_sourcesanswerabilitydifficultyeval/ folder structurescripts/eval.tseval/results/Improve answerability handling, retrieval quality, evidence quality, and safety without changing plugin type.
Predict whether retrieval is likely useful before paying the full retrieval cost.
Classify each request into one of:
src/gating.tssrc/types/gating.tssrc/promptPreprocessor.ts
src/config.ts
answerabilityGateEnabledanswerabilityGateThresholdambiguousQueryBehaviorRetrieve from multiple query variants and fuse the results.
src/queryRewrite.tssrc/fusion.tssrc/types/retrieval.tssrc/promptPreprocessor.ts
src/config.ts
multiQueryEnabledmultiQueryCountfusionMethodmaxCandidatesBeforeRerankPass better evidence to the model than raw top chunks.
Each evidence block should include:
src/evidence.tssrc/types/evidence.tssrc/promptPreprocessor.ts
neighborWindowdedupeSimilarityThresholdmaxEvidenceBlocksTreat file content as untrusted data.
src/safety.tssrc/types/safety.tssrc/promptPreprocessor.ts
src/config.ts
sanitizeRetrievedTextstripInstructionalSpansstrictGroundingModeUpgrade the underlying retrieval engine with adaptive chunking, hybrid retrieval, and reranking.
Move from flat text assumptions to document-structure-aware chunking.
src/chunking.tssrc/documentModel.tssrc/types/document.tschunkingModetargetChunkTokensmaxChunkTokensstructureAwareChunkingCombine semantic retrieval with lexical retrieval.
src/lexicalRetrieve.tssrc/hybridRetrieve.tssrc/indexing.tshybridEnabledlexicalWeightsemanticWeighthybridCandidateCountSelect evidence that is sufficient and complementary, not just topically similar.
src/rerank.tssrc/types/rerank.tsVersion 1:
Version 2:
rerankEnabledrerankTopKrerankStrategySupport iterative, multi-hop, or clarification-heavy retrieval workflows.
This should be implemented as a tools provider rather than forcing it into the prompt preprocessor.
src/toolsProvider.tssrc/tools/searchFiles.tssrc/tools/readSection.tssrc/tools/readNeighbors.tssrc/tools/listHeadings.tssrc/tools/verifyClaim.tssrc/types/tools.tssrc/index.ts
src/config.ts
agenticModeEnabledmaxToolCallstoolReadWindowverificationEnabledsearch_files(query)read_section(file, sectionId)read_neighbors(file, chunkId, window)list_headings(file)verify_claim(claim, evidenceIds)Reduce unsupported claims in generated answers.
src/verify.tssrc/claimSplit.tssrc/types/verify.tsclaimVerificationEnabledmaxClaimsToVerifyunsupportedClaimBehaviorMake the plugin safer and more robust against hostile or messy documents.
src/sanitize.tssrc/policy.tssrc/types/policy.tsAdd these to src/config.ts:
Phase 0 + Phase 1A
Phase 1B + Phase 1C
Phase 1D + cleanup
Phase 2A + Phase 2B + Phase 2C
Phase 3 + Phase 4
Phase 5
The plugin should improve along these axes while remaining usable inside LM Studio:
Start with:
Phase 0 baseline eval harnessPhase 1A answerability gatePhase 1B multi-query rewrite and fusionPhase 1C evidence packaging and neighbor expansionPhase 1D retrieved-text safety wrapperThis is the best first cut because it materially improves the current plugin without forcing an architectural jump before there is measurement in place.
rag_answer, rag_search, corpus_inspect, and rerank_only.src/core/pipeline.ts-level eval cases independent of LM Studio runtime objects.packages/ workspace split only after the runtime boundary stops moving.src/promptPreprocessor.ts
README.mdanswerabilityGateEnabledanswerabilityGateThresholdambiguousQueryBehaviormultiQueryEnabledmultiQueryCountfusionMethodmaxCandidatesBeforeRerankneighborWindowdedupeSimilarityThresholdmaxEvidenceBlockssanitizeRetrievedTextstripInstructionalSpansstrictGroundingModechunkingModetargetChunkTokensmaxChunkTokensstructureAwareChunkinghybridEnabledlexicalWeightsemanticWeighthybridCandidateCountrerankEnabledrerankTopKrerankStrategyagenticModeEnabledmaxToolCallstoolReadWindowverificationEnabledclaimVerificationEnabledmaxClaimsToVerifyunsupportedClaimBehavior