Project Files
README.md
A powerful RAG (Retrieval-Augmented Generation) plugin for LM Studio that can index and search through gigabytes or even terabytes (not tested) of document data. Hosted here: ari99/lm_studio_big_rag_plugin on GitHub.
The plugin provides the following configuration options in LM Studio:
maxConcurrentFiles if needed)maxConcurrentFiles on systems with limited resourcesmaxConcurrentFilesmaxConcurrentFiles to 1 or 2success / failed counts after each processed document.BIG_RAG_FAILURE_REPORT_PATH=/absolute/path/report.json when running npm run index (or via LM Studio env settings) to emit a JSON report containing all failure reasons and counts after indexing completes. This is useful when triaging stubborn PDFs such as blueprints or large scanned books.Automated parser smoke tests cover HTML, Markdown, and plain text ingestion:
For end-to-end validation:
This plugin is based on the LM Studio plugin SDK. For more information:
ISC
Configure the Plugin:
/Users/user/Documents/MyLibrary)/Users/user/.lmstudio/big-rag-db)Initial Indexing:
Query Your Documents:
File Scanner (src/ingestion/fileScanner.ts):
Document Parsers (src/parsers/):
htmlParser.ts: Extracts text from HTML/HTM filespdfParser.ts: Extracts text from PDF filesepubParser.ts: Extracts text from EPUB filestextParser.ts: Reads plain text & Markdown files with optional Markdown strippingimageParser.ts: OCR for image filesdocumentParser.ts: Routes to appropriate parserVector Store (src/vectorstore/vectorStore.ts):
Index Manager (src/ingestion/indexManager.ts):
Prompt Preprocessor (src/promptPreprocessor.ts):
retrievalAffinityThresholdcd big-rag-plugin
npm install
npm run build
npm run dev
big-rag-plugin/
βββ src/
β βββ config.ts # Plugin configuration schema
β βββ index.ts # Main entry point
β βββ promptPreprocessor.ts # RAG integration
β βββ ingestion/
β β βββ fileScanner.ts # Directory scanning
β β βββ indexManager.ts # Indexing orchestration
β βββ parsers/
β β βββ documentParser.ts # Parser router
β β βββ htmlParser.ts # HTML parsing
β β βββ pdfParser.ts # PDF parsing
β β βββ epubParser.ts # EPUB parsing
β β βββ textParser.ts # Text parsing
β β βββ imageParser.ts # OCR parsing
β βββ vectorstore/
β β βββ vectorStore.ts # Vectra integration
β βββ utils/
β βββ fileHash.ts # File hashing
β βββ textChunker.ts # Text chunking
βββ manifest.json # Plugin manifest
βββ package.json # Dependencies
βββ tsconfig.json # TypeScript config
βββ README.md # This file
npm run test