aleksandr-333/big-rag-rus • LM Studio Hub

cd big-rag-plugin

npm install

npm run build

npm run dev

Setting	File Types	Default
Index HTML/XHTML	`.htm`, `.html`, `.xhtml`	✅ Enabled
Index PDF	`.pdf`	✅ Enabled
Index EPUB	`.epub`	✅ Enabled
Index Text/Markdown	`.txt`, `.text`, `.md`, `.mdx`, `.markdown`, `.mdown`, `.mkd`, `.mkdn`	✅ Enabled
Index DOCX	`.docx`	✅ Enabled
Index Spreadsheets	`.xlsx`, `.xls`, `.csv`	✅ Enabled
Index Presentations	`.pptx`	✅ Enabled
Index Images (OCR)	`.bmp`, `.jpg`, `.jpeg`, `.png`	✅ Enabled (requires OCR enabled)

Query Language	Example Query	Behaviour
🇷🇺 Russian	«найди все файлы с именем протокол»	Lists all indexed files whose name contains «протокол»
🇷🇺 Russian	«найди файлы письмо в которых встречается слово договор»	Finds files named «письмо» and searches their content for «договор»
🇷🇺 Russian	«в названии которых есть отчёт»	Lists files whose name contains «отчёт»
🇬🇧 English	«find all files named protocol»	Lists all indexed files whose name contains «protocol»
🇬🇧 English	«show files called report containing budget»	Finds files named «report» and searches their content for «budget»
🇬🇧 English	«list files with name invoice»	Lists files whose name contains «invoice»

node dist/cliIndex.js /path/to/docs /path/to/db

big-rag-plugin/
├── src/
│   ├── config.ts              # Plugin configuration schema
│   ├── index.ts               # Main entry point
│   ├── promptPreprocessor.ts  # RAG integration
│   ├── ingestion/
│   │   ├── fileScanner.ts     # Directory scanning
│   │   └── indexManager.ts    # Indexing orchestration
│   ├── parsers/
│   │   ├── documentParser.ts  # Parser router
│   │   ├── htmlParser.ts      # HTML parsing
│   │   ├── pdfParser.ts       # PDF parsing
│   │   ├── epubParser.ts      # EPUB parsing
│   │   ├── textParser.ts      # Text parsing
│   │   ├── imageParser.ts     # OCR parsing
│   │   ├── docxParser.ts      # DOCX (Word) parsing
│   │   ├── xlsxParser.ts      # XLSX/XLS (Excel) parsing
│   │   └── pptxParser.ts      # PPTX (PowerPoint) parsing
│   ├── vectorstore/
│   │   └── vectorStore.ts     # Vectra sharded index integration
│   └── utils/
│       ├── fileHash.ts        # File hashing
│       ├── ocrLangPath.ts     # OCR language path resolution
│       └── textChunker.ts     # Text chunking
├── manifest.json              # Plugin manifest
├── package.json               # Dependencies
├── tsconfig.json              # TypeScript config
└── README.md                  # This file

npm run test

#	Test	Format
1	HTML text extraction	`.html`
2	XHTML text extraction	`.xhtml`
3	Markdown formatting	`.md`
4	MDX as Markdown	`.mdx`
5	Plain text	`.txt`
6	DOCX paragraphs	`.docx`
7	XLSX cell text	`.xlsx`
8	CSV cell text	`.csv`
9	PPTX slide text	`.pptx`
10	EPUB text extraction	`.epub`
11	OCR English (auto-downloads `eng.traineddata`)	`.png`
12	OCR Russian (auto-downloads `rus.traineddata`)	`.png`
13	OCR Mixed English+Russian	`.png`

big-rag-rus

Big RAG Plugin for LM Studio

Features

Supported File Types

Installation

Configuration

Response Language

Required Settings

Retrieval Settings

File Type Filters

Performance Settings

Filename Search

OCR Settings

Reindexing Controls

Usage

Architecture

Components

Performance Considerations

Large Datasets

Optimization Tips

Troubleshooting

No Results Found

Slow Indexing

Out of Memory

OCR Not Working

Failure Reason Reporting

CLI Indexing

Limitations

Development

Project Structure

Testing

Contributing

License

Acknowledgments