Forked from mindstudio/big-rag
Project Files
EXAMPLES.md
These examples show practical settings for different document collections. Paths are examples; replace them with local paths on your machine.
Use this for API references, Markdown docs, JSON/YAML config examples, and generated documentation.
Documents Directory: ~/Documents/tech-library Vector Store Directory: ~/.lmstudio/tech-library-db Chunk Size: 768 Chunk Overlap: 150 Chunking Strategy: sentence Min Chunk Length: 20 Retrieval Limit: 7 Retrieval Affinity Threshold: 0.55 Max Concurrent Files: 3 Enable OCR: false
Example queries:
Notes:
Use this for PDF-heavy academic collections.
Example queries:
Notes:
Use this for DOCX and ODT collections such as policies, contracts, procedures, and reports.
Example queries:
Notes:
mammoth.adm-zip when available and falls back to best-effort XML text extraction.Use this for contracts, regulations, case files, and scanned legal records.
Example queries:
Notes:
Use this for notes, articles, saved web pages, recipes, and ebooks.
Example queries:
Notes:
minChunkLength if short notes or page fragments create noisy retrieval results.Use this for operational logs, CSV exports, JSONL traces, and config directories.
Example queries:
Notes:
If users paste large text blocks into chat, set Max Query Length to match the embedding model limit. The plugin truncates only the query used for embedding; the original user prompt is still preserved in the final prompt template.
Recommended starting values:
maxConcurrentFiles low for PDFs and OCR.minChunkLength to remove headers, page numbers, and boilerplate.Documents Directory: ~/Research/papers
Vector Store Directory: ~/.lmstudio/research-db
Chunk Size: 768
Chunk Overlap: 150
Chunking Strategy: sentence
Min Chunk Length: 30
Retrieval Limit: 10
Retrieval Affinity Threshold: 0.55
Max Concurrent Files: 2
Enable OCR: false
Documents Directory: ~/Documents/company-archive
Vector Store Directory: ~/.lmstudio/company-archive-db
Chunk Size: 512
Chunk Overlap: 100
Chunking Strategy: sentence
Min Chunk Length: 20
Retrieval Limit: 6
Retrieval Affinity Threshold: 0.6
Max Concurrent Files: 2
Enable OCR: false
Documents Directory: ~/Legal/documents
Vector Store Directory: ~/.lmstudio/legal-db
Chunk Size: 512
Chunk Overlap: 100
Chunking Strategy: sentence
Min Chunk Length: 25
Retrieval Limit: 5
Retrieval Affinity Threshold: 0.7
Max Concurrent Files: 1
Enable OCR: true
Documents Directory: ~/Knowledge
Vector Store Directory: ~/.lmstudio/knowledge-db
Chunk Size: 512
Chunk Overlap: 100
Chunking Strategy: character
Min Chunk Length: 20
Retrieval Limit: 5
Retrieval Affinity Threshold: 0.5
Max Concurrent Files: 3
Enable OCR: true
Documents Directory: ~/ops-data
Vector Store Directory: ~/.lmstudio/ops-data-db
Chunk Size: 1024
Chunk Overlap: 128
Chunking Strategy: character
Min Chunk Length: 10
Retrieval Limit: 8
Retrieval Affinity Threshold: 0.45
Max Concurrent Files: 4
Enable OCR: false