Project Files
TESTING.md
This guide provides instructions for testing the Big RAG plugin with various scenarios and dataset sizes.
nomic-ai/nomic-embed-text-v1.5-GGUF)Before larger end-to-end runs, ensure the core parsers succeed:
npm run test
This builds the project and executes the HTML/Markdown/Text regression tests located in src/tests/parseDocument.test.ts.
cd big-rag-plugin npm install
Create a test directory structure:
mkdir -p ~/test-documents/subfolder1 mkdir -p ~/test-documents/subfolder2/deep
Add some test files:
# Create a simple text file echo "This is a test document about artificial intelligence and machine learning." > ~/test-documents/test1.txt # Create an HTML file cat > ~/test-documents/test2.html << 'EOF' <!DOCTYPE html> <html> <head><title>Test Document</title></head> <body> <h1>Machine Learning Basics</h1> <p>Machine learning is a subset of artificial intelligence that focuses on algorithms that can learn from data.</p> </body> </html> EOF # Create a markdown file in subfolder echo "# Deep Learning\n\nDeep learning uses neural networks with multiple layers." > ~/test-documents/subfolder1/test3.md
mkdir -p ~/.lmstudio/big-rag-test-db
Objective: Verify the plugin can index and retrieve from a small dataset.
Steps:
Start the plugin in dev mode:
Configure in LM Studio:
~/test-documents~/.lmstudio/big-rag-test-dbSend a test query:
Expected Result: The plugin should:
Success Criteria:
Objective: Test with a larger dataset (100+ files).
Steps:
Generate test files:
for i in {1..100}; do echo "Document $i: This document discusses topic $((i % 10)) in detail." > ~/test-documents/doc_$i.txt done
Clear the vector store:
Restart the plugin and send a query
Expected Result:
Success Criteria:
Objective: Verify all supported file types are processed correctly.
Steps:
Add different file types to test directory:
Clear vector store and reindex
Query for content that should be in each file type
Success Criteria:
Objective: Verify that already-indexed files are skipped.
Steps:
Expected Result: Second indexing should be much faster as files are already indexed.
Success Criteria:
Objective: Test different concurrency settings.
Steps:
maxConcurrentFiles to 1maxConcurrentFiles to 5Expected Result: Higher concurrency should be faster (but use more memory).
Success Criteria:
Objective: Test different threshold settings.
Steps:
retrievalAffinityThreshold to 0.9 (very strict)retrievalAffinityThreshold to 0.3 (very loose)Expected Result:
Success Criteria:
Objective: Verify OCR works for image files.
Steps:
Expected Result: Text from image should be extracted and searchable.
Success Criteria:
Objective: Verify graceful handling of errors.
Test Cases:
Invalid Directory:
Corrupted File:
No Write Permission:
Disk Full (simulated):
Success Criteria:
Note: Times vary based on hardware, file types, and OCR usage.
The plugin uses LM Studio's logging system. To see debug output:
ctl.debug() calls in code for detailed logging"No relevant content found"
"Vector store not initialized"
Slow indexing
High memory usage
After testing, clean up test data:
# Remove test documents rm -rf ~/test-documents # Remove test vector store rm -rf ~/.lmstudio/big-rag-test-db
For automated testing, consider:
Example test structure:
describe('DocumentParser', () => { it('should parse HTML correctly', async () => { const result = await parseHTML('test.html'); expect(result).toContain('expected text'); }); });
When reporting issues, include:
npm run dev
What is machine learning?
rm -rf ~/.lmstudio/big-rag-test-db/*