Forked from mindstudio/big-rag
Project Files
TESTING.md
This guide provides instructions for testing the Big RAG plugin with various scenarios and dataset sizes.
nomic-ai/nomic-embed-text-v1.5-GGUF)Before larger end-to-end runs, ensure the core parsers succeed:
This builds the project and executes the HTML/Markdown/Text regression tests located in src/tests/parseDocument.test.ts.
Create a test directory structure:
Add some test files:
Objective: Verify the plugin can index and retrieve from a small dataset.
Steps:
Success Criteria:
Objective: Test with a larger dataset (100+ files).
Steps:
Generate test files:
Success Criteria:
Objective: Verify all supported file types are processed correctly.
Steps:
Add different file types to test directory:
Clear vector store and reindex
Query for content that should be in each file type
Success Criteria:
Objective: Verify that already-indexed files are skipped.
Steps:
Expected Result: Second indexing should be much faster as files are already indexed.
Success Criteria:
Objective: Test different concurrency settings.
Steps:
maxConcurrentFiles to 1maxConcurrentFiles to 5Expected Result: Higher concurrency should be faster (but use more memory).
Success Criteria:
Objective: Test different threshold settings.
Steps:
retrievalAffinityThreshold to 0.9 (very strict)retrievalAffinityThreshold to 0.3 (very loose)Expected Result:
Success Criteria:
Objective: Verify OCR works for image files.
Steps:
Expected Result: Text from image should be extracted and searchable.
Success Criteria:
Objective: Verify graceful handling of errors.
Test Cases:
Success Criteria:
Note: Times vary based on hardware, file types, and OCR usage.
The plugin uses LM Studio's logging system. To see debug output:
ctl.debug() calls in code for detailed loggingAfter testing, clean up test data:
For automated testing, consider:
Example test structure:
When reporting issues, include:
Start the plugin in dev mode:
Configure in LM Studio:
~/test-documents~/.lmstudio/big-rag-test-dbSend a test query:
Expected Result: The plugin should:
Clear the vector store:
Restart the plugin and send a query
Expected Result:
Invalid Directory:
Corrupted File:
No Write Permission:
Disk Full (simulated):
"No relevant content found"
"Vector store not initialized"
Slow indexing
High memory usage
npm run test
cd big-rag-plugin
npm install
mkdir -p ~/test-documents/subfolder1
mkdir -p ~/test-documents/subfolder2/deep
# Create a simple text file
echo "This is a test document about artificial intelligence and machine learning." > ~/test-documents/test1.txt
# Create an HTML file
cat > ~/test-documents/test2.html << 'EOF'
<!DOCTYPE html>
<html>
<head><title>Test Document</title></head>
<body>
<h1>Machine Learning Basics</h1>
<p>Machine learning is a subset of artificial intelligence that focuses on algorithms that can learn from data.</p>
</body>
</html>
EOF
# Create a markdown file in subfolder
echo "# Deep Learning\n\nDeep learning uses neural networks with multiple layers." > ~/test-documents/subfolder1/test3.md
mkdir -p ~/.lmstudio/big-rag-test-db
for i in {1..100}; do
echo "Document $i: This document discusses topic $((i % 10)) in detail." > ~/test-documents/doc_$i.txt
done
# Remove test documents
rm -rf ~/test-documents
# Remove test vector store
rm -rf ~/.lmstudio/big-rag-test-db
describe('DocumentParser', () => {
it('should parse HTML correctly', async () => {
const result = await parseHTML('test.html');
expect(result).toContain('expected text');
});
});
npm run dev
What is machine learning?
rm -rf ~/.lmstudio/big-rag-test-db/*