RAG-Flex

Version: 1.2.0 | License: MIT

A flexible RAG (Retrieval-Augmented Generation) plugin for LM Studio with dynamic embedding model selection, intelligent context management, and multilingual support.

✨ Features

🔄 Dynamic Model Selection: Choose from 4 mainstream embedding models with automatic local model detection
🧠 Smart Context Management: Automatically decides between full-text injection and RAG retrieval based on file size
🌏 Multilingual Support: Full UI and messages in English, Traditional Chinese, and Japanese
⚙️ Flexible Configuration: Adjustable retrieval limits, affinity thresholds, and context usage
🛡️ Robust Error Handling: AI-friendly error messages that guide users to solutions
🔧 Developer Tools: Optional debug logging for troubleshooting and development

🚀 Quick Start

Prerequisites

Install LM Studio (v0.2.9 or later)
Download at least one embedding model:
- Recommended: nomic-ai/nomic-embed-text-v1.5-GGUF (built-in, fast)
- For Chinese/Multilingual: lm-kit/bge-m3-gguf (slower but more accurate)

Installation

From LM Studio Plugin Page (Recommended)

Visit https://lmstudio.ai/yongwei/rag-flex
Click Run in LM Studio
LM Studio will automatically open and install the plugin

From GitHub (Development Mode)

The plugin will automatically load into LM Studio. You should see "Register with LM Studio" in the terminal output.

📖 Usage

Basic Workflow

Enable the plugin in LM Studio settings (Plugins tab)
Upload documents to your chat (PDF, DOCX, TXT, MD)
Ask questions - RAG-Flex automatically:
- Analyzes file size and context usage
- Chooses between full-text injection (small files) or RAG retrieval (large files)
- Returns relevant chunks with citations

Example Conversations

Small File (Full-Text Injection)

Large File (RAG Retrieval)

⚙️ Configuration Options

Access plugin settings in LM Studio → Plugins → RAG-Flex

Parameter	Default	Range	Description
Message Language	Auto-detected	EN/ZH-TW/JA	Language for runtime messages
Embedding Model	nomic-ai/nomic-embed-text-v1.5	4 presets	Select from preset embedding models
Custom Embedding Model	(empty)	Text input	Override selection above with model key (e.g. `text-embedding-bge-m3`), identifier (e.g. `lm-kit/bge-m3-gguf`), or full path
Context Usage Threshold	0.7	0.1 - 1.0	Trigger point for RAG retrieval (lower = more precise)
Retrieval Limit	5	1 - 15	Number of chunks to retrieve
Retrieval Affinity Threshold	0.4	0.0 - 1.0	Similarity threshold (BGE-M3: 0.4-0.6 recommended)
Enable Debug Logging	Off	On/Off

Embedding Model Comparison

Model	Size	Speed	Best For	Language Support
nomic-ai/nomic-embed-text-v1.5-GGUF	84 MB	⚡⚡⚡ Fast	English, general use	English
NathanMad/sentence-transformers_all-MiniLM-L12-v2-gguf	133 MB	⚡⚡⚡ Fast	Lightweight tasks	English
groonga/gte-large-Q4_K_M-GGUF	216 MB	⚡⚡ Medium	Balanced performance	Multilingual
lm-kit/bge-m3-gguf	1.16 GB	⚡ Slow (F16) / ⚡⚡ Medium (Q4)	Chinese, multilingual, high precision	100+ languages

Note: Due to SDK limitations, the dropdown only shows preset models. Use the Custom Embedding Model field to specify any downloaded model by entering its model key (e.g. text-embedding-qwen3-embedding-8b), identifier, or full path.

💡 Use Cases & Examples

📚 Technical Documentation Analysis

📄 Legal Document Review

💻 Code Understanding & Analysis

🏛️ Government Document Processing

📊 Research Paper Analysis

🔧 Advanced Configuration Guide

Understanding Context Usage Threshold

The threshold determines when to switch from full-text injection to RAG retrieval:

When to adjust:

Threshold	Behavior	Use Case
0.3-0.5	Forces RAG more often	Large documents, memory constraints
0.6-0.7	Balanced (default)	General use
0.8-0.9	Allows more full injection	Small documents, need full context

Optimizing Retrieval Affinity Threshold

Different content types require different similarity thresholds:

Content Type	Recommended Threshold	Reason
Natural language text	0.5-0.7	Clear semantic matching
Technical documentation	0.4-0.6	Technical terms vary
Code/SQL	0.3-0.4	Syntax-heavy, lower semantic similarity
Mixed language	0.4-0.5	Account for language switching

Multilingual Configuration

The plugin automatically detects your system language and sets the UI accordingly:

Windows: Uses Intl API to detect locale
Linux/macOS: Checks LANG, LANGUAGE, LC_ALL environment variables
Manual Override: Change "Message Language" in plugin settings

Supported Languages:

🇬🇧 English (en)
🇹🇼 繁體中文 (zh-TW)
🇯🇵 日本語 (ja)

📖 For developers: See I18N.md for technical details on the internationalization system, adding new languages, and translation guidelines. Also available in 繁體中文 and 日本語.

Developer Mode: Debug Logging

Enable debug logging for troubleshooting or development:

Open LM Studio → Plugins → RAG-Flex settings
Enable "Enable Debug Logging"
(Optional) Set custom "Debug Log Path"
Logs will include:
- System locale detection
- Model loading events
- File processing steps
- Retrieval results
- Error stack traces

Default log location: ./logs/lmstudio-debug.log

🐛 Troubleshooting

Common Issues

"❌ Embedding model not found"

Cause: Selected model not downloaded in LM Studio

Solution:

Open LM Studio → Search (🔍)
Search for the model name (e.g., bge-m3)
Click Download
Wait for download to complete
Restart the chat or reload the plugin

Alternative: Select a different model in plugin settings

"No relevant citations found (threshold: 0.4)"

Cause: Retrieval affinity threshold too high for your content

Solutions:

For code/SQL files: Lower threshold to 0.3-0.4
For mixed-language documents: Try 0.4-0.5
For technical jargon: Lower to 0.35-0.45

How to adjust: LM Studio → Plugins → RAG-Flex → Retrieval Affinity Threshold

File processing too slow

Cause: Large file with high-precision embedding model

Solutions:

Switch to faster model:
- Use nomic-embed-text-v1.5 instead of bge-m3
- 10-20x faster for English content

Runtime messages in wrong language

Cause: System locale auto-detection doesn't match your preference

Solution:

Open plugin settings
Manually select "Message Language"
Choose: English (en) / 繁體中文 (zh-TW) / 日本語 (ja)

Note: This only changes plugin runtime messages (errors, status updates). LM Studio's UI language is controlled by LM Studio itself.

Debug logs not being created

Possible causes:

Debug logging not enabled in settings
Insufficient file write permissions
Invalid log path

Solutions:

Enable "Enable Debug Logging" in plugin settings
Check log path exists and is writable
Try default path: ./logs/lmstudio-debug.log
On Windows, ensure path uses \\ or /

💡 Pro Tip: All error messages are AI-friendly - paste them directly into your LLM chat for automated troubleshooting!

📦 Supported File Formats

Format	Extension	Processing Method	Notes
PDF	`.pdf`	Text extraction	Supports text-based PDFs (not scanned images)
Word Documents	`.docx`	Full document parsing	Preserves structure and formatting
Plain Text	`.txt`	Direct read	UTF-8 encoding recommended
Markdown	`.md`	Markdown parsing	Maintains heading structure

Not supported: Images, audio, video, Excel spreadsheets, scanned PDFs without OCR

🆚 Improvements Over RAG-v1

Feature	RAG-v1	RAG-Flex (v1.2.0)
Embedding Models	❌ Hardcoded (nomic only)	✅ 4 selectable + auto-detection
Multilingual Support	❌ English only	✅ English, 繁體中文, 日本語
Error Messages	❌ Technical English	✅ User-friendly, localized
Context Management	⚙️ Basic threshold	✅ Smart threshold-based strategy
Affinity Threshold	❌ Fixed at 0.5	✅ Configurable (0.0-1.0)
No-result Handling	❌ Exposes system prompt	✅ Graceful degradation
Model Detection	❌ Manual configuration	✅ Auto-detects local models
Debug Tools	❌ None	✅ Optional debug logging
Configuration UI

🤝 Contributing

Contributions are welcome! Here's how you can help:

Reporting Issues

Use GitHub Issues for bug reports
Include debug logs (enable debug logging first)
Provide file type, size, and configuration used

Submitting Code

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Follow existing code style (TypeScript with proper types)
Test with multiple embedding models
Update documentation if needed
Commit changes (git commit -m 'Add amazing feature')
Push to branch (git push origin feature/amazing-feature)
Open a Pull Request

Adding Translations

To add a new language:

Add language code to src/locales/types.ts
Create translation file: src/locales/[lang].ts
Update src/locales/index.ts
Update src/config.ts language options
Create README.[lang].md

📝 License

MIT License - see LICENSE file for details.

This means you can:

✅ Use commercially
✅ Modify and distribute
✅ Use privately
✅ Sublicense

Requirements:

⚖️ Include original license and copyright notice

🙏 Acknowledgments

📧 Contact & Links

Author: Henry Chen GitHub: @henrychen95 Repository: rag-flex LM Studio Plugin Page: lmstudio.ai/yongwei/rag-flex

Community

🐛 Report Bugs: GitHub Issues
💡 Feature Requests: GitHub Discussions
📖 Documentation: Wiki

⭐ If RAG-Flex helps your workflow, please star the repository!

Made with ❤️ for the LM Studio community