This is a Retrieval-Augmented Generation (RAG) plugin for LM Studio. This plugin enhances your local LLM with the ability to answer questions based on the content of provided documents.
Features
Retrieval-Augmented Generation (RAG): Automatically retrieves relevant information from your documents to answer your questions.
Two Context Strategies:
Inject Full Content: For smaller documents, the plugin injects the entire content into the context.
Retrieval: For larger documents, it uses an embedding model to find and inject only the most relevant parts.
Automatic Embedding Model Detection: The plugin can automatically detect and use a compatible embedding model that you have loaded or downloaded in LM Studio.
Configurable: You can configure the retrieval parameters to suit your needs.
Getting Started
Development
The source code resides in the src/ directory. For development purposes, you can run the plugin in development mode using:
lms dev
Publishing
To share your plugin with the community, you can publish it to LM Studio Hub using:
lms push
The same command can also be used to update an existing plugin.
Configuration
You can configure the plugin from the LM Studio UI. Here are the available options:
Embedding Model: Choose an embedding model to use. It defaults to "Auto-Detect".
Manual Model ID (Optional): Specify a model ID to override the auto-detection.
Auto-Unload Model: If enabled, the embedding model will be unloaded from memory after retrieval.
Retrieval Limit: The maximum number of text chunks to retrieve from the documents.
Retrieval Affinity Threshold: The minimum similarity score for a chunk to be considered relevant.