tupik/top • LM Studio Hub

1. Parse each file via ctl.client.files.parseDocument()
2. Extract full content
3. Format with headers: ** filename full content **
4. Inject into prompt with instructions

This is a Enriched Context Generation scenario.

The following content was found in the files provided by the user.

** document.pdf full content **

[full file content]

** end of document.pdf **

Based on the content above, please provide a response to the user query.

User query: [user query]

1. Load embedding model (nomic-embed-text-v1.5-GGUF)
2. Perform semantic search via ctl.client.files.retrieve()
3. Filter results by retrievalAffinityThreshold
4. Add found citations to the prompt
5. Attach citations via ctl.addCitations()

The following citations were found in the files provided by the user:

Citation 1: "[citation text]"

Citation 2: "[citation text]"

Use the citations above to respond to the user query, only if they are relevant. Otherwise, respond to the best of your ability without them.

User Query:

[user query]

Important: No citations were found in the user files for the user query. In less than one sentence, inform the user of this. Then respond to the query to the best of your ability.

User Query:

[user query]

Step	Description
1	Load LLM model via `ctl.client.llm.model()`
2	Measure current context usage via `measureContextWindow()`
3	Parse files and count total tokens
4	Calculate available tokens with 70% target utilization
5	Compare: `totalFilePlusPromptTokenCount > availableContextTokens`

const contextOccupiedFraction = contextOccupiedPercent / 100;
const targetContextUsePercent = 0.7;
const targetContextUsage = targetContextUsePercent * (1 - contextOccupiedFraction);
const availableContextTokens = Math.floor(modelRemainingContextLength * targetContextUsage);

If totalFileTokenCount + userPromptTokenCount > availableContextTokens
    → retrieval
Else
    → inject-full-content

{
  totalTokensInContext: number,      // total tokens in context
  modelContextLength: number,        // model context size
  modelRemainingContextLength: number, // remaining tokens available
  contextOccupiedPercent: number     // percentage filled
}

Status	Message
Deciding	`Deciding how to handle the document(s)...`
Loading parser	`Loading parser for {filename}...`
Parser loaded	`{library} loaded for {filename}...`
Processing	`Parsing file {filename}... ({progress}%)`
Retrieval	`Retrieving relevant citations for user query...`
Done	`Retrieved {N} relevant citations for user query`

import {
  text,
  type Chat,
  type ChatMessage,
  type FileHandle,
  type LLMDynamicHandle,
  type PredictionProcessStatusController,
  type PromptPreprocessorController,
} from "@lmstudio/sdk";

┌─────────────────────────────────────────────────────────────┐
│                    User Message + Files                      │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                     preprocess()                             │
│  - Load history                                              │
│  - Filter files (no images)                                  │
│  - Choose strategy                                           │
└─────────────────────────────────────────────────────────────┘
                              │
              ┌───────────────┴───────────────┐
              │                               │
              ▼                               ▼
    ┌─────────────────────┐       ┌─────────────────────┐
    │  inject-full-content │       │      retrieval      │
    │                      │       │                     │
    │  - Parse all files   │       │  - Load embeddings  │
    │  - Format content    │       │  - Semantic search  │
    │  - Build prompt      │       │  - Filter by score  │
    └─────────────────────┘       │  - Build prompt      │
                                  └─────────────────────┘
                                              │
                              ┌───────────────┴───────────────┐
                              │                               │
                              ▼                               ▼
                    ┌─────────────────┐           ┌─────────────────┐
                    │  Results found  │           │  No results     │
                    │  - Add citations│           │  - Inform user  │
                    │  - Continue     │           │  - Continue     │
                    └─────────────────┘           └─────────────────┘

Parameter	Type	Description
`retrievalLimit`	number	Maximum number of citations to retrieve
`retrievalAffinityThreshold`	number	Relevance threshold for filtering citations (0.0–1.0)

top

Prompt Preprocessor

Overview

Main Function: `preprocess`

Context Injection Strategies

1. `inject-full-content` — Full Content Injection

2. `retrieval` — Semantic Search

3. `none` — No Context

Strategy Selection Algorithm

Algorithm Steps

Calculation Formula

Selection Criteria

Helper Functions

`measureContextWindow()`

`getEffectiveContextFormatted()`

`prepareRetrievalResultsContextInjection()`

`prepareDocumentContextInjection()`

Configuration

User Status Messages

Debug Output

Dependencies

Architecture Diagram

top

Prompt Preprocessor

Overview

Main Function: preprocess

Context Injection Strategies

1. inject-full-content — Full Content Injection

2. retrieval — Semantic Search

3. none — No Context

Strategy Selection Algorithm

Algorithm Steps

Calculation Formula

Selection Criteria

Helper Functions

measureContextWindow()

getEffectiveContextFormatted()

prepareRetrievalResultsContextInjection()

prepareDocumentContextInjection()

Configuration

User Status Messages

Debug Output

Dependencies

Architecture Diagram

Main Function: `preprocess`

1. `inject-full-content` — Full Content Injection

2. `retrieval` — Semantic Search

3. `none` — No Context

`measureContextWindow()`

`getEffectiveContextFormatted()`

`prepareRetrievalResultsContextInjection()`

`prepareDocumentContextInjection()`