Documentation
Getting Started
Predicting with LLMs
Agentic Flows
Text Embedding
Tokenization
Getting Started
Predicting with LLMs
Agentic Flows
Text Embedding
Tokenization
Get Context Length
LLMs and embedding models, due to their fundamental architecture, have a property called context length
, and more specifically a maximum context length. Loosely speaking, this is how many tokens the models can "keep in memory" when generating text or embeddings. Exceeding this limit will result in the model behaving erratically.
get_context_length()
function on the model objectIt's useful to be able to check the context length of a model, especially as an extra check before providing potentially long input to the model.
context_length = model.get_context_length()
The model
in the above code snippet is an instance of a loaded model you get from the llm.model
method. See Manage Models in Memory for more information.
You can determine if a given conversation fits into a model's context by doing the following:
import lmstudio as lms
def does_chat_fit_in_context(model: lms.LLM, chat: lms.Chat) → bool:
# Convert the conversation to a string using the prompt template.
formatted = model.apply_prompt_template(chat)
# Count the number of tokens in the string.
token_count = len(model.tokenize(formatted))
# Get the current loaded context length of the model
context_length = model.get_context_length()
return token_count < context_length
model = lms.llm()
chat = lms.Chat.from_history({
"messages": [
{ "role": "user", "content": "What is the meaning of life." },
{ "role": "assistant", "content": "The meaning of life is..." },
# ... More messages
]
})
print("Fits in context:", does_chat_fit_in_context(model, chat))
On this page
Use the get_context_length() function on the model object
Example: Check if the input will fit in the model's context window