Documentation
Getting Started
Predicting with LLMs
Agentic Flows
Text Embedding
Tokenization
Manage Models
Model Info
Getting Started
Predicting with LLMs
Agentic Flows
Text Embedding
Tokenization
Manage Models
Model Info
Cancelling Predictions
One benefit of using the streaming API is the ability to cancel the
prediction request based on criteria that can't be represented using
the stopStrings
or maxPredictedTokens
configuration settings.
The following snippet illustrates cancelling the request in response to an application specification cancellation condition (such as polling an event set by another thread).
import lmstudio as lms
model = lms.llm()
prediction_stream = model.respond_stream("What is the meaning of life?")
cancelled = False
for fragment in prediction_stream:
if ...: # Cancellation condition will be app specific
cancelled = True
prediction_stream.cancel()
# Note: it is recommended to let the iteration complete,
# as doing so allows the partial result to be recorded.
# Breaking the loop *is* permitted, but means the partial result
# and final prediction stats won't be available to the client
# The stream allows the prediction result to be retrieved after iteration
if not cancelled:
print(prediction_stream.result())