Documentation
lms load
Stream logs from LM Studio. Useful for debugging prompts sent to the model.
The lms load
command loads a model into memory. You can optionally set parameters such as context length, GPU offload, and TTL.
[path] (optional) : string
The path of the model to load. If not provided, you will be prompted to select one
--ttl (optional) : number
If provided, when the model is not used for this number of seconds, it will be unloaded
--gpu (optional) : string
How much to offload to the GPU. Values: 0-1, off, max
--context-length (optional) : number
The number of tokens to consider as context when generating text
--identifier (optional) : string
The identifier to assign to the loaded model for API reference
Load a model into memory by running the following command:
lms load <model_key>
You can find the model_key
by first running lms ls
to list your locally downloaded models.
Optionally, you can assign a custom identifier to the loaded model for API reference:
lms load <model_key> --identifier "my-custom-identifier"
You will then be able to refer to this model by the identifier my_model
in subsequent commands and API calls (model
parameter).
You can set the context length when loading a model using the --context-length
flag:
lms load <model_key> --context-length 4096
This determines how many tokens the model will consider as context when generating text.
Control GPU memory usage with the --gpu
flag:
lms load <model_key> --gpu 0.5 # Offload 50% of layers to GPU lms load <model_key> --gpu max # Offload all layers to GPU lms load <model_key> --gpu off # Disable GPU offloading
If not specified, LM Studio will automatically determine optimal GPU usage.
Set an auto-unload timer with the --ttl
flag (in seconds):
lms load <model_key> --ttl 3600 # Unload after 1 hour of inactivity
lms load
supports the --host
flag to connect to a remote LM Studio instance.
lms load <model_key> --host <host>
For this to work, the remote LM Studio instance must be running and accessible from your local machine, e.g. be accessible on the same subnet.
This page's source is available on GitHub
On this page
Parameters
Load a model
Set a custom identifier
Set context length
Set GPU offload
Set TTL
Operate on a remote LM Studio instance