Documentation
lms load
Reference
The lms load
command loads a model into memory. You can optionally set parameters such as context length, GPU offload, and TTL.
[path] (optional) : string
The path of the model to load. If not provided, you will be prompted to select one
--ttl (optional) : number
If provided, when the model is not used for this number of seconds, it will be unloaded
--gpu (optional) : string
How much to offload to the GPU. Values: 0-1, off, max
--context-length (optional) : number
The number of tokens to consider as context when generating text
--identifier (optional) : string
The identifier to assign to the loaded model for API reference
Load a model into memory by running the following command:
lms load <model_key>
You can find the model_key
by first running lms ls
to list your locally downloaded models.
Optionally, you can assign a custom identifier to the loaded model for API reference:
lms load <model_key> --identifier "my-custom-identifier"
You will then be able to refer to this model by the identifier my_model
in subsequent commands and API calls (model
parameter).
You can set the context length when loading a model using the --context-length
flag:
lms load <model_key> --context-length 4096
This determines how many tokens the model will consider as context when generating text.
Control GPU memory usage with the --gpu
flag:
lms load <model_key> --gpu 0.5 # Offload 50% of layers to GPU lms load <model_key> --gpu max # Offload all layers to GPU lms load <model_key> --gpu off # Disable GPU offloading
If not specified, LM Studio will automatically determine optimal GPU usage.
Set an auto-unload timer with the --ttl
flag (in seconds):
lms load <model_key> --ttl 3600 # Unload after 1 hour of inactivity
lms load
supports the --host
flag to connect to a remote LM Studio instance.
lms load <model_key> --host <host>
For this to work, the remote LM Studio instance must be running and accessible from your local machine, e.g. be accessible on the same subnet.