-
Notifications
You must be signed in to change notification settings - Fork 42
[WIP] Added Support for importing model from lm studio #679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
|
||
| async def install(self): | ||
| input_model_path = self.source_id_or_path | ||
| if not input_model_path or not os.path.isfile(input_model_path): |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
user-provided value
This path depends on a
user-provided value
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 5 days ago
To fix this vulnerability, the code must ensure that any user-supplied filename or path, when resolved, is guaranteed to remain within the intended LM Studio models directory. This should be enforced before any sensitive file system operation is performed.
Specifically, in the LMStudioModel __init__ method (or in install), after constructing the path but before assignment, the code should:
- Normalize the constructed path (using
os.path.abspathandos.path.normpath). - Check that it is contained within the expected models directory (
lmstudio_models_dir()), using a strong prefix match (startswithonly after normalization). - If this check fails, raise an exception or handle the import as invalid.
Changes to make:
- In
LMStudioModel.__init__, validate that the resolved absolute model path is inside the LM Studio models directory. If not, raise a ValueError. - This ensures that even if the user supplies a malicious path, they cannot escape the designated directory.
Implementation specifics:
- Add a
models_dirvariable in__init__, resolve it vialmstudio_models_dir(). - Use
os.path.abspath(os.path.join(models_dir, model_path))(or similar) to resolve any model_id as a file underneath the intended directory. - Normalize and check with
startswithas in the recommendation. - Raise exception if check fails.
-
Copy modified lines R34-R46 -
Copy modified line R51
| @@ -31,12 +31,24 @@ | ||
|
|
||
| class LMStudioModel(basemodel.BaseModel): | ||
| def __init__(self, model_path: str): | ||
| filename = os.path.basename(model_path) | ||
| # Validate that the user-supplied path is inside the LMStudio models directory | ||
| models_dir = lmstudio_models_dir() | ||
| if not models_dir: | ||
| raise ValueError("Could not locate LM Studio models directory.") | ||
| # Support only files underneath the models_dir | ||
| # This ensures no path traversal outside the safe root | ||
| abs_models_dir = os.path.abspath(models_dir) | ||
| abs_model_path = os.path.abspath(os.path.join(abs_models_dir, model_path)) | ||
| if not abs_model_path.startswith(abs_models_dir + os.sep): | ||
| raise ValueError("Requested model path is outside the LM Studio models directory.") | ||
| if not os.path.isfile(abs_model_path): | ||
| raise FileNotFoundError(f"Model file not found: {abs_model_path}") | ||
| filename = os.path.basename(abs_model_path) | ||
| super().__init__(model_id=filename) | ||
|
|
||
| self.source = "lmstudio" | ||
| self.name = f"{os.path.splitext(filename)[0]} (LM Studio)" | ||
| self.source_id_or_path = os.path.abspath(model_path) | ||
| self.source_id_or_path = abs_model_path | ||
| self.model_filename = filename | ||
|
|
||
| async def get_json_data(self): |
| output_filename = self.id | ||
| output_path = os.path.join(get_models_dir(), output_filename) | ||
|
|
||
| if os.path.exists(output_path): |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
user-provided value
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 5 days ago
To fix the uncontrolled path expression problem, we must ensure that the constructed output path (output_path) is strictly constrained to a subdirectory of the intended models directory. This is best done by:
- Sanitizing the user-controlled
output_filenameusing a function likewerkzeug.utils.secure_filenameto strip/replace potentially dangerous characters. - Resolving the final path (using
os.path.abspathoros.path.realpath) and confirming that it starts with the trusted base directory returned byget_models_dir(). If it doesn't, we must reject the operation. - Optionally, add a check to block absolute paths or path components containing directory traversal (
..).
The main places to edit:
- In
transformerlab/models/lmstudiomodel.py, methodinstall():- Sanitize
output_filenamewithsecure_filename. - After constructing
output_path, normalize and check that it is within the directory returned byget_models_dir(). If not, raise an error.
- Sanitize
New import(s) for secure_filename may be required.
-
Copy modified line R7 -
Copy modified lines R66-R73
| @@ -4,7 +4,7 @@ | ||
| import json | ||
| import errno | ||
| import shutil | ||
|
|
||
| from werkzeug.utils import secure_filename | ||
| _LM_MODEL_EXTS = (".gguf", ".safetensors", ".pt", ".bin") | ||
|
|
||
|
|
||
| @@ -63,13 +63,18 @@ | ||
|
|
||
| from lab.dirs import get_models_dir | ||
|
|
||
| output_filename = self.id | ||
| output_path = os.path.join(get_models_dir(), output_filename) | ||
| output_filename = secure_filename(self.id) | ||
| models_dir = get_models_dir() | ||
| output_path = os.path.join(models_dir, output_filename) | ||
| output_path = os.path.abspath(output_path) | ||
| # Ensure output_path is within models_dir | ||
| models_dir_abs = os.path.abspath(models_dir) | ||
| if not output_path.startswith(models_dir_abs + os.sep): | ||
| raise ValueError("Invalid model id/path.") | ||
|
|
||
| if os.path.exists(output_path): | ||
| raise FileExistsError(errno.EEXIST, "Model already exists", output_path) | ||
| os.makedirs(output_path, exist_ok=True) | ||
|
|
||
| link_name = os.path.join(output_path, output_filename) | ||
| os.symlink(input_model_path, link_name) | ||
|
|
|
|
||
| if os.path.exists(output_path): | ||
| raise FileExistsError(errno.EEXIST, "Model already exists", output_path) | ||
| os.makedirs(output_path, exist_ok=True) |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
user-provided value
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 5 days ago
The best way to fix the problem is to validate that any untrusted path segment (here, output_filename derived from user input in self.id) does not contain directory traversal or absolute path elements before using it to construct file system paths. In particular:
- Normalize the final constructed path (
output_path) usingos.path.normpath. - Verify that the resulting normalized path starts with the intended root directory (
get_models_dir()), after normalization. - Optionally, ensure
output_filenameis a "secure" filename, e.g., usingwerkzeug.utils.secure_filenameor by allowing only whitelisted model IDs. - If the path is not contained in the intended directory after normalization, abort the installation and report an error.
This affects the install method in LMStudioModel in transformerlab/models/lmstudiomodel.py, specifically around the lines where output_path is constructed and used. We will add path normalization, containment checking, and filename sanitization. We need to import werkzeug.utils.secure_filename at the top.
-
Copy modified line R7 -
Copy modified lines R66-R74
| @@ -4,7 +4,7 @@ | ||
| import json | ||
| import errno | ||
| import shutil | ||
|
|
||
| from werkzeug.utils import secure_filename | ||
| _LM_MODEL_EXTS = (".gguf", ".safetensors", ".pt", ".bin") | ||
|
|
||
|
|
||
| @@ -63,9 +63,15 @@ | ||
|
|
||
| from lab.dirs import get_models_dir | ||
|
|
||
| output_filename = self.id | ||
| output_path = os.path.join(get_models_dir(), output_filename) | ||
|
|
||
| # Sanitize and validate filename | ||
| output_filename = secure_filename(self.id) | ||
| if not output_filename: | ||
| raise ValueError("Invalid model filename from model id.") | ||
| models_root = os.path.abspath(get_models_dir()) | ||
| output_path = os.path.normpath(os.path.join(models_root, output_filename)) | ||
| # Ensure output_path is inside models_root after normalization | ||
| if not output_path.startswith(models_root + os.sep): | ||
| raise ValueError("Attempted path traversal detected in model id.") | ||
| if os.path.exists(output_path): | ||
| raise FileExistsError(errno.EEXIST, "Model already exists", output_path) | ||
| os.makedirs(output_path, exist_ok=True) |
| os.makedirs(output_path, exist_ok=True) | ||
|
|
||
| link_name = os.path.join(output_path, output_filename) | ||
| os.symlink(input_model_path, link_name) |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
user-provided value
This path depends on a
user-provided value
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 5 days ago
To fix this issue, we want to ensure that the input_model_path used in os.symlink is always inside the trusted LM Studio models directory, so that the path cannot be abused by a malicious user. This can be achieved by normalizing the input path and checking that it starts with the trusted models directory's absolute path. If it does not, an exception should be raised. This check should be performed inside the install method of LMStudioModel before os.symlink is called.
Specifically, in transformerlab/models/lmstudiomodel.py:
- Inside
LMStudioModel.install, after assigninginput_model_path, retrieve the root trusted directory (lmstudio_models_dir()), useos.path.abspath+os.path.normpathon both the user-controlled path (input_model_path) and the root, and check that the user-controlled path starts with the root. - If the check fails, raise an exception and do not proceed.
- Add import of the utility if not already present.
No existing functionality will be broken, and legitimate users are not affected.
-
Copy modified lines R64-R69
| @@ -61,6 +61,12 @@ | ||
| if not input_model_path or not os.path.isfile(input_model_path): | ||
| raise FileNotFoundError(errno.ENOENT, os.strerror(errno.ENOENT), input_model_path) | ||
|
|
||
| # Validate that input_model_path is inside the trusted lmstudio models dir | ||
| trusted_root = os.path.abspath(os.path.normpath(lmstudio_models_dir())) | ||
| normalized_input_path = os.path.abspath(os.path.normpath(input_model_path)) | ||
| if not (normalized_input_path.startswith(trusted_root + os.sep)): | ||
| raise PermissionError(f"Model file path {input_model_path} is outside of trusted models directory {trusted_root}.") | ||
|
|
||
| from lab.dirs import get_models_dir | ||
|
|
||
| output_filename = self.id |
| os.makedirs(output_path, exist_ok=True) | ||
|
|
||
| link_name = os.path.join(output_path, output_filename) | ||
| os.symlink(input_model_path, link_name) |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
user-provided value
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 5 days ago
To mitigate path traversal and arbitrary file writes, the best practice is to sanitize and constrain the filenames/paths derived from user input before using them in any filesystem operation. Specifically:
- Only allow simple, safe filenames for models (
output_filename). - Use
werkzeug.utils.secure_filename()to strip dangerous characters and patterns. - Optionally, check that the resulting path after normalization is within the intended workspace/models root directory.
For the provided code, this means:
- In
LMStudioModel.install, before usingself.id(output_filename) to constructoutput_pathandlink_name, sanitize it viasecure_filename. - Since
secure_filenameis only imported intransformerlab/routers/model.py, add an import intransformerlab/models/lmstudiomodel.py. - Replace the value of
output_filenamewith the sanitized variant.
Files/regions/lines to change:
- Add
from werkzeug.utils import secure_filenamenear the top oftransformerlab/models/lmstudiomodel.py. - In
install, before usingself.id, sanitize it:output_filename = secure_filename(self.id). - Use
output_filenameas usual below, now guaranteed to be safe.
-
Copy modified line R7 -
Copy modified line R66
| @@ -4,7 +4,7 @@ | ||
| import json | ||
| import errno | ||
| import shutil | ||
|
|
||
| from werkzeug.utils import secure_filename | ||
| _LM_MODEL_EXTS = (".gguf", ".safetensors", ".pt", ".bin") | ||
|
|
||
|
|
||
| @@ -63,7 +63,7 @@ | ||
|
|
||
| from lab.dirs import get_models_dir | ||
|
|
||
| output_filename = self.id | ||
| output_filename = secure_filename(self.id) | ||
| output_path = os.path.join(get_models_dir(), output_filename) | ||
|
|
||
| if os.path.exists(output_path): |
| } | ||
|
|
||
| model_info_file = os.path.join(output_path, "index.json") | ||
| with open(model_info_file, "w") as f: |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
user-provided value
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 5 days ago
To address this problem, we should sanitize or validate the use of user-controlled data in filesystem paths. The best general-purpose approach for this codebase is to ensure that the output_filename (derived from model_id) is a safe filename. This can be achieved by using werkzeug.utils.secure_filename, which strips dangerous characters and sub-paths (slashes, path traversal, etc.) from filenames, thereby preventing directory traversal and related issues.
The fix should be applied at the point where output_filename is derived (in LMStudioModel.install):
- Import
secure_filenamefromwerkzeug.utilsif not already done in this file. - Replace the assignment to
output_filenamewith a version that passesself.idthroughsecure_filename.
This ensures no untrusted or unsafe input is used as part of any file or directory name, mitigating the risk of path traversal regardless of any other possible input validation or lack thereof.
-
Copy modified line R7 -
Copy modified line R66
| @@ -4,7 +4,7 @@ | ||
| import json | ||
| import errno | ||
| import shutil | ||
|
|
||
| from werkzeug.utils import secure_filename | ||
| _LM_MODEL_EXTS = (".gguf", ".safetensors", ".pt", ".bin") | ||
|
|
||
|
|
||
| @@ -63,7 +63,7 @@ | ||
|
|
||
| from lab.dirs import get_models_dir | ||
|
|
||
| output_filename = self.id | ||
| output_filename = secure_filename(self.id) | ||
| output_path = os.path.join(get_models_dir(), output_filename) | ||
|
|
||
| if os.path.exists(output_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably use this pattern anywhere that it is complaining about user-assigned values for paths.
|
@dadmobile to review |
…ab-api into add/import-lm-studio
|
Is this still WIP?
|
| return ollamamodel.OllamaModel(model_source_id) | ||
| case "huggingface": | ||
| return huggingfacemodel.HuggingFaceModel(model_source_id) | ||
| case "lmstudio": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you'll also have to add a case for "lmstudio" to the list_models_from_source function so they show up on the import modal.
| } | ||
|
|
||
| model_info_file = os.path.join(output_path, "index.json") | ||
| with open(model_info_file, "w") as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably use this pattern anywhere that it is complaining about user-assigned values for paths.
| @@ -0,0 +1,18 @@ | |||
| { | |||
| "name": "Google LM Studio Server", | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's pull this for now and discuss if we actually want to add a server or not.
|
Okay @dadmobile Probably what we can do is close this PR for now and keep this as a backup and later discuss about it separately and I will open it once again |
Fixes transformerlab/transformerlab-app#160
Should I also go ahead and add a lmstudio server in plugins? Because lmstudio runs on localhost:1234