Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

only allow relative paths in `load_uploaded_files? #461

Open
soxofaan opened this issue Sep 4, 2023 · 6 comments
Open

only allow relative paths in `load_uploaded_files? #461

soxofaan opened this issue Sep 4, 2023 · 6 comments

Comments

@soxofaan
Copy link
Member

soxofaan commented Sep 4, 2023

"name": "paths",
"description": "The files to read. Folders can't be specified, specify all files instead. An exception is thrown if a file can't be read.",
"schema": {
"type": "array",
"subtype": "file-paths",
"items": {
"type": "string",
"subtype": "file-path",
"pattern": "^[^\r\n\\:'\"]+$"

Current regex allows paths that escape the user workspace, like /etc/passwd and ../../etc/passwd.
Wouldn't it be cleaner to at least forbid absolute paths (starting with /)?

@m-mohr
Copy link
Member

m-mohr commented Sep 4, 2023

The file paths are not meant to be mapped to internal structures. The base path that should always be the user workspace, so I'd expect a folder/file.txt in the user workspace to be accessible via /folder/file.txt and folder/file.txt and ./folder/file.txt. A backend should ensure a user can never escape that base path. Maybe we need to clarify this further.

@soxofaan
Copy link
Member Author

soxofaan commented Sep 4, 2023

I'm also thinking about being future proof (e.g. loading files from another user), or allowing back-ends to go further than the official spec and workflows (e.g. supporting an out-of-band upload mechanism that uploads to something like /projects/ESA-123/data456.nc). If the standard regex is too liberal, there might be not enough wiggle room for these extensions.
That's why I'm suggesting to make sure the current standard only allows pure relative paths.

@m-mohr
Copy link
Member

m-mohr commented Sep 5, 2023

For other users, I'd rather use either a new parameter to select the data source or load_url. Encoding that in the path and as such exposing internals doesn't seem like a good idea. Similarly, if you have other data sources, or alternatively a new process.

@m-mohr m-mohr added this to the 2.0.0 milestone Sep 30, 2023
@m-mohr
Copy link
Member

m-mohr commented Jan 3, 2024

I don't see a direct todo here, closing for now. If you think differently, please open a PR with a proposal.

@m-mohr m-mohr closed this as not planned Won't fix, can't repro, duplicate, stale Jan 3, 2024
@soxofaan
Copy link
Member Author

soxofaan commented Jan 4, 2024

A backend should ensure a user can never escape that base path. Maybe we need to clarify this further.

I think this is at least a todo here

@m-mohr
Copy link
Member

m-mohr commented Jan 4, 2024

PRs are welcome.

@m-mohr m-mohr reopened this Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants