OpenConceptLab/ocl_issues#2477 | repo external exports#881
Conversation
filiperochalopes
left a comment
There was a problem hiding this comment.
Was the download audit intentionally left out?
Do we currently have any way to audit which users downloaded files through AWS? This audit trail is needed for some decisions @askanter needs to make regarding which versions should be kept and which users are still using those versions.
Also, locally I’ve had some issues where the 302 redirect did not always happen. I’m currently using a local patch to cover redirect validation to the signed bucket path.
It would be useful to add an integration test ensuring that the download GET works correctly and results in a 302 redirect.
|
|
||
| return path | ||
|
|
||
| def get_external_export_path(self, key, filename): |
There was a problem hiding this comment.
Simple sanitization using .replace() for slashes might be insufficient and risky.
- Path Traversal: It doesn't handle '..' sequences, which can lead to unpredictable path resolution.
- OS Compatibility: It preserves characters like ':', '*', '<', '>', which are invalid on Windows systems.
- Storage/URL Issues: Spaces and special characters are kept, which often break S3 signed URLs and Content-Disposition headers during download.
Recommendation: Use django.utils.text.get_valid_filename(filename) for a robust, industry-standard solution.
from django.utils.text import get_valid_filename
def get_external_export_path(self, key, filename):
base_path = self.get_version_export_path(suffix=None).rstrip(".")
safe_filename = get_valid_filename(filename)
return f"{base_path}/external/{key}_{safe_filename}"|
|
||
| class Meta: | ||
| model = RepoExternalExport | ||
| fields = ('key', 'description', 'url', 'created_at', 'updated_at', 'file_path') |
There was a problem hiding this comment.
Is file_path necessary here? Maybe remove it for security reasons, preventing disclosure of filesystem paths
Linked Issue
Refs OpenConceptLab/ocl_issues#2477
OCL Web2 PR: OpenConceptLab/oclweb2#36