Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a safety check for the pull mechanism to ignore HTML pages #14

Open
alexskr opened this issue Apr 12, 2018 · 1 comment
Open

add a safety check for the pull mechanism to ignore HTML pages #14

alexskr opened this issue Apr 12, 2018 · 1 comment
Assignees

Comments

@alexskr
Copy link
Member

alexskr commented Apr 12, 2018

sometimes users put incorrect pull URL location which causes ncbo_cron to pull html pages and create a large number of bad sumissions. ideally script should make a quick determination if the pulled file is HTML document.

@alexskr
Copy link
Member Author

alexskr commented Apr 26, 2019

might as well add a comprehensive file type detection/verification mechanism to filter out files like images, html, js

syphax-bouazzouni referenced this issue in lifewatch-eric/ncbo_cron Jul 31, 2023
…missions-files

Feature: add option force re-archiving  to archive_old_submissions script
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants