Warn dataset owners when dataset resources are HTTP; replace with HTTPS if not 404 #2985
Labels
component/catalog
Related to catalog component playbooks/roles
H2.0/Harvest-Runner
Harvest Source Processing for Harvesting 2.0
not-mvp
User Story
In order to maintain trust and accessibility of datasets we index (by preventing browser warnings), we want to ensure catalog.data.gov doesn't generate mixed http/https content.
Acceptance Criteria
[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]
AND there is an equivalent https:// link that doesn't error
WHEN a harvest of that data source happens
THEN a warning about the HTTP link is generated in the harvest report
AND the HTTPS link is the one that's recorded.
AND there is NO equivalent https:// link that doesn't error
WHEN a harvest of that data source happens
THEN a warning about the HTTP link is generated in the harvest report
AND the HTTP link should not be recorded.
Background
https://blog.chromium.org/2020/02/protecting-users-from-insecure.html
Lynda reported this.
https://catalog.data.gov/dataset/fws-critical-habitat-for-threatened-and-endangered-species-datasetd55fc
Links to http resources, and chrome (and other modern web browsers) will block these downloads.
Example of the problem (at the time of issue creation):
http://ecos.fws.gov/docs/crithab/crithab_all/crithab_all_layers.zip
http://ecos.fws.gov/docs/crithab/crithab_all/crithab_all_shapefiles.zip
Security Considerations (required)
This change prevents catalog.data.gov from ever presenting mixed-content.
Sketch
[Notes or a checklist reflecting our understanding of the selected approach]
Note the upstream issue we filed.
The text was updated successfully, but these errors were encountered: