Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate the build and deployment of sensitive-species-data.xml #64

Open
1 of 3 tasks
nickdos opened this issue Jul 24, 2024 · 2 comments · Fixed by #66
Open
1 of 3 tasks

Automate the build and deployment of sensitive-species-data.xml #64

nickdos opened this issue Jul 24, 2024 · 2 comments · Fixed by #66
Assignees

Comments

@nickdos
Copy link
Contributor

nickdos commented Jul 24, 2024

With the planned decommissioning of sds-webapp2, the generation of the sensitive-species-data.xml file needs to be moved to an automated process.

Details and history: https://confluence.csiro.au/display/ALASD/sds-webapp2+decommission.
The current manual process using sds-webapp2: https://confluence.csiro.au/display/ALASD/SDS+and+Sensitive+Lists

Evaluate whether AWS Code Pipeline is a suitable tool for automating this task

Requirements:

  • Task should be easy to trigger manually by DM team
  • download latest sds jar from Nexus (or where ever the latest builds are saved to)
  • run the shell command to generate the sensitive-species-data.xml file
  • perform some basic DQ checks on the XML file (file size, check for entries like Callocephalon fimbriatum, etc.)
  • scp the file to the sensitive-ws server (test or prod) with the date as part of the file name. E.g. sensitive-species-data.xml-20240725
  • copy the latest file to sensitive-species-data.xml, overwriting the current version (which should have a copy with its date in the filename, in case of regression).
  • send a notification that it successfully completed (optional)

Check with data team, whether the XML file should be copied into a sds directory on archives.ala.org.au as an additonal step. This would allow all historical versions to be kept without clogging up the files on the sensitive-ws servers.


Steps needed & progress

  • Implement Airflow script to automate the creation of XML file, saving it to s3
  • Have the file served from sensitive-data-service root - e.g., https://sensitive-ws-test.ala.org.au/sensitive-species-data.xml
  • Put a copy onto archives.ala.org.au
@nickdos
Copy link
Contributor Author

nickdos commented Oct 11, 2024

Managed Airflow in AWS uses a Docker image for the "local runner" - where the Python code gets executed. This image can be used locally via this repo: https://github.com/aws/aws-mwaa-local-runner. The image has Java but not the "jar" command (JRE only). So work-around is to build a shaded jar, which allows the createxml.sh shell script to run a single "java" command.

nickdos added a commit that referenced this issue Oct 12, 2024
…ild-and-deployment-of-sensitive-species-dataxml

#64 automate the build and deployment of sensitive species dataxml
@nickdos nickdos reopened this Oct 14, 2024
@nickdos nickdos assigned nickdos and unassigned joe-lipson Nov 8, 2024
@nickdos
Copy link
Contributor Author

nickdos commented Nov 8, 2024

sensitive-data-service pulls static XML files down from sds.ala.org.au during the build phase - see docker-tasks.yml. This needs to be changed to pull from s3 instead.
Not needed as static files from sds.ala.org.au (4 of them) will still be served - just from Cloudfront and S3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants