Skip to content

Commit

Permalink
Replace vendored scripts with augur curate
Browse files Browse the repository at this point in the history
Vendored scripts for the curate rule have been ported to `augur curate`
and are available starting from Augur 25.0.0.¹

¹ <https://github.com/nextstrain/augur/releases/tag/25.0.0>
  • Loading branch information
j23414 committed Jul 14, 2024
1 parent 7520424 commit c06187a
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 7 deletions.
2 changes: 2 additions & 0 deletions ingest/defaults/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,8 @@ curate:
strain_backup_fields: ["accession"]
# List of date fields to standardize to ISO format YYYY-MM-DD
date_fields: ["date", "date_released", "date_updated"]
# The expected field that contains the GenBank geo_loc_name
genbank_location_field: location
# List of expected date formats that are present in the date fields provided above
# These date formats should use directives expected by datetime
# See https://docs.python.org/3.9/library/datetime.html#strftime-and-strptime-format-codes
Expand Down
15 changes: 8 additions & 7 deletions ingest/rules/curate.smk
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ rule curate:
strain_regex=config["curate"]["strain_regex"],
strain_backup_fields=config["curate"]["strain_backup_fields"],
date_fields=config["curate"]["date_fields"],
genbank_location_field=config["curate"]["genbank_location_field"],
expected_date_formats=config["curate"]["expected_date_formats"],
articles=config["curate"]["titlecase"]["articles"],
abbreviations=config["curate"]["titlecase"]["abbreviations"],
Expand All @@ -85,30 +86,30 @@ rule curate:
shell:
"""
(cat {input.sequences_ndjson} \
| ./vendored/transform-field-names \
| augur curate rename \
--field-map {params.field_map} \
| augur curate normalize-strings \
| ./vendored/transform-strain-names \
| augur curate transform-strain-name \
--strain-regex {params.strain_regex} \
--backup-fields {params.strain_backup_fields} \
| augur curate format-dates \
--date-fields {params.date_fields} \
--expected-date-formats {params.expected_date_formats} \
| ./vendored/transform-genbank-location \
| augur curate parse-genbank-location \
--location-field {params.genbank_location_field} \
| augur curate titlecase \
--titlecase-fields {params.titlecase_fields} \
--articles {params.articles} \
--abbreviations {params.abbreviations} \
| ./vendored/transform-authors \
| augur curate abbreviate-authors \
--authors-field {params.authors_field} \
--default-value {params.authors_default_value} \
--abbr-authors-field {params.abbr_authors_field} \
| ./vendored/apply-geolocation-rules \
| augur curate apply-geolocation-rules \
--geolocation-rules {input.all_geolocation_rules} \
| ./vendored/merge-user-metadata \
| augur curate apply-record-annotations \
--annotations {input.annotations} \
--id-field {params.annotations_id} \
| augur curate passthru \
--output-metadata {output.metadata} \
--output-fasta {output.sequences} \
--output-id-field {params.id_field} \
Expand Down

0 comments on commit c06187a

Please sign in to comment.