Releases · googlegenomics/gcp-variant-transforms

18 Oct 20:32

arostamianfar

v0.5.0

ccb26ee

Release v0.5.0

Main changes since last release:

BigQuery to VCF (alpha release): Generate VCF files from any BigQuery table generated using Variant Transforms. This can be useful when working with tools that only operate on VCF files. The available options enable generating small files (e.g. by specifying genomic regions of interest or a subset of samples) to very large files (e.g. exporting an entire BigQuery table as a VCF file).
Annotation type inference: By default, all annotation fields are loaded as strings. Simply add --infer_annotation_types and the pipeline will automatically infer field types from the annotation content.

Assets 2

30 Jul 22:45

arostamianfar

v0.4.2

9c6b229

Release v0.4.2

Patch release that mostly has usability improvements (failing on unrecognized/incompatible flags and additional logging in case of failures). Also includes a small fix for --optimize_for_large_inputs when partitioning is not requested.

Assets 2

16 Jul 18:30

bashir2

v0.4.1

731031b

Release v0.4.1

This is a patch release that makes the following improvements:

The validator/preprocessor tool now catches more mismatch issues between VCF headers and variant records, e.g., type and Number mismatches (Issue #258).
Support for running VEP on GRCh37 based VCF files is added (Issue #201).
A fix in the HTTP request retry logic of google-api-python-client is integrated (details).

Assets 2

11 Jun 19:49

arostamianfar

v0.4.0

cf34b99

Release v0.4.0

Main changes since last release:

Native annotation support: Annotate and import VCF files to BigQuery through a single command that uses our newly published VEP v91 docker image and GRCh38 cache! Check out the documentation for more details.
Native partitioning support: Partition the BigQuery output into any number of (configurable) smaller tables based on chromosome and/or regions. This feature can be used to reduce query cost for large tables especially in applications where particular regions are more heavily queried than others.
Automatic BigQuery schema update on appends: Append data to existing tables even if the BigQuery schema is different (but still compatible) and the schema is automatically updated in such cases.

Assets 2

09 May 21:05

arostamianfar

v0.3.0

5512e44

Release v0.3.0

Main changes since last release:

VCF validator/preprocessor: this is a lightweight tool that can be used to validate the VCF files and check for any inconsistencies in the data prior to loading the full VCF to BigQuery pipeline. Check out the documentation for more details.
Robustness improvements: several new features to enhance robustness of Variant Transforms when dealing with malformed/incomplete data such as setting custom headers when parsing and more accurate header inference in case of missing headers.
Performance improvements: optimizations for merging variants and writing to BigQuery when loading very large inputs (>5TB, >30B variants).
Annotation enhancements (experimental): added support to run VEP natively as part of the pipeline using pre-built docker image and cache files. Check out the documentation for more details.

Assets 2

04 Apr 20:19

arostamianfar

v0.2.0

edbebf6

Release v0.2.0

First release checkpoint of Variant Transforms! Main features of this release:

Highly scalable import of VCF files to BigQuery (500K+ files, TBs of data)
Robust import functions (see --infer_undefined_headers and --allow_incompatible_records)
Annotation support (experimental). Add --annotation_fields when running the pipeline.
See README and documents under the /docs folder for more details.

P.S. We're starting from 0.2.0 instead of 0.1.0 as the initial version at launch should have been 0.1.0.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: googlegenomics/gcp-variant-transforms

Release v0.5.0

Release v0.4.2

Release v0.4.1

Release v0.4.0

Release v0.3.0

Release v0.2.0