You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Parquet allows readers/writers to treat a set of files under a directory as a single dataset, with the option to specify partitions based on data values that are then encoded into the directory structure, for example: roads.parquet/highway=primary/slice=1/part0.parquet
what would be the benefits for open-gira intermediate or results datasets? At the least, it could reduce particularly large file sizes and avoid (or simplify) the concatenation steps as currently implemented.
how would the file/directory structure interact with snakemake? Would we need additional flags or workarounds?
The text was updated successfully, but these errors were encountered:
Parquet allows readers/writers to treat a set of files under a directory as a single dataset, with the option to specify partitions based on data values that are then encoded into the directory structure, for example:
roads.parquet/highway=primary/slice=1/part0.parquet
For Python, the key docs are:
Some questions:
The text was updated successfully, but these errors were encountered: