Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to write parquet output #117

Closed
shntnu opened this issue Jan 5, 2021 · 4 comments
Closed

Add option to write parquet output #117

shntnu opened this issue Jan 5, 2021 · 4 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@shntnu
Copy link
Member

shntnu commented Jan 5, 2021

Currently, cyto_utils.output can write out csv, csv.gz, and tsv.

We'd like it to also be able to write out parquet.

The immediate use case we have in mind is converting SQLite backends to Parquet format, which make it a lot easier to work with large single-cell datasets.

This will be kinda redundant with cytominer-database functionality which can directly write out parquet but it will serve our very frequent need of working with existing large SQLite files.

@shntnu
Copy link
Member Author

shntnu commented Jan 5, 2021

I wonder if it is as simple as this (needs to be tested)

pyarrow.parquet.write_table(pyarrow.Table.from_pandas(df), output_filename)

@shntnu
Copy link
Member Author

shntnu commented Jan 5, 2021

I've tested it and that (pyarrow.parquet.write_table(pyarrow.Table.from_pandas(df), output_filename)) is probably all we need to do here

https://colab.research.google.com/drive/1D0J7FiqmzR49IjSKy2lQVtZ8Gez4cDkD?authuser=1#scrollTo=W3FJmSQXgBOd

@gwaybio gwaybio added enhancement New feature or request good first issue Good for newcomers labels Jan 5, 2021
@gwaybio
Copy link
Member

gwaybio commented Jan 7, 2021

related to #112

@gwaybio
Copy link
Member

gwaybio commented Jun 6, 2023

addressed in 2327ebc

@gwaybio gwaybio closed this as completed Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants