-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(matrices): allow other formats for internal matrices storage #2113
base: dev
Are you sure you want to change the base?
Conversation
b986251
to
0f0a6eb
Compare
Hi @MartinBelthle, Given that the script successfully reduces the size of the matrices folder and needs to be run while the app is down, I was thinking it might be a good idea to integrate this script into the application startup process—specifically during the FastAPI setup phase. This way, the migration would happen automatically each time the app starts. Since the script won’t do anything once all the TSV files have been converted to HDF5, it wouldn’t matter if it runs multiple times. If it doesn’t find any TSV files, it simply won’t perform any actions. Of course, we'd need to ensure there’s enough space for the migration, especially on production where the data size could be much larger. Testing on integration and recette environments first, as you suggested, would be crucial. What do you think about this approach? Here is a possible implementation using a FastAPI event: from fastapi import FastAPI
app = FastAPI()
def migrate_tsv_to_hdf5():
print("Migrating TSV files to HDF5 format...")
@app.on_event("startup")
async def startup_event():
migrate_tsv_to_hdf5()
print("Startup event completed.") |
Indeed I think that's a better way to do it. |
I believe that with the new solution Laurent proposed, this PR is mature and can be reviewed. |
Seen with Sylvain, we have to discuss on this |
This PR does several things:
application.yaml
:matrixstore_format
that dictates the internal storage format. Default value is stilltsv
to ensure backward compatibility