-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data generation for PDBDev reports #79
Comments
@sureshhewabi what's the plan? i'm happy to help/test |
@colin-combe is the expert of the MzIdentML parser and I hope he can help you on this. @aozalevsky,Could you please mention your requirement here? I also can help @colin-combe on this matter. |
i have a half working way of doing this, by creating an sqlite DB and then querying it using the queries from the API endpoints. So, as long as people are fine with this temporary sqlite file being created, this fairly straightforward and i'll have something for you to look at and test next week. |
sure, i'm happy to start testing asap. sqlite implementation sounds ok for me. plus we can use in-memory sqlite to avoid dealing with additional files/os locks. |
My requirements (from the previous issue):
|
the version in #84 Its currently trying to get all the residue pairs at once. hmm. i'll look into it a bit more. The entry point is |
it seems it does eventually work |
i'm testing it with PXD036833 (your main test dataset, right? @aozalevsky), which takes a long time to parse anyway. |
great! i'll check it out and see if i can help/profile the query and/or the code. Now we also have PXD035508, PXD035519, and PXD035362 if that helps. |
i added json encoding for it to #84 it takes a long time. I think sqlite doesn't like all those joins, seems like it was less of a problem with postgres. |
OK, added another commit to #84 The time to get the summary of sequences and residue pairs shouldn't be much more than the time to parse ('convert') the file. The thing that's not working is the in-memory sqlite db. I'm trying to share the same in memory sqlite db between parts of the code that make separate connections to it (using the connection string defined at https://github.com/Rappsilber-Laboratory/xi-mzidentml-converter/blob/pride/parser/process_dataset.py#L169). But i think they're not getting the same in memory DB and it isn't working. Maybe someone can make some suggestions or help with this. |
No description provided.
The text was updated successfully, but these errors were encountered: