-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDExaminer data support #349
Comments
Hello, |
I have created a pull request that includes the One main reason for keeping replicates is that we use that type of data for other HDX-MS statistical analysis packages, so it is nice to be able to work with the same original data file for different applications. Leaving as replicates does tend to rather inflate the coverage plots, but I appreciate being able to see any replicate to replicate variability there (obviously there are other ways to do this as well). My preference is to keep the replicates within a single Is the Currently all of the HDExaminer outputs I am working with are for unpublished projects, but I'll see if I can track down something I am able to share. |
With respect to The format doesnt have to be all the same, so there can be DynamX formatted peptide output data files, or HDExaminer formatted output files, as long as the metadata specifies which format it is, and then a reader function can take that metadata and read tables depending on which format was used. Ideally also there should be some agreement between users on which fields the returned dataframes are; eg is it 'time' , 'exposure' or 'exposure_time' (and units); d-uptake, uptake; should there be a m0 field, etc |
PyHDX currently only directly accepts data formatted as 'state data' output from DynamX
The issue is a continuation of discussion opened by @tuttlelm at #348:
It would be great to add support for other file formats such as HDExaminer data.
A couple of questions:
Why would you prefer to leave the replicates in the data and not average them before entering the
HDXMeasurment
object? Do you want to perform downstream calculations on each replicate individually?In the latter the case would it make sense to make one
HDXMeasurment
object per replicate?Perhaps you could share your input script or make a pull request with your changes to
models.py
?To be honest I think that the current
HDXMeasurement
object has become a bit of a clumsy thing to work with at the moment. I'm planning to change it in the future (probably in the form of a different project altogether).There is also the hdxms-datasets package, which is still in a beta phase. Maybe you can also share your thoughts on this. The idea there is that there is a datasets format with a
.yaml
specification example containing all required metadata such that downstream packages likePyHDX
can load data from there directly. Ultimately, it would be nice to add support there for 1) cluster data (replicates) 2) HDExaminer output 3) other formats.Again, also there currently only DynamX state data is supported, simply because thats the only example data I have at the moment.
Do you have any example datasets of HDExaminer data you can share and/or example scripts of how you load the data?
The text was updated successfully, but these errors were encountered: