-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Datasets from SIM-XL, Mascot and ProteomeDiscover in PRIDE #63
Comments
Datasets in PRIDE with "crosslink" or "cross-link" word in TITLE which contains mzIdentML files: Ordered by priority:
Need to check version 1.2, the corresponding peak list and producer |
as noted in meeting, they might not be complete submissions |
wasn't there a "crosslink" tag people were referring to? (I don't know but people spoke of this) |
We will continue with different combinations. We will et you know when errors start to happen. |
OK, great, thanks! |
@sureshhewabi reported the following error in this one: PXD014359 - Error parsing C_Lee_141014_CRM_dialysis_NCE20_2.mzid |
i guess the error message is correct and it is not valid XML |
Another similar error for OpenMS Error parsing XLpeplib_Beveridge_QEx-HFX_DSS_R1.mzid |
Schema seems valid in There should be an issue with the parser. @colin-combe any idea? |
yes, could be a issue with parser. Or perhaps something to do with character encoding. I looked into and was confused. @sureshhewabi - I'm not sure what your |
It is a command to check the schema validity against the schema definition file(xsd) |
one problem is the empty location attribute for spectra data: XLpeplib_Beveridge_QEx-HFX_DSS_R3.mzid, line 527672: It is a required attribute, but empty string is enough to make the file schema valid (http://www.datypic.com/sc/xsd/t-xsd_anyURI.html). But this isn't the only problem, there's something else that's still mysterious... |
This means we cannot use this dataset for us anyway, isn't it? because we cannot find the peaklist file |
we could manually fix the location. |
think this fixes a problem - #64 sorry about that |
PXD021417 Dataset Issues:
|
PXD026603 Dataset Issues:
parser.process_dataset - INFO - parsing AnalysisProtocolCollection- start
|
thanks, will check it |
similar to before - parser was treating things that are optional as if they were required re PXD026603 - the peaklists are missing? |
Yes, peakfile is missing too: |
would these actually have complete submission status? I thought complete submission status wasn't previously being given to crosslinking data? |
this one shouldn't be in DB because the sequences are missing |
re. PXD021417 - maybe lets leave this in for testing purposes |
We have to find out the list of datasets with the following conditions:
Please lets update the list in this issue.
The text was updated successfully, but these errors were encountered: