Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Representing Spectra Collection #222

Open
YasinEl opened this issue Sep 10, 2024 · 3 comments
Open

Representing Spectra Collection #222

YasinEl opened this issue Sep 10, 2024 · 3 comments
Assignees
Milestone

Comments

@YasinEl
Copy link

YasinEl commented Sep 10, 2024

Hello, and thank you for maintaining this tool!

We propose adding the following parameters to the MTD table section allowing adding mgf files or other files types like msp with consensus scans for the dataset (e.g., could be the result from spectral clustering like mscluster or consensus MS2 scans from feature extraction software). To make this possible I suggest the following parameters:

“spectral_representation[x]-location”: filename/filepath to the file
“spectral_representation[x]-key-mz”: key used for precursor mz in the file
“spectral_representation[x]-key-rt”: key used for precursor retention time in the file
“spectral_representation[x]-key-rt-unit”: unit of rt in file (minutes or seconds)
“spectral_representation[x]-key-mslevel”: can be numeric (e.g. most commonly 2) or the key giving the level (sometimes mgf files can include ms1 and ms2 or ms3, ms4, etc)

“Spectral_representation[x]-key-featureID”: key used for feature table that can be matched to feature ids in SMF table

@nilshoffmann
Copy link
Member

nilshoffmann commented Nov 14, 2024

Thanks for your input on this. I think it makes sense to locate this information in the metadata part. For location, I would propose to use URIs, as we have done also for other file references in mzTab-M (e.g. the ms_run location). Could you add an example here how key-mz and key-rt would look like? Please note that key-rt-unit may not be necessary unless this is needed for external reference since we decided to always represent retention time in seconds within mzTab-M. How would key-featureID look like? Would this be a bar separated list of feature ids?

@YasinEl
Copy link
Author

YasinEl commented Nov 18, 2024

Thank you for implementing!

Agree regarding the metadata part and URL for location.

Here is an example for what this would look like for the mgf below. key-rt-unit is needed because it points to mgf/msp or other formats outside mzTab-M. key-featureID is the key used in the mgf/msp which points to the feature the scan is associated with in the SMF table.


spectral_representation[x]-location: "path/to/mzmineOutput.mgf"
spectral_representation[x]-key-mz: "PEPMASS"
spectral_representation[x]-key-rt: "RTINSECONDS"
spectral_representation[x]-key-rt-unit: "seconds"
spectral_representation[x]-key-mslevel: "MSLEVEL"
spectral_representation[x]-key-featureID: "FEATURE_ID"

one entry from path/to/mzmineOutput.mgf:

BEGIN IONS
FEATURE_ID=2
MSLEVEL=2
RTINSECONDS=18.91
PEPMASS=110.00862
CHARGE=1+
MERGED_SCANS=1451,1698,1944,1572,1326,1818,2064,2310,2556
MERGED_STATS=9 / 10 (0 removed due to low quality, 1 removed due to low cosine).
FILENAME=015_Sa02_Water_POS.mzML;015_Sa02_Water_POS.mzML;015_Sa02_Water_POS.mzML;021_Sa07_Water_POS.mzML;021_Sa07_Water_POS.mzML;021_Sa07_Water_POS.mzML;021_Sa07_Water_POS.mzML;021_Sa07_Water_POS.mzML;021_Sa07_Water_POS.mzML
SCANS=2
Num peaks=116
55.018024 0.862
55.054058 0.465
56.964603 0.439
56.99868 0.527
57.489437 0.8
END IONS

@nilshoffmann
Copy link
Member

  • Check PSM mechanism in mzTab 1 for proteomics for reference

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants