Skip to content

Commit

Permalink
fixed probelm in tests where pipeline got stuck
Browse files Browse the repository at this point in the history
  • Loading branch information
mvisani committed Feb 6, 2024
1 parent 22e0284 commit d838de6
Show file tree
Hide file tree
Showing 7 changed files with 149 additions and 65 deletions.
177 changes: 128 additions & 49 deletions bindings/sirius/README.md
Original file line number Diff line number Diff line change
@@ -1,55 +1,134 @@
# Sirius
SIRIUS is a Java software for analyzing metabolites from tandem mass spectrometry data.
It combines the analysis of isotope patterns in MS spectra with the analysis of fragmentation patterns in MS/MS spectra,
and uses CSI:FingerID as a web service to search in molecular structure databases.
Further it integrates CANOPUS for de novo compound class prediction.

SIRIUS requires **high mass accuracy data**. The mass deviation of your MS and MS/MS spectra should be within 20 ppm. Mass Spectrometry instruments such as TOF, Orbitrap and FT-ICR usually provide high mass accuracy data, as well as coupled instruments like Q-TOF, IT-TOF or IT-Orbitrap. Spectra measured with a quadrupole or linear trap do not provide the high mass accuracy that is required for our method.

SIRIUS expects MS and MS/MS spectra as input. It is possible to omit the MS data, but it will make the analysis more time consuming and might give you worse results. In this case, you should consider limiting the candidate molecular formulas to those found in PubChem.
# Sirius binding
## Usage
Add this to your `Cargo.toml`:
```toml
[dependencies]
sirius = "0.1"
```
and this to your crate root:
```rust
use sirius::prelude::*;
```

## Examples
In case you have an MGF file you can run Sirius as follows:
```rust
use sirius::prelude::*;
use std::path::Path;
let sirius = SiriusBuilder::<Version5>::default()
.maximal_mz_default().unwrap()
.enable_formula().unwrap()
.enable_zodiac().unwrap()
.enable_fingerprint().unwrap()
.enable_structure().unwrap()
.enable_canopus().unwrap()
.enable_write_summaries().unwrap()
.build();
let input_file_path = Path::new("tests/data/input_sirius.mgf");
let output_file_path = Path::new("tests/data/output_sirius_default");
// Check if the path exists before attempting to remove it
if output_file_path.exists() {
let _ = std::fs::remove_dir_all(output_file_path);
}
sirius.run(input_file_path, output_file_path).unwrap();
```

Or with more options/parameters (the example below uses the parameters used for the ENPKG pipeline):
```rust
use sirius::prelude::*;
use std::path::Path;
let sirius = SiriusBuilder::default()
.maximal_mz(800.0).unwrap()
.isotope_settings_filter(true).unwrap()
.formula_search_db(SearchDB::Bio).unwrap()
.timeout_seconds_per_tree(0).unwrap()
.formula_settings_enforced(AtomVector::from(vec![
Atoms::H,
Atoms::C,
Atoms::N,
Atoms::O,
Atoms::P,
])).unwrap()
.timeout_seconds_per_instance(0).unwrap()
.adduct_settings_detectable(AdductsVector::from(vec![
Adducts::MplusHplus,
Adducts::MplusHminusTwoH2Oplus,
Adducts::MplusNaplus,
Adducts::MplusKplus,
Adducts::MplusH3NplusHplus,
Adducts::MplusHminusH2Oplus,
])).unwrap()
.use_heuristic_mz_to_use_heuristic_only(650).unwrap()
.algorithm_profile(Instruments::Orbitrap).unwrap()
.isotope_ms2_settings(IsotopeMS2Settings::Ignore).unwrap()
.ms2_mass_deviation_allowed_mass_deviation(MassDeviation::Ppm(5.0)).unwrap()
.number_of_candidates_per_ion(1).unwrap()
.use_heuristic_mz_to_use_heuristic(300).unwrap()
.formula_settings_detectable(AtomVector::from(vec![
Atoms::B,
Atoms::Cl,
Atoms::Se,
Atoms::S,
])).unwrap()
.number_of_candidates(10).unwrap()
.zodiac_number_of_considered_candidates_at_300_mz(10).unwrap()
.zodiac_run_in_two_steps(true).unwrap()
.zodiac_edge_filter_thresholds_min_local_connections(10).unwrap()
.zodiac_edge_filter_thresholds_threshold_filter(0.95).unwrap()
.zodiac_epochs_burn_in_period(2000).unwrap()
.zodiac_epochs_number_of_markov_chains(10).unwrap()
.zodiac_number_of_considered_candidates_at_800_mz(50).unwrap()
.zodiac_epochs_iterations(20000).unwrap()
.adduct_settings_enforced_default().unwrap()
.adduct_settings_fallback(AdductsVector::from(vec![
Adducts::MplusHplus,
Adducts::MplusNaplus,
Adducts::MplusKplus,
])).unwrap()
.formula_result_threshold(true).unwrap()
.inject_el_gordo_compounds(true).unwrap()
.structure_search_db(SearchDB::Bio).unwrap()
.recompute_results(false).unwrap()
.enable_formula().unwrap()
.enable_zodiac().unwrap()
.enable_fingerprint().unwrap()
.enable_structure().unwrap()
.enable_canopus().unwrap()
.build();

let input_file_path = Path::new("tests/data/input_sirius.mgf");
let output_file_path = Path::new("tests/data/output_sirius");
// Check if the path exists before attempting to remove it
if output_file_path.exists() {
let _ = std::fs::remove_dir_all(output_file_path);
}
sirius.run(input_file_path, output_file_path).unwrap();
```

You can replace the `input_file_path` and `output_file_path` with your own paths.

<!--begin cite-->
# Citing Sirius

Kai Dührkop, Markus Fleischauer, Marcus Ludwig, Alexander A. Aksenov, Alexey V. Melnik, Marvin Meusel, Pieter C. Dorrestein, Juho Rousu, and Sebastian Böcker,
[SIRIUS 4: Turning tandem mass spectra into metabolite structure information.](https://doi.org/10.1038/s41592-019-0344-8)
*Nature Methods* 16, 299–302, 2019.
<!--end cite-->

## Sirius config

Usage: sirius config [-hV] [--AdductSettings.detectable=[M+H]+,[M+K]+,[M+Na]+,
[M+H-H2O]+,[M+H-H4O2]+,[M+NH4]+,[M-H]-,[M+Cl]-,[M-H2O-H]-,
[M+Br]-] [--AdductSettings.enforced=,] [--AdductSettings.
fallback=[M+H]+,[M-H]-,[M+Na]+,[M+K]+]
[--AlgorithmProfile=default] [--CandidateFormulas=,]
[--CompoundQuality=UNKNOWN]
[--ForbidRecalibration=ALLOWED]
[--FormulaResultRankingScore=AUTO]
[--FormulaResultThreshold=true] [--FormulaSearchDB=none]
[--FormulaSettings.detectable=S,Br,Cl,B,Se]
[--FormulaSettings.enforced=C,H,N,O,P] [--FormulaSettings.
fallback=S] [--InjectElGordoCompounds=True]
[--IsotopeMs2Settings=IGNORE] [--IsotopeSettings.
filter=True] [--IsotopeSettings.multiplier=1]
[--MedianNoiseIntensity=0.015] [--MotifDbFile=none] [--ms1.
absoluteIntensityError=0.02] [--ms1.
minimalIntensityToConsider=0.01] [--ms1.
relativeIntensityError=0.08] [--MS1MassDeviation.
allowedMassDeviation=10.0 ppm] [--MS1MassDeviation.
massDifferenceDeviation=5.0 ppm] [--MS1MassDeviation.
standardMassDeviation=10.0 ppm] [--MS2MassDeviation.
allowedMassDeviation=10.0 ppm] [--MS2MassDeviation.
standardMassDeviation=10.0 ppm] [--NoiseThresholdSettings.
absoluteThreshold=0] [--NoiseThresholdSettings.
basePeak=NOT_PRECURSOR] [--NoiseThresholdSettings.
intensityThreshold=0.005] [--NoiseThresholdSettings.
maximalNumberOfPeaks=60] [--NumberOfCandidates=10]
[--NumberOfCandidatesPerIon=1]
[--NumberOfStructureCandidates=10000]
[--PossibleAdductSwitches=[M+Na]+:[M+H]+,[M+K]+:[M+H]+,
[M+Cl]-:[M-H]-] [--PrintCitations=True]
[--RecomputeResults=False]
[--StructurePredictors=CSI_FINGERID]
[--StructureSearchDB=BIO] [--Timeout.secondsPerInstance=0]
[--Timeout.secondsPerTree=0] [--UseHeuristic.
mzToUseHeuristic=300] [--UseHeuristic.
mzToUseHeuristicOnly=650] [--ZodiacClusterCompounds=false]
[--ZodiacEdgeFilterThresholds.minLocalCandidates=1]
[--ZodiacEdgeFilterThresholds.minLocalConnections=10]
[--ZodiacEdgeFilterThresholds.thresholdFilter=0.95]
[--ZodiacEpochs.burnInPeriod=2000] [--ZodiacEpochs.
iterations=20000] [--ZodiacEpochs.numberOfMarkovChains=10]
[--ZodiacLibraryScoring.lambda=1000]
[--ZodiacLibraryScoring.minCosine=0.5]
[--ZodiacNumberOfConsideredCandidatesAt300Mz=10]
[--ZodiacNumberOfConsideredCandidatesAt800Mz=50]
[--ZodiacRatioOfConsideredCandidatesPerIonization=0.2]
[--ZodiacRunInTwoSteps=true] [COMMAND]
```bash
Usage: sirius config [-hV]
[COMMAND]
<CONFIGURATION> Override all possible default configurations of this toolbox
from the command line.
--AdductSettings.detectable=[M+H]+,[M+K]+,[M+Na]+,[M+H-H2O]+,[M+H-H4O2]+,
Expand Down Expand Up @@ -340,7 +419,7 @@ from the command line.
As default ZODIAC runs a 2-step approach. First
running 'good quality compounds' only, and
afterwards including the remaining.

```
# Sirius options
Expand Down
2 changes: 1 addition & 1 deletion bindings/sirius/src/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1053,7 +1053,7 @@ impl<V: Version> SiriusBuilder<V> {
/// # Example
/// ```
/// use sirius::prelude::*;
/// let sirius = SiriusBuilder::default().build();
/// let sirius = SiriusBuilder::<Version5>::default().build();
/// ```
pub fn build(self) -> Sirius<V> {
Sirius::from(self.config)
Expand Down
9 changes: 7 additions & 2 deletions bindings/sirius/src/sirius.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ impl<V: Version> Sirius<V> {
/// The sirius executable is expected to be available in the environment variable SIRIUS_PATH.
/// The username and password for the sirius account are expected to be available in the environment variables SIRIUS_USERNAME and SIRIUS_PASSWORD.
///
/// This function get the parameters that where set in building the SiriusConfig struct and runs the sirius command with the given input and output file paths.
/// This function gets the parameters that where set in the SiriusBuilder struct and runs the sirius command with the given input and output file paths.
///
/// # Arguments
/// * `input_file_path` - The path to the input file
Expand Down Expand Up @@ -114,7 +114,12 @@ impl<V: Version> Sirius<V> {
println!("Running command: sirius {:?}", args);

// Add arguments and spawn the command
command.args(&args).spawn().expect("Sirius failed to start");
let mut child = command.args(&args).spawn().expect("Sirius failed to start");
let status = child.wait().expect("Failed to wait on child");

if !status.success() {
return Err("Sirius failed".to_string());
}

Ok(())
}
Expand Down
2 changes: 1 addition & 1 deletion bindings/sirius/src/sirius_types/atoms.rs
Original file line number Diff line number Diff line change
Expand Up @@ -503,7 +503,7 @@ impl TryFrom<String> for Atoms {
}
}

/// A vector of atoms that can be read from a string or written to a string
/// Create a vector of atoms
#[cfg_attr(feature = "fuzz", derive(arbitrary::Arbitrary))]
#[derive(Debug, Clone, PartialEq, Eq, Hash, Default)]
pub struct AtomVector(Vec<Atoms>);
Expand Down
18 changes: 9 additions & 9 deletions bindings/sirius/src/sirius_types/mass_deviation.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,13 @@ impl MassDeviation {
/// If the value is negative
/// # Example
/// ```
/// use sirius_types::MassDeviation;
/// use sirius::prelude::*;
/// let ppm = MassDeviation::ppm(10.0);
/// ```
/// # Panics
/// If the value is negative
/// ```
/// use sirius_types::MassDeviation;
/// ```should_panic
/// use sirius::prelude::*;
/// let ppm = MassDeviation::ppm(-10.0);
/// ```
pub fn ppm(value: f32) -> Self {
Expand All @@ -39,14 +39,14 @@ impl MassDeviation {
/// If the value is negative
/// # Example
/// ```
/// use sirius_types::MassDeviation;
/// use sirius::prelude::*;
/// let da = MassDeviation::da(0.1);
/// ```
/// # Panics
/// If the value is negative
/// ```
/// use sirius_types::MassDeviation;
/// let da = MassDeviation::da(-0.1);
/// ```should_panic
/// use sirius::prelude::*;
/// let x = MassDeviation::da(-0.1);
/// ```
pub fn da(value: f32) -> Self {
// Da can't be negative
Expand All @@ -61,14 +61,14 @@ impl MassDeviation {
/// If the value is negative
/// # Example
/// ```
/// use sirius_types::MassDeviation;
/// use sirius::prelude::*;
/// let ppm = MassDeviation::ppm(10.0);
/// assert_eq!(ppm.must_be_positive().unwrap(), MassDeviation::Ppm(10.0));
/// ```
/// # Errors
/// If the value is negative
/// ```
/// use sirius_types::MassDeviation;
/// use sirius::prelude::*;
/// let ppm = MassDeviation::Ppm(-10.0);
/// assert!(ppm.must_be_positive().is_err());
/// ```
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
use std::fmt::Display;

/// Noise threshold settings
/// The noise threshold settings
#[cfg_attr(feature = "fuzz", derive(arbitrary::Arbitrary))]
#[derive(Debug, Clone, PartialEq, Eq, Hash, Default, Copy)]
pub enum BasePeak {
Expand Down
4 changes: 2 additions & 2 deletions bindings/sirius/tests/test_sirius.rs
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ fn test_run_sirius_with_enpkg_params() -> Result<(), String> {
.isotope_settings_filter(true)?
.formula_search_db(SearchDB::Bio)?
.timeout_seconds_per_tree(0)?
.formula_settings_enforced(AtomVector::new(vec![
.formula_settings_enforced(AtomVector::from(vec![
Atoms::H,
Atoms::C,
Atoms::N,
Expand All @@ -116,7 +116,7 @@ fn test_run_sirius_with_enpkg_params() -> Result<(), String> {
.ms2_mass_deviation_allowed_mass_deviation(MassDeviation::Ppm(5.0))?
.number_of_candidates_per_ion(1)?
.use_heuristic_mz_to_use_heuristic(300)?
.formula_settings_detectable(AtomVector::new(vec![
.formula_settings_detectable(AtomVector::from(vec![
Atoms::B,
Atoms::Cl,
Atoms::Se,
Expand Down

0 comments on commit d838de6

Please sign in to comment.