Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unified submission #50

Open
dbujold opened this issue Sep 26, 2019 · 7 comments
Open

Unified submission #50

dbujold opened this issue Sep 26, 2019 · 7 comments
Milestone

Comments

@dbujold
Copy link
Member

dbujold commented Sep 26, 2019

Currently, the data hub validator relies on an exact match between an EpiRR record experiment name and IHEC Data Hub name. This is problematic because
1- An EpiRR record can have more than one experiment of the same type
2- The name used to describre the experiment can be different between both sources
3- Experiment Type property will soon be an optional property, that can be replaced by an ontology URI.

@dzerbino
Copy link
Contributor

Result of Banff discussion: this issue would be resolved by unifying submissions into EpiRR and the IHEC Portal. Renaming issues.

@dzerbino dzerbino changed the title A way to make connection between a data hub and EpiRR record experiment is needed Unified submission Nov 29, 2019
@sitag
Copy link
Contributor

sitag commented Nov 29, 2019

To document: the proposal is to cross reference all repeated metadata fields in datahub schema from the epirr registry.

@dzerbino
Copy link
Contributor

@dzerbino
Copy link
Contributor

dzerbino commented Jan 23, 2020

Desired result: single point of contact. A JSON is submitted to EpiRR that sends a template Portal JSON which is then filled by the team.

TODO:

  • What info can be dropped from the Portal JSON (assuming the portal can read it from EpiRR)?
  • Create Portal JSON generator from EpiRR submission
  • Ensure validators still function properly
  • Set up SOP
  • Create joint RT system
  • Update documentation
  • Test

@dzerbino dzerbino self-assigned this Feb 28, 2020
@dzerbino dzerbino added this to the Kiel meeting milestone Mar 3, 2020
@dzerbino
Copy link
Contributor

dzerbino commented Jul 7, 2020

What info can be dropped from the Portal JSON (assuming the portal can read it from EpiRR)?

{
   "datasets": {
        "experiment_1": {
            "experiment_attributes" // convert to ID
        },
        "experiment_2": {
            ...
        },
    }
    "samples": { ... }
}

What info needs to be retained:

{
    "hub_description": { ... },
    "datasets": {
        "experiment_1": {
            "sample_id": "...",
            "experiment_id":  "..." ,
            "analysis_attributes": { ... },
            "browser": { ... }
        },
        "experiment_2": {
           ...
        },
    }
}

@sitag
Copy link
Contributor

sitag commented Jul 7, 2020

@dzerbino They use the same schemas, so everything can be dropped as long as we keep the identifiers.

@dbujold
Copy link
Member Author

dbujold commented Jul 7, 2020

Basically, what's needed is a way to link sample and experiment metadata, that would be obtained from EpiRR, to processed data (bigwigs, bigbeds) and data processing metadata (analysis_attributes), that would be provided to the IHEC Data Portal.

@dzerbino dzerbino removed their assignment Feb 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants