Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Placeholder for discussing support for HCA and FAANG context #95

Open
henrietteharmse opened this issue Apr 9, 2021 · 2 comments
Open

Comments

@henrietteharmse
Copy link
Collaborator

henrietteharmse commented Apr 9, 2021

This ticket serves as a discussion point for adding support for HCA and FAANG context. Here I will make some suggestions with the main intent that it is something people can point at to say it makes sense or it does not make sense.

Currently HCA and FAANG restrict mappings using graph-restrictions for some of their fields to restrict ontology terms that can be used for these fields.

Here is an example from FAANG for their experiments_chip-seq_dna-binding_proteins field:

              "graph_restriction": {
                "ontologies": ["obo:chebi"],
                "classes": ["CHEBI:15358"],
                "relations": ["rdfs:subClassOf"],
                "direct": false,
                "include_self": false
              }

Here is an example from HCA for their cell type field:

            "graph_restriction":  {
                "ontologies" : ["obo:hcao", "obo:cl"],
                "classes": ["CL:0000003"],
                "relations": ["rdfs:subClassOf"],
                "direct": false,
                "include_self": false
            },

Currently our project definition looks as follows:

{
  "name": "Project name",                     // MANDATORY
  "description": "Some description",
  "numberOfReviewsRequired": 3,
  "datasources": [
     "atlas",
     "uniprot",
     "gwas",
     ...
  ],
  "ontologies": [
     "efo",
     "mondo",
     "hp",
     "ordo"
  ],
  "preferredMappingOntologies": [ "efo" ]
}

To support HCA and FAANG, we need to add a fields field consisting of fields supporting graph-restrictions to our project definition.
Here is an example for FAANG.

 "fields": [
            {
            "fieldName" : "experiments_chip-seq_dna-binding_proteins"
            "graphRestriction":  {
                "ontologies" : ["obo:hcao", "obo:cl"],
                "classes": ["CL:0000003"],
                "relations": ["rdfs:subClassOf"],
                "direct": false,
                "includeSelf": false
            }
         },
         {
          "fieldName" : "otherField" ,
          "graphRestriction":  {
          ...
            }
         }
 ] 

Here is an example for HCA:


 "fields": [
            {
            "fieldName" : "cell type"
            "graphRestriction":  {
                "ontologies" : ["obo:hcao", "obo:cl"],
                "classes": ["CL:0000003"],
                "relations": ["rdfs:subClassOf"],
                "direct": false,
                "include_self": false
            }
         },
         {
          "fieldName" : "otherField" ,
          "graphRestriction":  {
          ...
            }
         }
 ] 

Currently our upload format looks as follows:

{
  "data": [
	{
	  "upstreamId": "ID",     // Optional
          "priority": 3,          // Optional
	  "text": "TEXT",     // Mandatory
	  "context": "field"   // Optional (if not provided, data points will be auto-assigned to the `default` context)
	}
  ]
}

I do not think our upload file format will need to change, assuming the context will contain a field that is part of the list of fields for that project.

@henrietteharmse
Copy link
Collaborator Author

@mshadbolt @Alexey-ebi @peterwharrison @dosumis @zoependlington @tskir @udp @tudorgroza

Please feel free to comment and raise your concerns and/or ideas.

@henrietteharmse
Copy link
Collaborator Author

henrietteharmse commented Apr 14, 2021

For HCA the only value used for the relations field is rdfs:subClassOf, the direct field is always false and include_self can be true or false.

Field Values Meaning
restrictions rdfs:subClassOf Must be a subclass of 1 of the terms in the classes field.
include_self true/false It means the term must either be 1 of the classes listed in the classes field. If this is used with a rdfs:subClassOf restriction, it means it can be 1 of the classes OR a subclass of 1 of the classes in the classes field.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant