You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Enju is outputting sentence structure information in a tree data structure, but I don't think it maps to ontology terms.
So, I guess you need an additional mapping step or combine this step with the translation one.
So, how do you see the hackathon running?
–> Reply:
I think that initially, structure-mapping and identifier-mapping can be separate subprojects.
A structure mapper could simply use term strings as placeholder IDs (or string+position, to prevent duplicates). Using URIs is not obligatory for VSM-JSON.
Indeed, multiple NLP tools will need to be brought together. (Todo: add to Readme).
◦ For example on www.pubannotation.org, we could use both the "Gene name grounding (PubTator)" and "Semantic annotation (MetaMap and SemRep)" annotations, which are shown aligned with the "Dependency parsing (Enju)".
◦ On stanza.run/bio, NER annotations are aligned with the UD tree. Though I only see term categories, no IDs, so PubAnnotation's site wins there.
Of course, we may combine NLP tools from any ecosystem. – I like Enju because I see arg1Of, arg2Of relations. Perhaps these can be mapped to VSM subject/object relations (=tridents and bidents)? I like Universal Dependencies because it says 'Universal'. :) So here is where I'd prefer NLP experts to chime in. Hence my hackathon participation.
Back to the IDs: Realistically, I think our best target will be to generate a combination of as many as possible terms linked to an ontology/PubDictionaries/etc. ID, and the remainder of terms linked to a placeholder string-ID.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
(Moved here from an email: )
–> Reply:
I think that initially, structure-mapping and identifier-mapping can be separate subprojects.
A structure mapper could simply use term strings as placeholder IDs (or string+position, to prevent duplicates). Using URIs is not obligatory for VSM-JSON.
Indeed, multiple NLP tools will need to be brought together. (Todo: add to Readme).
◦ For example on www.pubannotation.org, we could use both the "Gene name grounding (PubTator)" and "Semantic annotation (MetaMap and SemRep)" annotations, which are shown aligned with the "Dependency parsing (Enju)".
◦ On stanza.run/bio, NER annotations are aligned with the UD tree. Though I only see term categories, no IDs, so PubAnnotation's site wins there.
Of course, we may combine NLP tools from any ecosystem. – I like Enju because I see arg1Of, arg2Of relations. Perhaps these can be mapped to VSM subject/object relations (=tridents and bidents)? I like Universal Dependencies because it says 'Universal'. :) So here is where I'd prefer NLP experts to chime in. Hence my hackathon participation.
Back to the IDs: Realistically, I think our best target will be to generate a combination of as many as possible terms linked to an ontology/PubDictionaries/etc. ID, and the remainder of terms linked to a placeholder string-ID.
Beta Was this translation helpful? Give feedback.
All reactions