Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about nlp.add_pipe in the demo #79

Open
newbietuan opened this issue Dec 7, 2021 · 6 comments
Open

about nlp.add_pipe in the demo #79

newbietuan opened this issue Dec 7, 2021 · 6 comments

Comments

@newbietuan
Copy link

Describe the bug
1638879733
when i run the demo code,there something wrong about “nlp.add_pipe(quickumls_component)”,

Traceback (most recent call last):
File "umlsdemo.py", line 8, in
nlp.add_pipe(quickumls_component)
File "/home/mayt/anaconda3/envs/umls/lib/python3.7/site-packages/spacy/language.py", line 769, in add_pipe
raise ValueError(err)
ValueError: [E966] nlp.add_pipe now takes the string name of the registered component factory, not a callable component. Expected string, but got <quickumls.spacy_component.SpacyQuickUMLS object at 0x7f24b35e5cd0> (name: 'None').

  • If you created your component with nlp.create_pipe('name'): remove nlp.create_pipe and call nlp.add_pipe('name') instead.

  • If you passed in a component like TextCategorizer(): call nlp.add_pipe with the string name instead, e.g. nlp.add_pipe('textcat').

  • If you're using a custom component: Add the decorator @Language.component (for function components) or @Language.factory (for class components / factories) to your custom component and assign it a name, e.g. @Language.component('your_name'). You can then run nlp.add_pipe('your_name') to add it to the pipeline.

To Reproduce

**Environment **

  • OS: [Unbunt]
  • QuickUMLS version 1.4.0 post1
  • UMLS version 2021AB
  • spacy 3.2.0

Additional context
it seems relate to spacy accroding to https://stackoverflow.com/questions/67906945/valueerror-nlp-add-pipe-now-takes-the-string-name-of-the-registered-component-f while i still don't konw how to modify the code~~

@ygivenx
Copy link

ygivenx commented Feb 18, 2022

It can be used like this.

import spacy
from spacy.language import Language
from quickumls.spacy_component import SpacyQuickUMLS

@Language.component('quickumls_component')
def quickumls_component(doc):
    return SpacyQuickUMLS(nlp, <Path to quickUmls install dir>)(doc)
    

nlp.add_pipe('quickumls_component', last=True)

doc = nlp(full_rpts.iloc[0])

@shrimonmuke0202
Copy link

Hi everyone,
When I using this code I got the this error
[[E090] Extension 'similarity' already exists on Span. To overwrite the existing extension, set force=TrueonSpan.set_extension.]

@ghost
Copy link

ghost commented Feb 22, 2023

@shrimonmuke0202 did you solve this problem??
[[E090] Extension 'similarity' already exists on Span. To overwrite the existing extension, set force=TrueonSpan.set_extension.]

@ysu1213
Copy link

ysu1213 commented Mar 15, 2023

It can be used like this.

import spacy
from spacy.language import Language
from quickumls.spacy_component import SpacyQuickUMLS

@Language.component('quickumls_component')
def quickumls_component(doc):
    return SpacyQuickUMLS(nlp, <Path to quickUmls install dir>)(doc)
    

nlp.add_pipe('quickumls_component', last=True)

doc = nlp(full_rpts.iloc[0])

Hi there, thank you so much for sharing a solution! I was able to get past the add_pipe error but not further. Could you explain what the line of code on doc = nlp(full_rpts.iloc[0]) does? I was trying to put into something like
doc = nlp('Pt c/o shortness of breath, chest pain, nausea, vomiting, diarrrhea')
but that does not work. Initially I tried copy pasting your code entirely, but it returns the error saying "full_rpts" is not defined - is there some missing context here about this line of code? Thank you so much!

@ygivenx
Copy link

ygivenx commented Mar 16, 2023

It can be used like this.

import spacy
from spacy.language import Language
from quickumls.spacy_component import SpacyQuickUMLS

@Language.component('quickumls_component')
def quickumls_component(doc):
    return SpacyQuickUMLS(nlp, <Path to quickUmls install dir>)(doc)
    

nlp.add_pipe('quickumls_component', last=True)

doc = nlp(full_rpts.iloc[0])

Hi there, thank you so much for sharing a solution! I was able to get past the add_pipe error but not further. Could you explain what the line of code on doc = nlp(full_rpts.iloc[0]) does? I was trying to put into something like doc = nlp('Pt c/o shortness of breath, chest pain, nausea, vomiting, diarrrhea') but that does not work. Initially I tried copy pasting your code entirely, but it returns the error saying "full_rpts" is not defined - is there some missing context here about this line of code? Thank you so much!

full_rpts.iloc[0] returns a string from pandas dataframe, so doc = nlp('Pt c/o shortness of breath, chest pain, nausea, vomiting, diarrrhea') is correct. Did you update the UMLS install location in the code below?

def quickumls_component(doc):
    return SpacyQuickUMLS(nlp, <Path to quickUmls install dir>)(doc)

@gah-bo
Copy link

gah-bo commented Apr 23, 2023

It can be used like this.

import spacy
from spacy.language import Language
from quickumls.spacy_component import SpacyQuickUMLS

@Language.component('quickumls_component')
def quickumls_component(doc):
    return SpacyQuickUMLS(nlp, <Path to quickUmls install dir>)(doc)
    

nlp.add_pipe('quickumls_component', last=True)

doc = nlp(full_rpts.iloc[0])

Is the Path to quickUmls install dir supposed to be the same as quickumls_fp in this code block?

matcher = QuickUMLS(quickumls_fp, ...)

If so, I am doing this yet get this message:

Loading QuickUMLS resources from a default SAMPLE of UMLS data from here: /opt/conda/envs/python38/lib/python3.8/site-packages/resources/quickumls/QuickUMLS_SAMPLE_lowercase_POSIX_unqlite

and no output from the print statements from the code in OP's block

However, this works fine

# Initialize QuickUMLS matcher
matcher = QuickUMLS("./libraries/quickumls", "score", 0.99)
       
def quick_UMLS_match(medical_text):
    if len(medical_text) > 1000000:
        processed_text = medical_text[:1000000]
    else:
        processed_text = medical_text
    return matcher.match(processed_text, best_match=True, ignore_syntax=False)

But I am trying to implement medspacy as I extract items from the QuickUMLS output in a super inneficient way and this seems like the proper way. For what it's worth, this is how I do it:

def quick_UMLS_extractor(matcher_output, return_field, unique=True):
    return_items = [entity[return_field] for sublst in matcher_output for entity in sublst]

    if unique:
        return_items = list(set(return_items))
        return return_items
    else:
        return return_items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants