Error with annotate_ws.py #69

bfinj · 2020-08-01T21:30:38Z

Hi!

I was using annotate_ws.py to annotate custom questions. I ran annotate_ws.py on google cloud platform. However, I got this error:
python3 annotate_ws.py --split past,present annotating /home/Enzo/sqlova-shallow-layer/past.jsonl loading tables 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2716/2716 [00:00<00:00, 17256.43it/s] loading examples 0%| | 0/1690 [00:00<?, ?it/s]Starting server with command: java -Xmx5G -cp /home/Enzo/sqlova-shallow-layer/stanford-corenlp-4.0.0/* edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 60000 -threads 5 -maxCharLength 100000 -quiet True -serverProperties corenlp_server-6f9bf1976d784f04.props -preload tokenize,ssplit,pos,lemma,ner,depparse 0%| | 0/1690 [00:40<?, ?it/s] Traceback (most recent call last): File "annotate_ws.py", line 190, in <module> a = annotate_example_ws(d, tables[d['table_id']]) File "annotate_ws.py", line 107, in annotate_example_ws _nlu_ann = annotate(example['question']) File "annotate_ws.py", line 24, in annotate for s in client.annotate(sentence): TypeError: 'Document' object is not iterable

Could you tell me why this happened? Thank you in advance!

The text was updated successfully, but these errors were encountered:

Daljeetka · 2020-09-09T16:37:07Z

I am facing same issue. Did you get any solution to this problem?

bfinj · 2020-09-29T07:35:03Z

@Daljeetka Not yet...

Qingkongji · 2020-11-03T14:32:16Z

When running annotate_wa.py, i got an error: ModuleNotFoundError: No module named 'stanza.nlp'. But i has installed stanza. Which package else should I install?

Qingkongji · 2020-11-04T14:47:36Z

i know. Change line 8 to from stanza.server import CoreNLPClient. Now i am facing the same issue TypeError: 'Document' object is not iterable too..

gouldju1 · 2021-02-18T18:39:09Z

Try this:

import stanza
nlp = stanza.Pipeline('en')

def annotate(sentence, lower=True, nlp=nlp):
    """
    Input: Question
    Output: Tokenized input question
    {
        'gloss': original question,
        'words': list of tokens,
        'after': " " for tokens through last 2; last 2 tokens = ""
    }
    """
    doc = nlp(sentence)
    
    words, gloss, after = [], [], []
    for sentence in doc.sentences:
        for token in sentence.tokens:
            word, originalText = token.text, token.text
            after_ = " "

            words.append(word)
            gloss.append(originalText)
            after.append(after_)
        after[-2:] = ["", ""]
    if lower:
        words = [w.lower() for w in words]
    return {
        'gloss': gloss,
        'words': words,
        'after': after,
        }

dsivakumar · 2022-07-02T08:46:21Z

In case of latest stanza I have to make these changes to work (check lines with ###), started the coreNLP server outside (check this stanfordnlp/stanza#245 (comment))


#!/usr/bin/env python3
from argparse import ArgumentDefaultsHelpFormatter, ArgumentParser
from asyncio import start_server
import os
import records
import ujson as json
from stanza.server.client import CoreNLPClient ###
from tqdm import tqdm
import copy
from lib.common import count_lines, detokenize
from lib.query import Query
import stanza.server as corenlp ###

client = None
    if client is None:
        client = CoreNLPClient(annotators='tokenize,ssplit,pos,lemma,ner,depparse',
            start_server=corenlp.StartServer.DONT_START) ###
    words, gloss, after = [], [], []
    objs = client.annotate(sentence) ###
    for s in objs.sentence: ###
        for t in s.token: ###
            words.append(t.word)
            gloss.append(t.originalText)
            after.append(t.after)

jack-jjm · 2022-10-18T16:14:30Z

Yes, the code by @dsivakumar seems to be correct. The return value of client.annotate(sentence) is not an actual Document object, no matter what the error message says. It's something called a Protobuf, as explained (sort of) here. These objects' fields are named in the singular (sentence, token) even though they refer to iterables of multiple sentences and tokens.

jack-jjm mentioned this issue Oct 18, 2022

How can I iterate sentences in type corenlp_pb2.document? stanfordnlp/CoreNLP#1079

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error with annotate_ws.py #69

Error with annotate_ws.py #69

bfinj commented Aug 1, 2020

Daljeetka commented Sep 9, 2020

bfinj commented Sep 29, 2020

Qingkongji commented Nov 3, 2020

Qingkongji commented Nov 4, 2020

gouldju1 commented Feb 18, 2021

dsivakumar commented Jul 2, 2022 •

edited

Loading

jack-jjm commented Oct 18, 2022

Error with annotate_ws.py #69

Error with annotate_ws.py #69

Comments

bfinj commented Aug 1, 2020

Daljeetka commented Sep 9, 2020

bfinj commented Sep 29, 2020

Qingkongji commented Nov 3, 2020

Qingkongji commented Nov 4, 2020

gouldju1 commented Feb 18, 2021

dsivakumar commented Jul 2, 2022 • edited Loading

jack-jjm commented Oct 18, 2022

dsivakumar commented Jul 2, 2022 •

edited

Loading