A script to compute sentence probs #92

xiaohui-zhang · 2018-06-27T16:37:04Z

We need a script like rnnlm/compute_sentence_scores.sh in kaldi-rnnlm to compute scores on sentence level in a text file. A start point would be pocolm/scripts/get_data_prob.py which computes the prob of a whole text file. Dongji has offered to do this. Thanks.

DongjiGao · 2018-06-28T03:03:43Z

My ideas are 1) split the text file into sentences and call get_data_prob.py for each sentence 2) add an new argument (--sentence-prob) to get_data_prob.py and compute sentence probability inside it. Which one do you prefer? @danpovey And do we need to support utterance id?

danpovey · 2018-06-28T03:16:12Z

Add a new option, or create a new python script. Calling get_data_prob.py for each sentence would be very slow as it would have to load the model each time. Supporting utterance-ids would be nice, but it's not necessary as we could use `paste` to add them back in afterward.

…

On Wed, Jun 27, 2018 at 11:03 PM, DongjiGao ***@***.***> wrote: My ideas are 1) split the text file into sentences and call get_data_prob.py for each sentence 2) add an new argument (--sentence-prob) to get_data_prob.py and compute sentence probability inside it. Which one do you prefer? @danpovey <https://github.com/danpovey> And do we need to support utterance id? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#92 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu6sP8-SToWJrnrbyk3pUHDtrCZ3Aks5uBEeRgaJpZM4U6B0O> .

danpovey · 2018-07-06T01:00:19Z

It's OK. It will be a good exercise for you since you are doing a lot of language modeling work and SRILM is a very standard tool.

…

On Thu, Jul 5, 2018 at 8:59 PM, DongjiGao ***@***.***> wrote: Will do. It might take me some time since I have not used SRILM before. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#92 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu4RidO209HiY9P-KqBK6vQlyrsxFks5uDrZxgaJpZM4U6B0O> .

srgangireddy · 2020-09-07T12:07:45Z

Hi, I wonder is this is implemented in get_data_prob.py? Thank you.

danpovey · 2020-09-07T12:44:56Z

I don't think it is; as I said above in the thread, that will give you the overall prob but not per line.

…

On Mon, Sep 7, 2020 at 8:08 PM Siva Reddy Gangireddy < ***@***.***> wrote: Hi, I wonder is this is implemented in get_data_prob.py? Thank you. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#92 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZFLO5P3X6NJGZSMHKZ3BLSETEKBANCNFSM4FHIDUHA> .

srgangireddy · 2020-09-07T12:50:32Z

ok. thanks for letting me know.

DongjiGao mentioned this issue Jul 6, 2018

Compute sentence prob #93

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A script to compute sentence probs #92

A script to compute sentence probs #92

xiaohui-zhang commented Jun 27, 2018

DongjiGao commented Jun 28, 2018

danpovey commented Jun 28, 2018 via email

danpovey commented Jul 6, 2018 via email

srgangireddy commented Sep 7, 2020

danpovey commented Sep 7, 2020 via email

srgangireddy commented Sep 7, 2020

A script to compute sentence probs #92

A script to compute sentence probs #92

Comments

xiaohui-zhang commented Jun 27, 2018

DongjiGao commented Jun 28, 2018

danpovey commented Jun 28, 2018 via email

danpovey commented Jul 6, 2018 via email

srgangireddy commented Sep 7, 2020

danpovey commented Sep 7, 2020 via email

srgangireddy commented Sep 7, 2020