-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A script to compute sentence probs #92
Comments
My ideas are 1) split the text file into sentences and call get_data_prob.py for each sentence 2) add an new argument (--sentence-prob) to get_data_prob.py and compute sentence probability inside it. Which one do you prefer? @danpovey And do we need to support utterance id? |
Add a new option, or create a new python script. Calling get_data_prob.py
for each sentence would be very slow as it would have to load the model
each time.
Supporting utterance-ids would be nice, but it's not necessary as we could
use `paste` to add them back in afterward.
…On Wed, Jun 27, 2018 at 11:03 PM, DongjiGao ***@***.***> wrote:
My ideas are 1) split the text file into sentences and call
get_data_prob.py for each sentence 2) add an new argument (--sentence-prob)
to get_data_prob.py and compute sentence probability inside it. Which one
do you prefer? @danpovey <https://github.com/danpovey> And do we need to
support utterance id?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#92 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu6sP8-SToWJrnrbyk3pUHDtrCZ3Aks5uBEeRgaJpZM4U6B0O>
.
|
Closed
It's OK. It will be a good exercise for you since you are doing a lot of
language modeling work and SRILM is a very standard tool.
…On Thu, Jul 5, 2018 at 8:59 PM, DongjiGao ***@***.***> wrote:
Will do. It might take me some time since I have not used SRILM before.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#92 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu4RidO209HiY9P-KqBK6vQlyrsxFks5uDrZxgaJpZM4U6B0O>
.
|
Hi, I wonder is this is implemented in get_data_prob.py? Thank you. |
I don't think it is; as I said above in the thread, that will give you the
overall prob but not per line.
…On Mon, Sep 7, 2020 at 8:08 PM Siva Reddy Gangireddy < ***@***.***> wrote:
Hi, I wonder is this is implemented in get_data_prob.py? Thank you.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#92 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLO5P3X6NJGZSMHKZ3BLSETEKBANCNFSM4FHIDUHA>
.
|
ok. thanks for letting me know. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
We need a script like rnnlm/compute_sentence_scores.sh in kaldi-rnnlm to compute scores on sentence level in a text file. A start point would be pocolm/scripts/get_data_prob.py which computes the prob of a whole text file. Dongji has offered to do this. Thanks.
The text was updated successfully, but these errors were encountered: