You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There's also been research into using seq2seq to classify intent and do slot filling as seen in this paper from microsoft. Also, here's a python implementation.
Entity Extraction
Rasa NLU has several methods of entity extraction as documented here. These include conditional random field for custom entity extraction (not pretrained). SpaCy provides entity extraction as well in the form of an averaged perceptron. The third option is a duckling server, which uses context-free grammar. Facebook has an Open source implementation of context-free grammar.
As mentioned above, a seq2seq approach can also be used as documented here.
Bootstrapping Utterances
Writing Utterances is a pain in the rear. There might be a way to bootstrap the utterance generation to alleviate the need to manually make them.
Here's a list of data corpus's that should prove useful for that regard. That paper also has an overview of useful methods for building dialogue systems.
The paper also has an interesting reference Luke, I am your father: dealing with out-of-domain requests by using movies subtitles. This should be useful for one off responses.
This google blog research has an example for handle to help rank uniquness of response, which will be necessary for generation of unique responses.
this repo uses the Cornell Movie-Dialogs Corpus and a seq to seq neural net to implement the google blog post.
The real challenge is going to be handling context.
There's a way to handle the context as proposed in the ubuntu dialog corpus, using an affinity model with context c (five consecutive utterances for example). The Paper is here.
Final Thoughts
The easiest would be to follow the paper to build a one off for out of domain requests. A sort of pithy response bot, as it were.
The text was updated successfully, but these errors were encountered:
Intent Classification
Rasa NLU uses a linear SVM to classify the intent by leveraging spaCy's n-gram model to vectorize utterances.
There's also been research into using seq2seq to classify intent and do slot filling as seen in this paper from microsoft. Also, here's a python implementation.
Entity Extraction
Rasa NLU has several methods of entity extraction as documented here. These include conditional random field for custom entity extraction (not pretrained). SpaCy provides entity extraction as well in the form of an averaged perceptron. The third option is a duckling server, which uses context-free grammar. Facebook has an Open source implementation of context-free grammar.
As mentioned above, a seq2seq approach can also be used as documented here.
Bootstrapping Utterances
Writing Utterances is a pain in the rear. There might be a way to bootstrap the utterance generation to alleviate the need to manually make them.
Here's a list of data corpus's that should prove useful for that regard. That paper also has an overview of useful methods for building dialogue systems.
The paper also has an interesting reference
Luke, I am your father: dealing with out-of-domain requests by using movies subtitles
. This should be useful for one off responses.This google blog research has an example for handle to help rank uniquness of response, which will be necessary for generation of unique responses.
this repo uses the Cornell Movie-Dialogs Corpus and a seq to seq neural net to implement the google blog post.
Should be able to also leverage reddit using the movie corpus code I've written already.
Context
The real challenge is going to be handling context.
There's a way to handle the context as proposed in the ubuntu dialog corpus, using an affinity model with context c (five consecutive utterances for example). The Paper is here.
Final Thoughts
The easiest would be to follow the paper to build a one off for out of domain requests. A sort of pithy response bot, as it were.
The text was updated successfully, but these errors were encountered: