This is "just for fun" project based on SKlearn ML models.
The main idea is try to create a Russian language talkative bot which determines user's main intention based on the input phrase and give the most appropriate answer to him.
A ready-made dictionary of intentions and examples of answers to them was used for bot training (JSON).
Three different models were tested to determine the user's intention in his phrase. Results are below:
- Logistic Regression (model score = 0.3884)
- Random forest Classifier (model score = 0.8281)
- MLP Classifier (model score = 0.8247)
So, I decided to use Random forest Classifier because it is faster and less GPU expensive.
The input word (phrase) is lowered to lower case then spaces and punctuation marks are removed from it.
Two words are compared: the input and which the model predicts in the body of this function.
- Filter input data;
- Try to find the answer directly in the dictionary;
- If not - use ML model intent predictiction and take the random answer example from this intent group;
- Or use the Failure Phrases if the model score is not enough in any case.
- Repit until input phrase == one of exit_phrases.