Skip to content

nerses0/ModelJack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ModelJack

Author: Nerses Nersesyan

ModelJack is project for effectively emulating interesting language APIs with simple models that can run locally to avoid latency, costs and request limits.

Example 1: Comment Toxicity

In the first example we show how to train a model that emulates the Google Perspective API using data from the Wikipedia Talk project and the fasttext library.

The Perspective API is a demo released by the Google Jigsaw team. The API scores a comment based on its potential impact on a conversation, deting personal attacks. More detailed information about the project can be found here.

Detecting and reducing toxic comments and personal attacks is very important for most platforms with user-generated content. The Perspective API is potentially very useful, but is a demo limited to 1000 requests.

Can we emulate it so that developers can integrate this functionality into their platforms today?

Running

Download the data to this directory and run:

python ft_cls.py

Results

See Results

Datasets

For training and evaluation of created model we used the Wikipedia Talk project dataset.

Wikipedia Talk project release includes:

  1. a large historical corpus of discussion comments on Wikipedia talk pages

  2. a sample of over 100k comments with human labels for whether the comment contains a personal attack

  3. a sample of over 100k comments with human labels for whether the comment has an aggressive tone

Please refer to meta.wikimedia.org/wiki/Research:Detox/Data_Release for documentation of the schema of each data set.

References

Ex Machina: Personal Attacks Seen at Scale - documentation on the data collection and modeling methodology from Google and Wikimedia

Conversation AI - The Conversation AI Research Github Organization at Google

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages