ValidatesLamenessOf is a plugin that I developed to assist in clearing up some of the negative behavior shown in our user comments. I’ve observered that user’s that exhibit a set of behaviors, are likely to exhibit many more. In other words, a good user is likely to leave a comment like “This is a good comment”, whereas a bad user is likely to write something like “OMG THIS IS SOU FUNNY!!!!!! :) lol xoxo”. While the first isn’t really lame at all, the second is lame in many different ways.
The idea behind this plugin is that if we filter for obviously lame things, like an abundance of ALL CAPS words, or excessive exclamation marks, we can use these lame and non-lame comments to train a Bayesian filter to automatically classify comments over time as the definition of lameness evolves. Comments can presently be reported for filtering, although there doesn’t yet exist a validation to check against the filter, as further testing is needed. Over time, additional lameness reporting validations will be added as user behaviors change.
ValidatesLamenessOf is available as a traditional plugin. To install it as a traditional plugin (Rails 2.1 or later):
script/plugin install git://github.com/jcnnghm/validates_lameness_of.git
To use the bayesian filtering, you’ll need to install the classifier gem, and the madeleine gem. To do this, add the following to your environment.rb, then run rake gems:install.
config.gem 'classifier' config.gem 'madeleine'
The validations work like any other validation.
class Comment < ActiveRecord::Base validate_capitilization_of :comment, :minimum_size => 5, :maximum_uppercase_percentage => 40, :report_lameness => true validate_exclamation_marks_of :comment, :maximum_in_composition => 2, :maximum_together => 1, :report_lameness => true end
The only difference is the report_lameness option, which, when set to true, will automatically train a bayesian filter for each field.
You may need to configure ValidatesLamenessOf for your setup. The configuration required is:
String that sets the directory that will be used to store persisted bayesian filter data. In your production environment, you will probably want to set this somewhere that persists between deployments.
ValidatesLamenessOf.data_directory = 'tmp/lameness_data' # sets for current environment (also the default)
-
GitHub Repsitory: github.com/jcnnghm/validates_lameness_of
Based, in part, on the following:
-
Alex Dunae’s validates_email_format_of: github.com/alexdunae/validates_email_format_of
-
Tommaso Baldovino’s Unit Testing Blog Post: usefulfor.com/ruby/2009/01/22/unit-testing-your-ruby-on-rails-plugin/
-
jrhicks on Stack Overflow, who suggested a Bayesian Classifier: stackoverflow.com/questions/1602740/rails-lameness-filter
Copyright © 2009 Justin Cunningham, released under the MIT license