Skip to content

hyperstudio/parserbot

Repository files navigation

Parserbot

Parserbot is your one-stop shop for natural language parsing, tagging, and entity extraction. It wraps a variety of services and APIs into one app for easy parsing and cross-reference. Currently:

Built for the Artbot project on Flask.

Setup

Tested with Python 2.7.x. Setup within a virtualenv is recommended, or even better, a virtualenvwrapper. After cloning the repo and activating the virtualenv:

  • pip install .
  • run python key.py. It will spit out a secret key and auth header token; save these in environment variables (e.g. a shell profile, .env file, etc.). This is a convenience function that you can run as many times as you like.
  • python run.py to start the server
  • navigate to (http://localhost:3000) and you should see a welcome message

NOTE: All services require a PARSERBOT_SECRET_KEY environment variable.

Setting up specific NLP services:

Stanford NER -- /stanford
  • you must have Java of some flavor installed
  • pip install nltk==3.0.1
OpenCalais -- /opencalais
Zemanta -- /zemanta
Freebase
  • not currently configured. If you set it up, let us know!

Use

Python example:

headers = {'Authorization': '<YOUR_TOKEN_HERE>', 'Content-Type': 'application/json'}
data = json.dumps({'payload': 'This is a test for a man named Pablo Picasso'})
r = requests.post('http://localhost:3000/stanford', data=data, headers=headers)

If you want to play in the shell, you can use python shell.py

Tests

Tests are built for local setup only for now:

  • pip install pytest pytest-flask
  • python setup.py test

Documentation

You can find the docs in the docs subfolder. To generate new docs:

  • pip install sphinx
  • sphinx-build docs/source docs

Deployment

Currently set up to deploy on Heroku; configure the environment variables you need and it should be good to go. Heroku may complain about setting a JAVAHOME variable on the /stanford endpoint as well. A sample config:

DEBUG="False"
PARSERBOT_SECRET_KEY="<KEY_HERE>"
CALAIS_API_KEY="<KEY_HERE>"
ZEMANTA_API_KEY="<KEY_HERE>"
JAVAHOME="/usr/bin/java"

Future

Parsers to add someday:

License

Copyright (C) 2015 Massachusetts Institute of Technology

This program is free software; you can redistribute it and/or modify it under the terms of version 2 of the GNU General Public License as published by the Free Software Foundation (http://opensource.org/licenses/GPL-2.0).

About

Web-based synthesis of nifty NLP and entity extraction services

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages