Parserbot is your one-stop shop for natural language parsing, tagging, and entity extraction. It wraps a variety of services and APIs into one app for easy parsing and cross-reference. Currently:
- Stanford NER
- DBpedia
- OpenCalais
- Zemanta
- roughly, Freebase
Built for the Artbot project on Flask.
Tested with Python 2.7.x. Setup within a virtualenv is recommended, or even better, a virtualenvwrapper. After cloning the repo and activating the virtualenv:
pip install .
- run
python key.py
. It will spit out a secret key and auth header token; save these in environment variables (e.g. a shell profile,.env
file, etc.). This is a convenience function that you can run as many times as you like. python run.py
to start the server- navigate to (http://localhost:3000) and you should see a welcome message
NOTE: All services require a PARSERBOT_SECRET_KEY
environment variable.
Setting up specific NLP services:
- you must have Java of some flavor installed
pip install nltk==3.0.1
- get an OpenCalais API key and set as a
CALAIS_API_KEY
environment variable
- get a Zemanta API key and set as a
ZEMANTA_API_KEY
environment variable
- not currently configured. If you set it up, let us know!
Python example:
headers = {'Authorization': '<YOUR_TOKEN_HERE>', 'Content-Type': 'application/json'}
data = json.dumps({'payload': 'This is a test for a man named Pablo Picasso'})
r = requests.post('http://localhost:3000/stanford', data=data, headers=headers)
If you want to play in the shell, you can use python shell.py
Tests are built for local setup only for now:
pip install pytest pytest-flask
python setup.py test
You can find the docs in the docs
subfolder. To generate new docs:
pip install sphinx
sphinx-build docs/source docs
Currently set up to deploy on Heroku; configure the environment variables
you need and it should be good to go. Heroku may complain about setting a
JAVAHOME
variable on the /stanford
endpoint as well. A sample config:
DEBUG="False"
PARSERBOT_SECRET_KEY="<KEY_HERE>"
CALAIS_API_KEY="<KEY_HERE>"
ZEMANTA_API_KEY="<KEY_HERE>"
JAVAHOME="/usr/bin/java"
Parsers to add someday:
Copyright (C) 2015 Massachusetts Institute of Technology
This program is free software; you can redistribute it and/or modify it under the terms of version 2 of the GNU General Public License as published by the Free Software Foundation (http://opensource.org/licenses/GPL-2.0).