Check out a Live Demo of the search engine webapp integrated with Elasticsearch here.
- Dockerized realtime tweet streaming to MongoDB based on search rules. Tweepy used to connect to twitter API;
- MongoDB collection is continuously synced with an Elasticsearch index using Monstache;
- MongoDB queried with Mongo Express, a web-based MongoDB admin interface;
- Kibana used to visualize and search tweets.
- Flask search webapp connected served by nginx.
All components of the project are dockerized. The Streaming Client is initiated by twitter_stream/Dockerfile
and the Search Webapp by flask_search/Dockerfile
. All remaining containers are created from DockerHub images.
- Clone the repo:
$ git clone https://github.com/tngaspar/twitter-stream-mongo.git
- Create
.env
file in project root folder with the following parameters:
API_KEY=[Twitter API key]
API_SECRET_KEY=[Twitter API secrect key]
BEARER_TOKEN=[Twitter API bearer token]
MDB_HOST_NAME=mongodb://root:[Password]@mongo:27017/
MDB_DATABASE_NAME=tweetdb
MDB_COLLECTION_NAME=tweets
SEARCH_RULE=[Twitter Filtered Stream rule]
MONGODB_ROOT_PASSWORD=[choose Password]
MONGODB_REPLICA_SET_KEY=[choose ReplicaKey]
Replace all fields between brackets. You may find the twitter documentation for the SEARCH RULE
here. By default the rule has lang:en
, -is:retweet
and -is:reply
implicit so there's no need to add this parameters.
- Add password to
mongo-url
onmonstache/monstache.config.toml
:
mongo-url = "mongodb://root:[Password]@mongo:27017"
Replace fields between brackets.
- In the project root directory run docker-compose:
$ docker-compose up -d
After this all containers should be up and running and the streaming initiated.
If running locally you can check MongoDB through Mongo Express at localhost:8081
and search gathered tweets in Kibana at localhost:5601
.
The seach webapp should also be up and accessible at 0.0.0.0
and localhost
(port 80
).
Kibana allows search and analysis of tweet data from Elasticsearch.
This dashboard may be imported to Kibana by navigating to Stack Management>Saved Objects>Import
and importing the file doc/kibana_dashboard.ndjson
.
Kibana uses syntax from Apache Lucene to query and filter data. Find out more here.
Here's a simple example:
The flask_search
webapp displays a user interface where it is possible to use the Elascticsearch search functionalities. It acts as a search engine on the records present on the index.
The main page shows the search bar and a snapshot of the Kibana Dashboard.
Search example with tweets gathered using software engineer
, data
, jobs
and other related keywords as the streaming search rule.