Kafka Pipeline transforming twitter feeds based on keywords and users followers' count to local elastic search server in real time
Platform requirement
Java 8
Kafka
ElasticSearch
A step by step series of examples that tell you how to get a development env running
Install Java SE 8
brew cask install java8
Or lastest Java Version
brew cask install java
Install Kafka Kafka Documentation in Or with Docker and quick provisioning [Kafka Stack With Docker] (https://github.com/simplesteph/kafka-stack-docker-compose)
If you chose to install Kafka with second methods: Change the directory into the cloned github
cd kafka-stack-docker-compose
Bring up the docker container
docker-compose -f zk-single-kafka-single.yml up
Bring down the docker container when you all down
docker-compose -f zk-single-kafka-single.yml down
Install Elastic Search: Follow this installation guide Elastic Search Installation
- Open the project with your favourite Java IDE (Eclipse, Netbeans, IntelliJ, ...)
- Obtained Twitter Developer Token Twitter Developer Dashboard
- Placed the associated tokens into the kafka-producer-twitter module.
- Run the kafka-producer-module Main method to run the Kafka Producer
- After getting ElasticSearch up on port 9200, run either kafka-consumer-elasticsearch to consume every tweets or kafka-streams-filter-tweets module to consume on specific conditions (see the code to modify behaviour)
- Then you can test by make GET request to endpoints localhost:9200/twitter/_search to query all the consumed data. See Elastic Search Documentation for further information
- Kafka - Real-time data pipelines and streaming apps
- Maven - Dependency Management
- Elastic Search - Searching Engine Built-in Server to provide quick indexing.
- Nhan Le - Initial work - Nhan Le
This project is licensed under the MIT License - see the LICENSE.md file for details
- Amazing course on Kafka Kafka Beginner