This repo contain all the material related to the implementation of my dissertation.
Configuration files used to configure Apache Hadoop 2.7.7.
All the Ansible playbooks used to install, configure and run:
- Prometheus;
- Grafana;
- InfluxDB.
Files that will be use to adjust the Python/MQTT_Kafka_bridge
script to run the right 'test case scenario'
Few scripts that illustrates how to run the (py)Spark Data_processing scripts on the cluster (master node)
Files that will be use to adjust the Python/Sensor_Emulator
script to run the right 'test case scenario'
All the Ansible playbooks used to install, configure and run:
- Hadoop;
- MQTT Broker;
- Kafka;
- Spark;
- Prometheus exporters.
Application that will clean the 'influxDB' db: 'edge_data' (Related to the run of the 'test case scenarios').
(py)Spark code that will process the data from the Kafka Broker "in real time".
Simple MQTT (Mosquitto) -> Kafka bridge. Publish in a MQTT topic and have you message republished into a Kafka topic.
Sensor emulator will be able to generate random data, emulating one or multiples sensors.
Simple full end-to-end example of how to use influxDB with Python.
Simple example of the usage of (py)Spark, with Spark running under YARN/HADOOP.
All the results (screenshots + .xls with the numbers) collected from the monitoring tool for each test case scenario executed.
The idea here is to test the code implementation using 3 (two) virtual machines configured to be as similar as we can to the RaspberryPis
### Raspian Desktop
- Virtual Machine Engine: Virtual Box
- OS: Debian Stretch with Raspberry Pi Desktop (Debian 9)
- RAM: 2GB
- Storage: 10GB (32GB for the RPi3)
- Processor: 1 CPU
- Machine 1 name: RPi3_new (OS: rpi3) static_ip: 192.168.56.103
- Machine 2 name: RPi4 (OS: rpi4) static_ip: 192.168.56.202
- Machine 3 name: RPi5 (OS: rpi5) static_ip: 192.168.56.203
- RaspberryPi Configuration > System > hostname = rpi3 (and 'rpi4', 'rpi5', ...)
- RaspberryPi Configuration > Interface > SSH: enable
-
sudo nano /etc/dhcpcd.conf
-
Add on the end of the file:
interface eth1
static ip_address=192.168.56.<change_for_the_number_that_you_want_for_that_machine>/24
static routers=192.168.56.1
static domain_name_servers=192.168.56.1
-
https://www.codesandnotes.be/2018/10/16/network-of-virtualbox-instances-with-static-ip-addresses-and-internet-access/ (until 'Port-forwarding')
-
https://developer.ibm.com/recipes/tutorials/building-a-hadoop-cluster-with-raspberry-pi/
-
https://www.linode.com/docs/databases/hadoop/how-to-install-and-set-up-hadoop-cluster/