This repository contains two PySpark based python scripts.
ddos_detector.py: A DDOS detector that analyzes Apache logs obtained from a Kafka server. It works by counting the number of requests in a moving window and it writes out to disk any ip address that makes more requests than a set threshold value.
load_data.py: A script to load Apache logs to a Kafka server. It chunks the data by the Apache timestamp. For demonstration purposes, the script may wait between sending batches to the server.
live_view.py: Watches a folder for text files written by PySpark