AttractMe

Attraction Recommendation Application

Inbal Croitoru & Almog Gueta

About The App

Given current source bus station, the app recommends on an attraction, based on attractions' TripAdvisor rating, and optional bus lines' delays.

Figure 1. Attraction Recommendation Application

About The Data

We have used 230 million recoreds from bus sensors within Dublin, between July 2017 to September 2018, as a stream data.
we have used TripAdvisor attractions' ratings that we scrapped from their website.
We matched these attractions data with attractions from the open data Ireland website: link.
We also used bus stops data that contains the geo-location for each bus stop in Dublin. This data was downloaded from Smart Dublin website: link. +evant

All relevant data is in the directory: data , except for the Dublin data that is received from the VM or from the user input.

About The Technology

Apache_Spark™ and Jupyter_Notebook as processing frameworks.

Elasticsearch as data warehouse.

About The Requirements & Usage

Processing is accomplished by using Spark 2.4.5 (PySpark) and Python 3.7.5.
Please look at the Requirements directory for required libraries.
Instructions below assume that the code will run on Databricks.
Since we use Elasticsearch, in order to run the code you will need to write the following command on your VM cmd: 'sudo docker-compose up -d'
Code is written as if we read stream data from Kafka server, look at the cell 'read stream data' in final_app.ipynb.

Final Task

Run As User

In order to use the app (as a user) please enter the dashboard link and follow the instructions: AppDashboard.

Figure 2. Dashboard Example

Run Code

In order to run the app code, please run the following files in the following order:

schema_matching_NLP.ipynb
create_all_static_data_dfs.ipynb
final_app.ipynb → Notice: this file includes code to create and upload to Elasticsearch the Delay stream data. Please type your Elasticsearch host number in the 'imports' cell.

In order to run the final app you are requested to choose one of the options at the top of the notebook:

For Stream Sources, enter your api in the "API" option.
For Batch Sources, enter your json path in the "Json path" option.
For a single source, choose one of the bus stops options presented in the "Source Bus Stop" option.

The input data must include the same df columns as described in the final_app.ipynb in the cell 'dublin data schema'.

Warm up

In order to run the Warm Up part, please run all files in the warmup_task directory in the following order:

preprocess_n_save_external_data.ipynb
train_lr_model_task_2.ipynb
train_lr_task_3.ipynb
warmup_final.ipynb

Notice: some of the notebooks create data that is necessary to the notebooks comming after.

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
Requirements		Requirements
data		data
warmup task		warmup task
README.adoc		README.adoc
create_all_static_data_dfs.ipynb		create_all_static_data_dfs.ipynb
example-dashboard-pic.jpeg		example-dashboard-pic.jpeg
final_app.ipynb		final_app.ipynb
presentation-picture.png		presentation-picture.png
schema_matching_NLP.ipynb		schema_matching_NLP.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AttractMe

Attraction Recommendation Application

About The App

About The Data

About The Technology

About The Requirements & Usage

Final Task

Run As User

Run Code

Warm up

About

Releases

Packages

Languages

inbalcroitoru/AttractMe-Dublin-

Folders and files

Latest commit

History

Repository files navigation

AttractMe

Attraction Recommendation Application

About The App

About The Data

About The Technology

About The Requirements & Usage

Final Task

Run As User

Run Code

Warm up

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages