This repository is dedicated to the development and application of machine learning techniques for predicting the prices of apartments in Prague.
Ensure you set up a Python environment before running the code. You can use one of the following commands:
conda env create -f environment.yml
conda env create re && conda activate re && pip install -r requirements.txt
Python Version: The codebase uses Python 3.10.6.
For detailed instructions, use: python main.py --help
Hyperparameter Search: python main.py --train --tune
(data loaded from ../data/dataset.csv
)
Default Prediction: python main.py
(runs prediction on data from ../data/dataset.csv and saves results in ../data/result.csv
)
Training with New Data: python main.py --train --scrape
(scrapes new data and performs training)
Run a local web server: streamlit run web.py
The processing runs in two phases:
- Training Phase: Crawlers obtain all advertisements.
- Inference Phase: Users provide advertisement URL/data via the web app.
ETL
Class: Handles data acquisition and preprocessing.
Model
Class: Operates on data preprocessed by ETL
.
The model is based on XGBoost.
- Hanka Nguyenová (Team leader)
- Daniel Karlík
- Emanuel Frátrik
- (Adam Šumník)