The application is designed for periodic scraping of the AutoRia platform, specifically for used cars. The application is scheduled to run every day at a specified time and iterates through all pages starting from the initial page to the end. For each car listing, it enters the individual car details page and collects the necessary data.
- Automated scraping on a scheduled basis
- Regularly scheduled database backups
- Application containerized using Docker
- Python
- PostgreSQL
- Scrapy
- Docker
- Clone the repository:
git clone [email protected]:Katherine-Greg/py-scrape-cars.git
- If you are using PyCharm - it may propose you to automatically create venv for your project and install requirements in it, but if not:
python -m venv venv
venv\Scripts\activate (on Windows)
source venv/bin/activate (on macOS)
pip install -r requirements.txt
- Run app:
python main.py
- or Run it with Docker
docker-compose up
- If you want to check spider`s work, run
scrapy crawl cars
- Create a branch for the solution and switch on it:
git checkout -b develop
Feel free to add more data, and implement new methods and features!
- Save the solution:
git commit -am 'Solution'
- Push the solution to the repo:
git push origin develop
If you created another branch (not develop) use its name instead
Create New Pull Request
:
-
Go to the original repository on GitHub and click on the New Pull Request.
-
Provide details about your changes and submit the pull request for review.