Skip to content

shhovhan/web_scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web Scraping

Project creates web scraper, to scrap web page like Hacker News and handle the data in a specific way with storing usage info into a database

Requirements

  • Python (3.10)
  • Beautifulsoup4
  • Pytest

Tested on MacOS.

Setup

First check if you have python installed by running command python --version. Most probably you have it already. Otherwise, install corresponding version according to instructions of your OS. For more details follow the link

Next setup python environment:

  • create virtual environment using python -m venv <path_to_env> or use instruction here
  • activate virtual environment: source <path_to_venv>/bin/activate
  • clone code or download zip anywhere in your local machine
  • go to web_crawler project and install requirements: pip install -r requirements.txt

Usage

From web_crawler directory run main.py

python main.py

Database

SQLite is used for storing data in a database. For that, a special storage module has created, which contains UsageDataStorage class. It handles db connection with SQLite, creating a table and adding data in it.

SQLite is light, no need for any special installation or setup for connecting and working with database.

Running Tests

Project contains unit test written in unittest library. You will find them in tests folder.

Pytest is used to run all tests automatically.

To do it, just run the following command and it will automatically run all tests under the folder.

pytest

==================================

Author: Shushanik Hovhannesyan

Email: [email protected]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages