In this project, I utilized web scraping techniques to gather data from a paginated webpage. Specifically, I extracted Hockey Team data from ScrapeThisSite, which is an example of a paginated webpage. A paginated webpage is one that is organized into multiple pages, often to accommodate a large amount of data. In the case of the specified page, the Hockey Team data spanned 582 rows and was distributed across multiple pages for a more organized presentation.
The scraping process was facilitated by the BeautifulSoup
library -- a powerful Python library for pulling data out of HTML and XML files.
Following the data retrieval, I conducted Exploratory Data Analysis (EDA) using the pandas
, matplotlib
, and seaborn
libraries. This analysis aims to gain insights, visualize patterns, and uncover trends within the scraped Hockey Team data. The combination of these libraries allowed for comprehensive data exploration and visualization, enhancing the understanding of the dataset.