ML_Project1

YouTube Data Harvesting and Warehousing using SQL, MongoDB and Streamlit

Description

YouTube Data Harvesting and Warehousing is a project designed to collect and store YouTube data efficiently using both SQL and MongoDB databases. The project includes a Streamlit web application that allows users to explore and analyze the collected data easily.

Features

Data Harvesting: Utilizes YouTube API to fetch information such as video details, channel statistics, and user interactions.
Dual Database Support: Stores data in both SQL and MongoDB databases for flexibility and performance.
Streamlit Interface: Provides an intuitive Streamlit interface for users to query, visualize, and gain insights into the collected data.
Data Exploration: Supports various queries, including top-viewed videos, most active channels, and more.

Technologies Used

Python
MySQL for SQL database
MongoDB for NoSQL database
Streamlit for the web application

Installation

To set up the project locally, follow the installation instructions in the Installation section.

Usage

Configure the database connections in the respective configuration files.
Run the Streamlit app to interact with the collected YouTube data.

Folder Structure

youtube.py: Contains the source code files for data collection and the Streamlit app.
data/: Stores any sample or processed data files.
config/: Configuration files for database connections.

Database Schema

SQL Database

videos table: title, channel_name, views, likes, dislikes, comments, ...
comments table: comment_id, video_id, comment_text, comment_published_At...
channels table: channel_id, channel_name, subscriber, total videos...
playlists table: channel_id, channel_name, playlist_name, title, video_count...

MongoDB Database

Collections: videos, channels, ...

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
README.md		README.md
Radha Rangarajan ML-17.docx		Radha Rangarajan ML-17.docx
youtube.py		youtube.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML_Project1

Description

Features

Technologies Used

Installation

Usage

Folder Structure

Database Schema

SQL Database

MongoDB Database

About

Releases

Packages

Languages

Radha19-sriram/ML_Project1

Folders and files

Latest commit

History

Repository files navigation

ML_Project1

Description

Features

Technologies Used

Installation

Usage

Folder Structure

Database Schema

SQL Database

MongoDB Database

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages