This project aims to predict the number of views for TED talks based on various features such as speaker, topic, duration, transcript, etc. The project uses machine learning techniques such as linear regression, random forest, and neural networks to build and evaluate different models. The project also explores the factors that influence the popularity and impact of TED talks. This can help TED organizers and speakers to optimize their content and delivery, as well as to identify the most engaging and influential topics for their audience. It can also help the viewers to discover and enjoy the most interesting and inspiring talks that suit their interests and needs.
This Project contain 4 Files:
- NoteBook - Includes all the colab notebooks available for the project.
- Presentation - Has Presentation of the whole project in file.
- Technical Document - Included a word file for technical details of the whole project.
- Requirements - All the required libraries for the project with their version.
Extreme Gradient Boost (XGBoost) is a powerful and efficient framework for implementing gradient boosting algorithms. It builds an ensemble of weak learners, typically decision trees, that are sequentially fitted to the residuals of the previous learners. XGBoost uses a novel tree learning algorithm that handles sparsity, missing values, and regularization. It also supports parallel and distributed computing, as well as various objective functions and evaluation metrics.
- Importing necessary libraries
- Importing dataset
- Data processing for EDA
- Exploratory data analysis
- Deriving insights and short conclusions
- Data Cleaning/Null value/missing value treatment
- Feature engineering
- Data preprocessing
- Model building
- Hyperparameter tuning
- Comparision and selection of model
- conclusion