Data-Analysis-with-Python

Data analysis with Python to building and evaluating data models

Data Analysis with Python

Tasks	Topic	Content Included
Task 1	User Car Pricing	Data Wrangling: Handling missing values, Data Format, Data Standardization, Data Normalization, Binning and Indicator variable
Task 2	Laptops Pricing	EDA, groups and pivot tables, Pearson Correlation
Task 3	automobile Price	Model Development; Linear Regression, Polynomial Regression, Pipeline, Visualization, R-square, Mean Square Error, Prediction & Decision Making
Task 4	Laptop Pricing	Model Development, Linear Regression, Polynomial Regression, Pipeline, Plot
Task 5	automobile pricing	Model Evaluation, overfitting, Ridge Regression, Cross validation, Grid Search
Task 6	Model Evaluation and Refinement	train, test, cross validation,overfitting, Ridge, Grid search
Task 7 (Project)	Black Friday Sales Prediction	EDA, Data cleaning, Visualization, Linear Regression, Ridge Regression, DecisionTreeRegressor, RandomForestRegressor, ExtraTreesRegressor, XGBRegressor
Task 8 (Project)	Bank Customer Churn Prediction	Feature Scaling, Logistic Regression, SCV, KNeighbor Classifier, Decision Tree Classifier, Random Forest Classifier, Gradient Boosting Classifier, XGBoost

Task 1: Used Cars Pricing

Objectives:

Handling missing values
Correct data formatting
Standardize and normalize data

Task 2 : Laptops Pricing Dataset

Objectives:

Visualize individual feature patterns
Run descriptive statistical analysis on the dataset
Use group and pivot tables to find the effect of categorical varaibles on price
User Pearson Correlation to measure the independence between variables

Task 3 : Model Development; automobile pricing

Objectives: Develop prediction models

In this task, I'll develop several models that will predict the price of the car using the variables or features. This is just an estimate but should give us an objective idea of how much the car should cost.

Simple Linear Regression
Multiple Linear Regression
Pipeline
Conclusion

Task 4: Model Development- Laptop Pricing

Objectives

Use Linear Regression in one variable to fit the parameters to a model
Use Linear Regression in multiple variables to fit the parameters to a model
Use Polynomial Regression in single variable to fit the parameters to a model
Create a pipeline for performing linear regression using multiple features in polynomial scaling
Evaluate the performance of different forms of regression on basis of MSE and R^2 parameters.
- Simple Linear Regression
- Multiple Linear Regression
- Polynomial Regression
- Pipeline

Task 5: Model Evaluation and Refinement: automobile pricing

Objectives

Evaluate and refine prediction models

Tables of Contents

Model Evaluation
Cross validation
Over-fitting, Under-fitting and Model Selection
Ridge Regression
Grid Search

Task 6: Model Evaluation and Refinement; laptop pricing

In this lab, I'll try to refine our model's performance in predicting the price of a labtop.

Objectives

Use training, testing and cross validation to improve the performance of the dataset.
Identify the point of overfitting of a model
Use Ridge Regression to identify the change in performance of a model based on its hyperparameters
Use Grid Search to identify the best performing model using different hyperparameters

Project: Black Friday Sales Prediction

Explore the dynamics of Black Friday sales with predictive modeling. From feature engineering to machine learning, explore the dynamics of one of the largest shopping events globally.Join me as we analyze customer behavior, identify key predictors, and predict sales with machine learning techniques.

Project: Customer Churn Prediction

Problem Statement : Customer churn or customer attrition is a tendency of clients or customers to abandon a brand and stop being a paying client of a particular business or organization. The percentage of customers that discontinue using a company’s services or products during a specific period is called a customer churn rate. Several bad experiences (or just one) are enough, and a customer may quit. And if a large chunk of unsatisfied customers churn at a time interval, both material losses and damage to reputation would be enormous.

Working Flow : In order to create a model these are the following procedure

Split the dataset in 70% of Train set and 30% of Test Set
Feature engineering
Check the accuracy score for both Training and Test Set
Compare the accuracies for both Training and Test set, in order to check for the overfitting issues

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Data-Analysis-with-Python