Skip to content

Data analysis with Python to building and evaluating data models

Notifications You must be signed in to change notification settings

Aayush-Basnet/Data-Analysis-with-Python

Repository files navigation

Data-Analysis-with-Python

Data analysis with Python to building and evaluating data models


Data Analysis with Python

Tasks Topic Content Included
Task 1 User Car Pricing Data Wrangling: Handling missing values, Data Format, Data Standardization, Data Normalization, Binning and Indicator variable
Task 2 Laptops Pricing EDA, groups and pivot tables, Pearson Correlation
Task 3 automobile Price Model Development; Linear Regression, Polynomial Regression, Pipeline, Visualization, R-square, Mean Square Error, Prediction & Decision Making
Task 4 Laptop Pricing Model Development, Linear Regression, Polynomial Regression, Pipeline, Plot
Task 5 automobile pricing Model Evaluation, overfitting, Ridge Regression, Cross validation, Grid Search
Task 6 Model Evaluation and Refinement train, test, cross validation,overfitting, Ridge, Grid search
Task 7 (Project) Black Friday Sales Prediction EDA, Data cleaning, Visualization, Linear Regression, Ridge Regression, DecisionTreeRegressor, RandomForestRegressor, ExtraTreesRegressor, XGBRegressor
Task 8 (Project) Bank Customer Churn Prediction Feature Scaling, Logistic Regression, SCV, KNeighbor Classifier, Decision Tree Classifier, Random Forest Classifier, Gradient Boosting Classifier, XGBoost

Task 1: Used Cars Pricing

Objectives:

  • Handling missing values
  • Correct data formatting
  • Standardize and normalize data

Table of Contents

  1. Identify missing values alt text
    • Deal with missing values alt text
    • Correct data format alt text
  2. Data Standardization alt text
  3. Data Normalization alt text
  4. Binning alt text
  5. Indicator variable alt text

We use data wrangling to convert data from an initial format to a format that may be better for analysis.


Task 2 : Laptops Pricing Dataset

Objectives:

  • Visualize individual feature patterns alt text alt text
  • Run descriptive statistical analysis on the dataset alt text
  • Use group and pivot tables to find the effect of categorical varaibles on price alt text
  • User Pearson Correlation to measure the independence between variables alt text

Task 3 : Model Development; automobile pricing

Objectives: Develop prediction models

In this task, I'll develop several models that will predict the price of the car using the variables or features. This is just an estimate but should give us an objective idea of how much the car should cost.

  • Simple Linear Regression alt text alt text
  • Multiple Linear Regression alt text
  • Pipeline alt text
  • Conclusion alt text

Task 4: Model Development- Laptop Pricing

Objectives

  • Use Linear Regression in one variable to fit the parameters to a model

  • Use Linear Regression in multiple variables to fit the parameters to a model

  • Use Polynomial Regression in single variable to fit the parameters to a model

  • Create a pipeline for performing linear regression using multiple features in polynomial scaling

  • Evaluate the performance of different forms of regression on basis of MSE and R^2 parameters.

    • Simple Linear Regression alt text

    • Multiple Linear Regression alt text

    • Polynomial Regression alt text

    • Pipeline alt text


Task 5: Model Evaluation and Refinement: automobile pricing

Objectives

  • Evaluate and refine prediction models

Tables of Contents

  • Model Evaluation alt text
  • Cross validation alt tex
  • Over-fitting, Under-fitting and Model Selection alt text alt text
  • Ridge Regression alt text
  • Grid Search

Task 6: Model Evaluation and Refinement; laptop pricing

In this lab, I'll try to refine our model's performance in predicting the price of a labtop.

Objectives

  • Use training, testing and cross validation to improve the performance of the dataset. alt text
  • Identify the point of overfitting of a model alt text
  • Use Ridge Regression to identify the change in performance of a model based on its hyperparameters alt text
  • Use Grid Search to identify the best performing model using different hyperparameters alt text

Explore the dynamics of Black Friday sales with predictive modeling. From feature engineering to machine learning, explore the dynamics of one of the largest shopping events globally.Join me as we analyze customer behavior, identify key predictors, and predict sales with machine learning techniques. alt text


Problem Statement : Customer churn or customer attrition is a tendency of clients or customers to abandon a brand and stop being a paying client of a particular business or organization. The percentage of customers that discontinue using a company’s services or products during a specific period is called a customer churn rate. Several bad experiences (or just one) are enough, and a customer may quit. And if a large chunk of unsatisfied customers churn at a time interval, both material losses and damage to reputation would be enormous.

Working Flow : In order to create a model these are the following procedure

  • Split the dataset in 70% of Train set and 30% of Test Set
  • Feature engineering
  • Check the accuracy score for both Training and Test Set
  • Compare the accuracies for both Training and Test set, in order to check for the overfitting issues

Releases

No releases published

Packages

No packages published