We will use a synthetic dataset for this exercise. The dataset contains the following columns:
: A unique identifier for each customer.
: The age of the customer.
: The monthly bill amount for the customer.
: The number of times the customer contacted customer service.
Churn: This is our target variable, indicating whether the customer churned (Yes) or not (No).
Setup the Environment
Import necessary libraries: Pandas for data manipulation, Scikit-learn for machine learning, and Matplotlib for visualization.
Create the Dataset
Use Python to create a synthetic dataset. We'll make a small dataset for simplicity.
Data Preparation
Split the data into features (X) and the target variable (y). Further split the dataset into training and testing sets.
Build the Decision Tree Model
Use Scikit-learn to create a DecisionTreeClassifier. Train the model on the training data.
Evaluate the Model
Make predictions on the test set. Calculate the accuracy of the model.
Visualize the Decision Tree
Use Matplotlib to visualize how the decision tree makes decisions.
Discuss the Results
Interpret the decision tree.
Discuss how it can be used by the company to reduce customer churn.