Skip to content

AanandhiVB/decisiontree_predictive_modelling

Repository files navigation

decisiontree_predictive_modelling

Problem Statement

An insurance provider (US based) offers health insurance to customers. The provider assigns a PCP (primary care physician) to each customer. The PCP addresses most health concerns of the customers assigned to them. For various reasons, customers want change of PCP. It involves significant effort for the provider whenever the customer makes a change of PCP.

You will find a subset of the insurance provider data along with PCP changes. The provider likes to understand why are members likely to leave the recommended provider. Further, they like to recommend a provider to them that they are less likely to leave.

The dataset consists of following fields:

Columns Description
Id Column identification field
Outcome Member changed to his/her preferred primary care provider instead of auto assigned to.
  • 0: Member keeps the auto assigned provider.
  • 1: Member changed to this provider by calling customer service.
Distance Distance between member and provider in miles
Visit_count Number of claims between member and provider
Claims_days_away Days between member changed to/assigned to the provider and latest claim between member and provider
Tier Provider Tier from service, values - 1, 2, 3, 4. Tier 1 is highest benefit level and most cost-effective level
Fqhc Value 0 or 1
  • 1: Provider is a certified Federally Qualified Health Center
Pcp_lookback Value 0 or 1
  • 1: The provider was the member's primary care provider before
Family_Assignment Value 0 or 1
  • 1: The provider is the pcp of the member in the same family
Kid Value 0 or 1
  • 1: Member is a kid. (under 18 for state of New York)
Is_Ped Value 0 or 1
  • 1: Provider is a pediatrician
Same_gender Value 0 or 1
  • 1: Provider and member are the same gender
Same_language Value 0 or 1
  • 1: Provider and member speak the same language
Same_address Value 0 or 1
  • 1: The re-assigned provider has the same address as the provider pre-assigned

Aim:

  • Build a Predictive Model
  • Evaluate the model
  • Refine the model, as appropriate

Perform the following tasks:

  • Select a method for performing analytics
  • Preprocess the data to enhance quality
  • Carry out descriptive summarization of data and make observations
  • Identify relevant, irrelevant attributes for building model
  • Perform appropriate data transformations with justifications
  • Generate new features if needed
  • Carry out the chosen analytic task. Show results including intermediate results, as needed
  • Evaluate the solutions
  • Look for refinement opportunities

Inferences

Objective

An health insurance provider (US based) offers health insurance to customers. The provider assigns a PCP (primary care physician) to each customer. The PCP addresses most health concerns of the customers assigned to them. For various reasons, customers want change of PCP. It involves significant effort for the provider whenever the customer makes a change of PCP. You will find a subset of the insurance provider data along with PCP changes. The provider likes to understand why members are likely to leave the recommended provider. Further, they like to recommend a provider to them that they are less likely to leave.

Exploratory Data analysis (EDA)

  • Read the data from “DataSet_PCP_Change.csv” csv file

  • Data Summary:

    • Summary of Healthcare Insurance Enrolled Members (Auto PCP/Switched PCP) – Total 3,130 members have enrolled to Healthcare insurance benefits, of that 127 (4%) members have switched from assigned Auto PCP and interestingly 96% members are on Auto PCP

      Enrolled Members
      Category Count Percentage
      Auto PCP 3,003 96%
      Switched PCP 127 4%
      Total 3130 100%
    • Summary of Gender - Enrolled Members (127 Members) Who Switched PCP – Total 127 members have switched PCP, of that 62 (48%) are kids and 65 (52%) are adults

      Switched PCP Members
      Gender Count Percentage
      Kids 62 48%
      Adults 65 52%
      Total 127 100%
    • Summary of Pediatric PCP/Non-Pediatric PCP Vs Switched PCP Members (127 Members) – There are 127 members who have switched PCP (pediatric and non-pediatric), of that 12% of members among Tier-1, 13% of members among Tier-2, 6% of members among Tier-3 are treated by non-pediatrician (assumption here being switched PCP members are not assigned the appropriate physician, so there's high possibility that they will request further to get new PCP assigned. Therefore, here there is an opportunity to improve the model)

      Tier Tier-1 Tier-2 Tier-3 Tier-4
      Pediatric/Non-Pediatric Adult/Kid Count Percentage Count Percentage Count Percentage Count Percentage
      Non-Pediatric PCP Adult 35 54% 8 27% 10 63% 11 69%
      Non-Pediatric PCP Kid 8 12% 4 13% 1 6% - 0%
      Pediatric PCP Adult - 0% - 0% 1 6% - 0%
      Pediatric PCP Kid 22 34% 18 60% 4 25% 5 31%
      Total 65 100% 30 100% 16 100% 16 100%

Model Building

  • Used the Decision Tree Regressor to recommend the appropriate PCP to members - 30% of PCP Dataset was used for testing and 70% of PCP Dataset was used for training

  • Achieved a model accuracy of 96%

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published