-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Python Lists Medical Insurance Estimation Project
- Loading branch information
Showing
6 changed files
with
2,238 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
# Machine Learning Codecademy Project |
373 changes: 373 additions & 0 deletions
373
... Estimation/.ipynb_checkpoints/Python Lists Medical Insurance Estimation-checkpoint.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,373 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "79f9fe31", | ||
"metadata": {}, | ||
"source": [ | ||
"# Python Lists: Medical Insurance Estimation Project" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "5bfa2012", | ||
"metadata": {}, | ||
"source": [ | ||
"In this project, you will examine how factors such as age, sex, BMI, number of children, and smoking status contribute to medical insurance costs.\n", | ||
"\n", | ||
"You will apply your new knowledge of Python Lists to store insurance cost data in a list as well as compare **estimated** insurance costs to **actual** insurance costs.\n", | ||
"\n", | ||
"Let's get started!" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "454063d9", | ||
"metadata": {}, | ||
"source": [ | ||
"## Creating a List" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "5eb6fdde", | ||
"metadata": {}, | ||
"source": [ | ||
"1. First, take a look at the code in the code block below.\n", | ||
"\n", | ||
" The function `estimate_insurance_cost()` estimates the medical insurance cost for an individual, based on five variables:\n", | ||
" - `age`: age of the individual in years\n", | ||
" - `sex`: 0 for female, 1 for male\n", | ||
" - `bmi`: individual's body mass index\n", | ||
" - `num_of_children`: number of children the individual has\n", | ||
" - `smoker`: 0 for a non-smoker, 1 for a smoker\n", | ||
" \n", | ||
" These variables are used in the following formula to estimate an individual's insurance cost (in USD):\n", | ||
" \n", | ||
" $$\n", | ||
" insurance\\_cost = 250*age - 128*sex + 370*bmi + 425*num\\_of\\_children + 24000*smoker - 12500\n", | ||
" $$\n", | ||
" \n", | ||
" Observe below the code the estimated insurance costs for three individuals - Maria, Rohan, and Valentina." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 2, | ||
"id": "a6e98fae", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"Maria's Estimated Insurance Cost: 4222.0 dollars.\n", | ||
"Rohan's Estimated Insurance Cost: 5442.0 dollars.\n", | ||
"Valentina's Estimated Insurance Cost: 36368.0 dollars.\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"# Function to estimate insurance cost:\n", | ||
"def estimate_insurance_cost(name, age, sex, bmi, num_of_children, smoker):\n", | ||
" estimated_cost = 250*age - 128*sex + 370*bmi + 425*num_of_children + 24000*smoker - 12500\n", | ||
" print(name + \"'s Estimated Insurance Cost: \" + str(estimated_cost) + \" dollars.\")\n", | ||
" return estimated_cost\n", | ||
"\n", | ||
"# Estimate Maria's insurance cost\n", | ||
"maria_insurance_cost = estimate_insurance_cost(name = \"Maria\", age = 31, sex = 0, bmi = 23.1, num_of_children = 1, smoker = 0)\n", | ||
"\n", | ||
"# Estimate Rohan's insurance cost\n", | ||
"rohan_insurance_cost = estimate_insurance_cost(name = \"Rohan\", age = 25, sex = 1, bmi = 28.5, num_of_children = 3, smoker = 0)\n", | ||
"\n", | ||
"# Estimate Valentina's insurance cost\n", | ||
"valentina_insurance_cost = estimate_insurance_cost(name = \"Valentina\", age = 53, sex = 0, bmi = 31.4, num_of_children = 0, smoker = 1)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "87f5d6d5", | ||
"metadata": {}, | ||
"source": [ | ||
"2. We want to compare the estimated insurance costs (as calculated by our function) to the actual amounts that Maria, Rohan, and Valentina paid.\n", | ||
"\n", | ||
" Create a list called `names` and fill it with the names of individuals you are estimating insurance costs for:\n", | ||
" - `\"Maria\"`\n", | ||
" - `\"Rohan\"`\n", | ||
" - `\"Valentina\"`" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "6e4218d8", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "7aad0105", | ||
"metadata": {}, | ||
"source": [ | ||
"3. Next, create a list called `insurance_costs` and fill it with the actual amounts that Maria, Rohan, and Valentina paid for insurance:\n", | ||
" - `4150.0`\n", | ||
" - `5320.0`\n", | ||
" - `35210.0`" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "93fc21ce", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "752b283b", | ||
"metadata": {}, | ||
"source": [ | ||
"## Combining Lists" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "bbc0c97c", | ||
"metadata": {}, | ||
"source": [ | ||
"4. Currently the `names` and `insurance_costs` lists are separate, but we want each name to be paired with an insurance cost.\n", | ||
"\n", | ||
" Create a new variable called `insurance_data` that combines `names` and `insurance_costs` using the `zip()` function.\n", | ||
" \n", | ||
" Print this new variable." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "42f299ef", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "f1492a2d", | ||
"metadata": {}, | ||
"source": [ | ||
"5. The output should look something like:\n", | ||
"\n", | ||
" ```\n", | ||
" <zip object at 0x7f1631e86b48>\n", | ||
" ```\n", | ||
" \n", | ||
" This output does not mean much to us. To change it to a format we can actually understand, we must convert the `zip` object to a list by doing the following:\n", | ||
" \n", | ||
" ```\n", | ||
" list(zip(____, ____))\n", | ||
" ```\n", | ||
" \n", | ||
" Convert the `insurance_data` object to a list using this method. Run the code to see the result - you should now see a list of names and insurance costs." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "ef0311a1", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "ff1c4435", | ||
"metadata": {}, | ||
"source": [ | ||
"## Appending to a List" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "0ea69efb", | ||
"metadata": {}, | ||
"source": [ | ||
"6. Next, create an empty list called `estimated_insurance_data`.\n", | ||
"\n", | ||
" This is the list we'll use to store the estimated insurance costs for our three individuals." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "02025200", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "7eb29c6b", | ||
"metadata": {}, | ||
"source": [ | ||
"7. We want to add our estimated insurance data for Maria, Rohan, and Valentina to the `estimated_insurance_data` list.\n", | ||
"\n", | ||
" Use `.append()` to add `(\"Maria\", maria_insurance_cost)` to `estimated_insurance_data`. Do the same for Rohan and Valentina." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "f0177c39", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "99fb3c71", | ||
"metadata": {}, | ||
"source": [ | ||
"8. Print `estimated_insurance_data`.\n", | ||
"\n", | ||
" Make sure the output is what you expected." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "d7066b0c", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "21da68be", | ||
"metadata": {}, | ||
"source": [ | ||
"## Inspecting the data" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "c41d4c11", | ||
"metadata": {}, | ||
"source": [ | ||
"9. In the output, you should see two lists. The first one represents the **actual** insurance cost data and the second one represents the **estimated** insurance cost data.\n", | ||
"\n", | ||
" However, it's difficult to know this just by looking at the output. As a data scientist, you want to make sure that your data is clean and easy to understand.\n", | ||
" \n", | ||
" Add to the print statement for `insurance_data` so that it's clear what the list contains. The output of the print statement should look like:\n", | ||
" \n", | ||
" ```\n", | ||
" Here is the actual insurance cost data: [...list output...]\n", | ||
" ```" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "0da6c2e5", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "48c95b49", | ||
"metadata": {}, | ||
"source": [ | ||
"10. Do the same for the print statement that prints `estimated_insurance_data`. The output should look like:\n", | ||
"\n", | ||
" ```\n", | ||
" Here is the estimated insurance cost data: [...list output...]\n", | ||
" ```" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "dc701f7b", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "34e7aaa4", | ||
"metadata": {}, | ||
"source": [ | ||
"11. See the results from both tasks above.\n", | ||
"\n", | ||
" It should be much more clear from the output what each of the two lists represents, helping you better understand the data you're working with.\n", | ||
" \n", | ||
" You may notice that there are differences between the actual insurance costs and estimated insurance costs. This means that our `estimate_insurance_cost()` function does not calculate insurance costs with 100% accuracy.\n", | ||
" \n", | ||
" Compare the estimated insurance data to the actual insurance data. Do the estimated insurance costs seem to be overestimated or underestimated?" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "3bce2ea2", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "5341142c", | ||
"metadata": {}, | ||
"source": [ | ||
"## Extra" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "761c4808", | ||
"metadata": {}, | ||
"source": [ | ||
"12. Congratulations! In this project, you used Python lists to store **estimated** insurance cost data and then compare that data to **actual** insurance cost data.\n", | ||
"\n", | ||
" As you've seen, lists are data structures in Python that can contain multiple pieces of data in a single object. As a data scientist, you'll find yourself working with this data structure quite often. You now have a solid foundation to move forward in your data science journey!\n", | ||
" \n", | ||
" If you'd like additional practice on lists, here are some ways you might extend this project:\n", | ||
" - Calculate the difference between the actual insurance cost data and the estimated insurance cost data for each individual, and store the results in a list called `insurance_cost_dif`.\n", | ||
" - Estimate the insurance cost for a new individual, Akira, who is a 19-year-old male non-smoker with no children and a BMI of 27.1. Make sure to append his name to `names` and his actual insurance cost, `2930.0`, to `insurance_costs`.\n", | ||
" \n", | ||
" Happy coding!" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "8f50b08c", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.7.11" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
Oops, something went wrong.