-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
2 changed files
with
100 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,100 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "57f03a22-aedd-4fdd-ae79-af75722a3dd0", | ||
"metadata": {}, | ||
"source": [ | ||
"![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "9a711085-097f-460b-b501-b591f2bbc416", | ||
"metadata": { | ||
"tags": [] | ||
}, | ||
"source": [ | ||
"# Module 2 Unit 5 - Getting Started with Data Science Tools\n", | ||
"\n", | ||
"### The case for code\n", | ||
"\n", | ||
"Data sets used by data scientists tend to be big. Large data sets can mean more accurate insights, and a greater variety of questions can be answered. Even ones too small and simple to be considered big data are still much larger than what we have traditionally worked with in schools.\n", | ||
"\n", | ||
"![city](../_images/Module2-Unit5-image.jpeg)\n", | ||
"\n", | ||
"Remember the [Census by Community](https://data.calgary.ca/Demographics/Census-by-Community-2019/rkfr-buzb) from Open Calgary that we looked at in a previous unit? That data set is relatively small, with only 306 rows and 142 columns.\n", | ||
"\n", | ||
"By contrast, the [Community Crime Statistics](https://data.calgary.ca/Health-and-Safety/Community-Crime-Statistics/78gh-n26t) from the same open data portal has 37,500 rows, 12 columns and contains several years of data.\n", | ||
"\n", | ||
"Calgary is a relatively large Canadian city with approximately 1.2 million people. But what if we wanted to compare crime data sets with another city, such as Toronto? At approximately 2.9 million people, Toronto is over double the size of Calgary and the [Major Crime Indicators (MCI) 2014 to 2019]() data set collected by the Toronto Police Service has 206,435 rows and 27 columns.\n", | ||
"\n", | ||
"If a data set is small, some analysis can be done inside a spreadsheet application like Microsoft Excel or Google Sheets, but it takes **code** to really unlock the potential of data science. Many fascinating data sets are simply too large for spreadsheet applications to efficiently work with, but beyond that, code gives us the ability to explore and visualize data in a much greater variety of ways. \n", | ||
"\n", | ||
"Instead of relying on the pre-packaged features in spreadsheet software for performing analysis and creating charts, code lets us re-create nearly any form of exploration or analysis we have heard of, or even invent our own.\n", | ||
"\n", | ||
"In terms of what you can do and create, it's like the difference between having a microwave, and having access to a fully-stocked professional kitchen.\n", | ||
"\n", | ||
"Most teachers and students will not become software developers or professional data scientists, but knowing our way around the basics gives us agency in an increasingly data-oriented world, and passing these skills to our students helps to break down digital inequity.\n", | ||
"\n", | ||
"Python logo\n", | ||
"\n", | ||
"In this course, we'll be using the coding language **Python**, and the web application **Jupyter Notebooks** to run it.\n", | ||
"\n", | ||
"This powerful and versatile combination is used by most professional data scientists.\n", | ||
"\n", | ||
"We'll be working from the **Callysto Hub**, a free educational environment where we can store data, write code, experiment with visualizations, and show our results.\n", | ||
"\n", | ||
"### Callysto Hub\n", | ||
"\n", | ||
"### Jupyter Notebooks\n", | ||
"\n", | ||
"\n", | ||
"A Jupyter notebook serves as both the text editor for writing code and the environment for running it and displaying the output.\n", | ||
"\n", | ||
"For many people familiar with coding, being able to write, and run code in the same document might be a new experience that takes some getting used to, however this system is great for new learners as they can work more quickly and don't need to access as many different tools.\n", | ||
"\n", | ||
"An example of structured data is the [Census by Community](https://data.calgary.ca/Demographics/Census-by-Community-2019/rkfr-buzb) from the City of Calgary's open data portal, [Open Calgary.](https://data.calgary.ca/)\n", | ||
"\n", | ||
"### Additional notes about Python and Jupyter notebooks\n", | ||
" \n", | ||
"### 🏁 Actvity\n", | ||
"\n", | ||
"### Conclusion\n", | ||
"\n", | ||
"In this module, we learned more about what data is, why it's useful, and where to find it. We also explored the qualities that make a data set useful for data science projects, and we familiarized ourselves with some tools for working with data sets with code, Python code and Jupyter notebooks.\n", | ||
"\n", | ||
"In the next module, we'll roll up our sleeves and put what we've learned to work. We'll learn how to add data to Jupyter notebooks and how to organize and transform it in ways that allow us to see it from new perspectives." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "d55e408c-db2a-42bb-9426-fd5cf38946a4", | ||
"metadata": {}, | ||
"source": [ | ||
"[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.9.12" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.