Streamlined-Data-Ingestion-with-pandas

Discription

This course teaches you how to build pipelines to import data kept in common storage formats. You’ll use pandas, a major Python library for analytics, to get data from a variety of sources, from spreadsheets of survey responses, to a database of public service requests, to an API for a popular review site. Along the way, you’ll learn how to fine-tune imports to get only what you need and to address issues like incorrect data types. Finally, you’ll assemble a custom dataset from a mix of sources.

Chapters 1 and 2 Importing Data from Flat and Excel Files

read_csv() and read_excel
Setting data types, choosing data to load, handling missing data and errors
usecols, nrows, skiprows, names, dtype, na_values, .isna(), sheet_name, parse_dates(for standard datetime), pd.to_datetime()(Parse non-standard date formats)
Putting multiple spreadsheets together (Using iterate through dataframe in dictionary)
Setting custom true/false values

Chapter 3 Importing Data from Databases

read_sql() and sqlalchemy
SQL SELECT, WHERE, aggregate functions and joins
Install SQLite and import .db in it
sqlalchemy's create_engine() makes an engine to handle database connections

Chapter 4 Importing JSON Data and Working with APIs

read_json(), json_normalize(), and requests
Working with APIs and nested JSONs
Appending and merging datasets
requests.get()(get the data from API), .json()(extract JSON data), .append(), .merge()

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
README.md		README.md
Streamlined Data Ingestion with pandas.ipynb		Streamlined Data Ingestion with pandas.ipynb
data.db		data.db
dhs_report_reformatted.json		dhs_report_reformatted.json
fcc-new-coder-survey.xlsx		fcc-new-coder-survey.xlsx
vt_tax_data_2016.csv		vt_tax_data_2016.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Streamlined-Data-Ingestion-with-pandas

Discription

Chapters 1 and 2 Importing Data from Flat and Excel Files

Chapter 3 Importing Data from Databases

Chapter 4 Importing JSON Data and Working with APIs

About

Releases

Packages

Languages

cc59chong/Streamlined-Data-Ingestion-with-pandas

Folders and files

Latest commit

History

Repository files navigation

Streamlined-Data-Ingestion-with-pandas

Discription

Chapters 1 and 2 Importing Data from Flat and Excel Files

Chapter 3 Importing Data from Databases

Chapter 4 Importing JSON Data and Working with APIs

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages