Skip to content
Sadi edited this page May 9, 2017 · 14 revisions

pandas is an open source library which provides high-performance data structures and data analysis tools for the Python.

Here we are sharing some Lessons for Python Pandas.

Read files ( CSV )

To read a file we have to use read_csv function. Here is an example:

data = pd.read_csv('../waltti.csv', quotechar='"', low_memory=False)

data = data.reindex(columns=[ 'column1', 'column2','stopid'])

Get unique data

To get the unique data from the DafaFrame we can use duplicated function.

lines = data.loc[~data['lineid'].duplicated()]

Filter data

filter = data[(data["zone"] == int(zone)) & (data["variation"] == str(variation))]

Export data

Pandas has the feature to export the data for various formats. It has to_dict, to_json, to_html functions and many more.

Here are some examples:

out = lines.to_dict(orient="index") # For Dictionary

out = lines.to_json(orient='records') # For json

Adding a new column

To append a new column in a existing DataFrame:

df.loc[:, 'NewCol'] = 'New_Val'

df.loc[:, 'latitude'] = 0

Iterating over one DataFrame and appending data on another DataFrame

Here we are iterating over a the stops information and appending the information to the main DataFrame.

for i, row in stdata.iterrows(): stop_id = int(row['stop_id']) data.loc[(data.stopid == stop_id), 'lat'] = row['stop_lat'] data.loc[(data.stopid == stop_id), 'long'] = row['stop_lon']

Get max value

df.loc[df['Value'].idxmax()]