Skip to content

Latest commit

 

History

History
91 lines (90 loc) · 2.31 KB

pandas_multi_idx.md

File metadata and controls

91 lines (90 loc) · 2.31 KB

Multi row

creation

df_rows = [
  'row0',
  'row1'
  ]
df_cols = [
  'col0',
  'col1'
  ]
df = pd.DataFrame(columns=(df_rows+df_cols))
df.set_index(df_rows)

adding new row

(with several cols values)

In [27]: df.loc[('321', 4),['predicted_y', 'actual_y', 'predicted_full', 'actual_full']] =  (
           [20, 50, np.array((10, 20, 30), dtype='O'), [40, 50, 60]])

In [28]: df
Out[28]: 
                 predicted_y actual_y predicted_full   actual_full
subj_id org_clip                                                  
123     3                  2        5      [1, 2, 3]     [4, 5, 6]
321     4                 20       50   [10, 20, 30]  [40, 50, 60]

slicing

https://stackoverflow.com/questions/53927460/select-rows-in-pandas-multiindex-dataframe slice by 2nd level index

df.query("two > 5")

slice a single 2nd level by match (e.g. close from ohlc) take all 1st level, only take close and also drop 'close'

prices.stack().loc[(slice(None), slice('close')), :].droplevel(1)

deleting top level index

drop level 0

df = df.droplevel(0)

Multi col

creation

Should be created with at least 1 column in advance

df = pd.DataFrame(columns=[('idx0','idx0a'),('idx0','idx0b')])
df.columns = pd.MultiIndex.from_tuples(df.columns, names=['upper_level','lower_level'])

concating several dataframes to a single multicol

dfl=[]
for df in [a,b,c,d]:
  dfl.append(df)
df=pd.concat(dfl, axis=1, keys=[a,b,c,d])

adding new col

df[symbol,'fee']=np.nan
df[symbol,'price']=np.nan

adding row

df.loc[today, (symbol,'fee')]   = val

Move 1 col level up

df.reset_index().set_index(['timestamp','symbol']).unstack(1).swaplevel(0,1, axis=1)

or with pivot

cols=df.columns[df.columns != 'symbol']
df.reset_index().pivot(index='timestamp', columns = 'symbol', values=cols).swaplevel(0,1, axis=1)

deleting top level index

df.columns = df.columns.droplevel()

slicing

Slice 2nd level and remove it

df1=df.loc[:, (slice(None), 'close')]
df1.columns = df1.columns.droplevel(1)

get level indexes

level 0 columns:

df.columns=df.columns.remove_unused_levels() # Don't skip the 2 level process, or columns will goto incostinstance state
df.columns.remove_unused_levels().levels[0]