Skip to content

3. Lambda Functions

Klementyna Kasraie edited this page Dec 16, 2019 · 2 revisions

What is a lambda function

Lambda functions are small, anonymous functions that can be used in Python. You can always get around using lambda functions by simply writing Python functions--but sometimes having small, unnamed functions is easier, more convenient, and more readable.

Lambda functions have the following syntax: lambda (argument): (return value). To create a lambda function that returns the square of a number, you could write lambda x: x*x.

A lambda function can be executed when it's created. For example, in a python interpreter,

>>> (lambda x: x*x)(5)
# returns: 25
>>> (lambda x, y: x*y)(3, 4)
# returns 12

A nice explanation of lambda functions and how to use them can be found here: https://realpython.com/python-lambda/.

Using lambda function in pandas

One common place where lambda functions are used is processing data in a list or pandas dataframe. Suppose we have a dataframe df showing some cities and their average temperatures (this data is entirely made up for the purpose of illustration):

       city  region  avg_temp_F
0        NY    East          60
1   Atlanta    East          90
2  St. Luis    East          95
3    Tucson    West         100
4        LA    West          92
5   Chicago  Middle          50

We want to convert the average temperatures to Centigrade. We'll create a new column, avg_temp_C, and use a lambda function together with "apply" to do the conversion:

df['avg_temp_C'] = df.avg_temp.apply(lambda F: round((F-32)*5/9))

Now the df looks like:

       city  region  avg_temp_F  avg_temp_C
0        NY    East          60          16
1   Atlanta    East          90          32
2  St. Luis    East          95          35
3    Tucson    West         100          38
4        LA    West          92          33
5   Chicago  Middle          50          10

Note, we can operate on entire rows of the dataframe, not just a single column, using apply. You must specify axis = 1 to ensure apply is operating on rows:

df.apply(lambda x: x.city + ' is ' + ('hot' if x.avg_temp_F > 70 else 'cold') + '.', axis=1) 
# returns:
# 0         NY is cold.
# 1     Atlanta is hot.
# 2    St. Luis is hot.
# 3      Tucson is hot.
# 4          LA is hot.
# 5    Chicago is cold.