diff --git a/source/.DS_Store b/source/.DS_Store deleted file mode 100644 index 8d62b5d..0000000 Binary files a/source/.DS_Store and /dev/null differ diff --git a/source/modules/.DS_Store b/source/modules/.DS_Store deleted file mode 100644 index 8637b6e..0000000 Binary files a/source/modules/.DS_Store and /dev/null differ diff --git a/source/modules/lesson04/generators.rst b/source/modules/lesson04/generators.rst index e2ede31..cb4ef8d 100755 --- a/source/modules/lesson04/generators.rst +++ b/source/modules/lesson04/generators.rst @@ -2,47 +2,59 @@ Part 4: Generators ################## -Generators give you an iterator object with no access to the underlying -data ... if it even exists.  Conceptually, iterators are about various -ways to loop over data.  They can generate data on the fly.  In general, -you can use either an iterator or a generator --- in fact, a generator is a -type of iterator.  Generators do some of the book-keeping for you and -therefore involve simpler syntax. +Generators are a special kind of object that is like a function that can be paused while preserving state, and then be resumed. + +They are designed to work with the iterator protocol, so they can easily make iterables that "generate" values on the fly, rather than those stored in a sequence. This was the original use case, hence the name. But generators can be used in other places where it is handy to pause a function while maintaining state. They are used in pytest fixtures, for example, and in asynchronous programming -- both advanced topics for a later date. + +But for now, we will focus on their use for making iterators. + +Conceptually, iterators are about various ways to loop over data. They can iterate over a sequence of data that already exists, or they generate values on the fly. + +In general, you can use either a custom class a generator function to make an iterator -- in fact, a generator is a type of iterator. BUt using a generator function is often easier -- it does some of the book-keeping for you and therefore involve simpler code.   +Generator Functions +=================== -yield -===== +Generator functions are a special kind of function that returns a generator when called, rather than a simple value. -:: +``yield`` +--------- + +To make a generator function, you write it like a regular function, except that you use ``yield`` instead of ``return``. Any function with a ``yield`` statement in it is a generator function. + +For example: + + +.. code-block:: python     def a_generator_function(params):         some_stuff         yield something -| -| Generator functions "yield" a value, rather than returning a value.  - It \*does\* 'return' a value, but rather than ending execution of the - function, it preserves function state so that it can pick up where it - left off.  In other words, state is preserved between yields. -| A function with ``yield``  in it is a factory for a generator.  - Each time you call it, you get a new generator: +Generators "yield" a value, rather than returning a value.  +It *does* "return" a value, but rather than ending execution of the +function, it preserves function state so that it can pick up where it +left off.  In other words, state is preserved between ``yields`` statements. + +A function with ``yield`` in it is a factory for a generator, or "generator function".  +Each time you call it, you get a new generator: ::     gen_a = a_generator()     gen_b = a_generator() -| -| Each instance keeps its own state. -| To master yield, you must understand that when you call the function, - the code you have written in the function body does not run.  The - function only returns the generator object.  The actual code in the - function is run when next() is called on the generator itself. -| An example: an implementation of range() as a generator: +Each instance keeps its own state. -:: +To master yield, you must understand that when you call the function, +the code you have written in the function body does not run.  The +function only returns the generator object.  The actual code in the +function is run when next() is called on the generator itself. + +An example: an implementation of range() as a generator: +::     def y_range(start, stop, step=1):         i = start @@ -50,16 +62,23 @@ yield             yield i             i += step -| -| Generator Comprehensions: yet another way to make a generator: +Generator Comprehensions +........................ -:: +Generator Comprehensions are yet another way to make a generator, with the comprehension syntax: + +.. code-block:: python     >>> [x * 2 for x in [1, 2, 3]]     [2, 4, 6] +     >>> (x * 2 for x in [1, 2, 3])     at 0x10911bf50> +     >>> for n in (x * 2 for x in [1, 2, 3]):     ...   print n     ... 2 4 6 + +They are the same as list comprehension, except that they don't run through teh whle loop and make a list, but rather, make a generator that will loop through the items when called by the iterator protocol. + diff --git a/source/modules/lesson04/iterators-and-iterables.rst b/source/modules/lesson04/iterators-and-iterables.rst index 0acb854..3326f40 100755 --- a/source/modules/lesson04/iterators-and-iterables.rst +++ b/source/modules/lesson04/iterators-and-iterables.rst @@ -9,7 +9,7 @@ First, we are going to look at some background from Python 2, before moving on to cover the main topic. Background ----------- +========== Python used to be all about sequences --- a good chunk of anything you did was stored in a sequence or involved manipulating one. @@ -29,109 +29,117 @@ did was stored in a sequence or involved manipulating one. - zip() - In **Python2** those are all sequences.  It turns out, however, that - the most common operation for sequences is to iterate through them: +In **Python2** those are all sequences.  It turns out, however, that +the most common operation for sequences is to iterate through them: ::     for item in a_sequence:         do_something_with_item -| -| So fairly early in Python2, Python introduced the idea of the - "iterable".  An iterable is something you can, well, iterate over in a - for loop, but often does not keep the whole sequence in memory at - once.  After all, why make a copy of something just to look at all its - items? -| For example, in python2: \`\`dict.keys()\`\` returns a list of all the - keys in the dict.  But why make a full copy of all the keys, when all - you want to do is: +So fairly early in Python2, Python introduced the idea of the +"iterable".  An iterable is something you can, well, iterate over in a +for loop, but often does not keep the whole sequence in memory at +once.  After all, why make a copy of something just to look at all its +items? + +For example, in python2: ``dict.keys()`` returns a list of all the +keys in the dict.  But why make a full copy of all the keys, when all +you want to do is: ::     for key in dict.keys():         do_something_with(key) -|   -| Even worse ``dict.items()`` created a full list of - ``(key,value)`` tuples --- a complete copy of all the data in the - dict.  Yet worse ``enumerate(dict.items())`` created a whole list - of -| ``(index, (key, value))`` tuples --- lots of copies of everything. -| Python2 then introduced "iterable" versions of a number of functions - and methods: +Even worse ``dict.items()`` created a full list of +``(key,value)`` tuples -- a complete copy of all the data in the +dict. +Even worse ``enumerate(dict.items())`` created a whole list of +``(index, (key, value))`` tuples -- lots of copies of everything. + +Python2 then introduced "iterable" versions of a number of functions +and methods: + + +| ``itertools.izip`` +| ``dict.iteritems()`` +| ``dict.iterkeys()`` +| ``dict.itervalues()`` -| -| itertools.izip -| dict.iteritems() -| dict.iterkeys() -| dict.itervalues() -| -| So you could now iterate through that stuff without copying anything. +So you could now iterate through those things without copying anything. Python 3 -------- -| -| **Python3** embraces iterables --- now everything that can be an - iterator is already an iterator --- no unnecessary copies.  An - iterator is an iterable that has been made more efficient by removing - as much from memory as possible. Therefore, if you need a list, you - have to make the list explicitly, as in: - -:: +**Python3** embraces iterables --- now everything that can be an +iterator is already an iterator --- no unnecessary copies.  An +iterator is an iterable that has been made more efficient by removing +as much from memory as possible. Therefore, if you need a list, you +have to make the list explicitly, as in:: list(dict.keys()) -  Also, there is an entire module: ``itertools`` that provides nifty -ways to iterate through stuff.  So, while we used to think in terms of -sequences, we can now think in terms of iterables. +ways to iterate through various iterables.  So, while we used to think in terms of sequences, we can now think in terms of iterables. -| -| Iterators and Iterables -| Let's see an example of how Iteration makes Python code so readable: +Iterators and Iterables +======================= -:: +Let's see an example of how Iteration makes Python code so readable: + +.. code-block:: python     for x in just_about_anything:         do_stuff(x) -| An iterable is anything that can be looped over sequentially, so it - does not have to be a "sequence": list, tuple, etc.  For example, a - string is iterable. So is a set. +An *iterable* is anything that can be looped over sequentially, so it +does not have to be a typical "sequence" type: list, tuple, etc. +For example, a string is iterable (iterate over the characters). +So is a set. -| An iterator is an iterable that remembers state. All sequences are - iterable, but not all sequences are iterators. To make a sequence an - iterator, you can call it with iter: - -:: +An *iterator* is an iterable that remembers state. All sequences are +iterable, but not all sequences are iterators. To make a sequence an +iterator, you can call it with iter::    my_iter = iter(my_sequence) -| Iterables -| --------- -| To make an object iterable, you simply have to implement the - __getitem__ method. +The distiction between `iterables` and `iterators` is a bit subtle: -:: +An *iterable* is something that you can iterate over -- that is, put in a for loop. + +An *iterator* is the object that actually manges the iteration -- it keeps track of how which items have been used, and delivers them when asked for. + +The ``for`` statement usually takes care of managing all that, but you can do it bit by bit by hand as well. + + +The Iteration Protocol +---------------------- + +Python defines a protocol for mamking and working with iterables, Any object that conforms to that protocol can be used in a for loop, and other contexts in which an iterable is expected (the list constuctor, for instance). + +The easiest way to make an object iterable, is to implement the +``__getitem__`` method. + +.. code-block:: python     class T:         def __getitem__(self, position):             if position > 5:             return position -| iter() -| ------ -| How do you get the iterator object from an "iterable"?  The iter() - function will make any iterable an iterator.  It first looks for the - __iter__() method, and if none is found, uses get_item to create - the iterator.  The \`\`iter()\`\` function: +``iter()`` +---------- -:: +Part of this protocol is the ``iter()`` function -- it takes any iterable as an argument, and returs an itertor -- primed and ready to be used. + +It first looks for the __iter__() method, and if none is found, uses get_item to create the iterator. +The ``iter()`` function in action: + +.. code-block:: ipython     In []: iter([2,3,4])     Out[]: @@ -140,19 +148,31 @@ sequences, we can now think in terms of iterables.     In []: iter( ('a', 'tuple') )     Out[]: -List as an Iterator -------------------- +``next()`` +---------- + +So how do you get the items from an iterator? -:: +Another key part of the iterator protocol is the ``next()`` function. When passed in a iterator, it returns the next item: + +.. code-block:: ipython     In []: a_list = [1,2,3] + + # first get the iterator from the list     In []: list_iter = iter(a_list) + + # then ask for its items, one by one:     In []: next(list_iter)     Out[]: 1 +     In []: next(list_iter)     Out[]: 2 +     In []: next(list_iter)     Out[]: 3 + + # what happens when there are no more?     In []: next(list_iter)     --------------------------------------------------     StopIteration     Traceback (most recent call last) @@ -160,40 +180,43 @@ List as an Iterator     ----> 1 next(list_iter)     StopIteration: -Use iterators when you can --------------------------- +As you can see, when you call ``next()`` when there are no more items (the iterator is "exhausted"), a ``StopIteration`` exception is raised, indicating that you are done. + + +Use Iterators When You Can +========================== + Consider the example from the trigrams problem: (http://codekata.com/kata/kata14-tom-swift-under-the-milkwood/) + You have a list of words and you want to go through it, three at a time, and match up pairs with the following word. -The \*non-pythonic\* way to do that is to loop through the indices: +The *non-pythonic* way to do that is to loop through the indices: -:: +.. code-block:: python     for i in range(len(words)-2):         triple = words[i:i+3] -It works, and is fairly efficient, but what about: +It works, and is fairly efficient, but what about:: -:: -     for triple in zip(words[:-2], words[1:-1], words[2:-2]): +    for triple in zip(words[:-2], words[1:-1], words[2:-2]): - zip() returns an iterable --- it does not build up the whole list, so - this is quite efficient.  However, we are still slicing: ([1:]), which - produces a copy --- so we are creating three copies of the list --- - not so good if memory is tight.  Note that they are shallow copies, so - this is not terribly bad.  Nevertheless, we can do better. +zip() returns an *iterable* --- it does not build up the whole list, so +this is quite efficient.  However, we are still slicing: (``[1:]``), which +produces a copy --- so we are creating three copies of the list --- +not so good if memory is tight.  Note that they are shallow copies, so +this is not terribly bad.  Nevertheless, we can do better. - The ``itertools`` module has a ``islice()`` (iterable slice) - function.  It returns an iterator over a slice of a sequence --- so no - more copies: +The ``itertools`` module has a ``islice()`` (iterable slice) +function.  It returns an iterator over a slice of a sequence --- so no +more copies: -:: +.. code-block:: python     from itertools import islice -     triplets = zip(words, islice(words, 1, None), islice(words, 2, - None)) +     triplets = zip(words, islice(words, 1, None), islice(words, 2, None))     for triplet in triplets:         print(triplet)     ('this', 'that', 'the') @@ -202,29 +225,60 @@ It works, and is fairly efficient, but what about:     ('other', 'and', 'one')     ('and', 'one', 'more') + The Iterator Protocol ----------------------- - The main thing that differentiates an iterator from an iterable - (sequence) is that an iterator saves state.  An iterable must have the - following methods: +--------------------- -:: +There are two perspectives to the iterator protocol: -     an_iterator.__iter__() - Usually returns the iterator object itself. +1) Working with an iterable: -:: -     an_iterator.__next__() - Returns the next item from the container. If there are no further - items it raises the ``StopIteration`` exception. +You use ``iter()`` to get an iterator, and ``next()`` to get the next item in the iterable. -Making an Iterator -------------------- -A simple version of ``range()`` +2) Making a custom iterable: -:: +An iterable must have an ``__iter__()`` method that returns an iterator. + +An iterator must have a ``__next__()`` method that returns the next item, +and raises ``StopIteration`` when there are no more items. + + +The main thing that differentiates an iterator from an iterable (sequence) is that an iterator saves state -- it keeps track of which items have been already used, and which are left. + +The ``__iter__()`` method +......................... + +A class' ``__iter()`` method needs to return an iterator, ready to have ``__next__()`` called. + + +Often a custom iterable will return the iterator object itself. -     class IterateMe_1: +The main thing that differentiates an iterator from an iterable (sequence or other object) is that an iterator saves state -- it keeps track of which items have been already used, and which are left. + +If an iterable returns itself, than the ``__iter__()`` method is the place to initialize (or reset) the counter on where the iteration has been so far. + + +The ``__next__()`` method +......................... + +The ``__next__()`` method returns the next item from the container (or other source). +If there are no further items it raises the ``StopIteration`` exception. + +Probably the best way to understand this is with an example. + +Making an Iterable +------------------ + +The ``range()`` builtin object is an iterable that produces a range of numbers, as they are asked for -- it does not create the sequence of number ahead of time, but rather, creates each one as it is needed. + +.. note:: In Python 2, ``range()`` *did* create the full list as soon as you called it. As this was pretty inefficient, an ``xrange()`` function was created that generated the numbers on the fly. In python 3, the built in range() no longer creates a list, so ``xrange()`` is no longer needed. You may still see ``xrange()`` in old Python 2 code -- change it to range() when porting to Python 3. + + +You can create a simple version of ``range()`` like this: + +.. code-block:: python + +     class My_range:         def __init__(self, stop=5):             self.current = 0             self.stop = stop @@ -238,22 +292,24 @@ A simple version of ``range()``                 raise StopIteration -What does *for* do? +What does ``for`` do? +--------------------- - Now that we know the iterator protocol, we can write something like a - for loop: +Now that we know the iterator protocol, we can write something like a +for loop: - :download:\`my_for.py - <../examples/iterators_generators/my_for.py>` +(:download:`my_for.py <../examples/iterators_generators/my_for.py>`) -:: +.. code-block:: python     def my_for(an_iterable, func):         """         Emulation of a for loop.         func() will be called with each item in an_iterable -         """ -         # equiv of "for i in l:" +         + equiv of "for i in l: func(i)" + """ +         iterator = iter(an_iterable)         while True:             try: @@ -264,6 +320,8 @@ What does *for* do? Summary ------- + Iterators and Iterables are fundamental concepts in Python. Although the language can be confusing, the underlying concepts are quite straightforward. + In the lesson assignment you will have opportunities to practice and apply using them.