Py3k memory conservation by returning iterators rather than lists


Many methods that used to return lists in Python 2.x now seem to return iterators in Py3k

Are iterators also generator expressions? Lazy evaluation?

Thus, with this the memory footprint of python is going to reduce drastically. Isn't it?

What about for the programs converted from 2to3 using the builtin script?

Does the builtin tool explicitly convert all the returned iterators into lists, for compatibility? If so then the lower memory footprint benefit of Py3k is not really apparent in the converted programs. Is it?


One of the biggest benefits of iterators over lists isn't memory, it is actually computation time. For instance, in Python 2:

for i in range(1000000):  # spend a bunch of time making a big list
    if i == 0:
        break  # Building the list was a waste since we only looped once

Now take for instance:

for i in xrange(1000000):  # starts loop almost immediately
    if i == 0:
        break  # we did't waste time even if we break early

Although the example is contrived, the use case isn't: loops are often broken out of mid-way. Building an entire list to only use part of it is a waste unless you are going to use it more than once. If that is the case, you can explicitly build a list: r = list(range(100)). This is why iterators are the default in more places in Python 3; you aren't out anything since you can still explicitly create lists (or other containers) when you need. But you aren't forced to when all you plan to do is iterate over an iterable once (which I would argue is the much more common case).

By : eestrada

Many of them are not exactly iterators, but special view objects. For instance range() now returns something similar to the old xrange object - it can still be indexed, but lazily constructs the integers as needed.

Similarly dict.keys() gives a dict_keys object implementing a view on the dict, rather than creating a new list with a copy of the keys.

How this affects memory footprints probably depends on the program. Certainly there's more of an emphasis towards using iterators unless you really need lists, whereas using lists was generally the default case in python2. That will cause the average program to probably be more memory efficient. Cases where there are really big savings are probably going to already be implemented as iterators in python2 programs however, as really large memory usage will stand out, and is more likely to be already addressed. (eg. the file iterator is already much more memory efficient than the older file.readlines() method)

Converting is done by the 2to3 tool, and will generally convert things like range() to iterators where it can safely determine a real list isn't needed, so code like:

for x in range(10): print x

will switch to the new range() object, no longer creating a list, and so will obtain the reduced memory benefit, but code like:

x = range(20)

will be converted as:

x = list(range(20))

as the converter can't know if the code expects a real list object in x.

By : Brian

Are iterators also generator expressions? Lazy evaluation?

An iterator is just an object with a next method. What the documentation means most of the time when saying that a function returns an iterator is that its result is lazily loaded.

Thus, with this the memory footprint of python is going to reduce drastically. Isn't it?

It depends. I'd guess that the average program wouldn't notice a huge difference though. The performance advantages of iterators over lists is really only significant if you have a large dataset. You may want to see this question.

This video can help you solving your question :)
By: admin