Map reduce and filter in Python

  • 2020-04-02 13:35:31
  • OfStack

1. What is an iterable object

Take the built-in Max function as an example and see its doc:


>>> print max.__doc__
max(iterable[, key=func]) -> value
max(a, b, c, ...[, key=func]) -> value
With a single iterable argument, return its largest item.
With two or more arguments, return the largest argument.

In the first form of the Max function, the first argument is an iterable object, so which ones are iterable objects?

>>> max('abcx')
>>> 'x'
>>> max('1234')
>>> '4'
>>> max((1,2,3))
>>> 3
>>> max([1,2,4])
>>> 4

We can use yield to generate an iterable object (there are other ways, too) :

def my_range(start,end):
    ''' '''
    while start <= end:
        yield start
        start += 1

Execute the following code:

for num in my_range(1, 4):
    print num
print max(my_range(1, 4))

The output:

1
2
3
4
4


2, the map

So introduce the map function in http://docs.python.org/2/library/functions.html#map:


map(function, iterable, ...)
Apply function to every item of iterable and return a list of the results. If additional iterable arguments are passed, function must take that many arguments and is applied to the items from all iterables in parallel. If one iterable is shorter than another it is assumed to be extended with None items. If function is None, the identity function is assumed; if there are multiple arguments, map() returns a list consisting of tuples containing the corresponding items from all iterables (a kind of transpose operation). The iterable arguments may be a sequence or any iterable object; the result is always a list.

The map function USES a custom function to handle each element in iterable, returning all the processing results as a list. Such as:

def func(x):
    ''' '''
    return x*x
print map(func, [1,2,4,8])
print map(func, my_range(1, 4))

The running result is:

[1, 4, 16, 64]
[1, 4, 9, 16]

It can also be achieved by list derivation:

print [x*x for x in [1,2,4,8]]

3, reduce

In the http://docs.python.org/2/library/functions.html#reduce introduces the reduce function as follows:


reduce(function, iterable[, initializer])
Apply function of two arguments cumulatively to the items of iterable, from left to right, so as to reduce the iterable to a single value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5). The left argument, x, is the accumulated value and the right argument, y, is the update value from the iterable. If the optional initializer is present, it is placed before the items of the iterable in the calculation, and serves as a default when the iterable is empty. If initializer is not given and iterable contains only one item, the first item is returned.

This is very clear,
reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])

Equivalent computation

((((1+2)+3)+4)+5)

And:

reduce(lambda x, y: x+y, [1, 2, 3, 4, 5],6)

Equivalent computation

(((((6+1)+2)+3)+4)+5)


4, the filter

In the http://docs.python.org/2/library/functions.html#filter introduces filter function as follows:


filter(function, iterable)
Construct a list from those elements of iterable for which function returns true. iterable may be either a sequence, a container which supports iteration, or an iterator. If iterable is a string or a tuple, the result also has that type; otherwise it is always a list. If function is None, the identity function is assumed, that is, all elements of iterable that are false are removed.
Note that filter(function, iterable) is equivalent to [item for item in iterable if function(item)] if function is not None and [item for item in iterable if item] if function is None.

The parameter function (which is a function) is used to process each element in iterable, and if the function returns true when it processes an element, that element is returned as a member of the list. For example, filter out the character a in the string:

def func(x):
    ''' '''
    return x != 'a'
print filter(func, 'awake')

The running result is:

wke

This can also be done through list derivation:

print ''.join([x for x in 'awake' if x != 'a'])


Related articles: