/ python

Notes - Book - Functional Python Programming By Steven F. Lott


Chapter 1 : Introducing Functional Programming
Chapter 2 : Introducing Some Functional Features
Chapter 3 : Functions, Iterators and Generators
Chapter 4 : Working with Collections
Chapter 5 : Higher-order Functions
Chapter 6 : Recursions and Reductions
Chapter 7 : Additional Tuple Techniques
Chapter 8 : The Itertools Module
Chapter 9 : More Itertools Techniques
Chapter 10 : The Functools Module
Chapter 11 : Decorator Design Techniques
Chapter 12 : The Multiprocessing and Threading Modules
Chapter 13 : Conditional Expressions and the Operator Module
Chapter 14 : The PyMonad Library
Chapter 15 : A Functional Approach to Web Services
Chapter 16 : Optimizations and Improvements

Chapter 1 : Introducing Functional Programming

Chapter 2 : Introducing Some Functional Features

A python lambda is a pure function

Lambdas cant have assignment statement, they're always pure functions and suitable for functional programming

Use tuples and namedtuples extensively

Process a list of tuples

use higher order functions

use wrap-process-unwrap pattern

Functional programming's efficiency stems, in part, from being able to defer a computation until it's required

Python's generator expressions and generator functions are lazy

A generator can be used only once. Be cautious how you use it

Tail recursion >> the recursive call is the last line of the function

Default recursion limit in Python is 1000

In Python, when we use a generator expression instead of a recursive function, we essentially do the tail-call optimization manually

Object oriented >> suffix notation some_object.foo().bar().yet_more()

Functional >> prefix notation yet_more(bar(foo(some_object)))

Monad purely functional constructs that allow us to structure a sequential pipeline of processing in a flexible way

Chapter 3 : Functions, Iterators and Generators

A pure function has no side effects

Use tuples and namedtuples as a way to create stateless objects

Use iterable collections as your primary design tool for functional programming

In a recursive generator function, be careful of the return statement. Do no use the following command line:

return recursive_iter(args)

It returns only the generator object; it doesn't evaluate the function to return the generated values. Use either of the following:

for result in recursive_iter(args):
    yield result


    yield from recursive_iter(args)

You can use itertools.tee() to overcome the once-only limitation. While appealing, this doesn't workout well in the long run. Once consumed, an iterable will not provide any more values. When we want to compute multiple kinds of reductions for example sums, counts, minimums, maximums we need to design with this one-pass-only limitation in mind

Use generator functions to clean raw data

While there's shorthand syntax for list, dict and set, there's no shorthand syntax for a tuple. To materialize a tuple, we must use the tuple() function. For this reason, it often seems most consistent to use the list(), tuple() and set() functions as the preferred syntax

from collections import namedtuple

To reduce memory use and increase performance prefer to use generator expressions and functions as much as possible

Use bisect module for creating sorted object, which we can then search

Implement a subclass of collections.abc.Mapping for creating classes whose instances can be used for dict like mapping

Chapter 4 : Working with Collections

Mapping produce a new collection from and existing collection(s)

Scalar Functions apply to individual values and compute and individual result

Collection() Functions Work with iterable collections

unwrap ( process ( wrap ( iterable ) ) )

Two very heavily used functions when working with collections:

fst = lambda x: x[0]
snd = lambda x: x[1]

A good functional design allows us to freely replace any function with its equivalent, which makes refactoring quite simple

Example of functional approach
Problem > Create pairs of consecutive items from an iterable.
Solution >

def pairs(iterable):
    def pair_from(head, iterable_tail):
        nxt = next(iterable_tail)
        yield head, nxt
        yield from pair_from(nxt, iterable_tail)
        return pair_from(next(iterable), iterable)
    except StopIteration:

A good strategy for performing tail-call optimization is to replace the recursion with a generator expression. Example:

def legs(lat_lon_iter):
    begin = next(lat_lon_iter)
    for end in lat_lon_iter:
        yield begin, end
        begin = end

Sequence v/s Iterator
A sequence isn't an iterator; it doesn't implement the next() function. The for statement handles this gracefully, by creating an iterator from a sequence automatically

Adding mappings is common when a design is evolving

Once common refactoring of complex expressions is to separate the generator expression from any materialized collection

itertools.tee() is wasteful. We can make our processing more efficient by materializing intermediate results

The len() doesn't apply to iterables; it only applies to sequences

You may like to use zip_longest() instead of zip()

Chapter 5 : Higher-order Functions

Use the reduce(extract()) design pattern to perform a reduction on specific fields extracted from a larger tuple

Some useful inbuilt functions:
max(), min()
sorted(), reversed()

We can assign lambdas to variables as follows (however this is against PEP guidelines)

start = lambda x: x[0]
end = lambda x: x[-1]

A lambda is a callable object and can be used like a function

We can use the functools.partial function to implement currying

Predicate > a decision function
A decision of true means that the value is passed; otherwise, the value is rejected

filter(predicate, iterable)

Often using the filter() function with defined functions is clearer than using it with the lambda forms

Ability to decompose a fairly complex problem into a number of independent functions, each one of which can be easily tested in isolation. Our processing is a composition of simpler functions. This can lead to succinct, expressive functional programming

The default behavior of list.pop() is to pop(-1)

Three largely equivalent ways to express a mapping:

map() function map(f, C)

generator expression (f(x) for x in C)

generator function

def mymap(f, C):
    for x in C:
        yield f(x)
mymap(f, C)

Unwrapping data while mapping

(f(x) for x in C)

Wrapping additional data while mapping

((f(x), x) for x in C)

When considering multiple statement generator functions, we need to be cautious that we don't stray from the guiding principles of functional programming: stateless function evaluation

Building higher-order functions with Callables:
The class object, created by the class statement, defines essentially a function that emits a function. Commonly, we will use a callable object to crate a composite function that combines two other functions into something relatively complex.

from collections.abc import Callable
class NullAware(Callable):
    def __init__(self, some_func):
        self.some_func = some_func
    def __call__(self, arg):
        return None if arg is None else self.some_func(arg)
null_log_scale = NullAware(math.log)
some_data = [10, 100, None, 50, 60]
scaled = map(null_log_scale)

In this case, the function that was created will gracefully tolerate None values without raising exceptions.

Using __slots__ helps restrict addition of any properties to objects

from collections.abc import Callable

class Sum_Filter(Callable):
    __slots__ = ['filter', 'function']
    def __init__(self, filter, function):
        self.filter = filter
        self.function = function
    def __call__(self, iterable):
        return sum(self.function(x) for x in iterable if self.filter(x))

It doesn't prevent all modifications to the resulting object, but it limits it to just two attributes. Attempting to add attributes results in and exception

is_not_none = lambda x: x is not None
# the following function would count the number of non-null values in the iterable
count_not_none = Sum_Filter(is_not_none, lambda x: 1)

The use of key=function is a common design pattern

One significant advantage of using the lambda forms is that it follows the functional paradigm very closely

Chapter 6 : Recursions and Reductions

Generally a functional programming language compiler will optimize a recursive function to transform a call in the tail of the function to a loop. The tail-call optimization technique available in Python is to use an explicit for loop !

Group-by reductions from many to fewer
binning > group the data into different bins

Split a list

C = [1, 2, 3, 4, 5]
head, *tail = C

# head is 1
# tail is [2, 3, 4, 5]

Grouping or partitioning data by key values

def group_by(key, data):
    def group_into(key, collection, dictionary):
        if len(collection) == 1:
            return dictionary
        head, *tail = collection
        return group_into(key, tail, dictionary)
    return group_into(key, data, defaultdict(list))

Writing more general group-by reductions

def sum_f(function, data):
    return sum(function(x) for x in data)
# count of items in data
N = sum_f(lambda x: 1, data)

# sum of items in data
S = sum_f(lambda x: x, data)

# sum of squares of data
S2 = sum_f(lambda x: x*x, data)

Chapter 7 : Additional Tuple Techniques

Use an immutable namedtuple as a record
However do remember that, while in some cases, the namedtuple adds clarity. In other cases, the namedtuple is a needless change in syntax from prefix to suffix

Avoid stateful classes by using families of tuples

Chapter 8 : The Itertools Module

Functional programming emphasizes stateless programming. In Python, this leads us to work with generator expressions generator functions and iterables

Primary limitation of Python iterators: they can be used only once. This can be astonishing because there's no error. Once exhausted, they appear to have no elements and will raise the StopIteration exception every time they're used

Working with infinite iterators

count()     # unlimited version of range() function
cycle()     # reiterate a cycle of values
repeat()    # repeat a single value an indefinite number of times

Using the finite iterators

groupby()   # partition an iterator
zip_longest()   # pads the shorter iterables with the given fill-value
dropwhile(), takewhile()

Apply a function to data via starmap() and map()

map(function, arg_iter) == (function(a) for a in arg_iter)

starmap(function, arg_iter) = (function(*a) for a in arg_iter)

Chapter 9 : More Itertools Techniques

In relational database theory, a join between tables can be thought of as a filtered product

Chapter 10 : The Functools Module

Some very useful functions of the functools module


In Python 2, reduce() was a builtin function. In Python 3 it has been moved to the functools module

Example of usage of @lru_cache

from functools import lru_cache

def fibc(n):
    """Fibonacci numbers with naive recursion and caching"""

# to clear the cache

For cases where similar values are computed repeatedly, the speedup can be impressive. For situations where the cached values are rarely reused, the overheads of maintaining the cached values outweigh any speedups

Applications that work with float values might not benefit much from memoization because all floats differ by small amounts. The least-significant bits of a float value are sometimes just random noise which prevents the exact equality test in the lru_cache decorator from working

Apply partial arguments with partial()
A partially applied function is a new function built from an old function and a subset of the required arguments

def add_n(n, x):
    """adds n to x and returns the result"""
    return n+x
from functools import partial
add_1 = partial(add_n, 1)
add_2 = partial(add_n, 2)

add_1(3)    # prints 4
add_2(3)    # prints 5

Combining map() and reduce()

def map_reduce(map_func, reduce_func, iterable):
    return reduce(reduce_func, map(map_func, iterable))

This is a very useful helper function

from operator import add
square = lambda x: x*x

def sum_squares(iterable):
    """returns sum of squares of the values in the iterable"""
    return map_reduce(square, add, iterable)

Chapter 11 : Decorator Design Techniques

We have two tiers of higher-order functions involved in defining a decorator as follows:
The decorator function applies a wrapper to a base function and returns the new wrapper. This function can do some one-time only evaluation as part of building the decorated function

The wrapper function can (ad usually does) evaluate the base function. This function will be evaluated every time the decorated function is evaluated

Example of a simple decorator:

from functools import wraps
def nullable(function):
    def null_wrapper(arg):
        return None if arg is None else function(arg)
    return null_wrapper

Always use the functools.wraps() function to assure that the decorated function retains the attributes of the original function

Its important that decorators only return function, and not attempt any processing of data.

Decorators are meta-programming: a code that creates code

The @wraps decorator applies the update_wrapper() function to preserve a few attributes of a wrapped function

Cross-cutting concerns

The idea is to have a library of common decorators that can provide implementations for common concerns. We often call these cross-cutting concerns because they apply across several functions
Handling incomplete data

A great deal of functional programming amount to f(g(x)) kinds of constructs
We can always resort to the map(f, map(g, x)) method
It might be more clear, however, to use the map(f_g, x) method to apply a composite to a collection. It's important to note that there's no inherent performance advantage to either technique

Example: Pre-processing bad data

import decimal
def bad_data(function):
    def wrap_bad_data(text, *args, **kwargs):
            return function(text, *args, **kwargs)
        except (ValueError, decimal.InvalidOperation):
            cleaned = text.replace(',', '')
            return function(cleaned, *args, **kwargs)
    return wrap_bad_data

In this case, we've provided Python *args and **kwargs parameters. This assures that the wrapped functions can have additional argument values provided

We can use this function to create a suite of functions that can do conversions of good data as well as a limited amount of data cleansing to handle specific kinds of bad data

bd_int = bad_data(int)
bd_float = bad_data(float)
bd_decimal = bad_data(Decimal)

bd_int("13")    # prints 13
bd_int("1,236") # prints 1236
bd_int("1,371", base=16)    # prints 4977

Adding a parameter to a Decorator
Example: Improving our bad-data aware decorator to create a slightly more flexible conversion

import decimal

def clean_char(text, char):
    return text.replace(char, '')

def clean_list(text, char_list):
    if not char_list:
        return text
    c, *other_chars = char_list
    return clean_list(clean_char(text, c), other_chars)

def bad_char_remove(*char_list):
    def cr_decorator(function):
        def wrap_char_remove(text, *args, **kwargs):
                return function(text, *args, **kwargs)
            except (ValueError, decimal.InvalidOperation):
                cleaned = clean_list(text, char_list)
                return function(cleaned, *args, **kwargs)
        return wrap_char_remove
    return cr_decorator

A parametrized decorator has three parts:

The overall decorator. This defines and returns the abstract decorator. In this case the bad_char_remove is the overall decorator and the cr_decorator is the abstract decorator

The abstract decorator. In this case, cr_decorator

The decorating wrapper. In this case the wrap_char_remove

Sample Usage of the above parametrized decorator

@bad_char_remove('$', ',')
def currency(text, **kwargs):
    return Decimal(text, **kw)
currency("13")      # prints Decimal('13')
currency("$3.14")   # prints Decimal('3.14')
currency("$1,700.00")   # prints Decimal('1700.00')

Keep the nested decorators limited

def h(x):

This has a meaning somewhat like f * g * h (x). However the name is merely h(x).
If f_wrap and g_wrap handle cross-cutting concerns, then this is acceptable.
If, on the other hand, we're using a decoration to create a composite function, it might also be better to use the following command:

f_g_h = f_wrap(g_wrap(h))

A succinct and expressive program is the goal

Generally, decorators work well when we have a number relatively simple and fixed aspects that we want to include with a given function (or a class). Decorators are also important when these additional aspects can be looked at as an infrastructure or a support, and not something essential to the meaning of the application code.

The typical examples of logging or security testing can be considered as the kind of background processing that isn't specific to the problem domain

Chapter 12 : The Multiprocessing and Threading Modules

When we eliminate complex, shared state and design around non-strict processing, we can leverage parallelism to improve performance

The biggest difficulty in developing parallel programs is coordinating updates to shared resources

Using a multiprocessing pool for concurrent processing

import multiprocessing

with multiprocessing.Pool(4) as workers:
    workers.map(analysis, glob.glob(pattern)

If we start p processes in the pool, our overall application will include p+1 processes. There will be one parent process and p children.

The ordinary Linux parent/child process rules apply to the subprocess created by this module. If the parent crashes without properly collecting final status from the child processes, then "zombie" processes can be left running. For this reason, a process Pool object is a context manager. When we use a pool via the with statement, at the end of the context, the children are properly terminated


In some cases, it can help to have more workers than CPUs. This might be true when each worker has I/O-intensive processing. Having many worker processes waiting for I/O to complete can improve the elapsed running time of the application

Pool object has 4 map-like methods

# order is preserved

# lazier than map

# order of results is not preserved

# each item in the iterable must be a tuple; the tuple is passed
# to the function using the * modifies so that each value of 
# the tuple becomes a positional argument value

The behavior of the map(), starmap() and apply() functions is to allocate work to a subprocess in the Pool object and then collect the response from the subprocess when that response is ready. This can cause the child to wait fro the parent to gather the results

The _async() function variants do not wait for the child to finish. These functions return an object that can be queried to get the individual results from the child processes

import multiprocessing

pattern = '*.gz'
combined = Counter()

with multiprocessing.Pool() as workers:
    results = workers.map_async(analysis, glob.glob(pattern))
    data = results.get()
    for c in data:

The response from the map_async() function is a MapResult object that we can query for the result and overall status of the pool of workers

Using the concurrent.futures module

import concurrent.futures

pool_size = 4
pattern = '*.gz'
combined = Counter()

with concurrent.futures.ProcessPoolExecutor(max_workers=pool_size) as workers:
    for result in workers.map(analysis, glob.glob(pattern)):

Using the threading and queue modules
The threading api isn't ideally suited to functional programming

You can use thread-safe queues in the `queue` module to pass objects from thread to thread

Its best to focus on concurrent.futures module as the most accessible way to write concurrent functional programs

The benefit of using functional programming technique s that each part of the overall process can be defined as mapping. This makes it practical to consider different architectures to locate an optimal design

Chapter 13 : Conditional Expressions and the Operator Module

Functional programming emphasizes lazy or non-strict ordering of operations

Use the operator module instead of lambdas for simple operations

fst = lambda x: x[1]

# this is equivalent to

from operator import itemgetter

# also consider using attrgetter for named attributes
from operator import attrgetter

Use itertools.starmap. Its useful when we have a sequence of tuples

operator.truediv() v/s fractions.Fraction()
truediv is the / operator. Fraction() will create exact rational values that don't suffer from the limitations of floating point approximations

Reducing with operators
operator.and() and operator.or() are the bit-wise & and / operators. If we want the proper Boolean operators, we have to use the all()and any() functions instead of the reduce() function

Chapter 14 : The PyMonad Library


Chapter 15 : A Functional Approach to Web Services

The Web Server Gateway Interface (WSGI) defines a relatively simple, standardized design pattern for crating a response to a web request

Chapter 16 : Optimizations and Improvements

Memoization and Caching

One memory optimization technique we have in Python is to use an iterable

Optimizing Accuracy
use fractions.Fraction(), decimal.Decimal

It's important to use decimal.Decimal values to work with currency. It's a common error to use a float value. When using a float value, additional noise bits are introduced because of the mismatch between Decimal values provided as input and the binary approximation used by floating point values