Collections

Unite

merge(*colls)

Merges several collections of same type into one: dicts, sets, lists, tuples, iterators or strings. For dicts values of later dicts override values of former ones with same keys.

Can be used in variety of ways, but merging dicts is probably most common:

def utility(**options):
    defaults = {...}
    options = merge(defaults, options)
    ...

If you merge sequences and don’t need to preserve collection type, then use concat() or iconcat() instead.

join(colls)

Joins collections of same type into one. Same as merge(), but accepts iterable of collections.

Use cat() and icat() for non-type preserving sequence join.

Transform and select

All functions in this section support Extended function semantics.

walk(f, coll)

Returns a collection of same type as coll consisting of its elements mapped with the given function:

walk(inc, {1, 2, 3}) # -> {2, 3, 4}
walk(inc, (1, 2, 3)) # -> (2, 3, 4)

When walking dict, (key, value) pairs are mapped, i.e. this lines flip() dict:

swap = lambda (k, v): (v, k)
walk(swap, {1: 10, 2: 20})

walk() works with strings too:

walk(lambda x: x * 2, 'ABC')   # -> 'AABBCC'
walk(compose(str, ord), 'ABC') # -> '656667'

One should probably use map() or imap() when doesn’t need to preserve collection type.

walk_keys(f, coll)

Walks keys of coll, mapping them with the given function. Works with mappings and collections of pairs:

walk_keys(str.upper, {'a': 1, 'b': 2}) # {'A': 1, 'B': 2}
walk_keys(int, json.loads(some_dict))  # restore key type lost in translation

Important to note that it preserves collection type whenever this is simple dict, defaultdict, OrderedDict or any other mapping class or a collection of pairs.

walk_values(f, coll)

Walks values of coll, mapping them with the given function. Works with mappings and collections of pairs.

Common use is to process values somehow:

clean_values = walk_values(int, form_values)
sorted_groups = walk_values(sorted, groups)

Hint: you can use partial(sorted, key=...) instead of sorted() to sort in non-default way.

Note that walk_values() has special handling for defaultdicts. It constructs new one with values mapped the same as for ordinary dict, but a default factory of new defaultdict would be a composition of f and old default factory:

d = defaultdict(lambda: 'default', a='hi', b='bye')
walk_values(str.upper, d)
# -> defaultdict(lambda: 'DEFAULT', a='HI', b='BYE')
select(pred, coll)

Filters elements of coll by pred constructing a collection of same type. When filtering a dict pred receives (key, value) pairs. See select_keys() and select_values() to filter it by keys or values respectively:

select(even, {1, 2, 3, 10, 20})
# -> {2, 10, 20}

select(lambda (k, v): k == v, {1: 1, 2: 3})
# -> {1: 1}
select_keys(pred, coll)

Select part of a dict or a collection of pairs with keys passing the given predicate.

This way a public part of instance attributes dictionary could be selected:

is_public = complement(re_tester('^_'))
public = select_keys(is_public, instance.__dict__)
select_values(pred, coll)

Select part of a dict or a collection of pairs with values passing the given predicate.

Strip falsy values from dict:

select_values(bool, some_dict)
compact(coll)

Removes falsy values from given collection. When compacting a dict all keys with falsy values are trashed.

Extract integer data from request:

compact(walk_values(silent(int), request_dict))

Dict utils

merge_with(f, *dicts)
join_with(f, dicts)

Merge several dicts combining values for same key with given function:

merge_with(list, {1: 1}, {1: 10, 2: 2})
# -> {1: [1, 10], 2: [2]}

merge_with(sum, {1: 1}, {1: 10, 2: 2})
# -> {1: 11, 2: 2}

join_with(first, ({n % 3: n} for n in range(100, 110)))
# -> {0: 102, 1: 100, 2: 101}
zipdict(keys, vals)

Returns a dict with the keys mapped to the corresponding vals. Stops pairing on shorter sequence end:

zipdict('abcd', range(4))
# -> {'a': 0, 'b': 1, 'c': 2, 'd': 3}

zipdict('abc', count())
# -> {'a': 0, 'b': 1, 'c': 2}
flip(mapping)

Flip passed dict swapping its keys and values. Also works for sequences of pairs. Preserves collection type:

flip(OrderedDict(['aA', 'bB']))
# -> OrderedDict([('A', 'a'), ('B', 'b')])
project(mapping, keys)

Returns a dict containing only those entries in mapping whose key is in keys.

Most useful to shrink some common data or options to predefined subset. One particular case is constructing a dict of used variables:

merge(project(__builtins__, names), project(globals(), names))
izip_values(*dicts)

Yields tuples of corresponding values of given dicts. Skips any keys not present in all of the dicts. Comes in handy when comparing two or more dicts:

max_change = max(abs(x - y) for x, y in izip_values(items, old_items))
izip_dicts(*dicts)

Yields tuples like (key, value1, value2, ...) for each common key of all given dicts. A neat way to process several dicts at once:

changed_items = [id for id, (new, old) in izip_dicts(items, old_items)
                 if abs(new - old) >= PRECISION]

lines = {id: cnt * price for id, (cnt, price) in izip_dicts(amounts, prices)}

See also izip_values().

get_in(coll, path, default=None)

Returns a value corresponding to path in nested collection:

get_in({"a": {"b": 42}}, ["a", "b"])    # -> 42
get_in({"a": {"b": 42}}, ["c"], "foo")  # -> "foo"
set_in(coll, path, value)

Creates a nested collection with the value set at specified path. Original collection is not changed:

set_in({"a": {"b": 42}}, ["a", "b"], 10)
# -> {"a": {"b": 10}}

set_in({"a": {"b": 42}}, ["a", "c"], 10)
# -> {"a": {"b": 42, "c": 10}}
update_in(coll, path, update, default=None)

Creates a nested collection with a value at specified path updated:

update_in({"a": {}}, ["a", "cnt"], inc, default=0)
# -> {"a": {"cnt": 1}}

Data manipulation

where(mappings, **cond)
iwhere(mappings, **cond)

Looks through each value in given sequence of dicts, returning a list or an iterator of all the dicts that contain all key-value pairs in cond:

where(plays, author="Shakespeare", year=1611)
# => [{"title": "Cymbeline", "author": "Shakespeare", "year": 1611},
#     {"title": "The Tempest", "author": "Shakespeare", "year": 1611}]

Iterator version could be used for efficiency or when you don’t need the whole list. E.g. you are looking for the first match:

first(iwhere(plays, author=”Shakespeare”)) # => {“title”: “The Two Gentlemen of Verona”, ...}
pluck(key, mappings)
ipluck(key, mappings)

Returns a list or an iterator of values for key in each mapping in the given sequence. Essentially a shortcut for:

map(operator.itemgetter(key), mappings)
pluck_attr(attr, objects)
ipluck_attr(attr, objects)

Returns a list or an iterator of values for attr in each object in the given sequence. Essentially a shortcut for:

map(operator.attrgetter(attr), objects)

Useful when dealing with collections of ORM objects:

users = User.query.all()
ids = pluck_attr('id', users)
invoke(objects, name, *args, **kwargs)
iinvoke(objects, name, *args, **kwargs)

Calls named method with given arguments for each object in objects and returns a list or an iterator of results.

Content tests

is_distinct(coll, key=identity)

Checks if all elements in the collection are different:

assert is_distinct(field_names), "All fields should be named differently"

Uses key to differentiate values. This way one can check if all first letters of words are different:

is_distinct(words, key=0)
all([pred, ]seq)

Checks if pred holds every element in a seq. If pred is omitted checks if all elements of seq is true (which is the same as in built-in all()):

they_are_ints = all(is_instance(n, int) for n in seq)
they_are_even = all(even, seq)

Note that, first example could be rewritten using isa() like this:

they_are_ints = all(isa(int), seq)
any([pred, ]seq)

Returns True if pred holds for any item in given sequence. If pred is omitted checks if any element of seq is true.

Check if there is a needle in haystack, using extended predicate semantics:

any(r'needle', haystack_strings)
none([pred, ]seq)

Checks if none of items in given sequence pass pred or true if pred is omitted.

Just a stylish way to write not any(...):

assert none(' ' in name for name in names), "Spaces in names not allowed"
one([pred, ]seq)

Returns true if exactly one of items in seq passes pred. Cheks for boolean true if pred is omitted.

some([pred, ]seq)

Finds first item in seq passing pred or first that is true if pred is omitted.

Low-level helpers

empty(coll)

Returns an empty collection of the same type as coll.

iteritems(coll)

Returns an iterator of items of a coll. This means key, value pairs for any dictionaries:

list(iteritems({1, 2, 42}))
# -> [1, 42, 2]

list(iteritems({'a': 1}))
# -> [('a', 1)]
itervalues(coll)

Returns an iterator of values of a coll. This means values for any dictionaries and just elements for other collections:

list(itervalues({1, 2, 42}))
# -> [1, 42, 2]

list(itervalues({'a': 1}))
# -> [1]