10 Enumerables and Streams¶
{% include toc.html %}
10.1 Enumerables¶
Elixir provides the concept of enumerables and the ``Enum` module </docs/stable/elixir/Enum.html>`__ to work with them. We have already learned two enumerables: lists and maps.
iex> Enum.map([1, 2, 3], fn x -> x * 2 end)
[2, 4, 6]
iex> Enum.map(%{1 => 2, 3 => 4}, fn {k, v} -> k * v end)
[2, 12]
The Enum
module provides a huge range of functions to transform,
sort, group, filter and retrieve items from enumerables. It is one of
the modules developers use frequently in their Elixir code.
Elixir also provides ranges:
iex> Enum.map(1..3, fn x -> x * 2 end)
[2, 4, 6]
iex> Enum.reduce(1..3, 0, &+/2)
6
Since the Enum module was designed to work across different data types,
its API is limited to functions that are useful across many data types.
For specific operations, you may need to reach to modules specific to
the data types. For example, if you want to insert an element at a given
position in a list, you should use the List.insert_at/3
function
from the ``List` module </docs/stable/elixir/List.html>`__, as it
would make little sense to insert a value into, for example, a range.
We say the functions in the Enum
module are polymorphic because they
can work with diverse data types. In particular, the functions in the
Enum
module can work with any data type that implements the
``Enumerable` protocol </docs/stable/elixir/Enumerable.html>`__. We are
going to discuss Protocols in a later chapter, for now we are going to
move on to a specific kind of enumerable called streams.
10.2 Eager vs Lazy¶
All the functions in the Enum
module are eager. Many functions
expect an enumerable and return a list back:
iex> odd? = &(rem(&1, 2) != 0)
#Function<6.80484245/1 in :erl_eval.expr/5>
iex> Enum.filter(1..3, odd?)
[1, 3]
This means that when performing multiple operations with Enum
, each
operation is going to generate an intermediate list until we reach the
result:
iex> 1..100_000 |> Enum.map(&(&1 * 3)) |> Enum.filter(odd?) |> Enum.sum
7500000000
The example above has a pipeline of operations. We start with a range
and then multiply each element in the range by 3. This first operation
will now create and return a list with 100_000
items. Then we keep
all odd elements from the list, generating a new list, now with
50_000
items, and then we sum all entries.
10.2.1 The pipe operator¶
The |>
symbol used in the snippet above is the pipe operator: it
simply takes the output from the expression on its left side and passes
it as the input to the function call on its right side. It’s similar to
the Unix |
operator. Its purpose is to highlight the flow of data
being transformed by a series of functions. To see how it can make the
code cleaner, have a look at the example above rewritten without using
the |>
operator:
iex> Enum.sum(Enum.filter(Enum.map(1..100_000, &(&1 * 3)), odd?))
7500000000
Find more about the pipe operator `by reading its documentation <http://elixir-lang.org/docs/stable/elixir/Kernel.html#|>/2>`__.
10.3 Streams¶
As an alternative to Enum
, Elixir provides the ``Stream`
module </docs/stable/elixir/Stream.html>`__ which supports lazy
operations:
iex> 1..100_000 |> Stream.map(&(&1 * 3)) |> Stream.filter(odd?) |> Enum.sum
7500000000
Streams are lazy, composable enumerables. Instead of generating
intermediate lists, streams create a series of computations that are
invoked only when we pass it to the Enum
module. Streams are useful
when working with large, possibly infinite, collections.
They are lazy because, as shown in the example above,
1..100_000 |> Stream.map(&(&1 * 3))
returns a data type, an actual
stream, that represents the map
computation over the range
1..100_000
:
iex> 1..100_000 |> Stream.map(&(&1 * 3))
#Stream<[enum: 1..100000, funs: [#Function<34.16982430/1 in Stream.map/2>]]>
Furthermore, they are composable because we can pipe many stream operations:
iex> 1..100_000 |> Stream.map(&(&1 * 3)) |> Stream.filter(odd?)
#Stream<[enum: 1..100000, funs: [...]]>
Many functions in the Stream
module accept any enumerable as
argument and return a stream as result. It also provides functions for
creating streams, possibly infinite. For example, Stream.cycle/1
can
be used to create a stream that cycles a given enumerable infinitely. Be
careful to not call a function like Enum.map/2
on such streams, as
they would cycle forever:
iex> stream = Stream.cycle([1, 2, 3])
#Function<15.16982430/2 in Stream.cycle/1>
iex> Enum.take(stream, 10)
[1, 2, 3, 1, 2, 3, 1, 2, 3, 1]
On the other hand, Stream.unfold/2
can be used to generate values
from a given initial value:
iex> stream = Stream.unfold("hełło", &String.next_codepoint/1)
#Function<39.75994740/2 in Stream.unfold/2>
iex> Enum.take(stream, 3)
["h", "e", "ł"]
Another interesting function is Stream.resource/3
which can be used
to wrap around resources, guaranteeing they are opened right before
enumeration and closed afterwards, even in case of failures. For
example, we can use it to stream a file:
iex> stream = File.stream!("path/to/file")
#Function<18.16982430/2 in Stream.resource/3>
iex> Enum.take(stream, 10)
The example above will fetch the first 10 lines of the file you have selected. This means streams can be very useful for handling large files or even slow resources like network resources.
The amount of functions and functionality in
`Enum
</docs/stable/elixir/Enum.html>`__ and
`Stream
</docs/stable/elixir/Stream.html>`__ modules can be daunting
at first but you will get familiar with them case by case. In
particular, focus on the Enum
module first and only move to
Stream
for the particular scenarios where laziness is required to
either deal with slow resources or large, possibly infinite,
collections.
Next we’ll look at a feature central to Elixir, Processes, which allows us to write concurrent, parallel and distributed programs in an easy and understandable way.