10 Functionals

10.1 Introduction

“To become significantly more reliable, code must become more transparent. In particular, nested conditions and loops must be viewed with great suspicion. Complicated control flows confuse programmers. Messy code often hides bugs.”

— Bjarne Stroustrup

A functional is a function that takes a function as an input and returns a vector as output. Here’s a simple functional: it calls the function provided as input with 1000 random uniform numbers.

The chances are that you’ve already used a functional. You might have used for-loop replacement like base R’s lapply(), apply(), or tapply(), or maybe purrr’s map() or variant; or maybe you’ve used a mathemetical functional like integrate() or optim(). All functionals take a function as input (among other things) and return a vector as output.

A common use of functionals is as an alternative to for loops. For loops have a bad rap in R. They have a reputation for being slow (although that reputation is only partly true, see Section 3.5.1 for more details). But the real downside of for loops is that they’re not very expressive. A for loop conveys that it’s iterating over something, but doesn’t clearly convey a high level goal. Instead of using a for loop, it’s better to use a functional. Each functional is tailored for a specific task, so when you recognise the functional you immediately know why it’s being used. Functionals play other roles as well as replacements for for-loops. They are useful for encapsulating common data manipulation tasks like split-apply-combine, for thinking “functionally”, and for working with mathematical functions.

Functionals reduce bugs in your code by better communicating intent. Functionals implemented in base R and purrr are well tested (i.e., bug-free) and efficient, because they’re used by so many people. Many are written in C, and use special tricks to enhance performance. That said, using functionals will not always produce the fastest code. Instead, it helps you clearly communicate and build tools that solve a wide range of problems. It’s a mistake to focus on speed until you know it’ll be a problem. Once you have clear, correct code you can make it fast using the techniques you’ll learn in Section 24.

Using functionals is a pattern matching exercise. You look at the for loop, and find a functional that matches the basic form. If one doesn’t exist, don’t try and torture an existing functional to fit the form you need. Instead, just leave it as a for loop!

It’s not about eliminating for loops. It’s about having someone else write them for you!

Outline

Prerequisites

This chapter will focus on functionals provided by the purrr package. These functions have a consistent interface that makes it easier to understand the key ideas than their base equivalents, which have grown organically over many years. I’ll compare and contrast base R functions as we go, and then wrap up the chapter with a discussion of base functionals that don’t have purrr equivalents.

Many R users feel guilty about using for loops instead of apply functions. It’s natural to blame yourself for failing to understand and internalise the apply family of functions. However, I think this is like blaming yourself when embarass yourself by failing to pull open a door when it’s supposed to be pushed open33. It’s not actually your fault, because many people suffer the same problem; it’s a failing of design. Similarly, I think the reason why the apply functions are so hard for so many people is because their design is suboptimal.

10.2 My first functional: map()

The most fundamental functional is purrr::map()34. It takes a vector and a function, calls the function once for each element of the vector, and returns the results in a list. In other words, map(1:3, f) yields list(f(x[[1]]), f(x[[2]]), f(x[[3]])).

Or, graphically:

The implementation of map() is quite simple. We allocate a list the same length as the input, and then fill in the list with a for loop. The basic implementation is only a handful of lines of code:

The real purrr::map() function has a few differences: it is written in C to eke out every last iota of performance, preserves names, and supports a few shortcuts that you’ll learn about shortly.

The base equivalent to map() is lapply(). The only difference is that lappy() does not support the helpers that you’ll learn about below, so if you’re only using map() from purrr, you can skip the additional package and use base::lapply() directly.

10.2.1 Producing atomic vectors

map() returns a list. This makes map() the most general of the “map” family because you can put anything in a list. There are four more specific variants, map_lgl(), map_int(), map_dbl() and map_chr(), that return atomic vectors:

These examples rely on the fact that data frames are lists containing vectors of the same length:

Like map(), the input and the output must be the same length, so you can not return multiple values. When debugging problems like this, it’s often useful to switch back to map() so you can see what the problematic output is.

Base R has two similar functions: sapply() and vapply().

sapply() tries to simplify the result to an atomic vector, whereever possible. But this simplifiation depends on the input, so sometimes you’ll get a list, sometimes a vector, and sometimes a matrix. This makes it difficult to program with.

vapply() allows you to provide a template that describes the output shape. If you want to stick to with base R code you should always use vapply() in your functions, not sapply(). The primary downside of vapply() is its vebosity: the equivalent to map_dbl(x, mean, na.rm = TRUE) is vapply(x, mean, na.rm = TRUE, FUN.VALUE = double()).

10.2.2 Anonymous functions and helpers

Instead of using map() with an existing function, you can create an inline anonymous function (as mentioned in Section (first-class-functions)):

Anonymous functions are very useful, but the syntax is verbose. So purrr offers a shorthand:

That also makes for a handy way of generating random data:

Reserve this syntax for short and simple functions. A good rule of thumb is that if your function involves spans lines or uses {}, it’s time to name your function.

Inside all purrr functions you can create an anonymous function using a ~ (the usual formula operator, pronouned “twiddle”). You can see what happens by calling as_mapper(): the map functions normally do that for you, but it’s useful to do it “by hand” to see what’s going on:

The function arguments look a little quirky but allow you to refer to . for one argument functions, .x and .y. for two argument functions, and ..1, ..2, ..3, etc, for functions with an arbitrary number of arguments.

purrr also provides helpers for extracting elements from a vector, powered by purrr::pluck(). You can use a character vector to select elements by name, an integer vector to select by position, or a list to select by both name and position. These are very useful for working with deeply nested lists, which often arise when working with JSON.

In base R functions, like lapply(), you can provide the name of the function as a string. This isn’t tremendously useful as most of the time lapply(x, "f") is exactly equivalent to lapply(x, f), just more typing.

10.2.3 Passing arguments with ...

It’s often convenient to pass on along additional arguments to the function that you’re calling. For example, you might want to pass na.rm = TRUE along to mean(). One way to do that is with an anonymous function:

But because the map functions pass ... along, there’s a simpler form available:

This is easiest to understand with a picture: any arguments that come after f in the call to map() are inserted after the data in individual calls to f():

It’s important to note that these arguments are not decomposed; or said another way, map() is only vectorised over its first argument. If an argument after f is a vector, it will be passed along as is, not decomposed like the first argument:

Note there’s a subtle difference between placing extra arguments inside an anonymous function compared with passing them to map(). Putting them in anonymous function means that they will be evaluated every time f() is executed, not just once when you call map(). This is easiest to see if we make the additional argument random:

10.2.4 Argument names

In the diagrams, I’ve omitted argument names to focus on the overall structure. But I recommend writing out the full names in your code, as it makes it easier to read. map(x, mean, 0.1) is perfectly valid code, but it generates mean(x[[1]], 0.1) so it relies on the reader remembering that the second argument to mean() is trim. To avoid unnecesary burden on the brain of the reader35, be kind, and write map(x, mean, trim = 0.1).

This is the reason why the arguments to map() are a little odd: instead of being x and f, they are .x and .f. It’s easiest to the problem that leads to these names using simple_map() defined above. simple_map() has arguments x and f so you’ll have problems whenever the function you are calling has arguments x or f:

The error is a little bewildering until you remember that the call to simple_map() is equivalent to simple_map(x = mtcars, f = mean, bootstrap_summary) because named matching beats positional matching.

purrr functions reduce the likelihood of such a clash by using .f and .x instead of the more common f and x. Of course this technique isn’t perfect (because the function you are calling might still use .f and .x), but it avoids 99% of issues. The remaining 1% of the time, use an anonymous function.

Base functions that pass along ... use a variety of naming conventions to prevent undesired argument matching:

  • The apply family mostly uses capital letters (e.g X and FUN).

  • transform() uses more exotic prefix _: this makes the name non-syntactic so it must always be surrounded in `, as described in Section 3.2.1. This makes undesired matches extremely unlikely.

  • Other functional like uniroot() and optim() make no effort to avoid clashes; but they tend to be used with specially created funtions so clashes are less likely.

10.2.5 Varying another argument

So far the first argument to map() has always become the first argument to the function. But what happens if the first argument should be constant, and you want to vary a different argument? How do you get the result in this picture?

It turns out that there’s no way to do it directly, but there are two tricks you can use. To illustrate them, imagine I have a vector that contains a few unusual values, and I want to explore the effective of different amounts of trimming when computing the mean. In this case, the first argument to mean() will be constant, and I want to vary the second argument, trim.

You’ll see one more approach to this problem that in Section 10.4.5.

10.2.6 Exercises

  1. Use as_mapper() to explore how purrr generates anonymous functions for the integer, character, and list helpers. What helper allows you to extract attributes? Read the documentation to find out.

  2. map(1:3, ~ runif(2)) is a useful pattern for generating random numbers, but map(1:3, runif(2)) is not. Why not? Can you explain why it returns the result that it does?

  3. Use the appropriate map() function to:

    1. Compute the standard deviation of every column in a numeric data frame.

    2. Compute the standard deviation of every numeric column in a mixed data frame. (Hint: you’ll need to do it in two steps.)

    3. Compute the number of levels for in every factor in a data frame.

  4. The following code simulates the performance of a t-test for non-normal data. Extract the p-value from each test, then visualise.

  5. The following code uses a map nested inside another map to apply a function to every element of a nested list. Why does it fail, and what do you need to do to make it work?

  6. Use map() to fit linear models to the mtcars using the formulas stored in this list:

  7. Fit the model mpg ~ disp to each of the bootstrap replicates of mtcars in the list below, then extract the \(R^2\) of the model fit (Hint: you can compute the \(R^2\) with summary())

10.3 Purrr style

Before we go on to take to explore more map variants, lets take a quick look at how you tend to use multiple purrr functions to solve a moderately realistic problem: fitting a model to each subgroups and extracting a coefficient of the model.

For this toy example, I’m going to break the mtcars data set down into groups defined by the number of cylinders, using the base split function:

Now imagine we want to fit a linear model, then extract the second coefficient (i.e. the intern). The following code shows how you might do that with purrr:

(If you haven’t seen %>%, the pipe, before, it’s described in Section 6.3.)

I think this code is easy to read because each line encapsulates a single step, you can easily distinguish the functional from what it does, and the purrr helpers allow us to very concisely describe what to do in each step.

How would you attack this problem with base R? You certainly could replace each purrr function with the equivalent base function:

But this isn’t really base R since we’re using the pipe. To tackle purely in base I think you’d use an intermediate variable, and do more in each step:

Or, of course, you could you use a for loop:

It’s interesting to note that as you move from purrr to base apply functions to for loops you tend to do more and more in each iteration. In purrr we iterate 3 times (map(), map(), map_dbl()), with apply functions we iterate twice (lapply(), vapply()), and with a for loop we iterate once. The advantage of breaking the problem into smaller steps is that it’s easier to understand and later modify as needs change.

10.4 Map variants

There are 23 primary variants of map(). So far, you’ve learned about five (map(), map_lgl(), map_int(), map_dbl() and map_chr()). That means that you’ve got 18 (!!) more to learn. That sounds like a lot, but fortunately the design of purrr means that you only need to learn five new ideas:

  • Output same type as input with modify()
  • Iterate over two inputs with map2().
  • Iterate with an index using imap()
  • Return nothing with walk().
  • Iterate over any number of inputs with pmap().

The map family of functions has orthogonal input and outputs, meaning that we can organise all the family into a matrix, with inputs in the rows and outputs in the columns. Once you’ve mastered the idea in a row, you can combine it with any column; once you’ve mastered the idea in column, you can combine it with any row.

List Atomic Same type Nothing
One argument map() map_lgl(), … modify() walk()
Two arguments map2() map2_lgl(), … modify2() walk2()
One argument + index imap() imap_lgl(), … imodify() iwalk()
N arguments pmap() pmap_lgl(), … pwalk()

10.4.1 Same type of output as input: modify()

Imagine you wanted to double every column in a data frame. You might first try using map(), but map() always returns a list:

If you want to keep the output as a data frame, you can use modify(), which always returns the same type of output as the input:

Despite the name, modify() doesn’t modify in place, it returns a modified copy, so if you wanted to permanently modify df, you’d need to assign it:

As usual, the basic implementation of modify() is simple, and in fact it’s even simpler than map() because we don’t need to create a new output vector; we can just progressively replace the input. The real code is a little complex to handle edge cases more gracefully.

In Section @(predicate-map) you’ll learn about a very useful variant of modify(), called modify_if(). This allows you to (e.g.) only double numeric columns of a data frame with modify_if(df, is.numeric, ~ .x * 2).

10.4.2 Two inputs: map2() and friends

map() is vectorised over a single argument, .x. This means it only varies .x when calling .f, all other arguments are passed along unchanged. This makes it poorly suited for some problems. For example, how would you find a weighted mean when you have a list of observations and a list of weights? Imagine we have the following data:

You can use map_dbl() to compute the unweighted means:

But passing ws as an additional argument doesn’t work because arguments after .f are not transformed:

We need a new tool: a map2(), which is vectorised over two arguments. This means both .x and .y are varied in each call to .f:

The arguments to map2() are slightly different to the arguments to map() as two vectors come before the function, rather than one. Additional arguments still go afterwards:

The basic implementation of map2() is simple, and quite similar to that of map(). Instead of iterating over one vector, we iterate over two in parallel:

One of the big differences between map2() and the simple function above is that map2() recycles its inputs to make sure that they’re the same length:

In other words, map2(x, y, f) will automatically behave like map(x, f, y) when needed. This is helpful when writing functions; in scripts you’d generally just use the simpler form directly.

The closest no base equivalent to map2() is Map(), which is discussed in Section 10.4.5.

10.4.3 No outputs: walk() and friends

Most functions are called for value that they return, so it makes sense to capture and store it with a map() function. But some functions are called primarily for their side-effects (e.g. cat(), write.csv(), or ggsave()) and it doesn’t make sense to capture their results. Take this simple example that displays a welcome message using cat(). cat() returns NULL, so while map works (in the sense that it generates the desired welcomes), it also returns list(NULL, NULL).

You could avoid this problem by assigning the results of map() to a variable that you never use, but that would muddy the intent of the code. Instead, purrr provides the walk family of functions that ignore the return values of the .f and instead return .x invisibly36.

My visual depiction of walk attempts to capture the importance different from map(): the outputs are ephemeral, and the input is returned invisibly.

One of the most useful walk() variants is walk2() because a very common side-effect is saving something to disk, and when saving something to disk you always have a pair of values: the object and the path that you want to save it to.

For example, imagine you have a list of data frames (which I’ve created here using split), and you’d like to save each one to a separate csv file. That’s easy with walk2():

Here the walk2() is equivalent to write.csv(cyls[[1]], paths[[1]]), write.csv(cyls[[2]], paths[[2]]), write.csv(cyls[[3]], paths[[3]]).

There is no base equivalent to walk(); you can either wrap the result of lapply() in invisible() or save it to a variable that is never used.

10.4.4 Iterating over values and indices

There are three basic ways to loop over a vector with a for loop:

  • Loop over the elements: for (x in xs)
  • Loop over the numeric indices: for (i in seq_along(xs))
  • Loop over the names: for (nm in names(xs))

The first form is analogous to the map() family. The second and third forms are equivalent to the imap() family which allows you to iterate over the values and the indices of a vector in parallel.

imap() is like map2() in the sense that your .f gets called with two arguments, but here both are derived from the vector. imap(x, f) is equivalent to map2(x, names(x), f) if x has names, and map2(x, seq_along(x), f) if it does not.

imap() is often useful for constructing labels:

If the vector is unnamed, the second argument will be the index:

imap() is a useful helper if you want to work the values in a vector along with their positions.

10.4.5 Any number of inputs: pmap() and friends

Since we have map() and map2(), you might expect map3(), map4(), map5(), and so on. But where would you stop? Instead of generalisating to an arbitrary number of arguments, purrr takes a slightly different tack with pmap(): you supply it a single list, which contains any number of arguments. In most cases, that will be a list of equal-length vectors, i.e. something very similar to a data frame. In diagrams, I’ll emphasise that relationship by drawing the input similar to a data frame.

There’s a simple equivalence between map2() and pmap(): map2(x, y, f) becomes pmap(list(x, y), f). The pmap() equivalent to the map2_dbl(xs, ws, weighted.mean) used above is:

As before, the varying arguments come before .f (although now they must be wrapped in a list), and and the constant arguments come afterwards.

A big difference between pmap() and the other map functions is that pmap() gives you much finer control over argument matching because you can name the components of the list. Returning to our example from Section ??, where we wanted to vary the trim argument to x, we could instead use pmap():

I think it’s good practice to name the list to make it very clear how the function will be called.

It’s often convenient to call pmap() with a data frame. A handy way to create that data frame is with tibble::tribble(), which allows you to describe a data frame row-by-row (rather than column-by-column, as usual): thinking about the parameters to a function as a data, is a very powerful pattern. The following example shows how you might draw random uniform numbers with varying parameters:

Here, the column names are critical: I’ve carefully chosen to match them to the arguments to runif(), so the pmap(params, runif) is equivalent to runif(n = 1L, min = 0, max = 1), runif(n = 2, min = 10, max = 100), runif(n = 3L, min = 100, max = 1000).

There are two base equivalents to the pmap() family: Map() and mapply(). Both have significant drawbacks:

  • Map() vectorises over all arguments so you can not supply arguments that do not very.

  • mapply() is the multidimensional version of sapply(); concetually it takes the output of Map() and simplifies it if possible. This gives it similar issues to sapply(), and there’s no multi-input equivalent of vapply().

10.4.6 Exercises

  1. Explain the results of modify(mtcars, 1).

  2. Rewrite the following code to use iwalk() instead of walk2(). What are the advantages and disadvantages?

  3. Explain how the following code transforms a data frame using functions stored in a list.

    Compare and constrast the map2() approach to this map() approach:

  4. What does write.csv() return? i.e. what happens if you use it with map2() instead of walk2()?

10.5 Reduce

After the map family, the next most important family of functions is the reduce family. This family is much smaller, with only two main variants, and used less commonly, but it’s a powerful idea, gives us the opportunity to discuss some useful algebra, and powers the map-reduce framework frequently used when working with large data.

10.5.1 Basics

reduce() takes a vector of length n, and produces a vector of length one, by calling a function with a pair of values at a time. In other words, reduce(1:4, f) is equivalent to f(f(f(1, 2), 3), 4).

reduce() is a useful way to generalise a function that works with two inputs (a binary function) to work with any number of inputs. Imagine you have a list of numeric vectors, and you want to find the values that occur in every element:

To solve this challenge we need to use intersect() repeatedly:

reduce() automates this solution for us, so we can write:

We could apply the same idea if we wanted to list all the elements that appear in at least one entry. All we have to do is switch from intersect() to union():

Like the map family, you can also pass additional arguments. intersect() and union() don’t take an extra arguments so I can’t demonstrate them here, but the principle is straight forward and I drew you a picture.

As usual, the essence of reduce() can reduced to a simple wrapper around a for loop:

The base equivalent is Reduce(). Note that the argument order is different: the function comes first, followed by the vector; there is no way to supply additional arguments.

10.5.2 Accumulate

The first reduce() variant, accumulate(), is useful for understand how reduce works, because instead of return just the final result, it returns all the intermediate results as well:

Another useful way to understand reduce is to think about sum(): sum(x) is equivalent to x[[1]] + x[[2]] + x[[3]] + .... And then accumulate() gives you the cumulative sum:

10.5.3 Output types

In the above example using +, what should reduce() return when x is short, i.e. length 1 or 0? When x is length 1, reduce just returns it without applying the reduce function:

This means that reduce() has no way to check that the input is valid:

What if it’s length 0? We get an error that suggest we need to use the .init argument:

What should .init be here? To figure that out, we need to see what happens when .init supplied:

So if we call reduce(1,+, init) the result will be 1 + init. Now we know that the result should be just 1 one, so that suggests that .init should be 0:

This also ensures that reduce() checks that length 1 inputs are valid for the function that you’re calling:

If you want to get algebraic about it, 0 is called the identity of the numbers under the operation of addition: if you add a 0 to any number, you get the same number back. R applies the same principle to determine what a summary function with a zero length input should return:

If you’re using reduce() in a function, you should always supply .init. Think carefully about what you function should return when passed a vector of length zero or one, and make sure to test your implementation.

10.5.4 Multiple inputs

Very occassionally you need to two arguments to the function that you’re reducing. For example, you might have a list of data frames that you want to join together, and the variables that you are joining by vary from element to element. This is a very specialised scenario, so I don’t want to spend much time on it, except to know that it exists.

Note that the length of the second argument varies based on whether or not .init is supplied: if you have four elements of x, f will only be called three times. If you supply init, f will be called four times.

10.5.5 Map-reduce

You might have heard of map-reduce, the idea that powers technology like Hadoop. Now you can see how simple and powerful the underlying idea is: all map-reduce is a map combined with a reduce. The special idea for large data is that the data is spread over multiple computers. Each computer performs the map on the data that it has, then it sends the result to back to a coordinator which reduces the individual results back to a single result.

10.6 Predicate functionals

A predicate is a function that returns a single TRUE or FALSE, like is.character(), is.null(), or all(), and we say a predicate matches a vector if it returns TRUE.

10.6.1 Basics

A predicate functional applies a predicate to each element of a vector. purrr proivdes six useful functions which come in three pairs:

  • some(.x, .p) returns TRUE if any element matches; every(.x,, .p) returns TRUE if all elements match.

  • detect(.x, .p) returns the value of the first match; detect_index(.x, .p) returns the location of the first match.

  • keep(.x, .p) keeps all matching elements; discard(.x, .p) drops all matching elements.

The following example shows how you might use these functionals with a data frame:

All of these functions could be implemented by first computing a logical vector, e.g. map_lgl(.x, .p), and then computing on that. However, that is a little inefficient because you can often exit early. For example, in

10.6.3 Exercises

  1. Why isn’t is.na() a predicate function? What base R function is closest to being a predicate version of is.na()?

  2. What’s the relationship between which() and Position()? What’s the relationship between where() and Filter()?

  3. simple_reduce() has a problem when x is length 0 or length 1. Describe the source of the problem and how you might go about fixing it.

  4. Implement the span() function from Haskell: given a list x and a predicate function f, span() returns the location of the longest sequential run of elements where the predicate is true. (Hint: you might find rle() helpful.)

  5. Implement arg_max(). It should take a function and a vector of inputs, and return the elements of the input where the function returns the highest value. For example, arg_max(-10:5, function(x) x ^ 2) should return -10. arg_max(-5:5, function(x) x ^ 2) should return c(-5, 5). Also implement the matching arg_min() function.

  6. The function below scales a vector so it falls in the range [0, 1]. How would you apply it to every column of a data frame? How would you apply it to every numeric column in a data frame?

10.7 Base functionals

To finish up the chapter, here I provide a survey of important base functions that are not members of the map, reduce, or predicate families, and hence have no equivalent in purrr. This is not to say that they’re not important, but they have more of a mathematical/statistical flavour, so they are generally less useful in data analyses.

10.7.1 Matrices and arrays

map() and friends are specialised to work with 1d vectors. base::apply() is specialised to work with 2d and higher vectors, i.e. matrices and arrays. You can think of apply() as an operation that summarises a matrix or array by collapsing each row or column to a single value. It has four arguments:

  • X, the matrix or array to summarise.
  • MARGIN, an integer vector giving the dimensions to summarise over, 1 = rows, 2 = columns, etc.
  • FUN, a summary function.
  • ... other arguments passed on to FUN.

A typical example of apply() looks like this

You can specify multiple dimensions to MARGINS, which is useful for high-d arrays:

There are a two caveats to using apply():

10.7.2 “Ragged” arrays

You can think about tapply() as a generalisation to apply() that allows for “ragged” arrays, arrays where each row can have a different number of columns. This is often needed when you’re trying to summarise a data set. For example, imagine you’ve collected pulse rate data from a medical trial, and you want to compare the two groups:

tapply() works by creating a “ragged” data structure from a set of inputs, and then applying a function to the individual elements of that structure. The first task is actually performed by split() function does. It takes two inputs and returns a list which groups elements together from the first vector according to elements, or categories, from the second vector:

Then tapply() is just the combination of split() and sapply().

10.7.3 Mathematical

Functionals are very common in mathematics. The limit, the maximum, the roots (the set of points where f(x) = 0), and the definite integral are all functionals: given a function, they return a single number (or vector of numbers). At first glance, these functions don’t seem to fit in with the theme of eliminating loops, but if you dig deeper you’ll find out that they are all implemented using an algorithm that involves iteration.

Base R provides a useful set:

  • integrate() finds the area under the curve defined by f()
  • uniroot() finds where f() hits zero
  • optimise() finds the location of lowest (or highest) value of f()

The following example shows how they might be used with a simple function, sin():

10.7.4 Exercises

  1. How does apply() arrange the output? Read the documentation and perform some experiments.

  2. There’s no equivalent to split() + vapply(). Should there be? When would it be useful? Implement one yourself.

  3. Implement a pure R version of split(). (Hint: use unique() and subsetting.) Can you do it without a for loop?

  4. Challenge: read about the fixed point algorithm. Complete the exercises using R.


  1. These are sometimes called Norman doors after Don Norman who described them in his book, “The Design of Everyday Things”. There’s a nice video about them at https://99percentinvisible.org/article/norman-doors/.

  2. Not to be confused with base::Map(), which is considerably more complex, and we’ll come back to in Section 10.4.5.

  3. Who is highly likely to be future you!

  4. In brief, invisible values are only printed if you explicitly request it. This makes them well suited for functions called primarily for their side-effects, as it allows their output to be ignored by default, while still from an option to capture it. See Section 6.7.2 for more details.