The purrr package in R
29/07/2024
Welcome
this session is for 🌶🌶 intermediate users
we’ll get going properly at 15.05
you’ll need R + Rstudio / Posit Workbench / posit.cloud to
if you can’t access the chat, you might need to join our Teams channel: tinyurl.com/kindnetwork
you can find session materials at tinyurl.com/kindtrp
The KIND network
a social learning space for staff working with k nowledge, i nformation, and d ata across health, social care, and housing in Scotland
we offer social support, free training, mentoring, community events, …
Teams channel / mailing list
R training sessions
Testing R code
15:00-16:30 Wed 7th August 2024
R
🌶🌶
: intermediate-level
Flexdashboard
13:00-14:30 Thu 15th August 2024
R
🌶🌶
: intermediate-level
Session outline
a digression about Linnaeus
functionals
base-R functional programming
map
and walk
map2
and pmap
niceties and add-ons
A digression about Linnaeus
used an existing system of binomial classification
Homo sapiens
Homo = generic name, which applies to similar species
sapiens = specific name, for that species and that species only
Pan troglodytes and Pan paniscus = two similar species in a genus
Elephas maximus and Loxodonta africana = two similar species in different genera
Functionals
Here are some numbers:
Let’s find their average. We’d usually do this by passing those numbers to a function:
Functionals
But in R, interestingly, we can also do this the other way round by passing a function name:
my_num_f <- function (funct = mean) funct (n1)
my_num_f (mean)
We’d describe this as a functional . It’s fun, but a bit messy and annoying (e.g. how to change the numbers you’re averaging??).
Functional programming in base R
Say we’ve got a function we want to apply:
round_root <- function (n) round (n ^ 0.5 , 1 )
There are several ways of applying functions to stuff in base R. + we could use a loop: that’s another session + we could just exploit the vectorised nature of most functions in R
or we could use some of the apply
family of functions
lapply
and sapply
lapply (n1, round_root) # returns a list
[[1]]
[1] 2.6
[[2]]
[1] 2.8
[[3]]
[1] 3
sapply (n1, round_root) # simplifies that list to a vector
lapply
and sapply
There’s no real reason to use these functions when things are this simple, but when our applications become more complicated…
n2 <- 11 : 13
lapply (list (n1, n2), round_root)
[[1]]
[1] 2.6 2.8 3.0
[[2]]
[1] 3.3 3.5 3.6
sapply (list (n1, n2), round_root) # oddball output
[,1] [,2]
[1,] 2.6 3.3
[2,] 2.8 3.5
[3,] 3.0 3.6
lapply (list (n1, n2[1 : 2 ]), round_root) # quirky
[[1]]
[1] 2.6 2.8 3.0
[[2]]
[1] 3.3 3.5
map
map
is our purrr type specimen
library (purrr)
map (n1, round_root)
[[1]]
[1] 2.6
[[2]]
[1] 2.8
[[3]]
[1] 3
map
Pleasingly, map
will handle all kinds of odd inputs without fuss:
map (c (n1, n2), round_root)
[[1]]
[1] 2.6
[[2]]
[1] 2.8
[[3]]
[1] 3
[[4]]
[1] 3.3
[[5]]
[1] 3.5
[[6]]
[1] 3.6
map (dplyr:: tibble (n1 = n1, n2 = n2), round_root)
$n1
[1] 2.6 2.8 3.0
$n2
[1] 3.3 3.5 3.6
map (rbind (n1, n2), round_root)
[[1]]
[1] 2.6
[[2]]
[1] 3.3
[[3]]
[1] 2.8
[[4]]
[1] 3.5
[[5]]
[1] 3
[[6]]
[1] 3.6
map
map
will always return a list - that’s because, no matter what the output, you can always cram it into a list. If you want different output, you can have it. You just need to find the right species :
try (map_int (n1, round_root)) # surly and strict
Error in map_int(n1, round_root) : ℹ In index: 1.
Caused by error:
! Can't coerce from a number to an integer.
round_root_int <- function (n) as.integer (n ^ 0.5 )
map_int (n1, round_root_int)
round_root_lgl <- function (n) as.integer (n ^ 0.5 ) %% 2 == 0
map_lgl (n1, round_root_lgl)
anonymous functions
If you’re comfortable with the new anonymous function syntax, you can build an anonymous function in place:
map_lgl (1 : 4 , \(x) x %% 2 == 0 )
[1] FALSE TRUE FALSE TRUE
walk
walk
is intended for code where the side-effect is the point: graphs, pipes, and Rmarkdown especially. Otherwise, it’s as map
:
walk (n1, round_root) # wtf?
round_root_print <- function (n) print (n ^ 0.5 )
walk (n1, round_root_print)
[1] 2.645751
[1] 2.828427
[1] 3
round_root_cat <- function (n) cat (n ^ 0.5 , " \n " )
walk (n1, round_root_cat)
map2
map2
is for 2-argument functions:
map2_int (n1, n2, ` + ` ) # the best terrible way of adding I know
round_root_places <- function (n, dp = 1 ) round (n ^ 0.5 , dp)
round_root_places (n1, 0 )
map2 (n1, 0 , round_root_places)
[[1]]
[1] 3
[[2]]
[1] 3
[[3]]
[1] 3
beware of recycling rules
You’ll be unable to use map2
if your inputs are different lengths:
try (map2 (1 : 3 , 0 : 3 , round_root_places))
Error in map2(1:3, 0:3, round_root_places) :
Can't recycle `.x` (size 3) to match `.y` (size 4).
This makes expand.grid
valuable if you’re looking to try out all the combinations of two vectors, for example.
dat <- expand.grid (nums = 1 : 3 , dplaces = 0 : 3 )
map2 (dat$ nums, dat$ dplaces, round_root_places)
[[1]]
[1] 1
[[2]]
[1] 1
[[3]]
[1] 2
[[4]]
[1] 1
[[5]]
[1] 1.4
[[6]]
[1] 1.7
[[7]]
[1] 1
[[8]]
[1] 1.41
[[9]]
[1] 1.73
[[10]]
[1] 1
[[11]]
[1] 1.414
[[12]]
[1] 1.732
# or in a tibble
expand.grid (nums = n1, dplaces = 0 : 3 ) |>
dplyr:: as_tibble () |>
dplyr:: mutate (rr = map2_vec (nums, dplaces, round_root_places))
# A tibble: 12 × 3
nums dplaces rr
<int> <int> <dbl>
1 7 0 3
2 8 0 3
3 9 0 3
4 7 1 2.6
5 8 1 2.8
6 9 1 3
7 7 2 2.65
8 8 2 2.83
9 9 2 3
10 7 3 2.65
11 8 3 2.83
12 9 3 3
pmap
pmap
is for n argument functions.
round_roots_places <- function (n, root = 2 , places = 1 ) round (n ^ 1 / root, places)
round_roots_places (n1, root = 4 , places = 2 ) # use named arguments to avoid misery
pmap (list (n = n1, root = 4 , places = 2 ), round_roots_places)
[[1]]
[1] 1.75
[[2]]
[1] 2
[[3]]
[1] 2.25
Niceties and addons
imap (list ("a" , "b" , "c" ), \(x, y) paste0 (y, ": " , x)) |> # index map where y is the name or index
list_c ()
map (n1, \(x) dplyr:: tibble ("Val" = x, "sq_val" = x^ 2 )) |>
list_rbind ()
# A tibble: 3 × 2
Val sq_val
<int> <dbl>
1 7 49
2 8 64
3 9 81
Feedback and resources
#KINDR::training_sessions("Excel", "2024/07/11")