R reading group notes

Authors
Affiliation

KIND Network members

Brendan Clarke

NHS Education for Scotland

Published

November 21, 2024

This was about the vectors chapter in Advanced R (2nd ed). It’s mainly about how vectors and lists work.

Types of vector

There are a few odd extras, but nearly all vectors in the wild can be taxonomised like this:

  • atomic vectors (same type)
    • numeric
      • integer
        • 1234L, 1e4L, or 0xcafeL
      • double
        • decimal (0.1234), scientific (1.23e4), hex (0xcafe), Inf, -Inf, and NaN
    • logical
    • character
  • lists (potentially mixed type)
  • NULL (a 0-length vector)

Scalars and vectors

  • scalars = individual values = length-one vectors
"steve" # scalar
[1] "steve"
9 # scalar
[1] 9
length(9)
[1] 1
  • make longer vectors with c()
c(T, FALSE)
[1]  TRUE FALSE
flat <- c(c("nested", "vectors"), c("get", "flattened"))
flat
[1] "nested"    "vectors"   "get"       "flattened"
typeof(flat)
[1] "character"
  • understanding lists vs vectors as largely about hierarchical structure. Vectors are flat, while lists are hierchical:
c(c("this"), c("gets completely"), "flattened")
[1] "this"            "gets completely" "flattened"      
list(c("this"), list("gets", "not", "at all"), "flattened")
[[1]]
[1] "this"

[[2]]
[[2]][[1]]
[1] "gets"

[[2]][[2]]
[1] "not"

[[2]][[3]]
[1] "at all"


[[3]]
[1] "flattened"
  • we then did a lot of experimenting with empty lists and vectors
c("d", 1) == c("d", "1")
[1] TRUE TRUE
list("c", 1)
[[1]]
[1] "c"

[[2]]
[1] 1
## create an empty vector
vector("character", 0)
character(0)
## create an empty list
vector("list", 0) # the weird one
list()
unlist(vector("list", 0)) # because NULL is effectively a 0-length vector
NULL
is.null(unlist(list()))
[1] TRUE

Testing for vectors

This is a big issue because many R functions will coerce, and it can cause serious trouble. There’s a bit of discussion about the ordering, where character is basically the ground state/most basic, then double, then integer, then logical. So logical only gets made from logical inputs, integer from logical or integer, double from double, or integer, or logical, and character from anything.

as.numeric(TRUE)
[1] 1
is.logical(TRUE)
[1] TRUE
is.integer(1L)
[1] TRUE
is.double(1.0)
[1] TRUE
is.character("one")
[1] TRUE
NA_character_ # if you really want to ruin your life
[1] NA
# character → double → integer → logical

c(T, T) |> typeof()
[1] "logical"
c(T, F) |> typeof()
[1] "logical"
c(T, 1L) |> typeof()
[1] "integer"
c(T, 1) |> typeof()
[1] "double"
c(T, NA_character_) |> typeof()
[1] "character"
c(T, 1, NA)
[1]  1  1 NA
paste(NA_character_)
[1] "NA"
paste(NA)
[1] "NA"

The danger zone

This area is rich in false friends - functions with plausible-sounding names that do something unusual and not entirely what you might expect from their name:

testy1 <- c(1,2,3)
testy2 <- list(one = 1, two = 2, three = 3)
testy3 <- expression(x = 1)

is.vector(testy1) 
[1] TRUE
is.vector(testy2) 
[1] TRUE
is.vector(testy3) # includes lists and expressions
[1] TRUE
is.atomic(testy1) 
[1] TRUE
is.atomic(testy2)
[1] FALSE
is.atomic(testy3)# i.e. not recursive
[1] FALSE
is.numeric(testy1)
[1] TRUE
is.numeric(testy2) # even though everything is numeric, this list doesn't count as numeric
[1] FALSE
sapply(testy2, is.numeric)
  one   two three 
 TRUE  TRUE  TRUE 

Attributes

Vectors can have attributes - like names. Most attributes are pretty fragile, and get removed very easily:

testy4 <- c(one = 1, two = 2, three = 3)
typeof(testy4) # definitely not a list
[1] "double"
names(testy4)
[1] "one"   "two"   "three"
attributes(testy4)
$names
[1] "one"   "two"   "three"
attr(testy4, "names")
[1] "one"   "two"   "three"
attr(testy4, "supernames") <- c("one but secret", "two but secret", "three but secret")
attributes(testy4)
$names
[1] "one"   "two"   "three"

$supernames
[1] "one but secret"   "two but secret"   "three but secret"
str(attributes(testy4))
List of 2
 $ names     : chr [1:3] "one" "two" "three"
 $ supernames: chr [1:3] "one but secret" "two but secret" "three but secret"