From the manual,
It is possible to go far using R interactively
However, we will also study the language with the goals of
An object consists of a symbol (name) and a value
Also relevant: typeof(), mode() and storage.mode()
typeof() gives the type or internal storage mode of an object
Common types include:
Informally, often just called ’vectors’
Contiguous collections of objects of the same type
Common types include: “logical”, “integer”, “double”, “complex”, “character”, “raw”
R has no scalars, just vectors of length 1
age <- 15 # Length 1 vector
name <- 'Bob'
old_enough <- age >= 18 #old_enough <- FALSE
print(name)
old_enough
Comments:
16 -> age # Valid, but harder to read
typeof(age) # Note: age is a double
class(age)
typeof(name)
class(name)
age <- 19L
typeof(age)
The c() function (concatenate) creates vectors
people <- c("Alice", "Bob", 'Carol') # single/double quotes
years <- 1991 : 2000 # Watch out for: years <- 2000:1991
even_years <- (years %% 2) == 0
class(people)
typeof(years)
is.vector(even_years)
Use brackets [] to index subelements of a vector
First element of a vector is indexed by 1
people[1] # First element is indexed by 1
years[1 : 5] # Index with a subvector of integers
years[c(1, 3, length(years))]
Negative numbers exclude elements
people[-1] # All but the first element
years[c(-1, - length(years))] #All but first and last elementts
years[ - c(1,length(years))] # Equivalently
Index with logical vectors
even_years # Same as print(even_years)
years[even_years] # Index with a logical vector
Sample 100 Gaussian random variables and find the mean of the positive elements
xx <- rnorm(100, 0, 1) # Sample 100 Gaussians
indx_xx_pos <- (xx > 0) # Is this element positive
xx_pos <- xx[indx_xx_pos] # Extract positive elements
xx_pos_mean <- mean(xx_pos) # calculate mean
More terse:
xx <- rnorm(100, 0, 1) # Sample 100 Gaussians
xx_pos_mean <- mean(xx[xx > 0]) # calc. mean of positives
xx_pos_mean
Can assign single elements
people[1] <- 'Dave'; print(people)
or multiple elements:
years[even_years] <- years[even_years] + 1; print(years)
or assign multiple elements a single value (more on this when we look at recycling)
years[-c(1,length(years))] <- 0; print(years)
How about years <- 0?
What if we assign an element a value of the wrong type?
vals <- 1 : 3
typeof(vals)
vals[2] <- 'two'; print(vals)
typeof(vals)
R will coerce the vector to the most flexible type
In increasing flexibility: logical, integer, double, and character
The c() operator does the same
stuff <- c( TRUE , 3L, 3.14, 'pi')
stuff
typeof(stuff)
Use lists if you really wanted a heterogeneous collection
Atomic vectors are always flat, even for nested c() operators
Example from Advanced R, Hadley Wickham:
c(1, c(2, c(3, 4)))
A vector of vectors is still just a vector
Use lists/matrices/arrays if you want nested structure
What if we assign to an element outside the vector?
years[length(years) + 1] <- 2015
length(years); years
We have increased the vector length by 1
In general, this is an inefficient way to go about things
Much more efficient is to first allocate the entire vector
vals <- 1 : 3
typeof(vals)
vals[6] <- 6L
print(vals)
Also get NAs if we access elements outside the range of the vector
NA is a length 1 constant to handle missing values
Different from NaN (not a number), which results from e.g. dividing 0 by 0
NA can be coerced into any of the earlier data types
A useful command is is.na()
Unary transformations to a vectors: mean, sum, power etc
Binary operations are usually elementwise
What if vectors have different lengths?
Recycle: repeat shorter vector till the lengths match
Very convenient, but can allow bugs to remain undetected
R gives a warning if longer length is not multiple of shorter
val <- 1 : 6
val + 1
val + c(1,2)