semicolons indicate the end of a statement
newlines not necessarily
Whenever R encounters a syntactically correct statement it executes it and a value is returned
The value of a block is the value of the last statement
Allow conditional execution of statements
if( condition ) {
statement_block1 # executed if condition is true
} else { # else is optional
statement_block2
}
The value of condition is coerced to logical
If the value has length more than one, only the first is used
Since else
is optional, don’t put it on its own line!
Can disperse with braces for one-line statements:
if(condition) statement1 else statement2
if/else statements can be nested:
if( condition1 ) {
statements1
} else if( condition2 ) {
statements2
} else {
statements3
}
p <- rnorm(1)
if( p >= 0 ) {
p_logp <- p * log(p)
} else {
p_logp <- 0 # Assuming p >= 0
}
print(c(p,p_logp))
if( p > 0 ) p_logp <- p * log(p) else p_logp <- 0
if is a function that returns values, so we can also write
p_logp <- if( p > 0 ) p * log(p) else 0
# Less clear:
p_logp <- 'if'( p > 0, {p * log(p)}, 0)
!
logical negation
&
and &&
: logical ‘and’
|
and ||
: logical ‘or’
&
and |
perform elementwise comparisons on vectors
&&
and ||
:
Also useful are xor()
, any()
, all()
c(TRUE, TRUE) & c(TRUE, FALSE)
c(TRUE, TRUE) && c(TRUE, FALSE)
NA | c(TRUE, FALSE)
TRUE && (pi >1) && {print("Hello"); TRUE}
TRUE && (pi == 3.14) && {print("Hello"); TRUE}
c(TRUE, TRUE) & c(TRUE, FALSE) & {print("Hello!"); TRUE}
c(TRUE, TRUE) & c(FALSE, FALSE) & {print("Hello!"); TRUE}
We will look at lazy evaluation later
for()
, while()
and repeat()
¶for(elem in vect) { # Can be vector or list over
Do_stuff_with_elem # successive elements of vect
}
x <- 0
for(ii in 1:50000) x <- x + log(ii) # Horrible
x <- sum(log(1:50000)) # Much more simple and efficient!
system.time({x<-0; for(i in 1:50000) x <- x + log(i)})
system.time( x <- sum(log(1:50000)) )
An aside on increasing vector lengths
system.time({x<-0; for(i in 1:10000) x[i] <- i})
mean(x)
system.time({x<-rep(0,10000); for(i in 1:10000) x[i] <- i })
mean(x)
Vectorization allows concise and fast loop-free code
Example: Entropy $H(p) = −\sum_{i=1}^{|p|} p_i \log p_i$ of a prob. distrib.
p <- c(.0,.5,.5)
H <- -sum( p * log(p) ); print(H) # Vectorized but wrong (p[i] == 0?)
H <- 0
for(ii in 1:length(p)) # Correct but slow
if(p[ii] > 0) H <- H - p[ii] * log(p[ii])
pos <- p > 0; sum(p[pos])
Vectorization isn’t always possible though
See the third and fourth Circles in The R Inferno, Patrick Burns
"Premature optimization is the root of all evil" -Donald Knuth
ifelse()
¶ifelse()
has syntax:
ifelse(bool_vec, true_vec, false_vec)
Returns a vector of length equal to bool_vec whose
true_vec[i]
if bool_vec[i]
is TRUE
false_vec[i]
if bool_vec[i]
is FALSE
true_vec
and false_vec
are recycled if necessaryEntropy revisited:
H <- -sum(ifelse( p > 0, p * log(p), 0 ))
ifelse()
has syntax:
ifelse(bool_vec, true_vec, false_vec)
ifelse
is not lazy, usually evaluates all true_vec
and false_vec
(unless bool_vec is all TRUE or FALSE)
x <- c(6:-4)
sqrt(x) # gives warning
sqrt(ifelse(x >= 0, x, NA)) # no warning
## Note: the following also gives the warning !
ifelse(x >= 0, sqrt(x), NA)
I prefer to subset vectors
while( condition ) {
stuff # Repeat while condition evaluates to TRUE
}
If stuff doesn’t affect condition , we loop forever.
Then, we need a break statement. Useful if many conditions
while(TRUE) { # Or use ‘repeat { ... }’
stuff1
if( condition1 ) break
stuff2
if( condition2 ) break
}
i <- 4
while( i > 0 ) {
print(i)
i <- i - 1
}
i <- 5
while( i <- i - 1) { # while condition has a ‘side effect’
print(i) # Not recommended
}
i <- 4
while( { print(i); i <- i - 1} ) {}
# Correct but ridiculous
Might be useful if the block is a function
break()
transfers control to first statement outside loop
next()
halts current iteration and advances looping index
Both these commands apply to the innermost loop
Useful to avoid writing up complicated conditions
switch()
is another potentially useful alternative to if
See documentation (I don’t use it much)
*apply
family¶Useful functions for repeated operations on vectors, lists etc.
Note (Circle 4 of The R inferno):
# Calc. mean of each element of my_list
rslt_list <- lapply(my_list, FUN = mean)
Stackexchange has a nice summary
The plyr package (discussed later) is nicer