3 Control flow
This chapter provides a very basic introduction to control flow constructs. The basic principles are standard to almost all high-level programming languages. If you already have experience with other languages, these constructs will be familiar to you.
Control flow constructs are basically used to tell a computer what to do, either in the sense of making a choice depending on whether some condition is met, or performing some action repeatedly.
A more techincal term for choice is conditional evaluation. This means that we want the
computer to perform some actions only if a (set of) condition(s) is met. E.g. we
want to compute the square root of a number, but only if that number is greater
than zero. The operations used here are called if
and else
statements.
Telling a computer to do things repeatedly is known as repeated evaluation. In
this case, we want to iterate over a collection of items. In programming, this
is referred to as a loop. E.g. we have a list
of subjects, and we want to compute some quantity for each person in that list.
In this case we could write a for
loop over the list.
3.1 If and else: conditional evaluation
The R syntax for an if
statement is:
The condition must be a logical test that returns either the value TRUE
or FALSE
.
The parentheses (round brackets) around the condition are mandatory. The code
within the curly braces is performed only if the condition provides the value
TRUE
. Simple statements that fit onto one line can also be written without
curly braces:
Exercise
How would you calculate the square root of a number only if that number is positive?
Solution
Let’s say that the number is 4:
First, we need to test whether the number is greater than 0.
Of course, 4 is greater than 0, there the test will return TRUE
.
If we choose a negative number, e.g. -4, the same test will evaluate to FALSE
.
Now we can put it all together.
What happens if we assign a negative value to our number, say -9?
The code block sqrt(number)
is not evaluated.
3.1.1 else
Of course, we can also specify what should happen if the condition is not met by
using an else
statement.
3.1.2 else if
It is also possible to extend conditional evaluation to more than one condition.
In this case, we can also provide one or more else if
statements:
If condition A is met, action A is performed, if condition 2 is met, action B is
performed, and if neither are met, action C is performed. More generally, the
code block corresponding to the first condition that evaluates to TRUE
is
evaluated. If none of the preceding conditions are TRUE, the else block is
evaluated.
This is best illustrated by an example.
Suppose you want to compare two numbers, x and y, and return a statement about which one is greater. E.g. let’s compare the number 3 and 4:
if (x < y) {
print("x is less than y")
} else if (x > y) {
print("x is greater than y")
} else {
print("x is equal to y")
}
Only one of the print statement is evaluated; which one depends on the values of x and y.
Exercise
Now it’s your turn. Write code that assigns a grade, depending on a person’s test score. E.g. you could use:
- grade A if a person gets a score of 80% or more
- grade B for 65% to 80%
- grade C for 50% to 65%
- grade D for anything under 50%.
Solution
score <- 0.79
if (score >= 0.8) {
print("Your grade is: A")
} else if (score >= 0.65) {
print("Your grade is: B")
} else if(score > 0.5) {
print("Your grade is: C")
} else print("Your grade is: D")
#> [1] "Your grade is: B"
Conditional evaluation is frequently used when writing functions. We will see examples of this in the next chapter.
3.1.3 ifelse()
Instead of an if...else
construct, it is often possible to use the function
ifelse()
. This can lead to code that is both shorter and easier to read.
The basic syntax is:
ifelse(condition, true_action, false_action)
Here, if the condition is TRUE
, the function true_action
is called. If
condition is FALSE
, the function false_action
is called. Note that these
function are merely placeholders; you can use any functions.
At the end of chapter 2, we determined whether number were even or odd using the modulo operator:
x %% 2 == 0
for even numbersx %% 2 != 0
for odd numbers.
We can simply use the ifelse()
function to return the values “even” or “odd”,
because ifelse()
is vectorized - it operates on whole vectors, instead of
just a single TRUE/FALSE value:
x <- 1:20
ifelse(x %% 2 == 0, "even", "odd")
#> [1] "odd" "even" "odd" "even" "odd" "even" "odd" "even" "odd" "even"
#> [11] "odd" "even" "odd" "even" "odd" "even" "odd" "even" "odd" "even"
A further useful function is if_else()
from the dplyr
package. This checks that the true and false functions return the same type.
3.2 Loops: repeated evaluation
Loops are used to perform the same computations repeatedly. In brief, if you
know how many times something should be performed, you use a for
loop. If you
do not know how many repetitions there will be, you use a while
loop.
Due to the design of R, these kinds of explicit loops can often be avoided. On the one hand, functions in R operate on vectors, and thus we do not need to iterate over elements of a vector when applying simple functions.
On the other hand, there exist many functions that perform the iterations for
us. These functions are know as the *apply
family of functions. These can be
applied to any collection of elements, such as lists or data frames. A recent
addition to the functions that can be used instead of for
loops a the map_*
functions from the purrr
package, which is part of the tidyverse. While their
functionality is similar to the apply functions, they have a more consistent
interface.
3.2.1 for
loops
The basic structure of a for
loop is:
The iteration part must be written in brackets. As with if
statements, the
curly braces can be omitted if the statements can fit onto one line.
The iterator
is used to count the number of iterations. We often use the
letters i
, j
or k
(however, you can use any valid variable name).
In the next example, we want to loop over a vector of IQ scores and print every
person’s IQ. The iterator i
will take on each value in x
in turn. At the
first iteration, i
will equal 101
, at the second iteration, i
will equal
98
and at the third iteration, i
will equal 103
.
If we want to iterate over a vector x
, and we need the index of its elements, it
is tempting to use 1:length(x)
as the vector to iterate over.
In this case, it is better to use the special function seq_along()
, as this
handles cases where x
has length zero.
E.g. in the following example, the iterator i
will take on the values 1, 2 and
3. If we also want the IQ values at each index, we need to use i
to index the
vector x
.
for (i in 1:length(x)) {
print(paste("Person", i, "has an IQ of", x[i]))
}
#> [1] "Person 1 has an IQ of 101"
#> [1] "Person 2 has an IQ of 98"
#> [1] "Person 3 has an IQ of 103"
1:length(x)
should be replaced with seq_along(x)
.
for (i in seq_along(x)) {
print(paste("Person", i, "has an IQ of", x[i]))
}
#> [1] "Person 1 has an IQ of 101"
#> [1] "Person 2 has an IQ of 98"
#> [1] "Person 3 has an IQ of 103"
If a for
loop generates output, it is important to pre-allocate the output,
otherwise will have to create a new object at each iteration of the loop. For
instance, if we have a list of names, and we want to create a list of email
adresses, we can loop over the names, but first we have to create the output
vector.
names <- c("boris.mayer", "andrew.ellis")
domain <- "psy.unibe.ch"
# pre-allocate output
email_adresses <- vector("character", length(names))
for (i in seq_along(names)) {
email_adresses[i] <- paste0(names[i], "@", domain)
}
email_adresses
#> [1] "boris.mayer@psy.unibe.ch" "andrew.ellis@psy.unibe.ch"
There is an easier way of doing this: because R functions are vectorized, we can simply do this: paste0(names, “@”, domain)
. An alternative to paste()
or paste0()
is str_c()
from the stringr
package.
3.2.2 while
loops
In some cases, we want to repeatedly perform some computations while a condition
is true. In this case, we use a while
loop.
The condition
must evaluate to TRUE
or FALSE
.
set.seed(1733)
STOP <- FALSE
while (!STOP) {
STOP <- purrr::rbernoulli(1)
print(STOP)
}
#> [1] FALSE
#> [1] FALSE
#> [1] FALSE
#> [1] FALSE
#> [1] FALSE
#> [1] TRUE
The above (pointless) while loop prints the value of STOP
as long as STOP
is FALSE
. We start with STOP
equal to FALSE
, and then we then flip a coin
(we draw a random number from a Bernoulli distribution, using the rbernoulli()
function from the purrr
package), which returns either TRUE
or FALSE
. The
while runs until STOP
happens to be TRUE
, and therefore !STOP
is
FALSE
. Even though the example is rather silly, it nevertheless illustrates
the key points of while loops. We first set ensure that the condition is
FALSE
, and then reassign the condition variable at each iteration.
Every for
loop can rewritten as a while
loop. The for
loop using
seq_along
from the above IQ example can be expressed like this. Note that in
order to iterate over the indices of x
, we have to manually increment i
.
3.2.3 Combining repeated and conditional evaluation
Of course, we can combine logical tests (if
) with for
loops. If we have a
vector of numbers that we want to take the square root of, we can write a
for
loop, with conditional evaluation inside:
for (x in numbers) {
if (x > 0) {
print(sqrt(x))
} else {
print("Error: x must be positive!")
}
}
#> [1] 2.44949
#> [1] 2.236068
#> [1] 2
#> [1] 1.732051
#> [1] 1.414214
#> [1] 1
#> [1] "Error: x must be positive!"
#> [1] "Error: x must be positive!"
#> [1] "Error: x must be positive!"
#> [1] "Error: x must be positive!"
#> [1] "Error: x must be positive!"
Exercise
It’s your turn again: extend the code that assigns grades to iterate over a list of people.
Solution
scores <- c(0.79, 0.3, 0.95, 0.64)
for (score in scores) {
if (score >= 0.8) {
print("Your grade is: A")
} else if (score >= 0.65) {
print("Your grade is: B")
} else if (score > 0.5) {
print("Your grade is: C")
} else {
print("Your grade is: D")
}
}
#> [1] "Your grade is: B"
#> [1] "Your grade is: D"
#> [1] "Your grade is: A"
#> [1] "Your grade is: C"
Exercise
Let’s simulate throwing dice. In this particular game, we want to throw two dice
and add their numbers. Do this ten times, using a for
loop. Maybe you can also
save the output?
Solution
One throw is the sum of two dice:
Now, we can iterate over a vector of length 10. We should also pre-allocate the output vector.
set.seed(45)
n_times <- 10
output <- vector("numeric", length = n_times)
for (i in 1:n_times) {
output[i] <- sum(sample(die, size = 2))
}
output
#> [1] 8 8 9 8 10 10 6 10 8 7
We could equivalently have written the for loop using seq_along()
to iterate
over the pre-allocated output.