3 Control flow

This chapter provides a very basic introduction to control flow constructs. The basic principles are standard to almost all high-level programming languages. If you already have experience with other languages, these constructs will be familiar to you.

Control flow constructs are basically used to tell a computer what to do, either in the sense of making a choice depending on whether some condition is met, or performing some action repeatedly.

A more techincal term for choice is conditional evaluation. This means that we want the computer to perform some actions only if a (set of) condition(s) is met. E.g. we want to compute the square root of a number, but only if that number is greater than zero. The operations used here are called if and else statements.

Telling a computer to do things repeatedly is known as repeated evaluation. In this case, we want to iterate over a collection of items. In programming, this is referred to as a loop. E.g. we have a list of subjects, and we want to compute some quantity for each person in that list. In this case we could write a for loop over the list.

3.1 If and else: conditional evaluation

The R syntax for an if statement is:

if (condition) {
    Do something
}

The condition must be a logical test that returns either the value TRUE or FALSE. The parentheses (round brackets) around the condition are mandatory. The code within the curly braces is performed only if the condition provides the value TRUE. Simple statements that fit onto one line can also be written without curly braces:

if (condition) Do something

Exercise

How would you calculate the square root of a number only if that number is positive?

Solution

Let’s say that the number is 4:

number <- 4

First, we need to test whether the number is greater than 0.

number > 0
#> [1] TRUE

Of course, 4 is greater than 0, there the test will return TRUE.

If we choose a negative number, e.g. -4, the same test will evaluate to FALSE.

-4 > 0
#> [1] FALSE

Now we can put it all together.

if (number > 0) {
    sqrt(number)
}
#> [1] 2

What happens if we assign a negative value to our number, say -9?

number <- -9

if (number > 0) {
    sqrt(number)
}

The code block sqrt(number) is not evaluated.

3.1.1 else

Of course, we can also specify what should happen if the condition is not met by using an else statement.

if (condition) {
    Do action A
} else {
    Do action B
}

3.1.2 else if

It is also possible to extend conditional evaluation to more than one condition. In this case, we can also provide one or more else if statements:

if (condition 1) {
    Do action A
} else if (condition 2) {
    Do action B
} else {
    Do action C
}

If condition A is met, action A is performed, if condition 2 is met, action B is performed, and if neither are met, action C is performed. More generally, the code block corresponding to the first condition that evaluates to TRUE is evaluated. If none of the preceding conditions are TRUE, the else block is evaluated.

This is best illustrated by an example.

Suppose you want to compare two numbers, x and y, and return a statement about which one is greater. E.g. let’s compare the number 3 and 4:

if (x < y) {
    print("x is less than y")
} else if (x > y) {
    print("x is greater than y")
} else {
    print("x is equal to y")
}

Only one of the print statement is evaluated; which one depends on the values of x and y.

Exercise

Now it’s your turn. Write code that assigns a grade, depending on a person’s test score. E.g. you could use:

  • grade A if a person gets a score of 80% or more
  • grade B for 65% to 80%
  • grade C for 50% to 65%
  • grade D for anything under 50%.

Solution

score <- 0.79

if (score >= 0.8) {
    print("Your grade is: A")
} else if (score >= 0.65) {
    print("Your grade is: B")
} else if(score > 0.5) {
    print("Your grade is: C")
} else print("Your grade is: D")
#> [1] "Your grade is: B"

Conditional evaluation is frequently used when writing functions. We will see examples of this in the next chapter.

3.1.3 ifelse()

Instead of an if...else construct, it is often possible to use the function ifelse(). This can lead to code that is both shorter and easier to read.

The basic syntax is:

ifelse(condition, true_action, false_action)

Here, if the condition is TRUE, the function true_action is called. If condition is FALSE, the function false_action is called. Note that these function are merely placeholders; you can use any functions.

At the end of chapter 2, we determined whether number were even or odd using the modulo operator:

  • x %% 2 == 0 for even numbers
  • x %% 2 != 0 for odd numbers.

We can simply use the ifelse() function to return the values “even” or “odd”, because ifelse() is vectorized - it operates on whole vectors, instead of just a single TRUE/FALSE value:

x <- 1:20
ifelse(x %% 2 == 0, "even", "odd")
#>  [1] "odd"  "even" "odd"  "even" "odd"  "even" "odd"  "even" "odd"  "even"
#> [11] "odd"  "even" "odd"  "even" "odd"  "even" "odd"  "even" "odd"  "even"

A further useful function is if_else() from the dplyr package. This checks that the true and false functions return the same type.

3.2 Loops: repeated evaluation

Loops are used to perform the same computations repeatedly. In brief, if you know how many times something should be performed, you use a for loop. If you do not know how many repetitions there will be, you use a while loop.

Due to the design of R, these kinds of explicit loops can often be avoided. On the one hand, functions in R operate on vectors, and thus we do not need to iterate over elements of a vector when applying simple functions.

On the other hand, there exist many functions that perform the iterations for us. These functions are know as the *apply family of functions. These can be applied to any collection of elements, such as lists or data frames. A recent addition to the functions that can be used instead of for loops a the map_* functions from the purrr package, which is part of the tidyverse. While their functionality is similar to the apply functions, they have a more consistent interface.

3.2.1 for loops

The basic structure of a for loop is:

for (iterator in collection) {
    Do something
}

The iteration part must be written in brackets. As with if statements, the curly braces can be omitted if the statements can fit onto one line.

The iterator is used to count the number of iterations. We often use the letters i, j or k (however, you can use any valid variable name).

In the next example, we want to loop over a vector of IQ scores and print every person’s IQ. The iterator i will take on each value in x in turn. At the first iteration, i will equal 101, at the second iteration, i will equal 98 and at the third iteration, i will equal 103.

x <- c(101, 98, 103)

for (i in x) {
    print(i)
}
#> [1] 101
#> [1] 98
#> [1] 103

If we want to iterate over a vector x, and we need the index of its elements, it is tempting to use 1:length(x) as the vector to iterate over.

# this returns the numbers 1,2,3 because x has three elements
1:length(x)
#> [1] 1 2 3

In this case, it is better to use the special function seq_along(), as this handles cases where x has length zero.

E.g. in the following example, the iterator i will take on the values 1, 2 and 3. If we also want the IQ values at each index, we need to use i to index the vector x.

for (i in 1:length(x)) {
    print(paste("Person", i, "has an IQ of", x[i]))
}
#> [1] "Person 1 has an IQ of 101"
#> [1] "Person 2 has an IQ of 98"
#> [1] "Person 3 has an IQ of 103"

1:length(x) should be replaced with seq_along(x).

for (i in seq_along(x)) {
      print(paste("Person", i, "has an IQ of", x[i]))
}
#> [1] "Person 1 has an IQ of 101"
#> [1] "Person 2 has an IQ of 98"
#> [1] "Person 3 has an IQ of 103"

If a for loop generates output, it is important to pre-allocate the output, otherwise will have to create a new object at each iteration of the loop. For instance, if we have a list of names, and we want to create a list of email adresses, we can loop over the names, but first we have to create the output vector.

names <- c("boris.mayer", "andrew.ellis")
domain <- "psy.unibe.ch"

# pre-allocate output 
email_adresses <- vector("character", length(names))

for (i in seq_along(names)) {
    email_adresses[i] <- paste0(names[i], "@", domain)
}

email_adresses
#> [1] "boris.mayer@psy.unibe.ch"  "andrew.ellis@psy.unibe.ch"

There is an easier way of doing this: because R functions are vectorized, we can simply do this: paste0(names, “@”, domain). An alternative to paste() or paste0() is str_c() from the stringr package.

3.2.2 while loops

In some cases, we want to repeatedly perform some computations while a condition is true. In this case, we use a while loop.

while (condition) {
  Do something
}

The condition must evaluate to TRUE or FALSE.

set.seed(1733)

STOP <- FALSE
while (!STOP) {
    STOP <- purrr::rbernoulli(1)
    print(STOP)
}
#> [1] FALSE
#> [1] FALSE
#> [1] FALSE
#> [1] FALSE
#> [1] FALSE
#> [1] TRUE

The above (pointless) while loop prints the value of STOP as long as STOP is FALSE. We start with STOP equal to FALSE, and then we then flip a coin (we draw a random number from a Bernoulli distribution, using the rbernoulli() function from the purrr package), which returns either TRUE or FALSE. The while runs until STOP happens to be TRUE, and therefore !STOP is FALSE. Even though the example is rather silly, it nevertheless illustrates the key points of while loops. We first set ensure that the condition is FALSE, and then reassign the condition variable at each iteration.

Every for loop can rewritten as a while loop. The for loop using seq_along from the above IQ example can be expressed like this. Note that in order to iterate over the indices of x, we have to manually increment i.

i <- 1 
while (i <= length(x)) {
      print(paste("Person", i, "has an IQ of", x[i]))
      i <- i + 1
}
#> [1] "Person 1 has an IQ of 101"
#> [1] "Person 2 has an IQ of 98"
#> [1] "Person 3 has an IQ of 103"

3.2.3 Combining repeated and conditional evaluation

Of course, we can combine logical tests (if) with for loops. If we have a vector of numbers that we want to take the square root of, we can write a for loop, with conditional evaluation inside:

numbers <- 6:-4
numbers
#>  [1]  6  5  4  3  2  1  0 -1 -2 -3 -4
for (x in numbers) {
    if (x > 0) {
        print(sqrt(x))
    } else {
        print("Error: x must be positive!")
    }
}
#> [1] 2.44949
#> [1] 2.236068
#> [1] 2
#> [1] 1.732051
#> [1] 1.414214
#> [1] 1
#> [1] "Error: x must be positive!"
#> [1] "Error: x must be positive!"
#> [1] "Error: x must be positive!"
#> [1] "Error: x must be positive!"
#> [1] "Error: x must be positive!"

Exercise

It’s your turn again: extend the code that assigns grades to iterate over a list of people.

Solution

scores <- c(0.79, 0.3, 0.95, 0.64)

for (score in scores) {
    if (score >= 0.8) {
        print("Your grade is: A")
    } else if (score >= 0.65) {
        print("Your grade is: B")
    } else if (score > 0.5) {
        print("Your grade is: C")
    } else {
        print("Your grade is: D")
    }
}
#> [1] "Your grade is: B"
#> [1] "Your grade is: D"
#> [1] "Your grade is: A"
#> [1] "Your grade is: C"

Exercise

Let’s simulate throwing dice. In this particular game, we want to throw two dice and add their numbers. Do this ten times, using a for loop. Maybe you can also save the output?

Solution

# die is the singular form of dice
die <- 1:6 

One throw is the sum of two dice:

sum(sample(die, size = 2))
#> [1] 9

Now, we can iterate over a vector of length 10. We should also pre-allocate the output vector.

set.seed(45)

n_times <- 10
output <- vector("numeric", length = n_times)

for (i in 1:n_times) {
    output[i] <- sum(sample(die, size = 2))
}
output
#>  [1]  8  8  9  8 10 10  6 10  8  7

We could equivalently have written the for loop using seq_along() to iterate over the pre-allocated output.

set.seed(45)

n_times <- 10
output <- vector("numeric", length = n_times)

for (i in seq_along(output)) {
    output[i] <- sum(sample(die, size = 2))
}
output
#>  [1]  8  8  9  8 10 10  6 10  8  7