Introduction

0.1 Requirements

Please install the latest version of R (preferably in that order): R version 3.6.3 (2020-02-29)

and RStudio: RStudio Desktop

Platform-specific instructions are provided.

On Windows, it might be necessary to install RTools if you need to compile packages. On MacOS, you will be prompted to install XCode if necessary. These installs are optional, and not necessary for this course.

The R installer package (on MacOS and Windows) installs all the necessary tools for running R, plus a rather outdated graphical user interface (GUI). In this course, we will be interacting with R via RStudio, an integrated development environment (IDE) with a much more modern GUI.

Note that RStudio does require a working R installation.

0.2 What is R?

R is both a programming language and a statistical computing environment. It is open-source, which means that the source code is freely available under the GNU Public License. R is also free to use.

R was developed as an open-source implementation of the commercial S language. by Ross Ihaka and Robert Gentleman, both at the University of Auckland at the time. The name R is partly due to both developers’ names starting with R, and partly because of the similarity to S.

R is maintained by the R Development Core Team, whose responsibilities include ensuring stability and backwards-compatibility.

The R language comes with a set of base packages, which are automatically installed. R’s functionality can be extended by installing contributed packages from a central repository.

The R language can be somewhat tricky to learn; its syntax can be considered rather arcane. We will be using the tidyverse, a family of R packages with a consistent design philosophy. These packages are designed to make interacting with R much easier.

Even though learning to use a text-based program, such as R, can be a daunting prospect, it is certainly worth the investment. R is currently the state-of-the-art for statistics and data science, and there exist packages for almost every problem.

0.3 Further literature

For anyone interested in a more advanced course, there is an excellent online textbook, R for Data Science, written by Garrett Grolemund und Hadley Wickham. This covers many advanced topics, such as systematic approaches to model building.

For anyone interested in the R programming language, we recommend the book Hands-On Programming with R by Garrett Grolemund. A very advanced online text book is Advanced R by Hadley Wickham.

DataCamp offer various online courses. These are not free, but the quality is certainly very good. This introductory course is free to use.

In addition to this, there are many excellent blogs about R.

0.4 Typographic conventions used in this book

In the text, we will use the following coloured text blocks:


This block is used for comments and explanations.

This block is for advanced information.

This block indicates that you should try things out for yourself.


In addition to prose, this script contains R code. Code chunks look like this:

x <- seq(from = 1, to = 10, by = 1)

Anything inside a code chunk can be pasted into an R console. If you hover your pointer over the top right corner of a code chunk, you can copy the text to the clipboard.

Code chunks can also have an output:

x
#>  [1]  1  2  3  4  5  6  7  8  9 10

In this block, x is the input and #> [1] 1 2 3 4 5 6 7 8 9 10 is the output. In this example, we printed a variable x (in the above code chunk we created this variable and assigned the numbers 1 to 10 to it).

This book also inlcudes exercises whose solutions are hidden. These can be revealed (and hidden again) by clicking on the Show/Hide button:

Solution

devtools::session_info() 
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.6.2 (2019-12-12)
#>  os       macOS Mojave 10.14.6        
#>  system   x86_64, darwin15.6.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  de_CH.UTF-8                 
#>  ctype    de_CH.UTF-8                 
#>  tz       Europe/Zurich               
#>  date     2020-03-05                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date       lib source        
#>  assertthat    0.2.1   2019-03-21 [1] CRAN (R 3.6.0)
#>  backports     1.1.5   2019-10-02 [1] CRAN (R 3.6.0)
#>  bookdown      0.17    2020-01-11 [1] CRAN (R 3.6.0)
#>  callr         3.4.2   2020-02-12 [1] CRAN (R 3.6.0)
#>  cli           2.0.1   2020-01-08 [1] CRAN (R 3.6.0)
#>  codetools     0.2-16  2018-12-24 [1] CRAN (R 3.6.2)
#>  crayon        1.3.4   2017-09-16 [1] CRAN (R 3.6.0)
#>  desc          1.2.0   2018-05-01 [1] CRAN (R 3.6.0)
#>  devtools      2.2.2   2020-02-17 [1] CRAN (R 3.6.0)
#>  digest        0.6.24  2020-02-12 [1] CRAN (R 3.6.0)
#>  ellipsis      0.3.0   2019-09-20 [1] CRAN (R 3.6.0)
#>  evaluate      0.14    2019-05-28 [1] CRAN (R 3.6.0)
#>  fansi         0.4.1   2020-01-08 [1] CRAN (R 3.6.0)
#>  fs            1.3.1   2019-05-06 [1] CRAN (R 3.6.0)
#>  glue          1.3.1   2019-03-12 [1] CRAN (R 3.6.0)
#>  htmltools     0.4.0   2019-10-04 [1] CRAN (R 3.6.0)
#>  knitr       * 1.28    2020-02-06 [1] CRAN (R 3.6.0)
#>  magrittr      1.5     2014-11-22 [1] CRAN (R 3.6.0)
#>  memoise       1.1.0   2017-04-21 [1] CRAN (R 3.6.0)
#>  pkgbuild      1.0.6   2019-10-09 [1] CRAN (R 3.6.0)
#>  pkgload       1.0.2   2018-10-29 [1] CRAN (R 3.6.0)
#>  prettyunits   1.1.1   2020-01-24 [1] CRAN (R 3.6.0)
#>  processx      3.4.2   2020-02-09 [1] CRAN (R 3.6.0)
#>  ps            1.3.2   2020-02-13 [1] CRAN (R 3.6.0)
#>  R6            2.4.1   2019-11-12 [1] CRAN (R 3.6.0)
#>  Rcpp          1.0.3   2019-11-08 [1] CRAN (R 3.6.0)
#>  remotes       2.1.1   2020-02-15 [1] CRAN (R 3.6.0)
#>  rlang         0.4.4   2020-01-28 [1] CRAN (R 3.6.0)
#>  rmarkdown     2.1     2020-01-20 [1] CRAN (R 3.6.0)
#>  rprojroot     1.3-2   2018-01-03 [1] CRAN (R 3.6.0)
#>  rstudioapi    0.11    2020-02-07 [1] CRAN (R 3.6.0)
#>  sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 3.6.0)
#>  stringi       1.4.6   2020-02-17 [1] CRAN (R 3.6.0)
#>  stringr       1.4.0   2019-02-10 [1] CRAN (R 3.6.0)
#>  testthat      2.3.1   2019-12-01 [1] CRAN (R 3.6.0)
#>  usethis       1.5.1   2019-07-04 [1] CRAN (R 3.6.0)
#>  withr         2.1.2   2018-03-15 [1] CRAN (R 3.6.0)
#>  xfun          0.12    2020-01-13 [1] CRAN (R 3.6.0)
#>  yaml          2.2.1   2020-02-01 [1] CRAN (R 3.6.0)
#> 
#> [1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library

0.5 License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.