# Chapter 6 Element 1: Functions

## 6.1 Learning Objectives

In R, everything that happens is a function call. As a first step, you should know how to:

- Use naming and positional matching to provide arguments, and
- Predict what the output of a function should look like, given the input.
- Operators are short-cuts for functions.

## 6.2 Introduction

There are three programming paradigms found in R: *functional*, *object-oriented* and *reactive.* In this workshop, we’ll introduce you to the first two.

In the context of functional programming, *everything that happens is a function call*. We’ll get deep into objects in the next chapter. Here, we need to know some very basic things, like how to make objects and do math, but our focus is on functions.

## 6.3 Arithmetic Operators

R is basically a simple, yet very powerful, calculator. which means it can handle all the usual *arithmetic operators*:

Operator | Description |
---|---|

`+` |
Addition |

`-` |
Subtraction |

`*` |
Multiplication |

`/` |
Division |

`^` or `**` |
Exponentiation |

`%/%` |
Integer division - `5%/%2` is `2` |

`%%` |
Modulus/remainder - `5%%2` is `1` |

`()` |
Set order of operations |

Consider the following examples. Use R’s text editor to enter the following commands:

```
34 + 6 # Addition
34 / 6 # Division
34 %/% 6 # Integer division
34 %% 6 # Remainder
34^6 # Exponentiation
```

```
# [1] 40
# [1] 5.666667
# [1] 5
# [1] 4
# [1] 1544804416
```

Don’t forget your order of operations! BEDMAS = Brackets, exponents, division, multiplication, addition, subtraction:

```
# [1] 1.25
# [1] -0.25
```

Now, assign the numbers 34 and 6 to objects `n`

and `p`

, respectively:

The whole point is that our objects, `n`

and `p`

, are place-holders *for* – and can be used in place *of* – the values.^{17}

```
# [1] 40
# [1] 5.7
# [1] 5
# [1] 4
# [1] 1.5e+09
```

We’ll return to objects in great detail in the next chapter, for now, let’s dig into functions.

## 6.4 What are Functions?

Functions are commands^{18} that accept zero or more user-specified arguments. Typically, functions follow the generic form

`function_name(argument_1 = , ... )`

where `function_name`

is the function’s name and `argument_1 =`

specifies the first argument to be passed to the function. A function may have many arguments, but as long as they are given in order they don’t need to be explicitly named.^{19}

We already saw several functions, so let’s review what we know. First we saw that there are some convenience functions

They basically provide an easier way of writing the full function:

Here, the first argument (`8`

) is assigned via positional matching and the second (`base = 2`

) is explicitly named. This combination is *very* typical, but we often only use positional matching for simple functions.

OK, so what do you think about this?

It’s a combination of positional matching and naming, and the result is the same, but it’s really confusing! Don’t do that! Try to keep your functions straight-forward. Table @ref(#tab:first-functions) list some very common functions

Function | Description |
---|---|

`c()` |
Combine arguments |

`seq()` |
Create a sequence of numbers |

`rep()` |
Create a repetitive sequence |

For^{20} example, `c()`

is a frequently used function that *combines* values into a single object.

`# [1] 3 8 9 23`

`# [1] "healthy" "tissue" "quantity"`

`seq()`

is convenient for producing a *sequence* of numbers:

`# [1] 1 8 15 22 29 36 43 50 57 64 71 78 85 92 99`

But recall that we can just leave out the names, since `seq()`

is common and straight-forward.

`# [1] 1 8 15 22 29 36 43 50 57 64 71 78 85 92 99`

Things get interesting when objects are used as arguments:

`# [1] 1 7 13 19 25 31`

## 6.5 The `:`

operator

If we want to generate a regular integer sequence, we can use `:`

, the *colon operator* as a short-cut:

`# [1] 1 2 3 4 5 6 7 8 9 10`

`# [1] 1 2 3 4 5 6 7 8 9 10`

## 6.6 Descriptive statistics

Function | Description |
---|---|

`length()` |
Get the length of a vector |

`mean()` |
Mean (average), \(bar(x)\) |

`names(sort(-table()))[1]` |
Mode (most frequent value) |

`median()` |
Median (middle value) |

`var()` |
Variance, \(s^2\) |

`sd()` |
Standard deviation, \(s^2\) |

`IQR()` |
The inter-quartile range |

`max()` |
Maximium number |

`min()` |
Minimum number |

`range()` |
Range - In R, the min & max |

Table 6.3 lists some common functions for descriptive statistics. We can calculate some descriptive statistics on `foo1`

and `foo2`

:

`# [1] 15`

`# [1] 50`

With the exception of `range()`

,^{23} all the functions in table **??** return a single value. Notice that in this situation, we can describe mathematical functions in two fundamental ways:

**Aggregation** functions *summarise* data and return (typically) a single value.

**Transformation** functions *mutate* each individual data point in exactly the same way. The number of output values equal the number of input values.

We just saw some aggregation functions, but we’ve already seen a transformation function!

`# [1] 1 7 13 19 25 31`

`# [1] 0.0 2.8 3.7 4.2 4.6 5.0`

What are some other examples of transformation functions?

`# [1] 1.0 2.6 3.6 4.4 5.0 5.6`

Of course we can combine aggregation and transformation functions. For example, when calculating a Z-score normalisation (\(z = \frac {x_i - \bar{x}}{s}\)):

`# [1] -1.34 -0.80 -0.27 0.27 0.80 1.34`

What we just saw is a profound and important point in R. What happened?

Following the order of operations: the mean of `foo1`

was calculated (an aggregation function), then subtracted from each *individual* value in the whole `foo1`

series (a transformation function). Then the same thing happened with the division of the standard deviation.

**Exercise 6.1 (Predict output)**Given

`foo2`

, can you predict the outcome of the following commands? Are they transformation or aggregation functions?
`> foo2 + 100`

`> foo2 + foo2`

`> sum(foo2) + foo2`

`> 1:3 + foo2`

## Solution

Here is the solution to the exercise

`# [1] 101 107 113 119 125 131`

`# [1] 2 14 26 38 50 62`

`# [1] 97 103 109 115 121 127`

`# [1] 2 9 16 20 27 34`

This is an interesting, important and very convenient concept, so let’s make sure you understand what’s going on here.

**Exercise 6.2 (Calculate results)**Use the linear equation from the previous exercises to calculate \(y=1.12x-0.4\) for

`xx`

.
So far so good. If you understand the premise, you should have gotten 2.96, 8.56, 9.68, 25.36 from a single command.

**Exercise 6.3 (Multiple iterations)**Continuing from the previous exercise, how would you calculate the result of our equation if we had more than one slope? e.g.

`> m2 <- c(0, 1.12)`

So this is a bit trickier, because of vector recycling! Of course there are way of dealing with this, which brings us into the topic of *creating* functions of our own.

In addition to

*arithmetic operators*we will also encounter*relational operators*(p. @ref(sec:relational_operators)),*logical operators*(p. @ref(sec:logical_operators)) and special characters used in*regular expressions*(p.**??**).↩︎We already saw this in

**??**when we assigned the result of`log2(8)`

to the object`n`

. What happens when we now assign the number`34`

to`n`

?↩︎Functions are either built-in, provided by extension packages, or user-defined.↩︎

Omitting the argument name is common for functions with only a few arguments or for specifying the first few arguments of complex functions. However, it is useful to specify argument names until you become more experienced in writing R scripts.↩︎

We’ll see in the next chapter that this is a vector.↩︎

R recognizes words in quotation marks as a character string, and not the name of an object.↩︎

Note that the

`::`

operator tells us that the`std.error()`

function is in the`plotrix`

package. See**??**for details on how to use packages.↩︎In R, range() is a bit of an anomaly, since we typically think of the range as a single value, i.e. the difference between the max and min. Instead, R returns two values, i.e. the min and max.↩︎