Chapter 6 Element 2: Functions

6.1 Learning Objectives

In R, everything that happens is a function call. As a first step, you should know how to:

  • Use naming and positional matching to provide arguments, and
  • Predict what the output of a function should look like, given the input.
  • Operators are short-cuts for functions.

6.2 Introduction

There are three programming paradigms found in R: functional, object-oriented and reactive. In this workshop, we’ll introduce you to the first two.

In the context of functional programming, everything that happens is a function call. We’ll get deep into objects in the next chapter. Here, we need to know some very basic things, like how to make objects and do math, but our focus is on functions.

6.3 Arithmetic Operators

R is basically a simple, yet very powerful, calculator. which means it can handle all the usual arithmetic operators:

Table 6.1: Arithmetic operators.16
Operator Description
+ Addition
- Subtraction
* Multiplication
/ Division
^ or ** Exponentiation
%/% Integer division - 5%/%2 is 2
%% Modulus/remainder - 5%%2 is 1
() Set order of operations

Consider the following examples. Use R’s text editor to enter the following commands:

# [1] 40
# [1] 5.666667
# [1] 5
# [1] 4
# [1] 1544804416

Don’t forget your order of operations! BEDMAS = Brackets, exponents, division, multiplication, addition, subtraction:

# [1] 1.25
# [1] -0.25

Now, assign the numbers 34 and 6 to objects n and p, respectively:

The whole point is that our objects, n and p, are place-holders for – and can be used in place of – the values.17

# [1] 40
# [1] 5.7
# [1] 5
# [1] 4
# [1] 1.5e+09

We’ll return to objects in great detail in the next chapter, for now, let’s dig into functions.

6.4 What are Functions?

Functions are commands18 that accept zero or more user-specified arguments. Typically, functions follow the generic form

function_name(argument_1 = , ... )

where function_name is the function’s name and argument_1 = specifies the first argument to be passed to the function. A function may have many arguments, but as long as they are given in order they don’t need to be explicitly named.19

We already saw several functions, so let’s review what we know. First we saw that there are some convenience functions

They basically provide an easier way of writing the full function:

Here, the first argument (8) is assigned via positional matching and the second (base = 2) is explicitly named. This combination is very typical, but we often only use positional matching for simple functions.

OK, so what do you think about this?

It’s a combination of positional matching and naming, and the result is the same, but it’s really confusing! Don’t do that! Try to keep your functions straight-forward. Table @ref(#tab:first-functions) list some very common functions

Table 6.2: Examples of some simple and frequently used functions for creating sequences.)
Function Description
c() Combine arguments
seq() Create a sequence of numbers
rep() Create a repetitive sequence

For20 example, c() is a frequently used function that combines values into a single object.

# [1]  3  8  9 23
# [1] "healthy"  "tissue"   "quantity"

21

seq() is convenient for producing a sequence of numbers:

#  [1]  1  8 15 22 29 36 43 50 57 64 71 78 85 92 99

But recall that we can just leave out the names, since seq() is common and straight-forward.

#  [1]  1  8 15 22 29 36 43 50 57 64 71 78 85 92 99

Things get interesting when objects are used as arguments:

# [1]  1  7 13 19 25 31

6.5 The : operator

If we want to generate a regular integer sequence, we can use :, the colon operator as a short-cut:

#  [1]  1  2  3  4  5  6  7  8  9 10
#  [1]  1  2  3  4  5  6  7  8  9 10

6.6 Descriptive statistics

Table 6.3: Frequently used functions in descriptive statistics.
Function Description
length() Get the length of a vector
mean() Mean (average), \(bar(x)\)
names(sort(-table()))[1] Mode (most frequent value)
median() Median (middle value)
var() Variance, \(s^2\)
sd() Standard deviation, \(s^2\)
IQR() The inter-quartile range
max() Maximium number
min() Minimum number
range() Range - In R, the min & max

22

Table 6.3 lists some common functions for descriptive statistics. We can calculate some descriptive statistics on foo1 and foo2:

# [1] 15
# [1] 50

With the exception of range(),23 all the functions in table ?? return a single value. Notice that in this situation, we can describe mathematical functions in two fundamental ways:

Aggregation functions summarise data and return (typically) a single value.

Transformation functions mutate each individual data point in exactly the same way. The number of output values equal the number of input values.

We just saw some aggregation functions, but we’ve already seen a transformation function!

# [1]  1  7 13 19 25 31
# [1] 0.0 2.8 3.7 4.2 4.6 5.0

What are some other examples of transformation functions?

# [1] 1.0 2.6 3.6 4.4 5.0 5.6

Of course we can combine aggregation and transformation functions. For example, when calculating a Z-score normalisation (\(z = {x_i - \bar{x} \over s}\)):

# [1] -1.34 -0.80 -0.27  0.27  0.80  1.34

What we just saw is a profound and important point in R. What happened?

Following the order of operations: the mean of foo1 was calculated (an aggregation function), then subtracted from each individual value in the whole foo1 series (a transformation function). Then the same thing happened with the division of the standard deviation.

Exercise 6.1 (Predict output) Given foo2, can you predict the outcome of the following commands? Are they transformation or aggregation functions?
> foo2 + 100
> foo2 + foo2
> sum(foo2) + foo2
> 1:3 + foo2
Solution

Here is the solution to the exercise

# [1] 101 107 113 119 125 131
# [1]  2 14 26 38 50 62
# [1]  97 103 109 115 121 127
# [1]  2  9 16 20 27 34

This is an interesting, important and very convenient concept, so let’s make sure you understand what’s going on here.

Exercise 6.2 (Calculate results) Use the linear equation from the previous exercises to calculate \(y=1.12x-0.4\) for xx.

So far so good. If you understand the premise, you should have gotten 2.96, 8.56, 9.68, 25.36 from a single command.

Exercise 6.3 (Multiple iterations) Continuing from the previous exercise, how would you calculate the result of our equation if we had more than one slope? e.g.
> m2 <- c(0, 1.12)

So this is a bit trickier, because of vector recycling! Of course there are way of dealing with this, which brings us into the topic of creating functions of our own.


  1. In addition to arithmetic operators we will also encounter relational operators (p. @ref(sec:relational_operators)), logical operators (p. @ref(sec:logical_operators)) and special characters used in regular expressions (p. ??).

  2. We already saw this in ?? when we assigned the result of log2(8) to the object n. What happens when we now assign the number 34 to n?

  3. Functions are either built-in, provided by extension packages, or user-defined.

  4. Omitting the argument name is common for functions with only a few arguments or for specifying the first few arguments of complex functions. However, it is useful to specify argument names until you become more experienced in writing R scripts.

  5. We’ll see in the next chapter that this is a vector.

  6. R recognizes words in quotation marks as a character string, and not the name of an object.

  7. Note that the :: operator tells us that the std.error() function is in the plotrix package. See ?? for details on how to use packages.

  8. In R, range() is a bit of an anomaly, since we typically think of the range as a single value, i.e. the difference between the max and min. Instead, R returns two values, i.e. the min and max.