28 Data and Design
The first step in thinking creatively about data visualisation is to appreciate that graphics are built upon an underlying grammar. When we write a manuscript, we take great care to form our sentences so that they communicate the specific message we want to convey. That is what we need to do with graphics.
We are not limited to specific, standard, forms of expression. Indeed, we can, and should, communicate in the way that best suits our goal.
28.1 Graphics have a Grammar
We now have all the elements necessary to start building meaningful plots using the grammar of graphics plotting concept. Just as every word in a sentence has a clear grammatical definition, so to does every element of a graphic. Consider one of the most well-known sentences in English:
Maintaining the structure of the original sentence, replacing specific words dramatically changes the meaning.
We build graphics in much the same way. 1The grammatical elements of our graphics are:
Data The data-set being plotted. Aesthetics The scales onto which we map our data.
Geometries The visual elements used for our data.
Statistics Representations of our data to aid understanding.
Coordinates The space on which the data will be plotted.
Facets Plotting small multiples.
Themes All non-data ink.
This concept will be explored throughout the rest of the workshop. We will make use of the structure depicted in figure @ref(fig:GrammarGraphics), which lists some attributes that we have already encountered in the workshop.
28.2 Case Study: The Iris Dataset
Understanding the data layer is crucial to building plots efficiently using the grammar of graphics framework. Consider the following example using the classic Iris data set.
The data set contains observations for 150 flowers each from three species of Iris, Iris setosa, Iris versicolor and Iris virginica. Four variables were measured on each flower: The petal length and width and the sepal length and width. The sepals are typically the green outer layer of flowers, but in this case they are a colorful part of the flower.
In the form presented in table XX, the measurements of the three Iris species can be plotted on a single plotting space. This is practical for making comparisons among a small number of sub-sets.
Sepal.Length | Sepal.Width | Petal.Length | Petal.Width | Species |
---|---|---|---|---|
5.099182 | 3.505011 | 1.411859 | 0.1909784 | setosa |
4.906041 | 2.996428 | 1.382701 | 0.2169218 | setosa |
4.682389 | 3.185782 | 1.281282 | 0.2083006 | setosa |
Species | key | value |
---|---|---|
setosa | Sepal Length | 5.1 |
setosa | Sepal Length | 4.9 |
setosa | Sepal Length | 4.7 |
Species | Part | Length | Width |
---|---|---|---|
Setosa | Petal | 1.411859 | 0.1909784 |
Setosa | Petal | 1.382701 | 0.2169218 |
Setosa | Petal | 1.281282 | 0.2083006 |
Species | Part | Length | Width |
---|---|---|---|
setosa | Petal | 1.4 | 0.2 |
setosa | Petal | 1.4 | 0.2 |
setosa | Petal | 1.3 | 0.2 |
Species | Part | Length | Width | PartSpec |
---|---|---|---|---|
setosa | Petal | 1.411859 | 0.1909784 | setosa.Petal |
setosa | Petal | 1.382701 | 0.2169218 | setosa.Petal |
setosa | Petal | 1.281282 | 0.2083006 | setosa.Petal |
Species | Measure | Petal | Sepal |
---|---|---|---|
setosa | Length | 1.411859 | 5.099182 |
setosa | Length | 1.382701 | 4.906041 |
setosa | Length | 1.281282 | 4.682389 |
Further examples
Species | Measure | Petal | Sepal | MeasSpec |
---|---|---|---|---|
setosa | Length | 1.411859 | 5.099182 | setosa.Length |
setosa | Length | 1.382701 | 4.906041 | setosa.Length |
setosa | Length | 1.281282 | 4.682389 | setosa.Length |
To highlight the relationship between language and visual grammar, I have colour-coded each element. This is for illustrative purposes only and should not be dwelled on to literally.↩︎