idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
Data Visualization and Exploration
A grammar
expresses the fundamental principles of rules of an art or science.
provides structural insight into complicated graphics.
makes available more flexibility and expressiveness in creation of graphics.
provides a consistent framework and guidelines to think about graphics.
constitutes boundaries by principled rules rather than an API.
data
aesthetic mapping
geometric objects
scales
statistical transformations
position adjustments
facet specification
coordinate system
This is the most fundamental part: all other components depend on it.
In our discussion, we assume we are dealing with tidy data:
Variables
Observations
Values
An aesthetic is a visual property of the objects in your plot.
An aesthetic is a visual property of the objects in your plot.
Examples:
Position on the x, y plane
Colour
Shape
Size
A geom is the geometrical object that a plot uses to represent data.
A geom is the geometrical object that a plot uses to represent data.
Examples:
Points
Lines
Bars
Polygons
The aesthetic mapping
associates variables in the data
with visual properties of geometric objects.
A scale controls the mapping from data values to aesthetic values.
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
palette <- tibble(color = qualitative_hcl(3)) %>%
mutate(x=rank(color), y=0)
p1 <- ggplot(palette, aes(fill=color, x=x, y=y)) +
geom_tile(width=.9) +
scale_fill_identity() +
labs(title="scale") +
theme_void()
p2 <- scale_example %>%
mutate(yidx = 0) %>%
ggplot(aes(x=idx, fill=category, y=yidx)) +
geom_tile(color='black')
grid.arrange(p1, p2)
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
palette <- tibble(color = qualitative_hcl(3)) %>%
mutate(x=rank(color), y=0)
p1 <- ggplot(palette, aes(fill=color, x=x, y=y)) +
geom_tile(width=.9) +
scale_fill_identity() +
labs(title="scale") +
theme_void()
p2 <- scale_example %>%
mutate(yidx = 0) %>%
ggplot(aes(x=idx, fill=category, y=yidx)) +
geom_tile(color='black') +
scale_fill_discrete_qualitative()
grid.arrange(p1, p2)
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
palette <- tibble(color = qualitative_hcl(3)) %>%
mutate(x=rank(color), y=0)
p1 <- ggplot(palette, aes(fill=color, x=x, y=y)) +
geom_tile(width=.9) +
scale_fill_identity() +
labs(title="scale") +
theme_void()
p2 <- scale_example %>%
mutate(yidx = 0) %>%
ggplot(aes(x=idx, fill=category, y=yidx)) +
geom_tile(color='black') +
scale_fill_discrete_qualitative() +
theme_void()
grid.arrange(p1, p2)
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
palette <- tibble(color = qualitative_hcl(3)) %>%
mutate(x=rank(color), y=0)
p1 <- ggplot(palette, aes(fill=color, x=x, y=y)) +
geom_tile(width=.9) +
scale_fill_identity() +
labs(title="scale") +
theme_void()
p2 <- scale_example %>%
mutate(yidx = 0) %>%
ggplot(aes(x=idx, fill=category, y=yidx)) +
geom_tile(color='black') +
scale_fill_discrete_qualitative() +
theme_void() +
theme(axis.text.x = element_text())
grid.arrange(p1, p2)
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
scale_example %>%
mutate(yidx = 0) %>%
ggplot(aes(x=price, y=yidx)) +
geom_point(color='black') +
geom_label_repel(aes(label=idx)) +
scale_y_continuous(limits = c(0, 0.01)) +
scale_x_continuous(breaks = c(0,250,500,750,1000)) +
scale_fill_discrete_qualitative() +
theme_void() +
theme(axis.text.x = element_text(),
axis.line.x.bottom = element_line())
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
scale_example %>%
mutate(yidx = 0) %>%
ggplot(aes(x=price, color=category, y=yidx)) +
geom_point(color="black") +
geom_label_repel(aes(label=idx)) +
scale_y_continuous(limits = c(0, 0.01)) +
scale_x_continuous(breaks = c(0,250,500,750,1000)) +
scale_fill_discrete_qualitative() +
theme_void() +
theme(axis.text.x = element_text(),
axis.line.x.bottom = element_line())
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
scale_example %>%
mutate(yidx = 0) %>%
ggplot(aes(x=price, color=category, y=yidx)) +
geom_point() +
geom_label_repel(aes(label=idx)) +
scale_y_continuous(limits = c(0, 0.01)) +
scale_x_continuous(breaks = c(0,250,500,750,1000)) +
scale_fill_discrete_qualitative() +
theme_void() +
theme(axis.text.x = element_text(),
axis.line.x.bottom = element_line())
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
scale_example %>%
mutate(yidx = 0) %>%
ggplot(aes(x=price, color=category, y=yidx)) +
geom_point() +
geom_label_repel(aes(label=idx)) +
scale_y_continuous(limits = c(0, 0.01)) +
scale_x_continuous(breaks = c(0,250,500,750,1000)) +
scale_fill_discrete_qualitative() +
theme_void() +
theme(axis.text.x = element_text(),
axis.line.x.bottom = element_line(),
legend.position = 'top')
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
scale_example %>%
mutate(yidx = 0) %>%
ggplot(aes(x=price, color=category, y=yidx)) +
geom_point() +
geom_label_repel(aes(label=idx)) +
scale_y_continuous(limits = c(0, 0.01)) +
scale_x_continuous(breaks = c(0,250,500,750,1000)) +
scale_color_discrete_qualitative() +
theme_void() +
theme(axis.text.x = element_text(),
axis.line.x.bottom = element_line(),
legend.position = 'top')
A scale controls the mapping from data values to aesthetic values.
idx | category | price |
---|---|---|
1 | shoes | 100 |
2 | shoes | 70 |
3 | computers | 1000 |
4 | trousers | 80 |
scale_example %>%
mutate(yidx = 0) %>%
ggplot(aes(x=price, color=category, y=yidx)) +
geom_point() +
geom_label_repel(aes(label=idx)) +
scale_y_continuous(limits = c(0, 0.01)) +
scale_x_log10(breaks = c(0,250,500,750,1000)) +
scale_color_discrete_qualitative() +
theme_void() +
theme(axis.text.x = element_text(),
axis.line.x.bottom = element_line(),
legend.position = 'top')
Transforms the data, typically by summarizing it.
Transforms the data, typically by summarizing it.
Examples:
Adjustment of the position of graphical objects to avoid overplotting.
Examples:
The combination of
You can stack several layers on top of each other.
Create multiple plots with the same layers, each on a different subset of data
Maps the position of objects onto the plane of the plot.
aes
geom_*
scale_*
stat_*
facet_*
coord_*
In the following we will use the gapminder
dataset.
# A tibble: 1,704 × 6
country continent year lifeExp pop gdpPercap
<fct> <fct> <int> <dbl> <int> <dbl>
1 Afghanistan Asia 1952 28.8 8425333 779.
2 Afghanistan Asia 1957 30.3 9240934 821.
3 Afghanistan Asia 1962 32.0 10267083 853.
4 Afghanistan Asia 1967 34.0 11537966 836.
5 Afghanistan Asia 1972 36.1 13079460 740.
6 Afghanistan Asia 1977 38.4 14880372 786.
7 Afghanistan Asia 1982 39.9 12881816 978.
8 Afghanistan Asia 1987 40.8 13867957 852.
9 Afghanistan Asia 1992 41.7 16317921 649.
10 Afghanistan Asia 1997 41.8 22227415 635.
# ℹ 1,694 more rows
Why is the scale outside the layer definition?
ggplot() +
layer(
data = filter(gapminder, year == 1952),
mapping = aes(x=gdpPercap, y=lifeExp,
color=factor(year)),
geom = 'point',
stat = 'identity',
position = 'identity'
) +
layer(
data = filter(gapminder, year == 2007),
mapping = aes(x=gdpPercap, y=lifeExp,
color=factor(year)),
geom = 'point',
stat = 'identity',
position = 'identity'
) +
scale_x_log10()
ggplot() +
layer(
data = gapminder,
mapping = aes(x=gdpPercap, y=lifeExp),
geom = 'point', stat = 'identity',
position = 'identity'
) +
layer(
data = gapminder,
mapping = aes(x=gdpPercap, y=lifeExp),
geom = 'line', stat = 'smooth',
position = 'identity',
params = list(
method = 'gam',
color = 'blue',
size = 1
)
) + scale_x_log10()
Oftentimes data and aesthetic mapping are shared across all layers.
In such cases, we can provide the “default” data in the ggplot
function.
Using specialized functions like geom_*
or stat_*
, we can use the default values in all the other components of a layer.
Each geom
has a default stat
, each stat
has a default geom
# A tibble: 1,704 × 6
country continent year lifeExp pop gdpPercap
<fct> <fct> <int> <dbl> <int> <dbl>
1 Afghanistan Asia 1952 28.8 8425333 779.
2 Afghanistan Asia 1957 30.3 9240934 821.
3 Afghanistan Asia 1962 32.0 10267083 853.
4 Afghanistan Asia 1967 34.0 11537966 836.
5 Afghanistan Asia 1972 36.1 13079460 740.
6 Afghanistan Asia 1977 38.4 14880372 786.
7 Afghanistan Asia 1982 39.9 12881816 978.
8 Afghanistan Asia 1987 40.8 13867957 852.
9 Afghanistan Asia 1992 41.7 16317921 649.
10 Afghanistan Asia 1997 41.8 22227415 635.
# ℹ 1,694 more rows
# A tibble: 1,704 × 6
country continent year lifeExp pop gdpPercap
<fct> <fct> <int> <dbl> <int> <dbl>
1 Afghanistan Asia 1952 28.8 8425333 779.
2 Afghanistan Asia 1957 30.3 9240934 821.
3 Afghanistan Asia 1962 32.0 10267083 853.
4 Afghanistan Asia 1967 34.0 11537966 836.
5 Afghanistan Asia 1972 36.1 13079460 740.
6 Afghanistan Asia 1977 38.4 14880372 786.
7 Afghanistan Asia 1982 39.9 12881816 978.
8 Afghanistan Asia 1987 40.8 13867957 852.
9 Afghanistan Asia 1992 41.7 16317921 649.
10 Afghanistan Asia 1997 41.8 22227415 635.
# ℹ 1,694 more rows
# A tibble: 1,704 × 6
# Groups: continent, year [60]
country continent year lifeExp pop gdpPercap
<fct> <fct> <int> <dbl> <int> <dbl>
1 Afghanistan Asia 1952 28.8 8425333 779.
2 Afghanistan Asia 1957 30.3 9240934 821.
3 Afghanistan Asia 1962 32.0 10267083 853.
4 Afghanistan Asia 1967 34.0 11537966 836.
5 Afghanistan Asia 1972 36.1 13079460 740.
6 Afghanistan Asia 1977 38.4 14880372 786.
7 Afghanistan Asia 1982 39.9 12881816 978.
8 Afghanistan Asia 1987 40.8 13867957 852.
9 Afghanistan Asia 1992 41.7 16317921 649.
10 Afghanistan Asia 1997 41.8 22227415 635.
# ℹ 1,694 more rows
# A tibble: 60 × 3
# Groups: continent [5]
continent year pop
<fct> <int> <dbl>
1 Africa 1952 237640501
2 Africa 1957 264837738
3 Africa 1962 296516865
4 Africa 1967 335289489
5 Africa 1972 379879541
6 Africa 1977 433061021
7 Africa 1982 499348587
8 Africa 1987 574834110
9 Africa 1992 659081517
10 Africa 1997 743832984
# ℹ 50 more rows
# A tibble: 60 × 3
# Groups: continent [5]
continent year pop
<fct> <int> <dbl>
1 Africa 1952 237640501
2 Africa 1957 264837738
3 Africa 1962 296516865
4 Africa 1967 335289489
5 Africa 1972 379879541
6 Africa 1977 433061021
7 Africa 1982 499348587
8 Africa 1987 574834110
9 Africa 1992 659081517
10 Africa 1997 743832984
# ℹ 50 more rows
gapminder %>%
drop_na(pop) %>%
group_by(continent, year) %>%
summarise(pop = sum(as.numeric(pop))) %>%
filter(year == 2007)
# A tibble: 5 × 3
# Groups: continent [5]
continent year pop
<fct> <int> <dbl>
1 Africa 2007 929539692
2 Americas 2007 898871184
3 Asia 2007 3811953827
4 Europe 2007 586098529
5 Oceania 2007 24549947
# A tibble: 1,704 × 6
country continent year lifeExp pop gdpPercap
<fct> <fct> <int> <dbl> <int> <dbl>
1 Afghanistan Asia 1952 28.8 8425333 779.
2 Afghanistan Asia 1957 30.3 9240934 821.
3 Afghanistan Asia 1962 32.0 10267083 853.
4 Afghanistan Asia 1967 34.0 11537966 836.
5 Afghanistan Asia 1972 36.1 13079460 740.
6 Afghanistan Asia 1977 38.4 14880372 786.
7 Afghanistan Asia 1982 39.9 12881816 978.
8 Afghanistan Asia 1987 40.8 13867957 852.
9 Afghanistan Asia 1992 41.7 16317921 649.
10 Afghanistan Asia 1997 41.8 22227415 635.
# ℹ 1,694 more rows
# A tibble: 1,704 × 6
country continent year lifeExp pop gdpPercap
<fct> <fct> <int> <dbl> <int> <dbl>
1 Afghanistan Asia 1952 28.8 8425333 779.
2 Afghanistan Asia 1957 30.3 9240934 821.
3 Afghanistan Asia 1962 32.0 10267083 853.
4 Afghanistan Asia 1967 34.0 11537966 836.
5 Afghanistan Asia 1972 36.1 13079460 740.
6 Afghanistan Asia 1977 38.4 14880372 786.
7 Afghanistan Asia 1982 39.9 12881816 978.
8 Afghanistan Asia 1987 40.8 13867957 852.
9 Afghanistan Asia 1992 41.7 16317921 649.
10 Afghanistan Asia 1997 41.8 22227415 635.
# ℹ 1,694 more rows
# A tibble: 1,704 × 6
# Groups: year, continent [60]
country continent year lifeExp pop gdpPercap
<fct> <fct> <int> <dbl> <int> <dbl>
1 Afghanistan Asia 1952 28.8 8425333 779.
2 Afghanistan Asia 1957 30.3 9240934 821.
3 Afghanistan Asia 1962 32.0 10267083 853.
4 Afghanistan Asia 1967 34.0 11537966 836.
5 Afghanistan Asia 1972 36.1 13079460 740.
6 Afghanistan Asia 1977 38.4 14880372 786.
7 Afghanistan Asia 1982 39.9 12881816 978.
8 Afghanistan Asia 1987 40.8 13867957 852.
9 Afghanistan Asia 1992 41.7 16317921 649.
10 Afghanistan Asia 1997 41.8 22227415 635.
# ℹ 1,694 more rows
gapminder %>%
drop_na(gdpPercap) %>%
group_by(year, continent) %>%
summarise(
ymax = max(gdpPercap),
ymin = min(gdpPercap)
)
# A tibble: 60 × 4
# Groups: year [12]
year continent ymax ymin
<int> <fct> <dbl> <dbl>
1 1952 Africa 4725. 299.
2 1952 Americas 13990. 1398.
3 1952 Asia 108382. 331
4 1952 Europe 14734. 974.
5 1952 Oceania 10557. 10040.
6 1957 Africa 5487. 336.
7 1957 Americas 14847. 1544.
8 1957 Asia 113523. 350
9 1957 Europe 17909. 1354.
10 1957 Oceania 12247. 10950.
# ℹ 50 more rows
gapminder %>%
drop_na(gdpPercap) %>%
group_by(year, continent) %>%
summarise(
ymax = max(gdpPercap),
ymin = min(gdpPercap)
) %>%
filter(continent == "Europe")
# A tibble: 12 × 4
# Groups: year [12]
year continent ymax ymin
<int> <fct> <dbl> <dbl>
1 1952 Europe 14734. 974.
2 1957 Europe 17909. 1354.
3 1962 Europe 20431. 1710.
4 1967 Europe 22966. 2172.
5 1972 Europe 27195. 2860.
6 1977 Europe 26982. 3528.
7 1982 Europe 28398. 3631.
8 1987 Europe 31541. 3739.
9 1992 Europe 33966. 2497.
10 1997 Europe 41283. 3193.
11 2002 Europe 44684. 4604.
12 2007 Europe 49357. 5937.
# A tibble: 1,704 × 6
country continent year lifeExp pop gdpPercap
<fct> <fct> <int> <dbl> <int> <dbl>
1 Afghanistan Asia 1952 28.8 8425333 779.
2 Afghanistan Asia 1957 30.3 9240934 821.
3 Afghanistan Asia 1962 32.0 10267083 853.
4 Afghanistan Asia 1967 34.0 11537966 836.
5 Afghanistan Asia 1972 36.1 13079460 740.
6 Afghanistan Asia 1977 38.4 14880372 786.
7 Afghanistan Asia 1982 39.9 12881816 978.
8 Afghanistan Asia 1987 40.8 13867957 852.
9 Afghanistan Asia 1992 41.7 16317921 649.
10 Afghanistan Asia 1997 41.8 22227415 635.
# ℹ 1,694 more rows
# A tibble: 1,704 × 6
country continent year lifeExp pop gdpPercap
<fct> <fct> <int> <dbl> <int> <dbl>
1 Afghanistan Asia 1952 28.8 8425333 779.
2 Afghanistan Asia 1957 30.3 9240934 821.
3 Afghanistan Asia 1962 32.0 10267083 853.
4 Afghanistan Asia 1967 34.0 11537966 836.
5 Afghanistan Asia 1972 36.1 13079460 740.
6 Afghanistan Asia 1977 38.4 14880372 786.
7 Afghanistan Asia 1982 39.9 12881816 978.
8 Afghanistan Asia 1987 40.8 13867957 852.
9 Afghanistan Asia 1992 41.7 16317921 649.
10 Afghanistan Asia 1997 41.8 22227415 635.
# ℹ 1,694 more rows
# A tibble: 1,704 × 6
# Groups: continent, year [60]
country continent year lifeExp pop gdpPercap
<fct> <fct> <int> <dbl> <int> <dbl>
1 Afghanistan Asia 1952 28.8 8425333 779.
2 Afghanistan Asia 1957 30.3 9240934 821.
3 Afghanistan Asia 1962 32.0 10267083 853.
4 Afghanistan Asia 1967 34.0 11537966 836.
5 Afghanistan Asia 1972 36.1 13079460 740.
6 Afghanistan Asia 1977 38.4 14880372 786.
7 Afghanistan Asia 1982 39.9 12881816 978.
8 Afghanistan Asia 1987 40.8 13867957 852.
9 Afghanistan Asia 1992 41.7 16317921 649.
10 Afghanistan Asia 1997 41.8 22227415 635.
# ℹ 1,694 more rows
gapminder %>%
drop_na(pop) %>%
group_by(continent, year) %>%
summarise(minGdp = min(gdpPercap),
maxGdp = max(gdpPercap))
# A tibble: 60 × 4
# Groups: continent [5]
continent year minGdp maxGdp
<fct> <int> <dbl> <dbl>
1 Africa 1952 299. 4725.
2 Africa 1957 336. 5487.
3 Africa 1962 355. 6757.
4 Africa 1967 413. 18773.
5 Africa 1972 464. 21011.
6 Africa 1977 502. 21951.
7 Africa 1982 462. 17364.
8 Africa 1987 390. 11864.
9 Africa 1992 411. 13522.
10 Africa 1997 312. 14723.
# ℹ 50 more rows
gapminder %>%
drop_na(pop) %>%
group_by(continent, year) %>%
summarise(minGdp = min(gdpPercap),
maxGdp = max(gdpPercap)) %>%
filter(year == 2007)
# A tibble: 5 × 4
# Groups: continent [5]
continent year minGdp maxGdp
<fct> <int> <dbl> <dbl>
1 Africa 2007 278. 13206.
2 Americas 2007 1202. 42952.
3 Asia 2007 944 47307.
4 Europe 2007 5937. 49357.
5 Oceania 2007 25185. 34435.
gapminder %>%
drop_na(pop) %>%
group_by(continent, year) %>%
summarise(minGdp = min(gdpPercap),
maxGdp = max(gdpPercap)) %>%
filter(year == 2007) %>%
ggplot(aes(x=continent, y=minGdp,
xend=continent, yend=maxGdp,
color=continent)) +
geom_segment() +
geom_point(mapping = aes(y=minGdp),
size=3) +
geom_point(mapping = aes(y=maxGdp),
size=3)
Sometimes it is easier to express a layer in terms of a statistical transformation.
Usually a stat
introduces new variables that can be mapped to aesthetics.
To know which ones, look at the help pages.
For instance, stat_summary
introduces
ymin
ymax
y
(overwrites)group
aestheticBy default, the group is set to the interaction of all discrete variables in the plot.
For most applications you can simply specify the grouping with various aesthetics,
that is, colour, shape, fill, linetype, as well as with facets.
group
aestheticgroup
aestheticgroup
aestheticLet’s first assign the ribbon plot we created before to a variable
europe_gdp <- gapminder %>%
drop_na(gdpPercap) %>%
filter(continent == "Europe")
ribbon_plot <- ggplot(
data=europe_gdp,
aes(x=year,
y=gdpPercap,
fill=continent)) +
stat_summary(geom='ribbon',
alpha=0.7) +
stat_summary(geom='ribbon',
fun.max = max,
fun.min = min,
alpha=0.2) +
geom_line(stat='summary')
group
aestheticLet’s add a line for each country.
group
aestheticLet’s add a line for each country.
group
aestheticgroup
aestheticSometimes you need to adjust the position of the plot elements
Scales are functions that map from data values to aesthetic values.
Examples:
Scales are usually linear, but not necessarily.
In some cases we can apply a non-linear transformation to improve readability.
[1] "10" "100" "1 000" "10 000"
[5] "100 000" "1 000 000" "10 000 000" "100 000 000"
[9] "1 000 000 000" "10 000 000 000"
Linear scale (default):
Logarithmic scale:
In a logarithmic scale, multiples are equally spaced.
We can use them to display data that spans a very wide range, in an unequal way.
Data Visualization and Exploration - Layered Grammar of Graphics - ozan-k.com