Data Visualization and Exploration
GGplot2 is a tidyverse
library for plotting.
It builds on top of a “grammar of graphics”.
Makes building plots modular.
# A tibble: 1,704 × 6
country continent year lifeExp pop gdpPercap
<fct> <fct> <int> <dbl> <int> <dbl>
1 Afghanistan Asia 1952 28.8 8425333 779.
2 Afghanistan Asia 1957 30.3 9240934 821.
3 Afghanistan Asia 1962 32.0 10267083 853.
4 Afghanistan Asia 1967 34.0 11537966 836.
5 Afghanistan Asia 1972 36.1 13079460 740.
6 Afghanistan Asia 1977 38.4 14880372 786.
7 Afghanistan Asia 1982 39.9 12881816 978.
8 Afghanistan Asia 1987 40.8 13867957 852.
9 Afghanistan Asia 1992 41.7 16317921 649.
10 Afghanistan Asia 1997 41.8 22227415 635.
# ℹ 1,694 more rows
What happens if we swap the order of two geoms?
What happens if we swap the order of two geoms?
ggplot
applies the scale transformations before fitting the model line.
p <- ggplot(data = gapminder,
mapping = aes(x = gdpPercap,
y = lifeExp))
p + geom_point() +
geom_smooth(method = "gam") +
scale_x_log10(labels = scales::dollar) +
labs(
x = "GDP per capita",
y = "Life Expectancy in Years",
title = "Economic growth and life expectancy",
caption = "Source: Gapminder."
)
p <- ggplot(data = gapminder,
mapping = aes(x = gdpPercap,
y = lifeExp))
p + geom_point() +
geom_smooth(method = "gam") +
scale_x_log10(labels = scales::dollar) +
labs(
x = "GDP per capita",
y = "Life Expectancy in Years",
title = "Economic growth and life expectancy",
caption = "Source: Gapminder."
)
p <- ggplot(data = gapminder,
mapping = aes(x = gdpPercap,
y = lifeExp))
p + geom_point() +
geom_smooth(method = "gam") +
scale_x_log10(labels = scales::dollar) +
labs(
x = "GDP per capita",
y = "Life Expectancy in Years",
title = "Economic growth and life expectancy",
caption = "Source: Gapminder."
) +
theme_bw()
Look again at this picture: can we do better?
Looking at the dataset, which information are we ignoring?
That’s quite a mess!
What is hapenning here?
Maybe in this case it’s better to have a global smoothing line.
What happens if you map year
to color?
p <- ggplot(data = gapminder,
mapping = aes(x = gdpPercap,
y = lifeExp,
color = continent,
fill = continent))
p + geom_point(color="gray") +
geom_smooth(method = "gam") +
scale_x_log10(labels = scales::dollar) +
labs(
x = "GDP per capita",
y = "Life Expectancy in Years",
title = "Economic growth and life expectancy",
caption = "Source: Gapminder."
) +
theme_bw()
Look closely at the legend. How is it related to the geoms you use?
You can save your plots within Quarto documents.
You can export them to an external file.
You can use the execution options to set some parameters.
See https://quarto.org/docs/computations/execution-options.html
globally:
or locally on the R chunks
Save to png
file the last plot you rendered.
ggsave(filename = "myplot.png")
save to pdf
file a specific plot object.
ggsave(filename = "myplot.pdf", plot = p)
Data Visualization and Exploration - Plotting basics - ozan-k.com