Data Visualization and Exploration
The retina of the eye has two kinds of receptors:
Little role in the preception of colors.
Concentrated around the visual axis.
Responsivity of human cone cells
Image from Ware (2008)
Showing small blue text on a black background is a bad idea. There is insufficient luminance contrast.
Showing small blue text on a black background is a bad idea. There is insufficient luminance contrast.
This effect is due to the low sensitivity of cones to blue wavelengths.
Showing small yellow text on a white background is a bad idea. There is insufficient luminance contrast.
Showing small yellow text on a white background is a bad idea. There is insufficient luminance contrast.
Yellow wavelengths excite two different types of cones, making it almost as light as pure white.
The brain combines signals from different cones to build three channels:
When there is a strong positive or negative signal on one of the three channels, and a neutral one on the other two,
we have “special” colors.
In most languages, these six colors are identified as the basic ones.
[Brent Berlin and Paul Kay, 1969. Basic Color Terms: Their Universality and Evolution]
A considerable number of people is missing one or more color channels.
Most commonly, the missing channel is the red-green one.
When designing a color scale we need to take this into account in order to be inclusive.
Image from Ware (2008)
The effect of contrast is distortion of a patch of color in a way that increases the difference between a color and its surroundings.
We talk about luminance contrast when it occurs on the black-white channel, and chromatic when it occurs on the other two channels.
This phenomenon is called simultaneous contrast, where the background interferes with our perception of a patch of color.
It can create problems when reading values from a graphic.
The luminance channel is more effective at conveying spatial details.
We perceive three dimensional surfaces through changes of luminance, rather than through chromatic changes.
The more vivid a color, the more saturated it is said to be.
More saturated colors are those that have strong signals on one or both of the chromatic channels.
The maximum saturation for a given hue varies with luminance.
When colors are dark, the difference between cone signals on chromatic channels is smaller.
When colors are light, there is a reduction in saturation due to the color reproduction technology rather than perception.
Remember the discriminability issue? Here it is at play!
To work with colors, we need to agree on a representation. Such representations of colors are called color spaces.
\((red, green, blue)\)
In computer representations, each component goes from 0 to 255.
Why red, green and blue?
This is the set of colors with the widest gamut, that is the set of all colors that can be defined by means of combining the three primary colors, e.g., (235, 91, 52) or #eb5b34.
RGB is computationally convenient.
However, it is a poor fit for how our eyes work: it is not perceptually accurate.
In a perceptually uniform color space,
colors with the same perceptual distance are at the same distance in the space.
HCL (Hue-Chroma-Luminance): color space models that are designed to accord with human perception of color.
What we intuitively think of as pure colors:
The “colorness” or intensity of the color.
From “vivid” to “muted”:
Intuitively, the brightness of the color, or the amount of black mixed into the color.
From “dark” to “light”:
Designing a color space for color picking is a about finding which tradeoffs to make. In particular, independent control of hue, lightness and chroma can not be achieved in a color space that also maps sRGB to a simple geometrical shape.
Visit https://bottosson.github.io/misc/colorpicker
Visit https://www.hsluv.org/
When the bars touch, the dark areas seem darker and the light areas lighter.
Source: https://socviz.co/lookatdata.html#perception-and-data-visualization
Source: https://socviz.co/lookatdata.html#perception-and-data-visualization
Because of the effects above, it is best not to encode more than 3 to 5 levels using the Chroma or Luminance channel, if we want our readers to be able to distinguish the levels (discriminability).
Fortunately, almost all of the work has been done for us already.
Different color spaces have been defined and standardized in ways that account for uneven or nonlinear aspects of human color perception.
Our decisions about color will focus more on when and how it should be used.
A colormap specifies a mapping between colors and data values
Categorical
Ordered
Continuous vs. discrete
There are mainly two things to pay attention.
We can distinguish just about 12 bins of color, better to stick to at most 6.
Luminance contrast: we need our colored marks to “stand out” from the background.
Source: Munzner, ch.10, fig 10.8.
It’s often used to encode ordered data, why is it confusing?
Monotonically increasing luminance colormap
You don’t need to create your colormaps from scratch.
R supports a rich collection of colormaps ready for use.
The palettes in the RColorBrewer
package
Can be used to encode two different variables with color: use with extreme care!
pal1 <- tibble(c1 = c("#f5f5f5","#EE744B","#9F1401"),
dim1 = c("low", "mid", "high"))
pal2 <- tibble(c2 = c("#f5f5f5","#878FD3","#07489C"),
dim2 = c("low", "mid", "high"))
crossing(pal1, pal2) %>%
mutate(
color = hex(mixcolor(0.5, hex2RGB(c1), hex2RGB(c2))),
across(starts_with("dim"), ~ factor(.x,
levels = c("low", "mid", "high"),
ordered=T))
) %>%
ggplot(aes(x=dim1, y=dim2, fill=color)) +
geom_tile() +
scale_fill_identity() +
theme_classic() +
theme(axis.line = element_blank())
Can be used to encode two different variables with color: use with extreme care!
Can be used to encode two different variables with color: use with extreme care!
pal1 <- tibble(c1 = c("#f5f5f5","#EE744B","#9F1401"),
dim1 = c("low", "mid", "high"))
pal2 <- tibble(c2 = c("#f5f5f5","#878FD3","#07489C"),
dim2 = c("low", "mid", "high"))
crossing(pal1, pal2)
# A tibble: 9 × 4
c1 dim1 c2 dim2
<chr> <chr> <chr> <chr>
1 #9F1401 high #07489C high
2 #9F1401 high #878FD3 mid
3 #9F1401 high #f5f5f5 low
4 #EE744B mid #07489C high
5 #EE744B mid #878FD3 mid
6 #EE744B mid #f5f5f5 low
7 #f5f5f5 low #07489C high
8 #f5f5f5 low #878FD3 mid
9 #f5f5f5 low #f5f5f5 low
Can be used to encode two different variables with color: use with extreme care!
pal1 <- tibble(c1 = c("#f5f5f5","#EE744B","#9F1401"),
dim1 = c("low", "mid", "high"))
pal2 <- tibble(c2 = c("#f5f5f5","#878FD3","#07489C"),
dim2 = c("low", "mid", "high"))
crossing(pal1, pal2) %>%
mutate(
color = hex(mixcolor(0.5, hex2RGB(c1), hex2RGB(c2))),
across(starts_with("dim"), ~ factor(.x,
levels = c("low", "mid", "high"),
ordered=T))
)
# A tibble: 9 × 5
c1 dim1 c2 dim2 color
<chr> <ord> <chr> <ord> <chr>
1 #9F1401 high #07489C high #532E4F
2 #9F1401 high #878FD3 mid #93526A
3 #9F1401 high #f5f5f5 low #CA857B
4 #EE744B mid #07489C high #7B5E74
5 #EE744B mid #878FD3 mid #BB828F
6 #EE744B mid #f5f5f5 low #F2B5A0
7 #f5f5f5 low #07489C high #7E9FC9
8 #f5f5f5 low #878FD3 mid #BEC2E4
9 #f5f5f5 low #f5f5f5 low #F5F5F5
Can be used to encode two different variables with color: use with extreme care!
pal1 <- tibble(c1 = c("#f5f5f5","#EE744B","#9F1401"),
dim1 = c("low", "mid", "high"))
pal2 <- tibble(c2 = c("#f5f5f5","#878FD3","#07489C"),
dim2 = c("low", "mid", "high"))
crossing(pal1, pal2) %>%
mutate(
color = hex(mixcolor(0.5, hex2RGB(c1), hex2RGB(c2))),
across(starts_with("dim"), ~ factor(.x,
levels = c("low", "mid", "high"),
ordered=T))
) %>%
ggplot(aes(x=dim1, y=dim2, fill=color)) +
geom_tile() +
scale_fill_identity() +
theme_classic() +
theme(axis.line = element_blank())
In R
you have access to pre-made colormaps by using the following libraries, among others.
RColorBrewer
The command will plot all available palettes in the package.
There are three types of palettes available.
They can be selected by using the type
parameter in the command above:
seq
, div
for ordered dataqual
for categorical dataRColorBrewer
To select a particular palette, you can use:
where you specify the number of colors you need and the name of the palette.
[Pro tip: visit http://colorbrewer2.org/
Viridis
There are eight colormaps in viridis
:
A
)B
)C
)D
)E
)F
)G
)H
)Viridis
All these palettes are ordered.
Viridis
Viridis
viridis
Viridis
magma
Viridis
inferno
Viridis
plasma
Viridis
cividis
Viridis
rocket
Viridis
mako
Viridis
turbo
colorspace
You can list all the colormaps in the colorspace
package using:
This command returns a whole lot of results, divided in:
colorspace
There are also specialized functions to get these palettes.
Note that some palettes come from the RColorBrewer
and Viridis
packages.
The colorspace
package provides the swatchplot
function to visualize palettes:
You can use the hcl_wizard()
function.
It requires to have shiny
installed:
The contrast ratio, as defined by the World Wide Web Consortium, is a number quantifying the contrast with the background. It should be higher than 4 for text, as a general guideline.
ggplot2
Where <aes>
is either fill
or color
and <type>
is one of
continuous
: continuous color mapdiscrete
: categorical color mapmanual
: manually specify colorsidentity
: use the values as color codesggplot2
: defaultggplot2
: defaultggplot2
: manualggplot2
: viridisggplot2
: RColorBrewer
ggplot2
: colorspace
Data Visualization and Exploration - Colors - ozan-k.com