How to Create Histogram With ggplot in R
-
Use
geom_histogram
to Create a Histogram Withggplot
in R -
Use
fill
,colour
andsize
Parameters to Modify the Histogram Visuals in R -
Use
facet_wrap
to Construct Multiple Histograms Grouped by Category in R
This article will demonstrate how to create a histogram with ggplot
in R.
Use geom_histogram
to Create a Histogram With ggplot
in R
A simple histogram is constructed using the geom_histogram
function, and it only needs one variable to draw the graph. In this case, we use the diamonds
data set, namely, the price
column from it, to specify the mapping to the x-axis. geom_histogram
automatically chooses the bin size and scale data points unless explicitly passed by the user.
library(ggplot2)
p1 <- ggplot(diamonds, aes(x = price)) +
geom_histogram()
p1
The following example expands the previous code snippet to specify the breakpoints on each axis using scale_x_continuous
and scale_y_continuous
functions. breaks
parameter is utilized to pass the values generated by seq
function. seq
parameters are intuitive to read as they form the pattern - (from, to, by)
. We also utilize the grid.arrange
function to display two graphs side-by-side for visual comparison.
library(ggplot2)
library(gridExtra)
p1 <- ggplot(diamonds, aes(x = price)) +
geom_histogram()
p2 <- ggplot(diamonds, aes(x = price)) +
geom_histogram() +
scale_y_continuous(breaks = seq(1000, 14000, 2000)) +
scale_x_continuous(breaks = seq(0, 18000, 2000))
grid.arrange(p1, p2, nrow = 2)
Use fill
, colour
and size
Parameters to Modify the Histogram Visuals in R
The common parameters such as fill
, colour
and size
can be utilized to change the visual of graph bins. The fill
parameter specifies the color by which bins are filled; in contrast, colour
is used for the bin strokes. size
takes numeric value to denote the width of the bin strokes. Notice also that the following code snippet adds the name
parameter to both axes.
library(ggplot2)
library(gridExtra)
p3 <- ggplot(diamonds, aes(x = price)) +
geom_histogram(fill = "pink", colour = "brown") +
scale_y_continuous(breaks = seq(1000, 14000, 2000)) +
scale_x_continuous(breaks = seq(0, 18000, 2000))
p4 <- ggplot(diamonds, aes(x = price)) +
geom_histogram(fill = "pink", colour = "brown", size = .3) +
scale_y_continuous(breaks = seq(1000, 14000, 2000), name = "Number of diamonds" ) +
scale_x_continuous(breaks = seq(0, 18000, 2000), name = "Price" )
grid.arrange(p3, p4, nrow = 2)
Use facet_wrap
to Construct Multiple Histograms Grouped by Category in R
The facet_wrap
function can be used to draw multiple histograms based on the set of variables. diamonds
data set gives provides enough dimensions to choose the variables from one of its columns. E.g., we chose the cut
column to display different price
histograms for each type. The theme
function can also be combined with the geom_histogram
to specify custom formatting for graph elements.
library(ggplot2)
p5 <- ggplot(diamonds, aes(x = price)) +
geom_histogram(fill = "pink", colour = "brown", size = .3) +
scale_y_continuous( name = "Number of diamonds" ) +
scale_x_continuous( name = "Price" ) +
facet_wrap(~cut) +
theme(
axis.title.x = element_text(
size = rel(1.2), lineheight = .9,
family = "Calibri", face = "bold", colour = "black"
),
axis.title.y = element_text(
size = rel(1.2), lineheight = .9,
family = "Calibri", face = "bold", colour = "black"
),
plot.background = element_rect("yellow"))
p5
Founder of DelftStack.com. Jinku has worked in the robotics and automotive industries for over 8 years. He sharpened his coding skills when he needed to do the automatic testing, data collection from remote servers and report creation from the endurance test. He is from an electrical/electronics engineering background but has expanded his interest to embedded electronics, embedded programming and front-/back-end programming.
LinkedIn Facebook