Transportation change since COVID-19 in Canada
Published:
COVID Transportation Trends
The global pandemic has completely changed how people move around, even here in Canada. Data from Apple sheds light on these trends for driving, walking, and driving. This notebook will go over how to access and clean these data for further analysis and visualization.
First things first: let’s load in the packages we need.
#specify the packages of interest
packages = c("ggplot2","directlabels","reshape", "readxl", "tidytext", "dplyr", "data.table", "tidyr", "pracma", "stringr", "grid", "zoo")
#use this function to check if each package is on the local machine
#if a package is installed, it will be loaded, else installed then loaded.
package.check <- lapply(packages, FUN = function(x) {
if (!require(x, character.only = TRUE)) {
install.packages(x, dependencies = TRUE)
library(x, character.only = TRUE)
}
})
Loading in the data
Next we load in the data. Apple provides this data but only in csv format, meaning that everytime it updates (ie. everyday), we’d have to reload the data and run this script again. ActiveConclusion on github has made this data readily available with daily automatic updates.
To load it, I use the fread
command from data.table
and simply plug in the url:
# Read in Apple mobility data from Github
mobility <- fread("https://raw.githubusercontent.com/ActiveConclusion/COVID19_mobility/master/apple_reports/applemobilitytrends.csv", header = T)
head(mobility)
Cleaning the data
Now that we have it loaded, we need to clean it to get it in a format useful for us. For this I use the famous dplyr
package and filter region
for the Canadian cities listed in the dataset. Then I select which columns to keep and lastly pivot the table so that it’s in long format.
# Pipeline to process the data and pull out Canadian cities
mobility <- mobility %>%
filter(region %in% c("Calgary", "Edmonton", "Halifax", "Montreal", "Ottawa", "Toronto", "Vancouver")) %>%
select(-c(1,4:6)) %>% #Remove undeeded columns and update with new dates
pivot_longer(-c(region, transportation_type), names_to = "Date", values_to = "Index")
Some data cleaning
Next we have to clean the data up a bit. First is to get transportation_type
into sentence case and Date
into date format.
After that we need to generate a 7-day moving average of the data to plot a smooth curve. NOTE: This needs to be done by both region
and transportation_type
or else the function will simply go down the column and average out different modes of transportation and cities.
# Change data type of these two columns
mobility$transportation_type <- str_to_sentence(mobility$transportation_type)
mobility$Date <- as.Date(mobility$Date)
# Get 7-day moving average
setDT(mobility)[, "MA" := frollmean(Index, 7),
.SDcols = c("Index","transportation_type"),
by = list(region, transportation_type)]
Plotting
Next we plot our graph. I use the ggplot2
package; one geom_line
for the raw data and another for the moving average. Then I add some features to make it more readable: an intercept at 100 to show the baseline, a rectangle to show the week lockdowns began, and a dashed line to show Labour day.
From there I set my prefered themes and customizations and voila!
# Create figure
ggplot(mobility) +
geom_line(aes(x=Date, Index, group = 1), size=1, color="lightblue", alpha = .4, na.rm = T) +
geom_line(aes(x=Date, MA, group = 1), size=1, color="darkred", na.rm = T) +
geom_hline(yintercept = 100, linetype = "twodash", size = 1.2) +
geom_hline(yintercept = 0, linetype = "solid", size = 1.2) +
geom_vline(xintercept = as.Date("2020-09-07"), linetype = "dashed", color = "navyblue") +
geom_rect(aes(xmin = as.Date("2020-03-17"), xmax = as.Date("2020-03-22"), ymin = 1, ymax = Inf), alpha = .1, fill = "lightgrey") +
theme_void() +
facet_grid(transportation_type ~ region, scales = "free") +
labs(title = "Index (January 13, 2020 = 100)\n",
x = "Month") +
theme(plot.title = element_text(size = 16, face = "bold"),
strip.text = element_text(size = 22, vjust = .2),
axis.text=element_text(size=15),
strip.background = element_rect(colour = "white"),
strip.text.x = element_text(vjust = .9),
#axis.title=element_text(size=22, vjust = 2),
axis.title = element_blank(),
axis.ticks = element_blank(),
axis.text.x =element_text(colour="black", angle = 40),
axis.text.y=element_text(colour="black"),
panel.grid.minor = element_blank(),
panel.grid.major.y = element_line(colour = "grey"),
panel.spacing.x = unit(6, "mm"))
# Add some text
grid.text("Labour Day", x = .955, y = .9)
grid.text("Lockdowns \nbegin", x = .83, y = .9)
grid.text("7-day moving \naverage", x = .97, y = .81)