Code
# Set up packages and functions
library(tidyverse)
library(kableExtra)
library(htmlTable)
library(gt)
source("code/cummax_ignore_na.R")
source("code/cummin_ignore_na.R")
For monthly temperature data, this file shows:
# Set up packages and functions
library(tidyverse)
library(kableExtra)
library(htmlTable)
library(gt)
source("code/cummax_ignore_na.R")
source("code/cummin_ignore_na.R")
Three data sets were downloaded from the NOAA National Weather service NOWData tool.
All data sets are for Paso Robles, CA, from the start of data collection (“por”) to present (2025).
We requested the monthly maximums, monthly minimums, and monthly averages.
After clicking “Go”, a data table pops up. We copied it and pasted into Microsoft Excel, removed the summary rows included at the bottom, then saved as a csv.
This process yielded 3 csv files: paso_max.csv, paso_min.csv, and paso_avg.csv.
Read in, combine, and process data from the csv files obtained above.
## Read in Csv's for min, mean, max
<- read.csv("data/paso_min.csv", na.strings = "M")
min_data <- min_data %>%
min_data select(-Annual) %>%
pivot_longer( ## one row = one month
cols = Jan:Dec,
names_to ="month",
values_to = "min"
)
<- read.csv("data/paso_avg.csv", na.strings = "M")
mean_data
<- mean_data %>%
mean_data select(-Annual) %>%
pivot_longer( ## one row = one month
cols = Jan:Dec,
names_to ="month",
values_to = "mean"
)
<- read.csv("data/paso_max.csv", na.strings = "M")
max_data
<- max_data %>%
max_data select(-Annual) %>%
pivot_longer( ## one row = one month
cols = Jan:Dec,
names_to ="month",
values_to = "max"
)
## combine data sets and add date formatting
<- bind_cols(min_data,
monthly_weather 3],
mean_data[,3]) %>%
max_data[,mutate(date = ym(paste0(year, "-", month)))
## add seasonal variables
<- monthly_weather %>%
monthly_weather mutate(
season = case_when(
%in% c("Dec", "Jan", "Feb") ~ "Winter",
month %in% c("Mar", "Apr", "May") ~ "Spring",
month %in% c("June", "July", "Aug") ~ "Summer",
month %in% c("Sept", "Oct", "Nov") ~ "Fall"
month
),season_color = case_when(
%in% c("Dec", "Jan", "Feb") ~ "#2f77c3",
month %in% c("Mar", "Apr", "May") ~ "#61bf9a",
month %in% c("June", "July", "Aug") ~ "#f94994",
month %in% c("Sept", "Oct", "Nov") ~ "#eb9911"
month
), season_year = case_when(
== "Dec" ~ year + 1, # December belongs to *next* Jan/Feb
month .default = year
),season_label = paste(season, season_year)
%>%
) group_by(season, year) %>%
ungroup() %>%
mutate(
month = factor(month, levels = c("Dec", "Jan", "Feb", "Mar", "Apr", "May" , "June", "July", "Aug", "Sept", "Oct", "Nov") ))
<- monthly_weather %>%
monthly_weather group_by(season_label, season_year) %>%
mutate(
xmin = min(date),
xmax = max(date),
seas_avg = mean(mean),
seas_max = max(max),
seas_min = min(min),
season_color = unique(season_color),
season = unique(season)
%>%
) ungroup() %>%
mutate(season = factor(season, levels = c("Winter", "Spring", "Summer", "Fall")))
This gets us three monthly time series, which we can visualize.
## color palatte
<- c("max" = "red", "min" = "blue", "mean"="grey", "Winter" = "#2f77c3", "Spring" = "#61bf9a", "Summer" = "#f94994", "Fall" = "#eb9911")
plot_cols
%>%
monthly_weather ggplot(aes(x = date)) +
geom_line(aes(y = min, col = "min")) +
geom_line(aes(y = max, col = "max")) +
geom_line(aes(y = mean, col = "mean")) +
scale_color_manual(values = plot_cols)
We can clearly see that the maximum series is consistently above the average, which is consistently above the minimum, as we expect.
It’s hard to see the monthly variation on this long time scale, so we will zoom in to just 10 years of data starting after the missing data period at the beginning (after 1909):.
%>%
monthly_weather filter(between(year, 1910, 1919)) %>%
ggplot(aes(x = date)) +
geom_line(aes(y = min, col = "min")) +
geom_point(aes(y = min, col = "min"))+
geom_line(aes(y = max, col = "max")) +
geom_point(aes(y = max, col = "max"))+
geom_line(aes(y = mean, col = "mean")) +
geom_point(aes(y = mean, col = "mean"))+
scale_color_manual(values = plot_cols)
Each dot on the plot corresponds to one month in a given year. The annual seasonal patterns are defined by the apparent winter valleys and summer peaks.
We can add the information about each season by coloring the points in the above plot:
%>%
monthly_weather filter(between(year, 1910, 1919)) %>%
ggplot(aes(x = date)) +
geom_line(aes(y = min, col = "min")) +
geom_point(aes(y = min, col = season))+
geom_line(aes(y = max, col = "max")) +
geom_point(aes(y = max, col = season))+
geom_line(aes(y = mean, col = "mean")) +
geom_point(aes(y = mean, col = season))+
scale_color_manual(values = plot_cols)
Summer and winter are confirmed as the high and low points, and we see spring and fall on the appropriate “sides” of the distribution. We see these relative seasonal patterns in all three summary statistics, which makes sense: we expect the lowest high to be in winter, and the highest low to be in summer, and so on.
Since months and seasons both capture the structure of the annual variation, we can aggregate to the monthly or seasonal level to gain additional insight to the series as a whole (without having to zoom in).
We will visually (plots) and numerically (tables) examine the following statistics:
For the full history January 1900- May 2025.
<- monthly_weather %>%
summary_table_all summarise(
## mean
mean_Min = mean(min, na.rm = T),
mean_Mean = mean(mean, na.rm = T),
mean_Max = mean(max, na.rm = T),
## sd
sd_Min= sd(min, na.rm = T),
sd_Mean = sd(mean, na.rm = T),
sd_Max = sd(max, na.rm = T),
## max
max_Min = max(min, na.rm = T),
max_Mean = max(mean, na.rm = T),
max_Max = max(max, na.rm = T),
## min
min_Min = min(min, na.rm = T),
min_Mean = min(mean, na.rm = T),
min_Max = min(max, na.rm = T),
)
## chat help
<- c("min", "mean", "max")
facet_vars
# Step 1: Pivot your summary table
<- summary_table_all %>%
vline_data pivot_longer(cols = contains(c("Min_min", "Mean_mean", "Max_max")),
names_to = "name", values_to = "xintercept")
$name = facet_vars
vline_data
%>%
monthly_weather bind_cols(summary_table_all[rep(1, nrow(monthly_weather)), ]) %>%
select(-c(xmin, xmax)) %>%
pivot_longer(cols = contains(c("min", "mean", "max"))) %>%
filter(name %in% facet_vars) %>%
select(year, month, date, season, name, value) %>%
ggplot(aes(x = value, fill = name)) +
geom_histogram(bins = 30) +
geom_vline(
data = vline_data,
aes(xintercept = xintercept),
color = "black", linewidth = 2, linetype = "solid"
+
) geom_vline(
data = vline_data,
aes(xintercept = xintercept, color = name),
linewidth = 1, linetype = "solid"
+
) facet_grid(name~., scales = "fixed") +
theme_minimal() +
scale_fill_manual(values = plot_cols)+
scale_color_manual(values = plot_cols)
## chat help
<- c("min", "mean", "max")
facet_vars
# Step 1: Pivot your summary table
<- summary_table_all %>%
vline_data pivot_longer(cols = contains(c("min_Min", "Mean_mean", "Max_max")),
names_to = "name", values_to = "xintercept")
$name = facet_vars
vline_data
%>%
monthly_weather bind_cols(summary_table_all[rep(1, nrow(monthly_weather)), ]) %>%
select(-c(xmin, xmax)) %>%
pivot_longer(cols = contains(c("min", "mean", "max"))) %>%
filter(name %in% facet_vars) %>%
select(year, month, date, season, name, value) %>%
ggplot(aes(x = value, fill = name)) +
geom_boxplot() +
geom_vline(
data = vline_data,
aes(xintercept = xintercept),
color = "black", linewidth = 2, linetype = "solid"
+
) geom_vline(
data = vline_data,
aes(xintercept = xintercept, color = name),
linewidth = 1, linetype = "solid"
+
) facet_grid(name~., scales = "fixed") +
theme_minimal() +
scale_fill_manual(values = plot_cols)+
scale_color_manual(values = plot_cols)
%>%
summary_table_all select(-c(max_Min, min_Max, min_Mean, max_Mean))%>%
gt() %>%
tab_header(
title = "All-time Summaries"
%>%
) fmt_number(
columns = where(is.numeric),
decimals = 2
%>%
)cols_label(
min_Min = "Min",
mean_Min = "Mean",
sd_Min = "SD",
mean_Mean = "Mean",
sd_Mean = "SD",
max_Max = "Max",
mean_Max = "Mean",
sd_Max= "SD"
%>%
) tab_spanner(
label = "Monthly Minimums",
columns = c(min_Min, mean_Min, sd_Min)
%>%
) tab_spanner(
label = "Monthly Means",
columns = c(mean_Mean, sd_Mean)
%>%
) tab_spanner(
label = "Monthly Maximums",
columns = c(mean_Max, sd_Max, max_Max)
)
All-time Summaries | |||||||
---|---|---|---|---|---|---|---|
Monthly Means
|
Monthly Maximums
|
Monthly Minimums
|
|||||
Mean | SD | Mean | SD | Max | Min | Mean | SD |
59.02 | 9.22 | 90.69 | 13.27 | 117.00 | 0.00 | 32.13 | 8.52 |
= c("Dec", "Jan", "Feb", "Mar", "Apr", "May" , "June", "July", "Aug", "Sept", "Oct", "Nov")
month.abb <- monthly_weather %>%
summary_table mutate(month = factor(month, levels = month.abb)) %>%
group_by(month) %>%
summarise(
## mean
mean_Min = mean(min, na.rm = T),
mean_Mean = mean(mean, na.rm = T),
mean_Max = mean(max, na.rm = T),
## sd
sd_Min= sd(min, na.rm = T),
sd_Mean = sd(mean, na.rm = T),
sd_Max = sd(max, na.rm = T),
## max
max_Min = max(min, na.rm = T),
max_Mean = max(mean, na.rm = T),
max_Max = max(max, na.rm = T),
## min
min_Min = min(min, na.rm = T),
min_Mean = min(mean, na.rm = T),
min_Max = min(max, na.rm = T),
%>%
)arrange(month)
<- c(plot_cols, "min_Min" = "blue", "mean_Mean" = "grey", "max_Max" = "red")
plot_cols
<- summary_table %>%
agg_plotggplot(aes(x = month)) +
geom_line(aes(y = min_Min, col = "min_Min", group = 1), linewidth = .75) +
geom_line(aes(y = mean_Mean, col = "mean_Mean", group = 1), linewidth = .75) +
geom_line(aes(y = max_Max, col = "max_Max", group = 1), linewidth = .75) +
geom_point(aes(y = min_Min, col = "min_Min", group = 1), size = 3) +
geom_point(aes(y = mean_Mean, col = "mean_Mean", group = 1), size = 3) +
geom_point(aes(y = max_Max, col = "max_Max", group = 1), size = 3) +
scale_color_manual(values = plot_cols, name = "Legend") +
ylab("Temperature (F)")+ xlab("Month") +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
axis.line = element_blank(),
axis.title.y = element_text(angle = 0, vjust = .5)) + ylim(c(0,120))
agg_plot
+
agg_plot geom_point(data = monthly_weather, aes(x = month, y = min, col = season), alpha = 0.1, position = position_jitter()) +
geom_point(data = monthly_weather, aes(x = month, y = mean, col = season), alpha = 0.1, position = position_jitter()) +
geom_point(data = monthly_weather, aes(x = month, y = max, col = season), alpha = 0.1, position = position_jitter()) +
ylab("Temperature (F)")+ xlab("Month") +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
axis.line = element_blank(),
axis.title.y = element_text(angle = 0, vjust = .5)) +
ylim(c(0,120))
%>%
summary_table select(-c(max_Min, min_Max, min_Mean, max_Mean))%>%
gt() %>%
tab_header(
title = "All-time Summaries"
%>%
) fmt_number(
columns = where(is.numeric),
decimals = 2
%>%
)cols_label(
min_Min = "Min",
mean_Min = "Mean",
sd_Min = "SD",
mean_Mean = "Mean",
sd_Mean = "SD",
max_Max = "Max",
mean_Max = "Mean",
sd_Max= "SD"
%>%
) tab_spanner(
label = "Monthly Minimums",
columns = c(min_Min, mean_Min, sd_Min)
%>%
) tab_spanner(
label = "Monthly Means",
columns = c(mean_Mean, sd_Mean)
%>%
) tab_spanner(
label = "Monthly Maximums",
columns = c(mean_Max, sd_Max, max_Max)
)
All-time Summaries | ||||||||
---|---|---|---|---|---|---|---|---|
month |
Monthly Means
|
Monthly Maximums
|
Monthly Minimums
|
|||||
Mean | SD | Mean | SD | Max | Min | Mean | SD | |
Dec | 46.95 | 2.47 | 73.08 | 4.52 | 87.00 | 7.00 | 21.21 | 4.44 |
Jan | 47.05 | 2.71 | 72.75 | 4.51 | 83.00 | 0.00 | 21.74 | 5.42 |
Feb | 50.06 | 2.68 | 75.99 | 4.91 | 85.00 | 13.00 | 25.34 | 4.28 |
Mar | 53.06 | 2.79 | 81.32 | 5.29 | 91.00 | 20.00 | 29.23 | 3.69 |
Apr | 56.75 | 2.63 | 89.06 | 5.51 | 100.00 | 24.00 | 31.55 | 3.29 |
May | 61.84 | 2.53 | 96.31 | 4.90 | 110.00 | 30.00 | 35.67 | 3.34 |
June | 67.56 | 2.52 | 103.65 | 4.09 | 115.00 | 31.00 | 39.69 | 3.40 |
July | 71.77 | 2.36 | 106.84 | 3.44 | 115.00 | 36.00 | 43.12 | 3.33 |
Aug | 71.47 | 2.14 | 105.86 | 3.35 | 117.00 | 32.00 | 42.58 | 3.66 |
Sept | 68.33 | 2.48 | 103.47 | 4.54 | 115.00 | 28.00 | 39.08 | 3.78 |
Oct | 61.41 | 2.50 | 96.05 | 4.43 | 108.00 | 19.00 | 31.71 | 4.30 |
Nov | 52.72 | 2.39 | 84.25 | 4.96 | 95.00 | 14.00 | 24.86 | 4.47 |
<- monthly_weather %>%
summary_table_seasonal mutate(season = factor(season, levels = c("Winter", "Spring", "Summer", "Fall"))) %>%
mutate(month = factor(month, levels = month.abb)) %>%
group_by(season) %>%
summarise(
## mean
mean_Min = mean(min, na.rm = T),
mean_Mean = mean(mean, na.rm = T),
mean_Max = mean(max, na.rm = T),
## sd
sd_Min= sd(min, na.rm = T),
sd_Mean = sd(mean, na.rm = T),
sd_Max = sd(max, na.rm = T),
## max
max_Min = max(min, na.rm = T),
max_Mean = max(mean, na.rm = T),
max_Max = max(max, na.rm = T),
## min
min_Min = min(min, na.rm = T),
min_Mean = min(mean, na.rm = T),
min_Max = min(max, na.rm = T),
%>%
)arrange(season)
<- c(plot_cols, "min_Min" = "blue", "mean_Mean" = "grey", "max_Max" = "red")
plot_cols
%>%
summary_table_seasonal ggplot(aes(x = season)) +
geom_line(aes(y = min_Min, col = "min_Min", group = 1), linewidth = .75) +
geom_line(aes(y = mean_Mean, col = "Mean_mean", group = 1), linewidth = .75) +
geom_line(aes(y = max_Max, col = "max_Max", group = 1), linewidth = .75) +
geom_point(aes(y = min_Min, col = season, group = 1), size = 3) +
geom_point(aes(y = mean_Mean, col = season, group = 1), size = 3) +
geom_point(aes(y = max_Max, col = season, group = 1), size = 3) +
scale_color_manual(values = plot_cols, name = "Legend") +
ylab("Temperature (F)")+ xlab("Season") +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
axis.line = element_blank(),
axis.title.y = element_text(angle = 0, vjust = .5)) + ylim(c(0,120))
%>%
summary_table_seasonal ggplot(aes(x = season)) +
geom_line(aes(y = min_Min, col = "min_Min", group = 1), linewidth = .75) +
geom_line(aes(y = mean_Mean, col = "mean_Mean", group = 1), linewidth = .75) +
geom_line(aes(y = max_Max, col = "max_Max", group = 1), linewidth = .75) +
geom_point(aes(y = min_Min, col = season, group = 1), size = 3) +
geom_point(aes(y = mean_Mean, col = season, group = 1), size = 3) +
geom_point(aes(y = max_Max, col = season, group = 1), size = 3) +
scale_color_manual(values = plot_cols, name = "Legend") +
geom_point(data = monthly_weather, aes(x = season, y = min, col = season), alpha = 0.1, position = position_jitter()) +
geom_point(data = monthly_weather, aes(x = season, y = mean, col = season), alpha = 0.1, position = position_jitter()) +
geom_point(data = monthly_weather, aes(x = season, y = max, col = season), alpha = 0.1, position = position_jitter()) +
ylab("Temperature (F)")+ xlab("Year") +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
axis.line = element_blank(),
axis.title.y = element_text(angle = 0, vjust = .5))+ ylim(c(0, 120))
## chat help
<- c("min", "mean", "max")
facet_vars
# Step 1: Pivot your summary table
<- summary_table_seasonal%>%
vline_data group_by(season) %>%
pivot_longer(cols = contains(c("min_Min", "mean_Mean", "max_Max")),
names_to = "Statistic", values_to = "xintercept")
$Statistic = rep(facet_vars, times = 4)
vline_data
%>%
monthly_weather bind_cols(summary_table_seasonal[rep(1, nrow(monthly_weather)), ]) %>%
select(-c(xmin, xmax)) %>%
pivot_longer(cols = contains(c("min", "mean", "max")), names_to = "Statistic") %>%
filter(Statistic %in% facet_vars) %>%
select(year, month, date, season...7, Statistic, value) %>%
rename(season=season...7) %>%
ggplot(aes(x = value, fill = season)) +
geom_histogram(bins = 30) +
geom_vline(
data = vline_data,
aes(xintercept = xintercept, col = Statistic),
linewidth = 1, linetype = "dashed"
+
) facet_grid(Statistic~season, scales = "fixed") +
theme_minimal() +
scale_fill_manual(values = plot_cols) +
scale_color_manual(values = plot_cols)
%>%
summary_table_seasonal select(-c(max_Min, min_Max, min_Mean, max_Mean))%>%
gt() %>%
tab_header(
title = "All-time Summaries"
%>%
) fmt_number(
columns = where(is.numeric),
decimals = 2
%>%
)cols_label(
min_Min = "Min",
mean_Min = "Mean",
sd_Min = "SD",
mean_Mean = "Mean",
sd_Mean = "SD",
max_Max = "Max",
mean_Max = "Mean",
sd_Max= "SD"
%>%
) tab_spanner(
label = "Monthly Minimums",
columns = c(min_Min, mean_Min, sd_Min)
%>%
) tab_spanner(
label = "Monthly Means",
columns = c(mean_Mean, sd_Mean)
%>%
) tab_spanner(
label = "Monthly Maximums",
columns = c(mean_Max, sd_Max, max_Max)
)
All-time Summaries | ||||||||
---|---|---|---|---|---|---|---|---|
season |
Monthly Means
|
Monthly Maximums
|
Monthly Minimums
|
|||||
Mean | SD | Mean | SD | Max | Min | Mean | SD | |
Winter | 48.03 | 2.99 | 73.94 | 4.86 | 87.00 | 0.00 | 22.77 | 5.07 |
Spring | 57.19 | 4.48 | 88.89 | 8.05 | 110.00 | 20.00 | 32.15 | 4.35 |
Summer | 70.27 | 3.02 | 105.45 | 3.87 | 117.00 | 31.00 | 41.79 | 3.77 |
Fall | 60.82 | 6.87 | 94.62 | 9.17 | 115.00 | 14.00 | 31.90 | 7.16 |
When the seasonal max or min reaches a new high or low, indicate this with a true/false.
<- monthly_weather %>%
out group_by(season) %>%
arrange(date) %>%
mutate(
start_flag = year > 1977,
cummax = cummax_ignore_na(max, start_flag),
cummin = cummin_ignore_na(min, start_flag),
new_max = cummax == max,
new_min = cummin == min
%>%
) ungroup()
$new_max <- out$new_max
monthly_weather$new_min <- out$new_min monthly_weather
We can either set this to start from the beginning of the series, or start the “tracker” at a certain specific time. Since we will have three parts to our music and we want the exceedances to be part 3, we will start the tracker in 1977 (see Sonifcation Design page for reasoning behind choice of 1977).
<- monthly_weather %>%
out_all arrange(date) %>%
mutate(
start_flag = year > 1899,
cummax = cummax_ignore_na(max, start_flag),
cummin = cummin_ignore_na(min, start_flag),
new_max = cummax == max,
new_min = cummin == min
%>%
) ungroup()
$new_max_all_time <- out_all$new_max
monthly_weather$new_min_all_time <- out_all$new_min
monthly_weather
%>%
monthly_weather ggplot(aes(x = date)) +
geom_line(aes(y = min, color = "min")) +
geom_point(aes(y = min, color = "min"), alpha = 0.1) +
geom_line(aes(y = max, color = "max")) +
geom_point(aes(y = max, color = "max"), alpha = 0.1) +
geom_point(data = filter(monthly_weather, new_max_all_time),
aes(x = date, y = max, color = "max"), size = 6, shape = 17) +
geom_point(data = filter(monthly_weather, new_min_all_time),
aes(x = date, y = min, color = "min"), size = 6, shape = 15) +
scale_color_manual(values = plot_cols, name = "Exceedance\nType") +
scale_x_date(labels = monthly_weather$season_year[monthly_weather$season == "Winter" & monthly_weather$season_year %%5 ==0],
breaks = monthly_weather$xmin[monthly_weather$season == "Winter"& monthly_weather$season_year %%5 ==0]) +
theme(axis.text.x = element_text(angle = 25))
<- monthly_weather %>%
out arrange(date) %>%
mutate(
start_flag = year > 1977,
cummax = cummax_ignore_na(max, start_flag),
cummin = cummin_ignore_na(min, start_flag),
new_max = cummax == max,
new_min = cummin == min
%>%
) ungroup()
$new_max <- out$new_max
monthly_weather$new_min <- out$new_min
monthly_weather
%>%
monthly_weather ggplot(aes(x = date)) +
geom_line(aes(y = min, color = "min")) +
geom_point(aes(y = min, color = "min"), alpha = 0.1) +
geom_line(aes(y = max, color = "max")) +
geom_point(aes(y = max, color = "max"), alpha = 0.1) +
geom_point(data = filter(monthly_weather, new_max),
aes(x = date, y = max, color = "max"), size = 6, shape = 17) +
geom_point(data = filter(monthly_weather, new_min),
aes(x = date, y = min, color = "min"), size = 6, shape = 15) +
scale_color_manual(values = plot_cols, name = "Exceedance\nType") +
scale_x_date(labels = monthly_weather$season_year[monthly_weather$season == "Winter" & monthly_weather$season_year %%5 ==0],
breaks = monthly_weather$xmin[monthly_weather$season == "Winter"& monthly_weather$season_year %%5 ==0]) +
theme(axis.text.x = element_text(angle = 25))
Most of the all time highs occur in summer, and most all-time lows occur in winter. This is unsurprising, and basically a consequence of the statistics we are looking at and the properties of the maximum and the minimum.
Since our goal is to summarize data for sonification, we want to make sure the information does not get frontloaded all at the beginning. So, looking at seasonal exceedances gives us more chances to hear extremes.
Note that the unusually high fall temperatures are both for all time and if we start the “new extreme” timer in 1977.
<- monthly_weather %>%
out_all group_by(season)%>%
arrange(date) %>%
mutate(
start_flag = year > 1899,
cummax = cummax_ignore_na(max, start_flag),
cummin = cummin_ignore_na(min, start_flag),
new_max = cummax == max,
new_min = cummin == min
%>%
) ungroup()
$new_max_all_time <- out_all$new_max
monthly_weather$new_min_all_time <- out_all$new_min
monthly_weather
%>%
monthly_weather ggplot(aes(x = date)) +
geom_line(aes(y = min, color = "min")) +
geom_point(aes(y = min, color = "min"), alpha = 0.1) +
geom_line(aes(y = max, color = "max")) +
geom_point(aes(y = max, color = "max"), alpha = 0.1) +
geom_point(data = filter(monthly_weather, new_max_all_time),
aes(x = date, y = max, color = season), size = 6, shape = 17) +
geom_point(data = filter(monthly_weather, new_min_all_time),
aes(x = date, y = min, color = season), size = 6, shape = 15) +
scale_color_manual(values = plot_cols, name = "Exceedance\nType") +
scale_x_date(labels = monthly_weather$season_year[monthly_weather$season == "Winter" & monthly_weather$season_year %%5 ==0],
breaks = monthly_weather$xmin[monthly_weather$season == "Winter"& monthly_weather$season_year %%5 ==0]) +
theme(axis.text.x = element_text(angle = 25))
<- monthly_weather %>%
out group_by(season)%>%
arrange(date) %>%
mutate(
start_flag = year > 1977,
cummax = cummax_ignore_na(max, start_flag),
cummin = cummin_ignore_na(min, start_flag),
new_max = cummax == max,
new_min = cummin == min
%>%
) ungroup()
$new_max <- out$new_max
monthly_weather$new_min <- out$new_min
monthly_weather
%>%
monthly_weather ggplot(aes(x = date)) +
geom_line(aes(y = min, color = "min")) +
geom_point(aes(y = min, color = "min"), alpha = 0.1) +
geom_line(aes(y = max, color = "max")) +
geom_point(aes(y = max, color = "max"), alpha = 0.1) +
geom_point(data = filter(monthly_weather, new_max),
aes(x = date, y = max, color = season), size = 6, shape = 17) +
geom_point(data = filter(monthly_weather, new_min),
aes(x = date, y = min, color = season), size = 6, shape = 15) +
scale_color_manual(values = plot_cols, name = "Exceedance\nType") +
scale_x_date(labels = monthly_weather$season_year[monthly_weather$season == "Winter" & monthly_weather$season_year %%5 ==0],
breaks = monthly_weather$xmin[monthly_weather$season == "Winter"& monthly_weather$season_year %%5 ==0]) +
theme(axis.text.x = element_text(angle = 25))
As you will hear (and see) in the animation, there are occasionally missing data points, resulting in silence!
## code inspired by https://r-graph-gallery.com/79-levelplot-with-ggplot2.html
<- monthly_weather$date
x <- c("min", "avg", "max")
y <- expand.grid(X=x, Y=y)
data $Z <- c(ifelse(is.na(monthly_weather$min),"missing", "observed"),
dataifelse(is.na(monthly_weather$mean),"missing", "observed"),
ifelse(is.na(monthly_weather$max),"missing", "observed"))
<- nrow(monthly_weather)/12
n_year
<- seq(from = ymd(monthly_weather$date[1]),
x_breaks to = ymd(monthly_weather$date[1392]),
length.out = floor(n_year/10))
<- year(x_breaks)
x_labs
<- ymd("1910-01-01")
change_date_part_1 <- monthly_weather[which.max(monthly_weather$max),"date"] %>% pull()
change_date_part_2 <- ymd("1977-01-01")
change_date_part_3
%>% tibble() %>%
data mutate(X = as.Date(X, "%Y-%m-%d"))%>%
ggplot(aes(x= X, y=factor(Y), fill = Z)) +
geom_tile(height = 0.5) +
scale_x_date(breaks = x_breaks, labels = x_labs) +
scale_fill_manual(name = "",
values = c("missing" = "black", "observed" = "grey90")
+ ylab("Summary \nStatistic") + xlab("Year (one line per month)") +
) theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
axis.line = element_blank(),
axis.title.y = element_text(angle = 0, vjust = .5))
Since we are looking at monthly level data, presumably, each data point represents the summary (average, max, min) of 28, 29, 30, or 31 data points. However, there are exceptions
May 2025 values of all 3 summary statistics only include at most 9 points, since the data set was sourced May 9, 2025
the NOWData tool only reports the monthly values, and does not indicate whether any of the daily (or lower level data that the daily values may be aggregated from) values were missing.
While busy, this is a visualization of the entire data set we will be working with for our sonification.
The full monthly history of the series is too much for one plot– this is why we want to animate and/or sonify it– to add a time dimension to our perceptualization and allowing us to “zoom in” as we did on the beginning of the series above.
ggplot(monthly_weather, aes(x = date)) +
geom_line(aes(y = min, color = "min")) +
geom_point(aes(y = min, color = season)) +
geom_line(aes(y = max, color = "max")) +
geom_point(aes(y = max, color = season)) +
geom_line(aes(y = mean, color = "mean")) +
geom_point(aes(y = mean, color = season)) +
geom_point(data = filter(monthly_weather, new_max),
aes(x = date, y = max, color = season), size = 6, shape = 17) +
geom_point(data = filter(monthly_weather, new_min),
aes(x = date, y = min, color = season), size = 6, shape = 15) +
scale_color_manual(values = plot_cols) +
scale_x_date(labels = monthly_weather$season_year[monthly_weather$season == "Winter" & monthly_weather$season_year %%5 ==0],
breaks = monthly_weather$xmin[monthly_weather$season == "Winter"& monthly_weather$season_year %%5 ==0]) +
theme(axis.text.x = element_text(angle = 25))
This file is read in at the top of the sonification design details.
save(monthly_weather, file = "data/monthly_weather")
save(plot_cols, file = "data/plot_cols")