Code
# Set up packages and functions
library(tidyverse)
library(kableExtra)
library(htmlTable)
library(gt)
source("code/cummax_ignore_na.R")
source("code/cummin_ignore_na.R")For monthly temperature data, this file shows:
# Set up packages and functions
library(tidyverse)
library(kableExtra)
library(htmlTable)
library(gt)
source("code/cummax_ignore_na.R")
source("code/cummin_ignore_na.R")Three data sets were downloaded from the NOAA National Weather service NOWData tool.
All data sets are for Paso Robles, CA, from the start of data collection (“por”) to present (2025).

We requested the monthly maximums, monthly minimums, and monthly averages.
After clicking “Go”, a data table pops up. We copied it and pasted into Microsoft Excel, removed the summary rows included at the bottom, then saved as a csv.
This process yielded 3 csv files: paso_max.csv, paso_min.csv, and paso_avg.csv.
Read in, combine, and process data from the csv files obtained above.
## Read in Csv's for min, mean, max
min_data <- read.csv("data/paso_min.csv", na.strings = "M")
min_data <- min_data %>%
select(-Annual) %>%
pivot_longer( ## one row = one month
cols = Jan:Dec,
names_to ="month",
values_to = "min"
)
mean_data <- read.csv("data/paso_avg.csv", na.strings = "M")
mean_data <- mean_data %>%
select(-Annual) %>%
pivot_longer( ## one row = one month
cols = Jan:Dec,
names_to ="month",
values_to = "mean"
)
max_data <- read.csv("data/paso_max.csv", na.strings = "M")
max_data <- max_data %>%
select(-Annual) %>%
pivot_longer( ## one row = one month
cols = Jan:Dec,
names_to ="month",
values_to = "max"
)
## combine data sets and add date formatting
monthly_weather <- bind_cols(min_data,
mean_data[,3],
max_data[,3]) %>%
mutate(date = ym(paste0(year, "-", month)))
## add seasonal variables
monthly_weather <- monthly_weather %>%
mutate(
season = case_when(
month %in% c("Dec", "Jan", "Feb") ~ "Winter",
month %in% c("Mar", "Apr", "May") ~ "Spring",
month %in% c("June", "July", "Aug") ~ "Summer",
month %in% c("Sept", "Oct", "Nov") ~ "Fall"
),
season_color = case_when(
month %in% c("Dec", "Jan", "Feb") ~ "#2f77c3",
month %in% c("Mar", "Apr", "May") ~ "#61bf9a",
month %in% c("June", "July", "Aug") ~ "#f94994",
month %in% c("Sept", "Oct", "Nov") ~ "#eb9911"
),
season_year = case_when(
month == "Dec" ~ year + 1, # December belongs to *next* Jan/Feb
.default = year
),
season_label = paste(season, season_year)
) %>%
group_by(season, year) %>%
ungroup() %>%
mutate(
month = factor(month, levels = c("Dec", "Jan", "Feb", "Mar", "Apr", "May" , "June", "July", "Aug", "Sept", "Oct", "Nov") ))
monthly_weather <- monthly_weather %>%
group_by(season_label, season_year) %>%
mutate(
xmin = min(date),
xmax = max(date),
seas_avg = mean(mean),
seas_max = max(max),
seas_min = min(min),
season_color = unique(season_color),
season = unique(season)
) %>%
ungroup() %>%
mutate(season = factor(season, levels = c("Winter", "Spring", "Summer", "Fall")))This gets us three monthly time series, which we can visualize.
## color palatte
plot_cols <- c("max" = "red", "min" = "blue", "mean"="grey", "Winter" = "#2f77c3", "Spring" = "#61bf9a", "Summer" = "#f94994", "Fall" = "#eb9911")
monthly_weather %>%
ggplot(aes(x = date)) +
geom_line(aes(y = min, col = "min")) +
geom_line(aes(y = max, col = "max")) +
geom_line(aes(y = mean, col = "mean")) +
scale_color_manual(values = plot_cols)
We can clearly see that the maximum series is consistently above the average, which is consistently above the minimum, as we expect.
It’s hard to see the monthly variation on this long time scale, so we will zoom in to just 10 years of data starting after the missing data period at the beginning (after 1909):.
monthly_weather %>%
filter(between(year, 1910, 1919)) %>%
ggplot(aes(x = date)) +
geom_line(aes(y = min, col = "min")) +
geom_point(aes(y = min, col = "min"))+
geom_line(aes(y = max, col = "max")) +
geom_point(aes(y = max, col = "max"))+
geom_line(aes(y = mean, col = "mean")) +
geom_point(aes(y = mean, col = "mean"))+
scale_color_manual(values = plot_cols)
Each dot on the plot corresponds to one month in a given year. The annual seasonal patterns are defined by the apparent winter valleys and summer peaks.
We can add the information about each season by coloring the points in the above plot:
monthly_weather %>%
filter(between(year, 1910, 1919)) %>%
ggplot(aes(x = date)) +
geom_line(aes(y = min, col = "min")) +
geom_point(aes(y = min, col = season))+
geom_line(aes(y = max, col = "max")) +
geom_point(aes(y = max, col = season))+
geom_line(aes(y = mean, col = "mean")) +
geom_point(aes(y = mean, col = season))+
scale_color_manual(values = plot_cols)
Summer and winter are confirmed as the high and low points, and we see spring and fall on the appropriate “sides” of the distribution. We see these relative seasonal patterns in all three summary statistics, which makes sense: we expect the lowest high to be in winter, and the highest low to be in summer, and so on.
Since months and seasons both capture the structure of the annual variation, we can aggregate to the monthly or seasonal level to gain additional insight to the series as a whole (without having to zoom in).
We will visually (plots) and numerically (tables) examine the following statistics:
For the full history January 1900- May 2025.
summary_table_all <- monthly_weather %>%
summarise(
## mean
mean_Min = mean(min, na.rm = T),
mean_Mean = mean(mean, na.rm = T),
mean_Max = mean(max, na.rm = T),
## sd
sd_Min= sd(min, na.rm = T),
sd_Mean = sd(mean, na.rm = T),
sd_Max = sd(max, na.rm = T),
## max
max_Min = max(min, na.rm = T),
max_Mean = max(mean, na.rm = T),
max_Max = max(max, na.rm = T),
## min
min_Min = min(min, na.rm = T),
min_Mean = min(mean, na.rm = T),
min_Max = min(max, na.rm = T),
)
## chat help
facet_vars <- c("min", "mean", "max")
# Step 1: Pivot your summary table
vline_data <- summary_table_all %>%
pivot_longer(cols = contains(c("Min_min", "Mean_mean", "Max_max")),
names_to = "name", values_to = "xintercept")
vline_data$name = facet_vars
monthly_weather %>%
bind_cols(summary_table_all[rep(1, nrow(monthly_weather)), ]) %>%
select(-c(xmin, xmax)) %>%
pivot_longer(cols = contains(c("min", "mean", "max"))) %>%
filter(name %in% facet_vars) %>%
select(year, month, date, season, name, value) %>%
ggplot(aes(x = value, fill = name)) +
geom_histogram(bins = 30) +
geom_vline(
data = vline_data,
aes(xintercept = xintercept),
color = "black", linewidth = 2, linetype = "solid"
) +
geom_vline(
data = vline_data,
aes(xintercept = xintercept, color = name),
linewidth = 1, linetype = "solid"
) +
facet_grid(name~., scales = "fixed") +
theme_minimal() +
scale_fill_manual(values = plot_cols)+
scale_color_manual(values = plot_cols)
## chat help
facet_vars <- c("min", "mean", "max")
# Step 1: Pivot your summary table
vline_data <- summary_table_all %>%
pivot_longer(cols = contains(c("min_Min", "Mean_mean", "Max_max")),
names_to = "name", values_to = "xintercept")
vline_data$name = facet_vars
monthly_weather %>%
bind_cols(summary_table_all[rep(1, nrow(monthly_weather)), ]) %>%
select(-c(xmin, xmax)) %>%
pivot_longer(cols = contains(c("min", "mean", "max"))) %>%
filter(name %in% facet_vars) %>%
select(year, month, date, season, name, value) %>%
ggplot(aes(x = value, fill = name)) +
geom_boxplot() +
geom_vline(
data = vline_data,
aes(xintercept = xintercept),
color = "black", linewidth = 2, linetype = "solid"
) +
geom_vline(
data = vline_data,
aes(xintercept = xintercept, color = name),
linewidth = 1, linetype = "solid"
) +
facet_grid(name~., scales = "fixed") +
theme_minimal() +
scale_fill_manual(values = plot_cols)+
scale_color_manual(values = plot_cols)
summary_table_all %>%
select(-c(max_Min, min_Max, min_Mean, max_Mean))%>%
gt() %>%
tab_header(
title = "All-time Summaries"
) %>%
fmt_number(
columns = where(is.numeric),
decimals = 2
)%>%
cols_label(
min_Min = "Min",
mean_Min = "Mean",
sd_Min = "SD",
mean_Mean = "Mean",
sd_Mean = "SD",
max_Max = "Max",
mean_Max = "Mean",
sd_Max= "SD"
) %>%
tab_spanner(
label = "Monthly Minimums",
columns = c(min_Min, mean_Min, sd_Min)
) %>%
tab_spanner(
label = "Monthly Means",
columns = c(mean_Mean, sd_Mean)
) %>%
tab_spanner(
label = "Monthly Maximums",
columns = c(mean_Max, sd_Max, max_Max)
)| All-time Summaries | |||||||
|---|---|---|---|---|---|---|---|
Monthly Means
|
Monthly Maximums
|
Monthly Minimums
|
|||||
| Mean | SD | Mean | SD | Max | Min | Mean | SD |
| 59.02 | 9.22 | 90.69 | 13.27 | 117.00 | 0.00 | 32.13 | 8.52 |
month.abb = c("Dec", "Jan", "Feb", "Mar", "Apr", "May" , "June", "July", "Aug", "Sept", "Oct", "Nov")
summary_table <- monthly_weather %>%
mutate(month = factor(month, levels = month.abb)) %>%
group_by(month) %>%
summarise(
## mean
mean_Min = mean(min, na.rm = T),
mean_Mean = mean(mean, na.rm = T),
mean_Max = mean(max, na.rm = T),
## sd
sd_Min= sd(min, na.rm = T),
sd_Mean = sd(mean, na.rm = T),
sd_Max = sd(max, na.rm = T),
## max
max_Min = max(min, na.rm = T),
max_Mean = max(mean, na.rm = T),
max_Max = max(max, na.rm = T),
## min
min_Min = min(min, na.rm = T),
min_Mean = min(mean, na.rm = T),
min_Max = min(max, na.rm = T),
)%>%
arrange(month)
plot_cols <- c(plot_cols, "min_Min" = "blue", "mean_Mean" = "grey", "max_Max" = "red")
agg_plot<- summary_table %>%
ggplot(aes(x = month)) +
geom_line(aes(y = min_Min, col = "min_Min", group = 1), linewidth = .75) +
geom_line(aes(y = mean_Mean, col = "mean_Mean", group = 1), linewidth = .75) +
geom_line(aes(y = max_Max, col = "max_Max", group = 1), linewidth = .75) +
geom_point(aes(y = min_Min, col = "min_Min", group = 1), size = 3) +
geom_point(aes(y = mean_Mean, col = "mean_Mean", group = 1), size = 3) +
geom_point(aes(y = max_Max, col = "max_Max", group = 1), size = 3) +
scale_color_manual(values = plot_cols, name = "Legend") +
ylab("Temperature (F)")+ xlab("Month") +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
axis.line = element_blank(),
axis.title.y = element_text(angle = 0, vjust = .5)) + ylim(c(0,120))
agg_plot
agg_plot +
geom_point(data = monthly_weather, aes(x = month, y = min, col = season), alpha = 0.1, position = position_jitter()) +
geom_point(data = monthly_weather, aes(x = month, y = mean, col = season), alpha = 0.1, position = position_jitter()) +
geom_point(data = monthly_weather, aes(x = month, y = max, col = season), alpha = 0.1, position = position_jitter()) +
ylab("Temperature (F)")+ xlab("Month") +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
axis.line = element_blank(),
axis.title.y = element_text(angle = 0, vjust = .5)) +
ylim(c(0,120))
summary_table %>%
select(-c(max_Min, min_Max, min_Mean, max_Mean))%>%
gt() %>%
tab_header(
title = "All-time Summaries"
) %>%
fmt_number(
columns = where(is.numeric),
decimals = 2
)%>%
cols_label(
min_Min = "Min",
mean_Min = "Mean",
sd_Min = "SD",
mean_Mean = "Mean",
sd_Mean = "SD",
max_Max = "Max",
mean_Max = "Mean",
sd_Max= "SD"
) %>%
tab_spanner(
label = "Monthly Minimums",
columns = c(min_Min, mean_Min, sd_Min)
) %>%
tab_spanner(
label = "Monthly Means",
columns = c(mean_Mean, sd_Mean)
) %>%
tab_spanner(
label = "Monthly Maximums",
columns = c(mean_Max, sd_Max, max_Max)
)| All-time Summaries | ||||||||
|---|---|---|---|---|---|---|---|---|
| month |
Monthly Means
|
Monthly Maximums
|
Monthly Minimums
|
|||||
| Mean | SD | Mean | SD | Max | Min | Mean | SD | |
| Dec | 46.95 | 2.47 | 73.08 | 4.52 | 87.00 | 7.00 | 21.21 | 4.44 |
| Jan | 47.05 | 2.71 | 72.75 | 4.51 | 83.00 | 0.00 | 21.74 | 5.42 |
| Feb | 50.06 | 2.68 | 75.99 | 4.91 | 85.00 | 13.00 | 25.34 | 4.28 |
| Mar | 53.06 | 2.79 | 81.32 | 5.29 | 91.00 | 20.00 | 29.23 | 3.69 |
| Apr | 56.75 | 2.63 | 89.06 | 5.51 | 100.00 | 24.00 | 31.55 | 3.29 |
| May | 61.84 | 2.53 | 96.31 | 4.90 | 110.00 | 30.00 | 35.67 | 3.34 |
| June | 67.56 | 2.52 | 103.65 | 4.09 | 115.00 | 31.00 | 39.69 | 3.40 |
| July | 71.77 | 2.36 | 106.84 | 3.44 | 115.00 | 36.00 | 43.12 | 3.33 |
| Aug | 71.47 | 2.14 | 105.86 | 3.35 | 117.00 | 32.00 | 42.58 | 3.66 |
| Sept | 68.33 | 2.48 | 103.47 | 4.54 | 115.00 | 28.00 | 39.08 | 3.78 |
| Oct | 61.41 | 2.50 | 96.05 | 4.43 | 108.00 | 19.00 | 31.71 | 4.30 |
| Nov | 52.72 | 2.39 | 84.25 | 4.96 | 95.00 | 14.00 | 24.86 | 4.47 |
summary_table_seasonal <- monthly_weather %>%
mutate(season = factor(season, levels = c("Winter", "Spring", "Summer", "Fall"))) %>%
mutate(month = factor(month, levels = month.abb)) %>%
group_by(season) %>%
summarise(
## mean
mean_Min = mean(min, na.rm = T),
mean_Mean = mean(mean, na.rm = T),
mean_Max = mean(max, na.rm = T),
## sd
sd_Min= sd(min, na.rm = T),
sd_Mean = sd(mean, na.rm = T),
sd_Max = sd(max, na.rm = T),
## max
max_Min = max(min, na.rm = T),
max_Mean = max(mean, na.rm = T),
max_Max = max(max, na.rm = T),
## min
min_Min = min(min, na.rm = T),
min_Mean = min(mean, na.rm = T),
min_Max = min(max, na.rm = T),
)%>%
arrange(season)
plot_cols <- c(plot_cols, "min_Min" = "blue", "mean_Mean" = "grey", "max_Max" = "red")
summary_table_seasonal %>%
ggplot(aes(x = season)) +
geom_line(aes(y = min_Min, col = "min_Min", group = 1), linewidth = .75) +
geom_line(aes(y = mean_Mean, col = "Mean_mean", group = 1), linewidth = .75) +
geom_line(aes(y = max_Max, col = "max_Max", group = 1), linewidth = .75) +
geom_point(aes(y = min_Min, col = season, group = 1), size = 3) +
geom_point(aes(y = mean_Mean, col = season, group = 1), size = 3) +
geom_point(aes(y = max_Max, col = season, group = 1), size = 3) +
scale_color_manual(values = plot_cols, name = "Legend") +
ylab("Temperature (F)")+ xlab("Season") +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
axis.line = element_blank(),
axis.title.y = element_text(angle = 0, vjust = .5)) + ylim(c(0,120))
summary_table_seasonal %>%
ggplot(aes(x = season)) +
geom_line(aes(y = min_Min, col = "min_Min", group = 1), linewidth = .75) +
geom_line(aes(y = mean_Mean, col = "mean_Mean", group = 1), linewidth = .75) +
geom_line(aes(y = max_Max, col = "max_Max", group = 1), linewidth = .75) +
geom_point(aes(y = min_Min, col = season, group = 1), size = 3) +
geom_point(aes(y = mean_Mean, col = season, group = 1), size = 3) +
geom_point(aes(y = max_Max, col = season, group = 1), size = 3) +
scale_color_manual(values = plot_cols, name = "Legend") +
geom_point(data = monthly_weather, aes(x = season, y = min, col = season), alpha = 0.1, position = position_jitter()) +
geom_point(data = monthly_weather, aes(x = season, y = mean, col = season), alpha = 0.1, position = position_jitter()) +
geom_point(data = monthly_weather, aes(x = season, y = max, col = season), alpha = 0.1, position = position_jitter()) +
ylab("Temperature (F)")+ xlab("Year") +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
axis.line = element_blank(),
axis.title.y = element_text(angle = 0, vjust = .5))+ ylim(c(0, 120))
## chat help
facet_vars <- c("min", "mean", "max")
# Step 1: Pivot your summary table
vline_data <- summary_table_seasonal%>%
group_by(season) %>%
pivot_longer(cols = contains(c("min_Min", "mean_Mean", "max_Max")),
names_to = "Statistic", values_to = "xintercept")
vline_data$Statistic = rep(facet_vars, times = 4)
monthly_weather %>%
bind_cols(summary_table_seasonal[rep(1, nrow(monthly_weather)), ]) %>%
select(-c(xmin, xmax)) %>%
pivot_longer(cols = contains(c("min", "mean", "max")), names_to = "Statistic") %>%
filter(Statistic %in% facet_vars) %>%
select(year, month, date, season...7, Statistic, value) %>%
rename(season=season...7) %>%
ggplot(aes(x = value, fill = season)) +
geom_histogram(bins = 30) +
geom_vline(
data = vline_data,
aes(xintercept = xintercept, col = Statistic),
linewidth = 1, linetype = "dashed"
) +
facet_grid(Statistic~season, scales = "fixed") +
theme_minimal() +
scale_fill_manual(values = plot_cols) +
scale_color_manual(values = plot_cols)
summary_table_seasonal %>%
select(-c(max_Min, min_Max, min_Mean, max_Mean))%>%
gt() %>%
tab_header(
title = "All-time Summaries"
) %>%
fmt_number(
columns = where(is.numeric),
decimals = 2
)%>%
cols_label(
min_Min = "Min",
mean_Min = "Mean",
sd_Min = "SD",
mean_Mean = "Mean",
sd_Mean = "SD",
max_Max = "Max",
mean_Max = "Mean",
sd_Max= "SD"
) %>%
tab_spanner(
label = "Monthly Minimums",
columns = c(min_Min, mean_Min, sd_Min)
) %>%
tab_spanner(
label = "Monthly Means",
columns = c(mean_Mean, sd_Mean)
) %>%
tab_spanner(
label = "Monthly Maximums",
columns = c(mean_Max, sd_Max, max_Max)
)| All-time Summaries | ||||||||
|---|---|---|---|---|---|---|---|---|
| season |
Monthly Means
|
Monthly Maximums
|
Monthly Minimums
|
|||||
| Mean | SD | Mean | SD | Max | Min | Mean | SD | |
| Winter | 48.03 | 2.99 | 73.94 | 4.86 | 87.00 | 0.00 | 22.77 | 5.07 |
| Spring | 57.19 | 4.48 | 88.89 | 8.05 | 110.00 | 20.00 | 32.15 | 4.35 |
| Summer | 70.27 | 3.02 | 105.45 | 3.87 | 117.00 | 31.00 | 41.79 | 3.77 |
| Fall | 60.82 | 6.87 | 94.62 | 9.17 | 115.00 | 14.00 | 31.90 | 7.16 |
When the seasonal max or min reaches a new high or low, indicate this with a true/false.
out <- monthly_weather %>%
group_by(season) %>%
arrange(date) %>%
mutate(
start_flag = year > 1977,
cummax = cummax_ignore_na(max, start_flag),
cummin = cummin_ignore_na(min, start_flag),
new_max = cummax == max,
new_min = cummin == min
) %>%
ungroup()
monthly_weather$new_max <- out$new_max
monthly_weather$new_min <- out$new_minWe can either set this to start from the beginning of the series, or start the “tracker” at a certain specific time. Since we will have three parts to our music and we want the exceedances to be part 3, we will start the tracker in 1977 (see Sonifcation Design page for reasoning behind choice of 1977).
out_all <- monthly_weather %>%
arrange(date) %>%
mutate(
start_flag = year > 1899,
cummax = cummax_ignore_na(max, start_flag),
cummin = cummin_ignore_na(min, start_flag),
new_max = cummax == max,
new_min = cummin == min
) %>%
ungroup()
monthly_weather$new_max_all_time <- out_all$new_max
monthly_weather$new_min_all_time <- out_all$new_min
monthly_weather %>%
ggplot(aes(x = date)) +
geom_line(aes(y = min, color = "min")) +
geom_point(aes(y = min, color = "min"), alpha = 0.1) +
geom_line(aes(y = max, color = "max")) +
geom_point(aes(y = max, color = "max"), alpha = 0.1) +
geom_point(data = filter(monthly_weather, new_max_all_time),
aes(x = date, y = max, color = "max"), size = 6, shape = 17) +
geom_point(data = filter(monthly_weather, new_min_all_time),
aes(x = date, y = min, color = "min"), size = 6, shape = 15) +
scale_color_manual(values = plot_cols, name = "Exceedance\nType") +
scale_x_date(labels = monthly_weather$season_year[monthly_weather$season == "Winter" & monthly_weather$season_year %%5 ==0],
breaks = monthly_weather$xmin[monthly_weather$season == "Winter"& monthly_weather$season_year %%5 ==0]) +
theme(axis.text.x = element_text(angle = 25))
out <- monthly_weather %>%
arrange(date) %>%
mutate(
start_flag = year > 1977,
cummax = cummax_ignore_na(max, start_flag),
cummin = cummin_ignore_na(min, start_flag),
new_max = cummax == max,
new_min = cummin == min
) %>%
ungroup()
monthly_weather$new_max <- out$new_max
monthly_weather$new_min <- out$new_min
monthly_weather %>%
ggplot(aes(x = date)) +
geom_line(aes(y = min, color = "min")) +
geom_point(aes(y = min, color = "min"), alpha = 0.1) +
geom_line(aes(y = max, color = "max")) +
geom_point(aes(y = max, color = "max"), alpha = 0.1) +
geom_point(data = filter(monthly_weather, new_max),
aes(x = date, y = max, color = "max"), size = 6, shape = 17) +
geom_point(data = filter(monthly_weather, new_min),
aes(x = date, y = min, color = "min"), size = 6, shape = 15) +
scale_color_manual(values = plot_cols, name = "Exceedance\nType") +
scale_x_date(labels = monthly_weather$season_year[monthly_weather$season == "Winter" & monthly_weather$season_year %%5 ==0],
breaks = monthly_weather$xmin[monthly_weather$season == "Winter"& monthly_weather$season_year %%5 ==0]) +
theme(axis.text.x = element_text(angle = 25))
Most of the all time highs occur in summer, and most all-time lows occur in winter. This is unsurprising, and basically a consequence of the statistics we are looking at and the properties of the maximum and the minimum.
Since our goal is to summarize data for sonification, we want to make sure the information does not get frontloaded all at the beginning. So, looking at seasonal exceedances gives us more chances to hear extremes.
Note that the unusually high fall temperatures are both for all time and if we start the “new extreme” timer in 1977.
out_all <- monthly_weather %>%
group_by(season)%>%
arrange(date) %>%
mutate(
start_flag = year > 1899,
cummax = cummax_ignore_na(max, start_flag),
cummin = cummin_ignore_na(min, start_flag),
new_max = cummax == max,
new_min = cummin == min
) %>%
ungroup()
monthly_weather$new_max_all_time <- out_all$new_max
monthly_weather$new_min_all_time <- out_all$new_min
monthly_weather %>%
ggplot(aes(x = date)) +
geom_line(aes(y = min, color = "min")) +
geom_point(aes(y = min, color = "min"), alpha = 0.1) +
geom_line(aes(y = max, color = "max")) +
geom_point(aes(y = max, color = "max"), alpha = 0.1) +
geom_point(data = filter(monthly_weather, new_max_all_time),
aes(x = date, y = max, color = season), size = 6, shape = 17) +
geom_point(data = filter(monthly_weather, new_min_all_time),
aes(x = date, y = min, color = season), size = 6, shape = 15) +
scale_color_manual(values = plot_cols, name = "Exceedance\nType") +
scale_x_date(labels = monthly_weather$season_year[monthly_weather$season == "Winter" & monthly_weather$season_year %%5 ==0],
breaks = monthly_weather$xmin[monthly_weather$season == "Winter"& monthly_weather$season_year %%5 ==0]) +
theme(axis.text.x = element_text(angle = 25))
out <- monthly_weather %>%
group_by(season)%>%
arrange(date) %>%
mutate(
start_flag = year > 1977,
cummax = cummax_ignore_na(max, start_flag),
cummin = cummin_ignore_na(min, start_flag),
new_max = cummax == max,
new_min = cummin == min
) %>%
ungroup()
monthly_weather$new_max <- out$new_max
monthly_weather$new_min <- out$new_min
monthly_weather %>%
ggplot(aes(x = date)) +
geom_line(aes(y = min, color = "min")) +
geom_point(aes(y = min, color = "min"), alpha = 0.1) +
geom_line(aes(y = max, color = "max")) +
geom_point(aes(y = max, color = "max"), alpha = 0.1) +
geom_point(data = filter(monthly_weather, new_max),
aes(x = date, y = max, color = season), size = 6, shape = 17) +
geom_point(data = filter(monthly_weather, new_min),
aes(x = date, y = min, color = season), size = 6, shape = 15) +
scale_color_manual(values = plot_cols, name = "Exceedance\nType") +
scale_x_date(labels = monthly_weather$season_year[monthly_weather$season == "Winter" & monthly_weather$season_year %%5 ==0],
breaks = monthly_weather$xmin[monthly_weather$season == "Winter"& monthly_weather$season_year %%5 ==0]) +
theme(axis.text.x = element_text(angle = 25))
As you will hear (and see) in the animation, there are occasionally missing data points, resulting in silence!
## code inspired by https://r-graph-gallery.com/79-levelplot-with-ggplot2.html
x <- monthly_weather$date
y <- c("min", "avg", "max")
data <- expand.grid(X=x, Y=y)
data$Z <- c(ifelse(is.na(monthly_weather$min),"missing", "observed"),
ifelse(is.na(monthly_weather$mean),"missing", "observed"),
ifelse(is.na(monthly_weather$max),"missing", "observed"))
n_year <- nrow(monthly_weather)/12
x_breaks <- seq(from = ymd(monthly_weather$date[1]),
to = ymd(monthly_weather$date[1392]),
length.out = floor(n_year/10))
x_labs <- year(x_breaks)
change_date_part_1 <- ymd("1910-01-01")
change_date_part_2 <- monthly_weather[which.max(monthly_weather$max),"date"] %>% pull()
change_date_part_3 <- ymd("1977-01-01")
data %>% tibble() %>%
mutate(X = as.Date(X, "%Y-%m-%d"))%>%
ggplot(aes(x= X, y=factor(Y), fill = Z)) +
geom_tile(height = 0.5) +
scale_x_date(breaks = x_breaks, labels = x_labs) +
scale_fill_manual(name = "",
values = c("missing" = "black", "observed" = "grey90")
) + ylab("Summary \nStatistic") + xlab("Year (one line per month)") +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
axis.line = element_blank(),
axis.title.y = element_text(angle = 0, vjust = .5))
Since we are looking at monthly level data, presumably, each data point represents the summary (average, max, min) of 28, 29, 30, or 31 data points. However, there are exceptions
May 2025 values of all 3 summary statistics only include at most 9 points, since the data set was sourced May 9, 2025
the NOWData tool only reports the monthly values, and does not indicate whether any of the daily (or lower level data that the daily values may be aggregated from) values were missing.
While busy, this is a visualization of the entire data set we will be working with for our sonification.
The full monthly history of the series is too much for one plot– this is why we want to animate and/or sonify it– to add a time dimension to our perceptualization and allowing us to “zoom in” as we did on the beginning of the series above.
ggplot(monthly_weather, aes(x = date)) +
geom_line(aes(y = min, color = "min")) +
geom_point(aes(y = min, color = season)) +
geom_line(aes(y = max, color = "max")) +
geom_point(aes(y = max, color = season)) +
geom_line(aes(y = mean, color = "mean")) +
geom_point(aes(y = mean, color = season)) +
geom_point(data = filter(monthly_weather, new_max),
aes(x = date, y = max, color = season), size = 6, shape = 17) +
geom_point(data = filter(monthly_weather, new_min),
aes(x = date, y = min, color = season), size = 6, shape = 15) +
scale_color_manual(values = plot_cols) +
scale_x_date(labels = monthly_weather$season_year[monthly_weather$season == "Winter" & monthly_weather$season_year %%5 ==0],
breaks = monthly_weather$xmin[monthly_weather$season == "Winter"& monthly_weather$season_year %%5 ==0]) +
theme(axis.text.x = element_text(angle = 25))
This file is read in at the top of the sonification design details.
save(monthly_weather, file = "data/monthly_weather")
save(plot_cols, file = "data/plot_cols")