## Annual cycles

The seasonality is annual – let’s examine the annual cycle instead of the usual time course plot.

- Can you see the start and stop dates in the plot?
- Any other observations?

## Pooling sites

How would you estimate the annual cycle based on data at each site?

Daily average temperatures at 0.2m depth for 37 sites, 2017-2019.

## As an estimation problem

Modeling the trend can be formulated as estimating the model:

\[
Y_{i, t} = f(t) + \epsilon_{i, t}
\]

Where:

\(Y_{i, t}\) is the temperature at site \(i\) and time \(t\)

\(f(t)\) is the mean temperature at time \(t\)

\(\epsilon_{i, t}\) is a random error

But how do you estimate an arbitrary function?

## Basis functions

A **basis function** is an element of a basis for a function space.

If \(\{f_j\}\) form a basis for a function space \(C\) then

\[
f \in C \quad\Longleftrightarrow f = \sum_j c_j f_j
\]

A finite subset of basis functions can be used to approximate functions in the space:

\[
f \approx \sum_{j = 1}^J c_j f_j
\]

## Basis approximation

A nifty trick is to estimate \(f\) using a suitable basis approximation:

\[
Y_{i, t} = \beta_0 + \color{maroon}{\underbrace{\sum_{j = 1}^J \beta_j f_j(t)}_{\tilde{f}(t) \approx f(t)}} + \epsilon_{i, t}
\]

This model can be fit using standard linear regression. (Think of the \(f_j(t)\)’s as \(J\) ‘new’ predictors.)

## Spline basis

The spline basis is a basis for piecewise polynomials of a specified order.

Bases for piecewise polynomials of order 1 through 4 joined at evenly-spaced knot points.

Generated recursively based on ‘knots’ – joining locations

## Knot spacing

Knot spacing will affect how densely basis functions are concentrated around particular regions of data.

Here are bases generated on some unevenly-spaced knots:

Check your understanding: where would this spline basis have the most flexible approximation capability?

## Knot placement

Appropriate placement of knots is essential for quality function approximation.

Where would you put them for our data?

## A first attempt: spline basis

Model: \(Y_{i,t} = \beta_0 + \beta_1\cdot\text{elev}_i + \sum_{j = 1}^7 \gamma_j \cdot f_j(t) + \epsilon_{i, t}\)

## A problem

The choice of basis must match problem context.

here, need boundaries to meet

in other words, need a *harmonic* function

## Fourier basis

The **Fourier basis** is a basis for square-integrable functions on closed intervals consisting of sine-cosine pairs.

## Forecasting

Does this forecast make sense? Why or why not?

## Next time

Fit a time series model to the residuals

\[
e_{i, t} = Y_{i, t} - \underbrace{\left(\hat{\beta_0} + \hat{\beta_1}\text{elev}_i + \hat{f}(t)\right)}_{\text{mean function } \hat{\mu}(i, t)}
\]

Forecast \(\hat{e}_{i, t} = \mathbb{E}\left(e_{i, t}|e_{i, t - 1}\right)\) using the residual model

“Feed forward” residual forecasts to obtain temperature forecasts

\[
\hat{Y}_{i, t} = \hat{\mu}(i, t) + \hat{e}_{i, t}
\]