--- title: "Base functions" author: "Alexander Häußer" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Base functions} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.height = 5, fig.width = 7 ) ``` ## Load package ```{r packages, message = FALSE, warning = FALSE} library(echos) ``` ## Prepare dataset In a first example, we want to model the well-known `AirPassengers` time series (`ts` object). The dataset contains monthly totals of international airline passengers (in thousands) from January 1949 to December 1960 with 144 observations in total. The first 132 observations are used for model training (`n_train`) and the last 12 observations are used for testing (`n_ahead`). `xtrain` and `xtest` are numeric vectors containing the training and testing data. ```{r data} # Convert 'AirPassengers' dataset from ts to numeric vector xdata <- as.numeric(AirPassengers) # Forecast horizon n_ahead <- 12 # Number of observations (total) n_obs <- length(xdata) # Number of observations (training data) n_train <- n_obs - n_ahead # Prepare train and test data xtrain <- xdata[(1:n_train)] xtest <- xdata[((n_train+1):n_obs)] xtrain xtest ``` ## ESN architecture and automatic model selection An Echo State Network consists of an input layer, a recurrent reservoir, and a readout layer. In `echos`, the input and reservoir weight matrices are randomly initialized and kept fixed. Only the readout layer is estimated, using ridge regression. The main reservoir hyperparameters are `n_states`, `alpha`, `rho`, and `density`. The argument `n_states` controls the reservoir size. The leakage rate `alpha` controls how quickly reservoir states react to new inputs. The spectral radius `rho` scales the recurrent reservoir matrix and affects memory and stability. The argument `density` controls the sparsity of the reservoir matrix. The function `train_esn()` also performs automatic model selection for the ridge readout. If `n_models = NULL`, the function evaluates `n_states * 2` candidate readout models. Candidate ridge penalties are sampled from the interval specified by `lambda`, and the best readout model is selected using the information criterion specified by `inf_crit`. ## Train ESN model The function `train_esn()` is used to automatically train an ESN to the input data `xtrain`, where the output `xmodel` is an object of class `esn`. The object `xmodel` is a list containing the `actual` and `fitted` values, `residuals`, the internal states `states_train`, estimated coefficients from the ridge regression estimation, hyperparameters, etc. We can summarize the model by using the generic S3 method `summary()` to get detailed information on the trained model. ```{r model, fig.alt = "Plot actual and fitted values"} # Train ESN model xmodel <- train_esn(y = xtrain) # Summarize model summary(xmodel) ``` From the output above, we get the following information about the trained ESN model: | Value | Description | |:-----------------|:---------------------------------------------------------------------------------------------------------------| | `n_obs` | Number of observations (i.e., length of the input time series) | | `n_diff` | Number of differences required to achieve (weak-) stationarity of the input training data | | `lags` | Lags of the output variable (response), which are used as model input | | `n_states` | Number of internal states (i.e., predictor variables or reservoir size). | | `alpha` | Leakage rate (smoothing parameter) | | `rho` | Spectral radius for scaling the reservoir weight matrix | | `density` | Density of the reservoir weight matrix | | `scale_inputs` | Input training data are scaled to the interval `(-0.5, 0.5)` | | `scale_win` | Input weight matrix is drawn from a random uniform distribution on `(-0.5, 0.5)` | | `scale_wres` | Reservoir weight matrix is drawn from a random uniform distribution on `(-0.5, 0.5)` | | `n_models` | Number of models evaluated during the random search optimization to find the regularization parameter `lambda` | | `df` | Effective degrees of freedom in the model | | `lambda` | Regularization parameter for the ridge regression estimation | ## Forecast ESN model The function `forecast_esn()` is used to forecast the trained model `xmodel` for `n_ahead` steps into the future. The output `xfcst` is a list of class `forecast_esn` containing the `point` forecasts, `actual` and `fitted` values, the forecast horizon `n_ahead` and the model specification `model_spec`. We can use the generic S3 method `plot()` to visualize the point forecast within `xfcst` and add the holdout test data `xtest`. ```{r forecast, fig.alt = "Plot forecast and test data"} # Forecast ESN model xfcst <- forecast_esn(xmodel, n_ahead = n_ahead) # Extract point and interval forecasts xfcst$point xfcst$interval # Plot forecast and test data plot(xfcst, test = xtest) ``` ## Hyperparameter tuning We now call the function `tune_esn()` to evaluate a grid of hyperparameter values using time series cross-validation. In this example, we test different values for the leakage rate `alpha` while keeping the spectral radius `rho` and the reservoir-size scaling parameter `tau` fixed. The leakage rate `alpha` controls how quickly the reservoir states react to new inputs. The spectral radius `rho` scales the recurrent reservoir weight matrix and affects the memory and stability of the reservoir. The parameter `tau` is used to determine the reservoir size when `n_states = NULL`. The tuning procedure uses rolling-origin evaluation with expanding training windows. Here, `n_ahead = 12` produces 12-step-ahead forecasts, and `n_split = 5` creates five rolling train/test splits. For each split and each hyperparameter combination, the model is fitted on the training window and evaluated on the subsequent test window. The resulting object stores the validation errors, including mean squared error (MSE) and mean absolute error (MAE), for each candidate configuration and split. Runtime increases with the number of grid combinations and validation splits, so coarse grids are recommended as a starting point before moving to finer tuning ranges. The S3 method `summary()` reports the best hyperparameter set according to the default accuracy metric (MSE unless you specify `metric = "mae"`), along with the corresponding performance. `plot()` visualizes the actual data together with the point forecasts from the selected “best” configuration. Forecasts are drawn as separate line segments over each test window, and vertical dashed lines indicate where each test window begins, making it easy to see how performance varies across splits. ```{r tuning, fig.alt = "Time series cross-validation"} # Tune hyperparameters via time series cross-validation xfit <- tune_esn( y = xdata, n_ahead = 12, n_split = 5, alpha = seq(0.1, 1.0, 0.1), rho = c(1.0), tau = c(0.4) ) # Summarize and visualize optimal hyperparameter configuration summary(xfit) plot(xfit) ```