Quantile Forecasts
quantiles.Rmd
library(nixtlar)
#> Registered S3 method overwritten by 'tsibble':
#> method from
#> as_tibble.grouped_df dplyr
1. Uncertainty quantification via quantiles
For uncertainty quantification, TimeGPT
can generate
both prediction intervals and quantiles, offering a measure of the range
of potential outcomes rather than just a single point forecast. In
real-life scenarios, forecasting often requires considering multiple
alternatives, not just one prediction. This vignette will explain how to
use quantiles with TimeGPT
via the nixtlar
package.
Quantiles represent the cumulative proportion of the forecast
distribution. For instance, the 90th quantile is the value below which
90% of the data points are expected to fall. Notably, the 50th quantile
corresponds to the median forecast value provided by
TimeGPT
. The quantiles are produced using conformal
prediction, a framework for creating distribution-free uncertainty
intervals for predictive models.
This vignette assumes you have already set up your API key. If you haven’t done this, please read the Get Started vignette first.
2. Load data
For this vignette, we’ll use the electricity consumption dataset that
is included in nixtlar
, which contains the hourly prices of
five different electricity markets.
df <- nixtlar::electricity
head(df)
#> unique_id ds y
#> 1 BE 2016-10-22 00:00:00 70.00
#> 2 BE 2016-10-22 01:00:00 37.10
#> 3 BE 2016-10-22 02:00:00 37.10
#> 4 BE 2016-10-22 03:00:00 44.75
#> 5 BE 2016-10-22 04:00:00 37.10
#> 6 BE 2016-10-22 05:00:00 35.61
3. Forecast with quantiles
TimeGPT
can generate quantiles when using the following
functions:
- nixtlar::nixtla_client_forecast()
- nixtlar::nixtla_client_historic()
- nixtlar::nixtla_client_cross_validation()
For any of these functions, simply set the quantiles
argument to the desired values as a vector. Keep in mind that quantiles
should all be numbers between 0 and 1. You can use either
quantiles
or level
for uncertainty
quantification, but not both.
fcst <- nixtla_client_forecast(df, h = 8, id_col = "unique_id", quantiles = c(0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9))
#> Frequency chosen: H
head(fcst)
#> unique_id ds TimeGPT TimeGPT-q-10 TimeGPT-q-20 TimeGPT-q-30
#> 1 BE 2016-12-31 00:00:00 45.19045 40.42074 42.34211 43.54112
#> 2 BE 2016-12-31 01:00:00 43.24445 36.91513 40.05255 41.62179
#> 3 BE 2016-12-31 02:00:00 41.95839 35.55863 38.39862 39.92430
#> 4 BE 2016-12-31 03:00:00 39.79649 33.45859 36.34654 38.08909
#> 5 BE 2016-12-31 04:00:00 39.20454 30.35095 34.39800 36.65258
#> 6 BE 2016-12-31 05:00:00 40.10878 31.60236 34.85969 37.43258
#> TimeGPT-q-40 TimeGPT-q-50 TimeGPT-q-60 TimeGPT-q-70 TimeGPT-q-80 TimeGPT-q-90
#> 1 44.72518 45.19045 45.65572 46.83979 48.03880 49.96017
#> 2 42.51711 43.24445 43.97178 44.86710 46.43634 49.57376
#> 3 41.13472 41.95839 42.78206 43.99248 45.51815 48.35815
#> 4 38.62703 39.79649 40.96594 41.50388 43.24643 46.13438
#> 5 38.17931 39.20454 40.22976 41.75650 44.01107 48.05812
#> 6 39.16840 40.10878 41.04916 42.78498 45.35787 48.61520
4. Plot quantiles
nixtlar
includes a function to plot the historical data
and any output from nixtlar::nixtla_client_forecast
,
nixtlar::nixtla_client_historic
,
nixtlar::nixtla_client_detect_anomalies
and
nixtlar::nixtla_client_cross_validation
. If you have long
series, you can use max_insample_length
to only plot the
last N historical values (the forecast will always be plotted in
full).
When available, nixtlar::nixtla_client_plot
will
automatically plot the quantiles.
nixtla_client_plot(df, fcst, id_col = "unique_id", max_insample_length = 100)
#> Frequency chosen: H