Skip to contents

library(nixtlar)
#> Registered S3 method overwritten by 'tsibble':
#>   method               from 
#>   as_tibble.grouped_df dplyr

1. Uncertainty quantification via prediction intervals

For uncertainty quantification, TimeGPT can generate both prediction intervals and quantiles, offering a measure of the range of potential outcomes rather than just a single point forecast. In real-life scenarios, forecasting often requires considering multiple alternatives, not just one prediction. This vignette will explain how to use prediction intervals with TimeGPT via the nixtlar package.

A prediction interval is a range of values that the forecast can take with a given probability, often referred to as the confidence level. Hence, a 95% prediction interval should contain a range of values that includes the actual future value with a probability of 95%. Prediction intervals are part of probabilistic forecasting, which, unlike point forecasting, aims to generate the full forecast distribution instead of just the mean or the median of that distribution.

This vignette assumes you have already set up your API key. If you haven’t done this, please read the Get Started vignette first.

2. Load data

For this vignette, we’ll use the electricity consumption dataset that is included in nixtlar, which contains the hourly prices of five different electricity markets.

df <- nixtlar::electricity
head(df)
#>   unique_id                  ds     y
#> 1        BE 2016-10-22 00:00:00 70.00
#> 2        BE 2016-10-22 01:00:00 37.10
#> 3        BE 2016-10-22 02:00:00 37.10
#> 4        BE 2016-10-22 03:00:00 44.75
#> 5        BE 2016-10-22 04:00:00 37.10
#> 6        BE 2016-10-22 05:00:00 35.61

3. Forecast with prediction intervals

TimeGPT can generate prediction intervals when using the following functions:

For any of these functions, simply set the level argument to the desired confidence level for the prediction intervals. Keep in mind that level should be a vector with numbers between 0 and 100. You can use either quantiles or level for uncertainty quantification, but not both.

fcst <- nixtla_client_forecast(df, h = 8, id_col = "unique_id", level=c(80,95))
#> Frequency chosen: H
head(fcst)
#>   unique_id                  ds  TimeGPT TimeGPT-lo-95 TimeGPT-lo-80
#> 1        BE 2016-12-31 00:00:00 45.19045      32.60115      40.42074
#> 2        BE 2016-12-31 01:00:00 43.24445      29.30454      36.91513
#> 3        BE 2016-12-31 02:00:00 41.95839      28.17721      35.55863
#> 4        BE 2016-12-31 03:00:00 39.79649      25.42790      33.45859
#> 5        BE 2016-12-31 04:00:00 39.20454      23.53869      30.35095
#> 6        BE 2016-12-31 05:00:00 40.10878      26.90472      31.60236
#>   TimeGPT-hi-80 TimeGPT-hi-95
#> 1      49.96017      57.77975
#> 2      49.57376      57.18435
#> 3      48.35815      55.73957
#> 4      46.13438      54.16507
#> 5      48.05812      54.87038
#> 6      48.61520      53.31284

Note that the level argument in the nixtlar::nixtla_client_detect_anomalies() function uses only the maximum value when there are multiple values. Hence, setting level=c(90,95,99), for example, is equivalent to setting level=c(99), which is the default value.

anomalies <- nixtla_client_detect_anomalies(df, id_col = "unique_id") # level=c(90,95,99)
#> Frequency chosen: H
head(anomalies) # only the 99% confidence level is used 
#>   unique_id                  ds     y anomaly TimeGPT-lo-99  TimeGPT
#> 1        BE 2016-10-27 00:00:00 52.58       0     -28.58336 56.07623
#> 2        BE 2016-10-27 01:00:00 44.86       0     -32.23986 52.41973
#> 3        BE 2016-10-27 02:00:00 42.31       0     -31.84485 52.81474
#> 4        BE 2016-10-27 03:00:00 39.66       0     -32.06933 52.59026
#> 5        BE 2016-10-27 04:00:00 38.98       0     -31.98661 52.67297
#> 6        BE 2016-10-27 05:00:00 42.31       0     -30.55300 54.10659
#>   TimeGPT-hi-99
#> 1      140.7358
#> 2      137.0793
#> 3      137.4743
#> 4      137.2498
#> 5      137.3326
#> 6      138.7662

4. Plot prediction intervals

nixtlar includes a function to plot the historical data and any output from nixtlar::nixtla_client_forecast, nixtlar::nixtla_client_historic, nixtlar::nixtla_client_detect_anomalies and nixtlar::nixtla_client_cross_validation. If you have long series, you can use max_insample_length to only plot the last N historical values (the forecast will always be plotted in full).

When available, nixtlar::nixtla_client_plot will automatically plot the prediction intervals.

nixtla_client_plot(df, fcst, id_col = "unique_id", max_insample_length = 100)
#> Frequency chosen: H

nixtlar::nixtla_client_plot(df, anomalies, id_col = "unique_id", plot_anomalies = TRUE)
#> Frequency chosen: H