Learn how to incorporate external numeric variables to improve your forecasting accuracy.
Exogenous variables or external factors are crucial in time series forecasting as they provide additional information that might influence the prediction. These variables could include holiday markers, marketing spending, weather data, or any other external data that correlate with the time series data you are forecasting.
For example, if you’re forecasting ice cream sales, temperature data could serve as a useful exogenous variable. On hotter days, ice cream sales may increase.
To incorporate exogenous variables in TimeGPT, you’ll need to pair each point in your time series data with the corresponding external data.
Import the required libraries and initialize the Nixtla client.
In this tutorial, we’ll predict day-ahead electricity prices. The dataset contains:
y
) from various markets (identified by unique_id
)Exogenous1
to day_6
)unique_id | ds | y | Exogenous1 | Exogenous2 | day_0 | day_1 | day_2 | day_3 | day_4 | day_5 | day_6 |
---|---|---|---|---|---|---|---|---|---|---|---|
BE | 2016-10-22 00:00:00 | 70.00 | 57253.0 | 49593.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
BE | 2016-10-22 01:00:00 | 37.10 | 51887.0 | 46073.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
BE | 2016-10-22 02:00:00 | 37.10 | 51896.0 | 44927.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
BE | 2016-10-22 03:00:00 | 44.75 | 48428.0 | 44483.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
BE | 2016-10-22 04:00:00 | 37.10 | 46721.0 | 44338.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
First, let’s create a baseline forecast without using any exogenous variables.
Next, let’s create a forecast using the exogenous variables. To make a forecast
using exogenous variables, you need to provide historical and future exogenous
values. Below is an example dataset containing future exogenous variables. Note
that it only contains the future exogenous variable values not the target
variable y
. We need to forecast this target variable using the exogenous
variables provided.
unique_id | ds | Exogenous1 | Exogenous2 | day_0 | day_1 | day_2 | day_3 | day_4 | day_5 | day_6 |
---|---|---|---|---|---|---|---|---|---|---|
BE | 2016-12-31 00:00:00 | 70318.0 | 64108.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
BE | 2016-12-31 01:00:00 | 67898.0 | 62492.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
BE | 2016-12-31 02:00:00 | 68379.0 | 61571.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
BE | 2016-12-31 03:00:00 | 64972.0 | 60381.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
BE | 2016-12-31 04:00:00 | 62900.0 | 60298.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
Ensure you maintain consistent data formatting and columns in both historical and future exogenous datasets (e.g., dates, unique_id, variable names).
Once you have generated your forecasts, you can visualize the results to compare forecasts between the two methods above.
Congratulations! You have mastered the fundamentals of adding exogenous variables to your TimeGPT forecasts. Keep refining your approach by