The principles of time series forecasting have been with us forever. Improving the accuracy of demand forecasting is impactful in countless real world applications and sectors like retail, industry and finance. Yet still, people rely on traditional methods to optimize their actions. After reinforcement learning, computer vision & natural language processing, time series forecasting is next in line to be disrupted by deep learning technology.

Conventional vs deep learning

Traditional methods like Simple/Exponential Moving Average and ARIMA need no introduction. They are understandable and well-integrated in our society. Next to these, more sophisticated methods exist such as exponential smoothing and Holt-Winters’ model. Even though these models are able to recognize seasonality trends, they will always be limited by the fact that one model is provided for each individual time series.

Deep learning models like RNN and CNN are able to include parallel and highly correlated time series which can drastically increase model performance. Unlike traditional methods, they take into account all historical data and can be used to create a general model (e.g. one model to forecast sales demand in every store, or energy demand in every household). On top of that, known categorical features such as holidays, location, etc. can be easily included. We see this in Amazon’s DeepAR which provides a hands-on implementation of a state of the art probabilistic RNN, also giving more information about the certainty of its prediction.

Another point of research is enabling multi-horizon forecasting. Rather than a fixed time step, multi-horizon forecasts provide stakeholders with optimized information across multiple time steps in the future. While most deep learning architectures provide ‘black-box’ models when implementing multi-horizon forecasting, the Temporal Fusion Transformer combines high performance with interpretable insights.

Smart meters

The upcoming smart meter introduces an exciting benchmark/milestone in grid technology. It provides endless opportunities in innovation, energy efficiency, grid reliability and management.

However, for some people, the smart meter comes with some complications. More specifically, solar panel owners are pushed to consume energy at times when production is high. This follows from the fact that one is barely compensated for injected energy in the grid, while consuming it remains the same price. In compensation, the option is given to use the smart meter as a reverse counter for a maximum period of 15 years, while still having to pay the prosumer tax.

Some major questions remain, one of them being: How can one increase his self-consumption and become more profitable in anticipation of these inevitable changes? An energy demand forecast might provide some answers to many open questions.

Energy demand forecasting

To find out, we trained an energy demand forecast model. The model was trained on a dataset that contains the energy consumption of 370 households, from which, after a cleanup, 320 remained. The data was split up in 300 households for training purposes and 20 for testing. All households contain 3 years of data with a maximum granularity of a 15 min intervals. It is worth mentioning that every trained model was taken out of the box. As we are looking for general trends, not every model was optimized for maximum performance. Therefore, default model architectures are used and the only implemented feature is the time of day. Other features like day of week, day of year, weather forecasts … could further help the model recognizing weekly and yearly seasonality.

One month of the test set:

Note: We decided later to remove the 12th test series. Due to the insanely high energy consumption, models failed to make a relevant prediction.


The following figure is an example of an energy demand forecast. One can clearly see the prediction horizon of one day, knowing 7 days of context.


A rolling metric:

The energy consumption forecast is only evaluated in the near future (e.g. test horizon of 4 hours) and only one week of the test set is taken into account.

def rolling_metric(args):
m_ar = list()
for i in range(test_length):
test_prediction, test_x = model.predict(pd.concat([context_data(i), target_data(i)],
ignore_index=True), return_x=True)
actual = np.array(target_data[config[‘target’])
prediction = np.array(test_prediction[0])
# select test horizon (4 hours)
a = actual[0:test_horizon]
p = prediction[0:test_horizon]
m = mae(a, p)
m_ar += [m]
return np.mean(m_ar)

1. Absolute vs cumulative - Personalised vs general model

For some applications it is interesting to know when and how much energy will be consumed (absolute energy demand), while for others the most valuable information is the total amount of energy consumed in the coming future (cumulative energy demand). Both predictions have different objectives, thus different optimizations. We’ll prove this by training models for both objectives and comparing performance on the absolute energy demand signal.

Secondly, we investigate the difference between personalised models (one model for each household) and one generalized model. The personalised models take one separate model for each household of the test set, using 2 of the 3 available years for training. A way more powerful model regarding deployment is the generalized model. Here, two years of energy consumption data of 300 households was used for training which is different from the test set.

Personalised model: absolute vs cumsum, in total 40 single models trained


General model trained on 300 households


Summary: Mean of all household errors

Absolute energyCumulative energy
Personalised model38,24387,979
General model34,52949,749

2. ARIMA vs DeepAR vs TFT

In the following experiment we compare the performance of 20 single model ARIMAs vs a general model trained using a DeepAR and temporal fusion transformer. The following models are trained to forecast absolute energy demand.

Because TFT and DeepAR didn’t receive any kind of hyperparameter optimization, ARIMA is trained on default settings (p, q, d) = (1, 1, 1). Just like the other models, ARIMA is given 7 days of context to predict the whole next day, after which the rolling metric is applied to evaluate performance.


DeepAR is scoring consistently better on the general model than the Temporal Fusion Transformer. DeepAR took 3h20m to train while, TFT took about 2h10m local training time.


Summary: Mean of all households

Personalised model41,78138,243(not tested)
General modelN.A.34,52924,708

3. context/prediction length, multi-horizon optimization

Up until now we’ve always fixed the context length to 7 days to predict a horizon of 1 day. The next experiment will show performance increase due to multi-horizon optimization. Again, since we want to evaluate general trends, The trained multi-horizon model was evaluated as is on the first try. Context length was considered from 1 day up to 7 days. Prediction horizon is trained for optimization between a 4 hours and 24 hours horizon.


Summary: Mean of all households

Fixed context/horizonMulti-horizon
Generalized model TFT34,52930,153

Accelerated innovation

1. The smart meter

Now that the smart meter is aware of your energy consumption, it could recommend measures to optimize your energy efficiency. Ultimately showing you how much you would save by acting differently.

A house battery could be another very efficient way of increasing self-consumption but at this point its expected lifespan does not compensate for the investment. However, the technology is evolving fast. In fact, an accurate energy demand forecast could increase energy efficiency to such extent that it would make it more profitable to switch to the new system immediately, leaving behind the reverse counter for ever.

2. The smart grid

Additionally, a consumption forecast would be very useful for grid operators. It would enable them to build and optimize a smart control center that allows to anticipate accurately on user demand. When aggregated, the energy demand forecast provides information about when and where energy is needed, resulting in better control operations, better grid monitoring and the prevention of energy waste.

The future of Time Series Forecasting

Energy demand forecasting is only the tip of the iceberg of the countless real world applications with a meaningful impact. We are seeing a lot of untapped potential in sectors like retail, healthcare and finance. Now that deep learning technology has proven to deliver a substantial increase in prediction performance, time series forecasting will surely be a force to be reckoned with in the future.