- Review on: A Hybrid Transfer Learning model for Short-term Electric Load Forecasting DOI:10.1007/s00202-020-00930-x
- Source: [1] Xu, X., Meng, Z. A hybrid transfer learning model for short-term electric load forecasting.Electr Eng (2020).
Literature Review by Q.Dang [qiyun.dang@mail.mcgill.ca]
Edited by D.Wu [di.wu5@mail.mcgill.ca]
Disclaimer: This website contains copyrighted material, and its use is not always specifically authorized by the copyright owner. Take all necessary steps to ensure that the information you receive from the post is correct and verified.
1. Paper Motivation
The ordinary transfer learning methods may bring negative transfer into load forecasting as time series prediction is not exactly the same as the traditional data regression problem.
So, this paper proposed a cross-location load prediction method based on transfer learning with seasonal decomposing of time series.
- By seasonal decomposing the author means to remove the seasonal fluctuation trend in the load profile before training the model, to improve model accuracy. Overall, two different power load datasets were used in this paper. one from Australia the other from the USA.
Besides, the dataset of Australia electricity market covers five states of Australia, which covers a wide area geographically. As for GEFCOM2012 dataset, the 20 zones in USA are in a smaller area geographically since the temperature pattern provided in dataset is much similar.
2. Methods
Main Algorithm: The two-stage TrAdaBoost.R2 with seasonal decomposing of time series.
The authors‘ contribution is combining seasonal decomposing into two-stage TrAdaBoost.R2 Algorithm (Pardoe, 2010) https://www.cs.utexas.edu/~pstone/Papers/bib2html-links/ICML10-pardoe.pdf
- two-stage TrAdaBoost.R2 can be decomposed into TrAdaBoost.R2
- Algorithm emerging sequence:
- Adaboost (also known as ‘Adaptive boost’ ) (Freund, 1995)
- Adaboost.R (Freund, 1996)
- Adaboost.R2 (Drucker, 1997)
- TrAdaboost (Dai, 2007)
- Two-stage TrAdaBoost.R2 Algorithm (Pardoe, 2010)
Comments: The Term R in Adaboost.R refers to Adaboost+Regression, as AdaBoost was first used for classification problems, see figure 3, and was not introduced into data regression problems. Adaboost.R2 is an improved version of Adaboost.R.

Now it can be applied in regression problems using AdaBoost.R2.
AdaBoost is one of the first boosting algorithms to be adapted in solving practices. Adaboost helps you combine multiple “weak classifiers” into a single “strong classifier”.
While TrAdaboost refers to Adaboost with Transfer learning (“Tr“).
..as extending boosting-based learning algorithms (Freund & Schapire, 1997) in transfer learning, TrAdaBoost allows users to utilize a small amount of newly labeled data to leverage the old data to construct a high-quality classification model for the new data.
- So, why involve Transfer learning to build up a load prediction model?
According to the author, In electricity load forecasting, building a reliable forecasting model requires enormous data which is not easy to acquire. By transfer learning, data from related locations can be applied to the target location.
3. Paper structure
Raise proposed improved transfer learning algorithm first⇨ Trace Algo’s history (Two-stage TrAdaBoost.R2is based on Adaboost.R2) ⇨ Introduce new feature (add seasonal trend decomposing function) ⇨ Compared time series prediction results to 4 other benchmark methods
The other 4 benchmark models are simply described as follows:
[1] Xu, X., Meng, Z. A hybrid transfer learning model for short-term electric load forecasting.Electr Eng (2020).
- Gradient boosting decision tree (GBDT). GBDT is a traditional machine learning technique, which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees.
- AdaBoost regression Approach described in algorithm 1 is applied on original dataset directly without transfer learning.
- Two-stage TrAdaBoost.R2 Two-stage TrAdaBoost.R2 described in algorithm 2 is applied on original dataset directly without time series seasonal decomposing.
- GBDT with seasonal decomposing Different with the proposed hybrid model in this study, after time series seasonal decomposing, all three seasonal components are predicted by GBDT model directly instead of transfer learning approach.
4. Summarization
We can roughly see that the Red curve (authors’ method) captured more details of the Blue curve, compared by the green curve. Zone 20 is a place located in USA.

Comment: The author claim that the model is also better, if not from looking at the figure above, measures by Mean absolute percentage error (M.A.P.E) which expresses average absolute error as a percentage. MAPE is used to evaluate the performance of different models. MAPE is calculated as follow:
Table 1 MAPE performance comparison of 4 benchmark transfer methods
Source: [1] Xu, X., Meng, Z. A hybrid transfer learning model for short-term electric load forecasting.Electr Eng (2020).
Target zone ID (source zone ID) | GBDT (%) | AdaBoost Regressor (%) | Two-stage TrAdaBoost.R2 (%) | GBDT with seasonal decomposing (%) | Our method (%) |
---|---|---|---|---|---|
11 (1) | 25.09 | 23.54 | 22.63 | 20.68 | 17.53 |
1 (11) | 25.09 | 24.86 | 24.43* | 21.11 | 18.01 |
11 (12) | 25.13 | 23.47 | 24.01* | 21.04 | 20.99 |
12 (11) | 25.02 | 24.34 | 25.41* | 23.03 | 19.80 |
12 (20) | 24.76 | 23.89 | 24.12* | 23.64 | 20.50 |
20 (12) | 19.44 | 20.82 | 19.35* | 16.56 | 16.52 |
12 (15) | 25.05 | 24.14 | 24.18* | 24.21 | 21.06 |
15 (12) | 22.12 | 22.02 | 22.74* | 19.01 | 18.96 |
11 (20) | 25.02 | 23.28 | 23.21 | 21.04 | 18.41 |
20 (11) | 19.43 | 20.64 | 20.08* | 18.23 | 16.67 |
- Negative transfers are marked with ‘*’ and best performances are highlighted with boldface font
The contribution of this paper is summarized as follows:
The paper developed a hybrid transfer learning method for cross-location short-term load forecasting, which integrates time series decomposition technique and two-stage transfer regression approach.
- Notes:
(1) By decomposing electric load data, trend and seasonal components are handled by standard machine learning approach.
And two-stage transfer regression model (Two-stage TrAdaBoost.R2 )is established on irregular component.
(2) Negative transfer can be avoided effectively by author proposed hybrid model.
By negative transfer the author mainly means value in rows of Table 1 that ‘Two-stage TrAdaBoost.R2’ perform weak MAPE value than other columns.
(3) The proposed model is evaluated on two real-world datasets.
One dataset includes electric load data from 20 zones of the USA. These 20 zones are relatively closer geographically. The other dataset includes electric load data from 5 states of Australia, which represents a wide area.
In summary it is impressive that the proposed model achieves better prediction accuracy in both two datasets (electric load in USA and AUS). Just as the author suggested, it demonstrates the scalability of the proposed model.
5. Suggested future work
- Try a similar transfer learning approach, with the core algorithm “Two-stage TrAdaBoost.R2 Algorithm (Pardoe, 2010)” but a hybrid with a different trend signal (other than seasonal trend) decomposing procedure. Such a new hybrid model may also result in good forecasting results.
- Apply paper proposed algorithm in interesting problems which contains forecasting time seres signal task, such as real-time electricity price forecasting, etc.
Appendix
- Useful Training Datasets download link (1&2):
(1 of 2) The Global Energy Forecasting Competition (GEFCom2012) datasets by kaggle.https://doi.org/10.1016/j.ijforecast.2013.07.001

(2 of 2) Australian Energy Market Operator (AEMO) Aggregated price and demand data.https://aemo.com.au/energy-systems/electricity/national-electricity-market-nem/data-nem/aggregated-data
Fig.A.2. Screenshot of dataset 2 archive
————————–
<End of Review>