Download the full-sized PDF of Forecasting Recessions in a Big Data EnvironmentDownload the full-sized PDF



Permanent link (DOI):


Export to: EndNote  |  Zotero  |  Mendeley


This file is in the following communities:

Graduate Studies and Research, Faculty of


This file is in the following collections:

Theses and Dissertations

Forecasting Recessions in a Big Data Environment Open Access


Other title
Bayesian Model Averaging
Machine Learning
Big Data
Dynamic Factor Analysis
Type of item
Degree grantor
University of Alberta
Author or creator
Sties, Max
Supervisor and department
Sebastian Fossati (Economics)
Examining committee member and department
Beyza Ural Marchand (Economics)
Dmytro Hryshko (Economics)
Saraswata Chaudhuri (Economics)
Denise Young (Economics)
Haifang Huang (Economics)
Sebastian Fossati (Economics)
Department of Economics

Date accepted
Graduation date
2017-11:Fall 2017
Doctor of Philosophy
Degree level
This thesis examines the predictability of Canadian recessions with special emphasis on variable selection in a big data environment. The first paper in this thesis addresses the problem of variable selection from a traditional point of view by employing a prescreened set of selected individual variables as well as data aggregation via factor analysis. Dynamic factors are estimated from panels of macroeconomic time series for Canada and the US. The factors are derived from financial, stock market, and real activity indicators for both countries. The predictive power of these factors is compared to the power of observed data. Additionally, the predictive content of US versus domestic data is evaluated. Results show that factor augmented probit regressions outperform models based solely on observed data, with a real-activity factor performing particularly well at short forecast horizons. Further, while at longer forecast horizons US interest rate spreads are consistently part of the best performing models, there is little gain in predictive accuracy from adding US data. The second paper uses modern machine learning techniques that allow for a much larger set of candidate variables. Logistic lasso and gradient boosting perform variable selection and model estimation simultaneously, thus making variable prescreening obsolete. The algorithms identify new leading indicators of recessions as well as provide evidence of structural instability in the forecasting model. I find that variables from the US labour and housing market best complement Canadian yield spreads as short term indicators, particularly during the 2008/2009 recession when yield spreads lose predictive power. Longer term forecasts are dominated by Canadian yield spreads and other financial indicators. US yield spreads and variables from the Canadian oil and gas sector do not hold predictive power at any forecast horizon.
This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for the purpose of private, scholarly or scientific research. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
Citation for previous publication

File Details

Date Uploaded
Date Modified
Audit Status
Audits have not yet been run on this file.
File format: pdf (PDF/A)
Mime type: application/pdf
File size: 10787322
Last modified: 2017:11:08 17:57:56-07:00
Filename: Sties_Max_201708_PhD.pdf
Original checksum: 2b23e2d4455efaeade5cf87afce151e6
Activity of users you follow
User Activity Date