BACK TO PORTFOLIO

Canadian Unemployment Forecasting (Macro ML)

Forecasting Canadian unemployment using macroeconomic indicators (StatCan, Bank of Canada, FRED) and exploring how monetary policy influences short-term labor market dynamics.

OVERVIEW

Research question: Can Canadian unemployment be reliably forecast using macroeconomic indicators, and how does monetary policy influence short-term labor market dynamics?

Using monthly data from 2009 to 2024 (177 months after preprocessing), I built seven forecasting models: baselines (Persistence, Historical Mean), statistical (ARIMA, VAR), and machine learning (Random Forest, XGBoost, Ridge). Main result: Persistence and Ridge achieve the best performance (RMSE ~0.23–0.24, MAPE ~2.7–3.6%). A scenario simulation with a Ridge model trained without the unemployment lag shows that rate cuts are associated with higher predicted unemployment over 12 months, and rate hikes with lower—consistent with delayed monetary policy transmission.

DATA & METHODS

Data sources: Statistics Canada (unemployment, employment, CPI, GDP), Bank of Canada (overnight rate, 10-year bond yield), FRED (CAD/USD exchange rate, WTI oil price). Target variable: seasonally adjusted monthly unemployment rate.

Train/test: Chronological 80/20 split—train 2010-04 to 2021-12 (141 months), test 2022-01 to 2024-12 (36 months). Features include levels, lags, changes, and rolling measures; two leaky features (unemp_change_1m, unemp_rolling_12m) were excluded.

Models: Persistence (next month = this month), Historical Mean, ARIMA(1,0,2), VAR(2), Random Forest, XGBoost, Ridge (alpha=1, standardized features). For scenario simulation, a separate Ridge model was trained without unemp_lag_1m to isolate macro→unemployment links under hypothetical BoC rate paths.

Correlation matrix between macroeconomic variables

RESULTS

Persistence and Ridge outperform all other models. Historical Mean is by far the worst (RMSE 1.65). Complex ML (VAR, Random Forest, XGBoost) does not improve on the simpler approaches at this 1-month horizon.

ModelRMSEMAEMAPE (%)
Persistence0.2330.1562.71
Ridge0.2450.2013.57
ARIMA0.3380.2364.26
VAR0.4550.3646.56
Random Forest0.4760.3917.32
XGBoost0.4790.4217.66
Historical Mean1.6451.53128.38
Model performance comparison (RMSE, MAE, MAPE)

Scenario simulation (12-month horizon)

Predicted unemployment under five BoC rate scenarios (Ridge without unemp_lag_1m). This step lets us ask how unemployment would change if the Bank of Canada followed different rate paths, holding the other inputs fixed. In the model, rate cuts → higher predicted unemployment and rate hikes → lower—directionally consistent with policy timing and macro links, but not a causal claim.

ScenarioRate changeNew rate (%)Month 1Month 6Month 12
Rate cut -200bp−2.0 pp1.437.407.457.44
Rate cut -100bp−1.0 pp2.437.007.037.02
Status quo0.0 pp3.436.606.606.60
Rate hike +100bp+1.0 pp4.436.206.176.18
Rate hike +200bp+2.0 pp5.435.795.745.75
Scenario simulation: 12-month unemployment trajectories

CONCLUSION

Overall, the project shows that Canadian unemployment is predictable at a 1‑month horizon, but mainly because the series itself is highly persistent—macro indicators provide useful, but incremental, signal on top of “next month ≈ this month.” A separate scenario module takes that forecasting setup and uses it to ask policy questions about different interest‑rate paths, providing directional guidance on how unemployment might evolve under alternative decisions.

TAKEAWAYS & LIMITATIONS

  • Among the seven models tested, Persistence and Ridge provide the best balance of accuracy and interpretability, while more complex ML (VAR, Random Forest, XGBoost) does not deliver clear gains at this horizon.
  • Unemployment is very sticky—today’s rate is already a strong predictor of next month—so feature engineering and model choice matter less than making sure the baseline persistence structure is handled correctly.
  • The scenario model is best viewed as a structured “what if” tool layered on top of the forecasting pipeline: it translates hypothetical rate paths into plausible unemployment trajectories, but should not be interpreted as a causal estimate.
  • Key limitations include a relatively short sample (177 months), possible structural breaks, and the need to extrapolate beyond the historical rate range in some scenarios—so results should be read as directional, not precise forecasts.
  • Future work could add richer labour‑market data (e.g. vacancies, sectoral detail), extend to multi‑step horizons, and benchmark against professional forecasts or alternative macro models.