This repository aims to predict future movements in the value of the Canadian dollar versus the Japanese yen.
Tools utilised include:
Relevant data for the analysis is contained in cad_jpy.csv.
In this notebook, I loaded historical Canadian Dollar-Yen exchange rate futures data and applied time series analysis and modeling to determine whether there is any predictable behavior.
This is to check for any long or short term patterns.
There was a significant decline in JPY price comparative to CAD in the 1990s. We see another significant decline for JPY at the time of the 2009 financial crisis.
See graph below:
Hodrick-Prescott Filter was used to decompose the exchange rate price into trend and noise.
We see a JPY peak in mid-2015, and lows in late 2016 and 2020.
See a plot of price vs trend for 2015 to present below (note that price is blue, trend is orange):
See a plot of the settle noise below:
I created an ARMA model and fit it to the returns data. The AR and MA ("p" and "q") parameters were set to p=2 and q=1.
The ARMA Model results are set out below:
Based on the p-value I wanted to see if the model is a good fit. Because some of the coefficients are above 0.05, the model is not statistically significant. For one lag the model is significant, but for two lags it is not. So this is not necessarily a good fit on this basis.
The 5 day returns forecast for the ARMA model is below:
This indicates negative returns in the near term.
Using the raw CAD/JPY exchange rate price, I estimated an ARIMA model.
I set P=5, D=1, and Q=1 in the model, where:
- P = number of auto-regressive lags
- D = number of differences (this is usually 1)
- Q = number of moving average lags
The ARIMA Model results are set out below:
Because each of the p-value coefficients are above 0.05, the model is not statistically significant.
The 5 day exchange rate price forecast for the ARIMA model is below, which predicts decrease in JPY price in the short term:
I used GARCH to predict near-term volatility of Japanese Yen exchange rate returns. Being able to accurately predict volatility can be be extremely useful if you want to trade in derivatives or quantify maximum loss.
I set the parameters of the GARCH model to p=2 and q=1.
See the GARCH Model results below:
P-values for GARCH and volatility forecasts tend to be much lower than ARMA/ARIMA return and price forecasts. In particular, here we have all p-values of less than 0.05, except for alpha(2), indicating overall a much better model performance. In practice, in financial markets, it's easier to forecast volatility than it is to forecast returns or prices.
The 5 day volatility forecast based on the GARCH Model is below:
The GARCH Model predicts that volatility will increase in the near term.
Based on this time series analysis, I would not buy yen now, as the forecast indicates negative returns, as well as increased volatility.
Risk of the yen is expected to increase, given that volatility is predicted to increase in the near term.
When it comes to using these models for trading, GARCH is the best of the models, given two of the p-values are below 0.05. The ARIMA and ARMA model have a number of p-values above 0.05 so I would be less confident using these for trading.
In this notebook, I built an SKLearn linear regression model to predict Yen futures ("settle") returns with lagged CAD/JPY exchange rate returns.
I created a series using "Price" percentage returns, drop any NaNs and create a lagged return using the shift function.
I then created a train/test split for the data using 2018-2019 for testing and the rest for training.
I created a Linear Regression model and fit it to the training data.
I evaluated the model using data that it has never seen before.
See below a plot of the first 20 predictions (orange) versus the true values for returns (blue):
I evaluated the model using "out-of-sample" data. Out-of-sample data is data that the model hasn't seen before (testing data).
I evaluated the model using "in-sample" data. In-sample data is data that the model was trained on (training data).
The model performs better on the out-of-sample data as compared to the in-sample data. This is indicated by the lower RMSE for the out-of-sample data.