Multifactorial Heath-Jarrow-Morton model using principal component analysis

ABSTRACT


INTRODUCTION
In recent decades, models for evaluating financial instruments sensitive to interest rates have notably developed.Identifying relevant characteristics that set them apart is useful to implement these models, as it is relevant for professionals to implement valuation models.In financial literature, different models [1]- [6] can be found, focusing on assessing the complexities in describing the behavior of the interest short rate.These short-rate models do not exclusively aim to produce accurate forecasts but rather to explain statistical properties of market behavior to a great extent [2], [3].As stated by [7], interest short-rate models assess the following characteristics: trend, mean reversion, skewness, kurtosis, heavy tails, confidence intervals, occurrence probabilities and average prices, among others.Now, in discrete and continuous cases, many of these models have been proposed to model the behavior of interest rates, sometimes based on prior knowledge or using methodologies allowing specific characteristics inferences of the process [6], [7].Time series models (in discrete models) and stochastic differential equations (in the continuous case) have been the most used methodologies in modelling and analyzing interest rate processes.Additionally, interest short-rate models, mostly driven by a single source of uncertainty, can be considered insufficient or limited in describing the complete evolution of the dynamics for the term structure under certain circumstances, as [6], [8] suggested.To overcome this limitation, other alternatives arise, such as multifactor models [2]- [4], [6], [9], [10], which can provide a more realistic description of the transitional behavior of the term structure of interest rates.In this regard, studies of this type are considered in the case of two or three factors.Specifically, this study will address the alternative models using three factors.

567
Due to the poor performance of single-factor models in capturing empirical data contributing to explaining the dynamics of the short rate [2], [4], the Heath-Jarrow-Morton (HJM) model emerges, which expands and generalizes previous models by allowing multiple factors incorporation in the interest rate dynamics.Furthermore, this model allows for volatility to change over time; therefore, it requires more complex specifications than previous functional forms.The advantage of the HJM model over short-rate models is that it automatically fits the yield curve, as those short-rate models require additional estimates, as [3], [7] suggest.However, the HJM model is one of the most complex models to implement, considering that it is necessary to find the principal factors or components (PCs) used in the model.These factors can be determined by using principal component analysis (PCA) on the underlying term structure of interest rates.
In this study, we propose implementing the three-factor HJM interest rate model using an approach to integrate PCA and Monte Carlo simulation (MCS) techniques in the model.By integrating PCA and MCS with the multifactor HJM model, we successfully captured the PCs driving the evolution of short-term interest rates and we provided a framework for deriving spot interest rates through parameter calibration and forward rate estimation in the US market.This is crucial for risk management and investment decision-making.To do that, we extract the PCs that explain the dynamics of United States (US) Treasury Yield interest rates to a greater extent using daily historical data US Treasury bonds from June 2017 to December 2019.In that sense, we provided a robust and precise approach to characterizing interest rate dynamics as well as the novel findings and insights from this study support the applicability of the proposed methodology.
The article has three sections apart from this introduction.The second section presents the main approaches and interest rate models.The third section introduces the HJM model, along with its formal development and the process of practical implementation with their respective adjustments and parameter calibration.The fourth section presents the results and finally, the conclusions are presented along with discussions on the scope of the proposed approach.

METHOD
The volatility in interest rates can be affected by multiple factors that determine the value of interest rates.These factors can be considered of a random nature, which in turn has spurred interest in designing analytical tools that reformulate the understanding of classical deterministic models with new mathematical models that incorporate this randomness.In the first generation of stochastic interest rate models, the state variable used is the instantaneous short rate [7], [8].Short-rate models offer significant advantages due to their simplicity and the availability of analytical solutions for valuing bonds and options.Furthermore, these models are particularly manageable for short-term calculations, allowing for quick determination of derivative prices.This capability becomes essential in scenarios that require simultaneous valuation of multiple derivatives.
Interest rates are modelled using stochastic differential equations (SDEs).Single-factor models use an SDE to represent the short rate, where the SDEs employed capture some of the key properties of interest rates, such as mean reversion and volatility [7].Short-rate models consist of two parts: the first is the average rate of change ('drift') of the short rate at each instant (, ()) and the second specifies the instantaneous volatility of the short rate (, ).() = (, ()) + (, )() where,   is a standard Wiener process.Furthermore, it assumes that, for any  ≥ 0, the price of the bond depends on the instantaneous short rate [8].For most models, the drift component will be determined using a numerical technique to match the initial point of the term structure, so the bond price can be expressed as (2).
This equation, the fundamental bond pricing equation (or Vasicek model [1]), describes the term structure and its behavior.Note that the bond price equation was initially derived as the solution to a secondorder partial differential equation under certain assumptions but is generally valid for any arbitrage-free term structure model.The equation holds even in the case of multiple factors or multiple sources of risk if the terms are interpreted as scalar products of vectors.Vasicek [1] introduced a stochastic mean-reverting model for interest rates, which became the conceptual foundation for interest rate modelling.Subsequently, more elaborate models were introduced, as in [2] or its generalization in [11].
Following [7], [8], these interest rate models can be classified into equilibrium and no-arbitrage models.The first models are based on a set of assumptions about the operating economy and derive a process for the short-term interest rate.In these models, term structures of interest rates and volatilities are determined endogenously [1], [2].However, since its input does not take enough information, the equilibrium model bond prices can differ from the market bond prices.This means that the equilibrium models are not arbitrage-free [6]- [8].Thus, equilibrium models cannot be perfectly calibrated for the yield curve.Most of these models assume that the only relevant state variable is the instantaneous interest rate, , which is modelled according to the SDE indicated in (1).
Meanwhile, non-arbitrage models treat these structures as exogenous to ensure that the prices of securities the model gives match those observed in the market.To do that, these models take the current yield curve structure as input.Models, such as [3]- [6], [9], [10], fall into this group.Therefore, for non-arbitrage models, the model's bond prices are equivalent to that of market.Thus, the yield curve of the model fits perfectly with that of the market.All interest rate models under the no-arbitrage scheme are special cases derived from the following general form: where  and  are suitably chosen functions of the short-term interest rate and are the same for most of the models presented here.As shown in ( 3) is a one-factor model that only reflects the relationship with the shortterm rate.Non-arbitrage models are a widely accepted framework for pricing interest rate derivatives because they minimally guarantee that the market prices of the bonds are accurate.
Another way to classify interest rate models is based on the number of random factors under analysis.While single-factor models [1], [4], [5] consider the short interest rate as the PCs, multifactor models include at least two significant factors in their term structure of interest rates, for example, the short rate and its trend.Models, such as [3], [6], [12]- [14], are in this group.However, the most comprehensive model is the HJM model.Heath et al. [3] propose a complete framework to model interest rate, which incorporates most of the models in the literature, including recent market models [15]- [18] and the extension of previous models as in [19]- [24].

The HJM model
The HJM model provides a general framework for modelling the evolution of term interest rates.It describes the behavior of the future price at time t of a zero-coupon bond, denoted as (, ), which pays one dollar at maturity  as in [25]- [27].The model is calibrated to the observed yield curve [28]- [31].To estimate the prices of zero-coupon bonds at different maturities, the model begins with an exogenous specification of the stochastic dynamics of the forward rate and subsequently determines endogenously, in a risk-neutral world, the zero-coupon bond.Hence, the model is implemented like in [4], [9], as these require the initial yield curve provided by the market at a previous date.Likewise, the instantaneous forward rate trend is calibrated so that the volatility-standardized risk premium is zero [31], [32].However, it has notable differences from these models: a) The valuation process starts with an exogenous specification of the forward rate.b) The expectation hypothesis in HJM for valuing a bond is that the face value is discounted with the average forward rate over its maturity, making the bond price a random variable.c) Calibration of the model is an implicit method and does not require fitting arguments as in [4], [9].However, the negative forward rates occurring with positive probability limit the model.Notably, once the face value is discounted with the average short rate over the bond's maturity, the conditional expected value is taken based on the information available at the issuance date.Based on the Feynman-Kac theorem, this is the only expectation hypothesis consistent with the approach of partial differential equations.The model adjusts to the current bond prices and generates a dynamic of the forward rates through the (4). is the volatility of the bond.If  * ,  and  are vectors, their products are interpreted as inner products.For the exogenous specification of the instantaneous forward rate, consider a standard Brownian motion () (∈[0,]) defined over a fixed probability space with its filtering increased (, ℱ, ℱ  ) ( ∈[0,]) .For simplicity, we will write () =   .In the HJM model, it is assumed that the forward rate dynamics, (, ), is exogenously specified by the following SDE (, ) = (, ) + (, )  and it is assumed that the price of a zero-coupon bond is given by (, ) =  {− ∫ (, )   }.Now, to determine the stochastic differential equation leading to the short rate, we first have that from (7)  .Also, notice that the stochastic spread of the short rate is given by (7):

𝑓(𝑡
Taking the partial derivatives of the integrals on the right-hand side using Leibniz's rule: )  + ()  (8) As shown in (8) describes the behavior of the short rate where the trend of   is the slope of the initial forward rate.Due to the trend integrals in (8), the evolution of the short rate does not present the Markovian property [31].Meanwhile, the multifactorial generalization of the HJM model has been developed as ( 9) where μ  (, ) is the drift of the forward rate with maturity in  and  , σ , (, ) are their volatility coefficients.Now, since bond prices depend on forward interest rates, we have: .

Principal component analysis
PCA is a multivariate technique used to combine two or more correlated variables into a smaller number of factors, known as PCs.To do that, a set of correlated variables is transformed into a set of uncorrelated factors ordered based on their contribution to reducing variability.The PCs are linear combinations of  random variables  1 ,  2 ,  3 … .  .The PCs technique depends solely on the covariance matrix Σ of the original variables [ 1 ,  2 ,  3 … .  ].Let be  = [ 1 ,  2 ,  3 … .  ] the random vector whose covariance matrix Σ with eigenvalues  1 >  2 >  3 > ⋯ . .  > 0. Consider the following linear combinations: The PCs are those uncorrelated linear combinations [ 1 ,  2 ,  3 … .  ] whose variances are the largest.These linear combinations represent the selection of a new system obtained from the rotation of the original variable system.In that sense, the PCA makes it possible to analyses the main risks determining the dynamics of yield curve rates based on historical data.The proposed approach, although it differs from other methodologies [32]- [36], allows us a simple application of the multifactorial HJM model and provides a comprehensive analysis of interest rate dynamics by capturing the PCs that drive the evolution of short-term interest rates.Additionally, the proposed PCA integrated three-factor HJM interest rate model enhances the efficiency and interpretability of the model providing insights into the PCs influencing yield curve rates, compared to previous works such as [37], [38].

RESULTS AND DISCUSSION
We propose implementing the three-factor HJM model using the daily rates from the yield curve of US Treasury bonds from 9 June 2017 to 31 December 2019, considering references for 1 month, 3 months, 6 months, 1 year, 2 years, 3 years, 5 years, 7 years, 10 years, 20 years and 30 years.Thus, the model captures the fluctuations in US Treasury Yield interest rates.In that sense, we obtain the factors that explain to a greater extent the dynamics of interest rates in the US using PCA.The results of PCA are presented in To select the PCs, 95% is taken as a comparison reference.The factors satisfying this condition (cumulatively) are p1, p7 and p8.These three PCs are selected because they significantly explain the variance in our term structure, as seen in Figure 1.Once the factors have been identified, structuring their functional forms is necessary.The factors that will be used in the construction of the tree of the HJM model and the deduction of the factors will be reached for spot rates.The purpose at this point is to calibrate the model's parameters by minimizing the squared differences between the rebalanced values of the matrix of fictitious values for the factor and the estimated values of the stated factor presented in Table 2.
This same process, applied to factor 2, is replicated for factor 3. In that sense, we calibrated the model's parameters by minimizing the squared differences between the rebalanced values of the matrix and the estimated values of the stated factor.Results are presented in Table 3.After determining the functional forms' final structure and the calibration, constructing the forward rate tree in the HJM model requires the base scenario of the annual yield curve.However, since the time  is annual in the model, as mentioned, the rates corresponding to said instant must be taken.However, we found cases without rate for that moment , such as years 4.6, 8.9 and 11.In this case, it was solved using the triple exponential smoothing algorithm, in which the interpolation of the years that did not have an associated rate was performed and a rate was established for year 11.From the consolidation of the annual yield curve, with the rates presented in Figure 2, the lattice structure of the model for forward and spot rates can be obtained, as evidenced in Tables 4 and 5.In grey color, the interest rates have been presented for the years that did not have information for said market reference.The discounted bond prices for the simulated paths are found by multiplying the successive prices from moment  = 0.For instance, the price of the path at the beginning for a zero-coupon bond with two years to maturity is 0.9842 × 0.9696 = 0.9543.In that sense, prices are derived using the implicit forward rates in the initial rate curve, given by the first row of spot rates and forward rates.For example, the real price at the origin of a zero-coupon bond with two years to maturity is: exp(−(1.59%+ 1%) × 1) = 0.9689.The comparison of the price series is shown in Figure 3.

CONCLUSION
The PCs exhibit several attractive features, including their intuitive interpretation.They can effectively explain various types of changes in the shape of the interest rate term structure.The first principal component exerts a similar and parallel influence on interest rates across all maturities, contributing significantly to approximately 80% of the overall variation in the term structure.These findings align with previous research, demonstrating that utilizing two factors explains around 95% of the term structure's movements, while three factors account for approximately 99% of the variation.The remaining variation is commonly considered noise.
Despite the advantages of the HJM model, its calibration has considerable challenges, making it difficult to apply.The original formulation of the HJM model, based on instantaneous forward rates, lacked an obvious equivalence with any market-traded instrument.Additionally, the authors acknowledged that in the continuous-time limit and under true lognormal forward rates, their process exhibits a positive probability of explosion.Overall, the PCs approach provides valuable insights into the term's structure dynamics, emphasizing the significant impact of parallel changes represented by the first principal component.However, the complexity of calibrating the HJM model and its inherent limitations have limited its widespread acceptance in the field.

Figure 2 .
Figure 2. US treasury yield curve forecast

Table 1 .
Cumulative weights of factors

Table 2 .
Functional for of factor 2

Table 3 .
Functional for factor 3

Table 4 .
Forward interest rates