“ An evaluation and comparison of Value at Risk and Expected Shortfall ”

As a risk measure, Value at Risk (VaR) is neither sub-additive nor coherent. These drawbacks have coerced regulatory authorities to introduce and mandate Expected Shortfall (ES) as a mainstream regulatory risk management metric. VaR is, however, still needed to estimate the tail conditional expectation (the ES): the average of losses that are greater than the VaR at a significance level These two risk measures behave quite differently during growth and recession periods in developed and emerging economies. Using equity portfolios assembled from securities of the banking and retail sectors in the UK and South Africa, historical, variance-covariance and Monte Carlo approaches are used to determine VaR (and hence ES). The results are back-tested and compared, and normality assumptions are tested. Key findings are that the results of the variance covariance and the Monte Carlo approach are more consistent in all environments in comparison to the historical outcomes regardless of the equity portfolio regarded. The industries and periods analysed influenced the accuracy of the risk measures; the different economies did not.


INTRODUCTION
Financial institutions are continuously exposed to credit, market, operational, liquidity, reputational and other risks. Although hedging helps to minimize and mitigate some of these risks, the first step towards managing risks is measuring them (BCBS, 1994). The focus of this article is idiosyncratic market risk (i.e. market risk that can be diversified away) along with the metrics which claim to measure it, but the accurate assessment of market risk is non-trivial (Riskmetrics, 1996). Portfolio return volatility weighs upside and downside risks equally, so risk measures such as Value at Risk (VaR) and Expected Shortfall (ES) were introduced to emphasize only downside risk: both measures now constitute part of the market risk regulatory framework of the Basel Committee of Banking Supervision (BCBS, 2016). All qualifying financial institutions must comply with the BCBS rules and must retain sufficient capital reserves to protect them from adverse scenarios with a given level of confidence over a specified period (BCBS, 2016). It is important to correctly estimate these reserves, because regulatory capital does not generate returns, capital retention is costly for institutions (Riskmetrics, 1996).
Calculating market risk has become a pursuit of burgeoning complexity due largely to the increasing degree of global investment and the growing number of interacting securities, which constitute trading book portfolios (Riskmetrics, 1996). The now familiar VaR measure first introduced by JP Morgan 1994 has enjoyed the status of principal market risk metric for several decades (Riskmetrics, 1996). Adopted by the BCBS in 1994, VaR in its various manifestations has been used to estimate the minimum capital required for financial institutions' market risk exposures (BCBS, 1994). VaR measures the maximum potential change in the value of a portfolio of financial instruments with a given probability over a pre-set horizon (Riskmetrics, 1996). Thus, if the random variable X describes potential portfolio profits and losses, with related quantile aa VaR x X x P X x α = − = − ≤< VaR is measured using one of three approaches (historical, variance-covariance and Monte Carlo simulation), but it does not provide an estimate of the loss severity, should a suitably large loss occur (as determined by the confidence level) -it only provides a measure of the loss frequency (Acerbi & Tasche, 2001). ES estimates the loss severity: it is the probability-weighted average of the losses greater than VaR. It is a superior risk measure, because it is sub-additive and coherent unlike VaR (Acerbi & Tasche, 2001).
If the random variable of profits/losses X is continuous, the ES is called tail conditional expectation (TCE): The TCE is the average of losses that are greater than the boundary VaR value at a significance level (Acerbi & Tasche, 2001), α and .

aa TCE VaR ≥
For a more general distribution, when the random variable of profits -losses X is discontinuous, the TCE is not sub-additive either and, in this case, the coherent risk measure ES is: ( ) Fx PX x = ≤ The Expected Shortfall can also be expressed in terms of the TCE and VaR: is the probability density function of ( ) 2 0, .

N σ
Inserting the probability density function ( ) fx into the integral leads to (1): All three approaches for calculating VaR and each approach's associated ES are calculated and compared for different markets (developing and developed), different time periods (highly volatile and calm) and different market sectors (retail and banks). To ascertain which measure more accurately describes the tail risks, results are back-tested.

Approaches for VaR and ES
The Historical Simulation (HS) approach makes no assumption about the return distribution and asserts that tomorrow's returns will behave as they did in the past (a contradiction of the efficient market hypothesis). Unless the historical period selected for simulation covers a turbulent era, losses in the tail region will only comprise a few observations. The VaR measure can thus be volatile (Sharma, 2012).
The Variance-Covariance (VCV) method assumes that the portfolio returns are normally distributed and takes correlations between constituent assets into account (Benninga & Wiener, 1998)

Coherent risk measures
There are four axioms a coherent risk measure must fulfil. A function :

Risk aversion
Other ways to evaluate coherent risk measures are risk aversion functions (Acerbi, 2001). A coherent risk measure must have a decreasing and positive risk aversion function ( ).
x ϕ In addition, represents the rational attitude towards risk and may be thought of as a function weighing all cases from worst to best.
The risk aversion function for the VaR measure is a spike function, assuming normality (Acerbi, 2001). Thus, it is not an overall declining function (further confirmation that VaR is not a coherent risk measure) (Acerbi, 2001). Alternatively, an example of a risk aversion function for ES that fulfils coherency is shown in Figure 1. For the 20% worst cases, a risk aversion of 5 is assigned. A rational investor may assign their own subjective risk aversion by simply changing the profile of the weight function ( ) x ϕ (Acerbi, 2001). The only requirements for coherency are that the function is positive, decreasing and normalized to 1 in the interval [0,1]. Within this framework, however, any option for ( ) x ϕ is a legitimate attitude toward risk (Acerbi, 2001).

LITERATURE STUDY
After the definitive introduction to VaR by JP Morgan (1994) in which the development of the metric was introduced and analyzed, Linsmeier and Pearson (1996) explored VaR alternatives such as sensitivity analyses or cash flow at risk. A sensitivity analysis was used to measure the impact of different factors on a dependent variable within a set of assumptions, while cash flow at risk is like VaR, but related to cash flows (Linsmeier & Pearson, 1996). Linsmeier and Pearson's (1996) most significant findings were that the HS and MC simula- tion are both able to capture portfolio risks, which included derivatives, while the VCV method did not always provide satisfying results. They further found that any of the three methods can produce misleading results if the recent past was atypical.
VaR was quickly found to not be a coherent risk measure (Artzner et al., 1999). This led Artzner et al. (1999) to introduce ES, then called tail conditional expectation (see equation (1)). However, the definition of ES from Artzner et al. (1999) was insufficient for discontinuous functions, as it did not satisfy the sub-additivity axiom. A more general version of the ES and proof of sub-additivity was then found by Acerbi and Tasche (2001).
Yamai and Yoshiba's (2002) early comparison of VaR and ES under general market conditions analyzed daily logarithmic changes of exchange rates. Their historical data included three established economic markets and 18 emerging markets and they found that VaR and ES do not estimate tail risk accurately in all cases. VaR and ES were both found to underestimate currency risk with fat-tails and high potential for losses. In a further analysis of Yamai and Yoshiba (2002), only Southeast Asian countries (emerging economies except for the Singapore dollar) were examined in which tail dependence was disregarded by both VaR and ES. Their conclusion was that neither risk measure on its own was sufficient: a combined approach of these two methods to analyze financial risk was more sophisticated than either of them alone. In addition, Yamai and Yoshiba (2002) asserted that the profit/loss distribution should be explored from different angles, including tail fatness and asymptotic dependence of the distribution.
Liang and Park (2007) explored risk measures for hedge funds using semi-deviation, VaR, ES, and tail risk, which all measure downside risk. Comparing performances of 1,500 hedge funds, Liang and Park (2007) concluded that skewness and kurtosis of the underlying distribution cannot be ignored. Furthermore, they confirm the findings that Expected Shortfall is superior to VaR for evaluating financial risk precisely when analyzing hedge funds' performances.
Acerbi, Nordio, and Sirtori (2001) compared VaR and ES in a unique non-empirical approach by reviewing classical arguments. A common misconception in the literature is the definition of VaR. VaR is often defined as the maximum potential loss that a portfolio can suffer in the 1% worst cases in a set time period (Riskmetrics, 1994), but Acerbi et al. (2001) assert that a better definition is "VaR is the minimum potential loss that a portfolio can suffer in the 1% worst cases in a set time period". Another definition, which amounts to the same thing, is: "VaR is the maximum potential loss that a portfolio can suffer in the 99% best cases in a set time period" (Acerbi et al., 2001). Acerbi et al. (2001) also mathematically proved the non-subadditivity of VaR and derived an exact definition of ES, as well as proof of the coherence of this measure. They asserted that ES should replace VaR in risk management, keeping in mind the reliability of approximations and transparency of the underlying hypotheses. They concluded that ES was a solid measure to assess risk with no restrictions on applicability.
Nadarajah, Zhang, and Chan (2013) discuss different estimation methods for ES by providing an overview of the most common approaches to calculate ES. They list 45 different ways to estimate ES and distinguish between 32 underlying distributions. Generally, calculation methods can be categorized into parametric, nonparametric and semiparametric (Nadarajah et al., 2013). Parametric methods assume that the sample data are from a population following a probability distribution. Thus, the method is based on a fixed set of parameters with the most typical being the normal and Student's t-distribution. Nonparametric estimation methods do not make any assumptions about the distribution and are often more straightforward to use, even when the use of a parametric approach is applicable. Nonparametric methods have greater robustness than parametric ones and tend to leave less room for improper use and misunderstanding (Nadarajah et al., 2013). According to Nadarajah et al. (2013), the best-known nonparametric method is the HS, but other methods are the filtered HS and the Yamai  Miletic, Korenak, and Lutovac (2014) apply VaR methods to the Belgrade Stock Exchange using four equally-weighted stocks from the 15 securities included in the Belgrade -15 index. VaR is measured using the HS and two parametric methods: one assumes a normal distribution and the other uses the Generalized Autoregressive Conditional Heteroscedasticity (GARCH) model with a Student's t-distribution. Using six years of return data, they conclude that both methods predict market risk well. Even though these results are limited to the Serbian stock market, Miletic et al. (2014) have shown that VaR can be a good predictor of portfolio risk in a developing country during extraordinary financial events, e.g., the global credit crisis of 2008.
Wimmerstedt (2015) focused on back-testing ES results to explore its elicitability property. Elicitability is a mathematical concept, which implies that a law invariant risk measure uses a probability distribution to transform into a singlevalued point forecast (Brehmer, 2017

Data
The data comprise closing prices of the securities from the London Stock Exchange and the Johannesburg Stock Exchange. Retail and bank portfolios using data from South Africa and the United Kingdom were used. Each of the four portfolios consists of the three biggest retailers or banks by market capitalization in each country as of April 2018. The two industries are compared because banks were directly involved in the financial crisis and the UK banks were heavily exposed to the housing market in the United States ( Figure 3). The UK's unemployment rate rose from 5% to 8% over this period, while South  Africa's increased from 23% to 25% (only a 2% absolute increase, and a relative change of 9%, due to the high levels of unemployment in South Africa).

Jarque-Bera test for normality
Normality of log returns is a chief assumption of the VCV method. Many approaches to test for normality exist, including the Kolmogorov-Smirnov, Shapiro-Wilk, 2 χ and Jarque-Bera tests (Thadewald & Büning, 2004). In this work, the Jarque-Bera test was used. The hypotheses are: The Jarque-Bera test essentially checks whether the sample data skewness and kurtosis match those of a normal distribution. The test statistic is: where n is the degrees of freedom, S is sample skewness and C is the sample kurtosis.
Skewness is:

Actual vs. estimated volatility
One-day and 10-day log returns were calculated and a two-sided statistical test was applied. Normality of return data is assumed, so the oneday volatility is scaled by 10 to obtain the 10day volatility. The hypotheses are:

Back-testing
The BCBS (2016) approach to back-testing allows for a comparison between risk measures. The BCBS back-testing is a forecast and uses one full year of daily return data to estimate the VaR and ES (Wimmerstedt, 2015). Back-testing counts the number of losses that exceed the predicted risk measure. For VaR, these exceedances are (over pe-  Table 1.
where t MRC measures the market risk capital at time .
t Table 1 shows that multiplier k increases in the yellow zone: bank penalties are manifest in higher capital reserves when they are exposed to higher market risks. In the aftermath of the credit crisis, only the South African retail portfolio was worth more by the end of 2010 than its original value in 2008. The portfolio increased by 55%, as shown in Figure 6. All other portfolios lost value over the period. The South African bank portfolio was worth 83% of its original value three years later in December 2010.

Portfolio performances
The UK retail portfolio decreased by a similar amount and was worth 79% of its starting value at the end of 2010 and the UK bank portfolio more than halved as a result of the financial crisis. An investment in the UK bank portfolio in January 2008 would have only been worth 38% of its initial investment by the end of 2010. Table 2 shows the correlation between the returns of all securities in this report during the pre-crisis period and the crisis period. The correlation between returns from either a South African bank or retailer and a UK retailer or bank is close to zero in most cases during both periods. Generally, the correlation between South African stocks is higher than the correlation between UK companies (see top left box). While UK retailers' stocks performance remain

Normality
The Jarque-Bera test results are shown in Table 3. The test was concluded at a 99% confidence level with two degrees of freedom. Using (2), the JB statistic must be absolute smaller than 9.21 to not reject the null hypothesis, which states normality of returns. The results show that only the returns of FirstRand Bank during the pre-crisis period were normally distributed. The assumption of returns following a normal distribution -a requirement of the variance-covariance approach -is violated in all other cases. Comparison of the JB statistics from both periods indicates that stock returns were 'more normal' in the pre-crisis period than during the crisis, since the JB statistics are smaller for most securities during the growth years.
Analyzing the historical stock price data from Barclays, they show that the bank experienced several days with share price losses greater than 15% and days with gains of more than 20% during the crisis. These unusual events explain the very large number for the JB statistic. In general, South African returns tend to be more normal than UK returns.

Estimated and actual volatility
The two-sided test findings of calculated and actual returns over a 10-day period can be seen in Table 4. The test was done at a 99% confidence level. The null hypothesis stating that the means are the same can be accepted for a 2.58. z > All portfolios during the credit crisis and pre-crisis period fulfil this. If the calculated and actual 10day means are equal, the normality assumption of portfolio returns also holds. Even though only the share price returns of FirstRand Bank followed a normal distribution, all multi-asset portfolios follow a normal distribution. The least differences of

General trends
Tables 5 through 7 show that the VaR and ES are generally higher in crisis periods: an expected result. While the economy was growing in the pre-crisis period, VaR and ES were smaller for the UK bank portfolio than for the South African one. However, the VaR and ES for the UK retail portfolio were slightly larger than the South African retail portfolio during the pre-crisis period, regardless of the approach used. During the crisis, VaR and ES were higher for the UK companies than for the South African ones.

VaR results
VaR is similar during both economic periods for both industries in South Africa using the HS and VCV method. In the UK, this differs. UK banks' VaR during 2008 and 2009 is significantly higher using the HS compared to the VCV method (Tables 5 and 6). Otherwise, UK VaR results using HS and VCV are similar. MC simulation in Table  7 shows similar outcomes as the VCV method for all VaR measures in both industries, periods and countries. The VaR HS is thus also higher in 2008 and 2009 than the VaR produced by the MC simulation for the UK banking portfolio. Overall, the VCV and MC outcomes are similar, while they both differ using the HS.

ES results
The VCV and MC VaR results for ES are similar, while HS generates significantly higher ES measures during the credit crisis for the UK and South African bank portfolios. For example, the one-day ES for the UK bank portfolio in 2009 is 16.8%. That means it is estimated that the portfolio loses almost 17% of its value in a single day in the 1% worst events when the HS is used. The MC simu-

ES/VaR ratios
Examining the ratio of ES/VaR is instructive. This ratio differs across methods, countries, industries and time periods.     MC results show that the difference between ES and VaR is often smaller than 15%. This is the case when the bars do not reach the dashed normal ratio line, which occurs 50% of the time (12 out of 24 results) ( Figure 9). This implies that VaR and ES values are similar for MC. It therefore matters less which risk measure (VaR or ES) is superior, because the differences between VaR and ES are relatively small using the MC approach. Basel III capital requirements for the MC method are similar, then, for financial institutions regardless of the risk measure used.

Back-testing results
VaR and ES for the HS and VCV methods were back-tested using the BCBS rules (testing MC results is not necessary, since returns are simulated). Back-testing results for HS and VCV methods are shown in Tables 8 and 9. If the risk measure is flagged yellow, the multiplier is greater than three. Thus, banks must hold higher capital reserves (see Table 1 and equation (2)). In addition, yellow implies that the model used may not be accurate, but to be certain, more information is needed. As expected, the risk measure VaR is flagged yellow for both bank portfolios during the credit crisis (Table 8). It is surprising that HS VaR also flags yellow during a pre-crisis period in 2005 for South African retailers and banks. The VCV VaR

CONCLUSION AND LIMITATIONS
A developed and an emerging economy, different industries and two periods of contrasting economic growth were considered for evaluating VaR (in its three common manifestations) and ES risk measures in this paper.
ES and VaR were found to be considerably higher during recession periods. The MC simulation and VCV method are more accurate (in terms of estimating exceedances) than the HS approach, especially in times of recessions and the ratio of ES/VaR using the VCV method and the MC simulation is more consistent than the HS method.
The assumption of normality for single stock returns, a requirement of the VCV approach, has been demonstrated to be largely untrue, as it has often been shown in the literature (e.g., Richardson & Smith, 1993;Sheikh & Qiao, 2009). Nevertheless, for the portfolios of stocks used in this work, this is a reasonable approximation. No statistically significant differences are found using the different risk measures whether applied to a developed or emerging economy. Differences arise between industries. Banks were directly exposed to the crisis initiated by the collapse of the US housing market, while retailers were only affected indirectly due to diminished consumer spending. VaR and ES are generally higher for the banking industry in both countries during the crisis period. Risk models were also inaccurate for banks -in line with Linsmeier and Pearson (1996) who found that if the recent past is atypical, all three estimation models are flawed.
ES was shown to be a superior risk measure (Liang & Park, 2007). Nevertheless, this is only true to the extent of which VaR measure is used. Financial institutions should indicate both measures employed. Both risk measures are insufficient for evaluating all potential portfolio risks (as demonstrated by Yamai & Yoshiba, 2002). Extraordinary market events, as witnessed during the crisis, are exceedingly difficult to predict. This is emphasized by back-testing results, which resulted in exceedances > 4 (flagged yellow in Tables 8 and 9) using the VCV and HS methods for the UK banking sector from 2008 to 2010. UK banks experienced substantial losses during the crisis, and the risk measures were unable to adapt them sufficiently quickly. Overall, ES performs better than VaR, and both VCV and MC simulation approaches provide more consistent results.
The research conducted is limited to one developed and one emerging economy, which are compared with each other. Also, only specific industries are compared and therefore the results might differ for other industry sectors or economies. Lastly, the time periods regarded might not reflect other past or future time periods. However, because both recession and growth timeframes were studied it is not unlikely that future periods of these economic conditions will lead to similar outcomes regarding the risk measures.
Future research could involve additional testing of risk measures under extreme market conditions. Also, back-testing or validating of test results are only in the early stages of development and can be researched in more detail. Finally, much research focuses on comparing VaR and ES, to ascertain which risk measure is superior (in terms of back-test exceedance numbers), future focus could shift to assessing and comparing various methods to calculate both risk measures. For example, the BCBS back-test is not a good method to validate VaR and ES using the MC simulation approach. Since it does not make sense to back-test simulated returns because they have not actually occurred in the past. The development of a universal back-test could be the next step for research in this field. This would allow a reliable comparison of all methods: currently determining which of the three methods is best under all market conditions is complex and can be contradictory.
New methods to calculate more accurate (and easier to apply) VaR and ES measurements could be developed. If policymakers take the initiative and decide on one framework and one risk measure, the risk measures would become much more comparable, allowing differences and similarities between them to be studied more easily. The current most suitable method for this seems to be the MC method because it is more consistent than the HS and no false assumption is necessary, as for the VCV approach.