A Dynamic Factor Model Applied to Investor Sentiment in the European Context

This paper proposes an Investor Sentiment Index for the European market and tests its predictability power over returns and volatility. The constructed Investor Sentiment Index for Europe draws upon three well-established and two recent individual sentiment proxies through a novel dynamic factor modeling addressed to behavioral finance. The index is obtained through an extended period of analysis and validated with other sentiment index measures. The work relies on individual sentiment proxies based on a dynamic factor model and tests it using a TGARCH model for volatility and returns. It carries out an in-sample and out-of-sample analysis to examine this sentiment index’s forecasting power over returns sustained on a recursive rolling window prediction against Fama and French’s three-factor model. The findings demonstrate that the proposed index closely predicts STOXX600 variance and returns and confirms a strong spillover effect between European and US stock markets. This study also concludes that the proposed European Sentiment Index is a valid alternative method for investors to monitor and predict market behaviors. The developed sentiment measure is a vital market prediction movement tool for financial information providers, investors, bankers, and financial analysts. The research combines the sentiment index with a TGARCH approach over the extended period of analysis and validates the method with other sentiment index measures. An in-sample and out-of-sample study confirms the predictive power of this work’s sentiment over returns compared to Fama and French’s three-factor model.


INTRODUCTION
Investor sentiment can influence decisions by affecting asset prices as they suffer from biases that affect market behavior (Pompian, 2011; Piccione & Spiegler, 2014; Benhabib et al., 2016;Jitmaneeroj, 2017). Investor emotions partially describe stock market changes, particularly in periods of senseless and unjustified panic or exaggerated optimism. Human emotions such as extreme excitement or fear and perceptive mistakes may generate emotional biases when evaluating the stock market's future performance, provoking peak movements and instability (Reis & Pinto, 2021). Sentiment may provoke a deviance bias amid asset price obtained by fundamentals and actual price (Giglio & Kelly, 2017;Zhou, 2018) and cause asset mispricing. Since investor sentiment is not directly visible, academia utilizes proxy measures, potentially carrying a risk of error as sentiments impact asset prices in several degrees. The proxy measure must efficiently capture the corresponding sentiment's essence and strength or scale (Reis & Pinho, 2020a). The proxy can provide grades of contagion spillover and market co-movements. There have been several academic attempts to measure investor sentiment through market sentiment data, survey data, or text analysis. Despite the extensive literature on the investor sentiment influence on market returns and volatility, only a few of

LITERATURE REVIEW
Feelings like belief, uncertainty, joy, or distrust that investors sense concerning firms' performance expectations significantly affect financial market behavior. It may be a feeling, judgment, or opinion supported by a thought about a situation or even a rational mode about something. Investor sentiment as an unobservable variable affects asset returns ahead of company fundamentals or macroeconomic circumstances. This topic has been a target of long debates considering the matters related to its measurement and its forecasting ability on market crashes and bubbles or spillovers amongst financial markets. Zhou (2018) defends that the investor sentiment measurement has been strongly discussed by academia. Sentiment measures that coexist today are related to market data, survey or questionnaire data, and text/media analysis data, and research debates what is the best measure of proxy sentiment and forecasts market movements and co-movements.
Measures based on market data analysis included trading volumes and were applied by several works (Ma et  Market volatility was also considered as a proxy measure for sentiment, as Jitmaneeroj (2017), and Rehman and Apergis (2019) testify. The closedend discount funds considered as a predictor of future market evolution as it measures the future sentiment related to the evolution of equities was one of the most relevant individual measures of sentiment as numerous articles demonstrate (Lee et al., 1991;Baker & Wurgler, 2006, 2007Baker et al. 2012;. The dividend premium, an indicator that accounts for the potential market optimism, is also an important indicator for individual investor sentiment used by Wurgler (2006, 2007)  Finally, text data indicators are derived from collecting several words from media sources such as Facebook, Twitter, Instagram, Google, LinkedIn, journals, magazines, and websites such as Bloomberg, CNBC, and Reuters (Loughran & Mcdonald, 2016;Da et al., 2015, Gao et al., 2020. Most of the cited studies build their investor sen-timent measures using just one or a few individual proxies and may be subject to a simplification proxy error. An investor sentiment composite index containing a full length of individual proxy measures may produce better results when predicting market behavior.
Some studies address the construction of investor sentiment indices, mainly using text data, but few utilizing market data measures.

Investor sentiment indices
In this context, the global investor sentiment index, based upon six sentiment individual proxy measures proposed by Baker and Wurgler (2006), is widely seen as a preferred approach, considering that it is commonly referenced and used in several behavioral finance studies (Chen et  implemented a sentimentality measure for emerging markets using principal component analysis (PCA) for US and Japan markets. Also, Reis and Pinho (2020b) refer that  announce that partial least squares (PLS) can be used to attain a reliable investor sentiment index for evaluating the equity risk premium for US markets.  promote sentiment index at the firm-level that also requests PCA. Kumari (2006) method. However, they employ volatility, share turnover, and the Consumer Confidence Index as single sentiment proxies. The researchers conclude that investor sentiment influences stocks' future returns that are difficult to price, more expensive and riskier to arbitrage. The results' sensitivity depends on the sentiment index and is prone to country-specific features. Reis and Pinho (2020b) built a European investor sentiment index based on a modified Baker and Wurgler (2006) method and different individual investor sentiment proxy constituents, concluding a strong predicting power of their index on stock returns. This paper constructs a statistical model that seeks to understand the stochastic processes describing investor sentiment, applying much rarely explored methods as diffusion indices, also named dynamic factors. Addressing volatility and returns with a TGARCH model, this paper builds and tests a sentiment index sustained on the referred Dynamic Factor Model. Then it trials its predictive power on European returns and volatility, comparing it with other market sentiment indices and supplying further robustness tests through an in and out-of-sample analysis against the 3-factor Fama and French model. Table 1 presents the variables in this study's model and summarizes descriptive statistics. The paper used the consumer confidence index, gold bullion price, economic sentiment, the VSTOXX volatility index, and the German spread between the 10th and the 3rd year government yield as this paper's European investor sentiment proxies as applied in the Reis and Pinho (2020b) work. Reis and Pinho (2020a, 2020b) tested these two last measures with good results, and thus, this paper considers they may influence a broader sentiment index. Gold stands as a haven for major European stock markets, acting as a soothing strength in the markets, as Baur and McDermott (2010) and Reis and Pinho (2020a) stated. Investors developed a more risk-aversion profile in times of adverse sentiment searching for safe assets like bonds from countries with low default risk, such as Germany (Bolton & Jeanne, 2011; Gómez-Puig et al., 2014). If the yield curve overturns and short-term sovereign interest rates are more significant than long-term sovereign yield rates, investors opt for the trade-off of lower rates but longer terms. Stoxx600 is the stock index applied for log return calculation, with the Baker and Wurgler (2006) and Reis and Pinho (2020b) sentiment indices serving for comparison.

Macroeconomic variables
The paper defined a dummy variable like NBER (National Bureau of Economic Research recession indicator) using EUROREC, the OECD-Recession Indicator for Euro Area. OECD identifies months representing turning points, and a dummy variable conventionally represents that event. Dummy variables assume a value of zero in periods of expansion and one of recession periods. This work controlled the effects of many macroeconomic aggregates (GDP growth, population growth, exports of goods and services dividing by the GDP, and inflation rate) on the sentiment proxies (Reis & Pinho, 2020b).

Statistical analysis
The paper applied two statistical methods. The first one, dynamic factor modeling (DFM), extracts a common factor from the set of all sentiment proxies, and the second applies the common factor to estimate data on returns and volatility. Before dynamic factor estimation, this work used macroeconomic variables and removed the macro-effects from the single sentiment proxies to orthogonalize them. This paper regressed each proxy variable against the macroeconomic variables to extract systematic risk through OLS (ordinary least squares). The standardized residuals (mean = 0, and variance = 1) are widely seen as the most proper proxies for the sentiment, as also applied by Wurgler (2006, 2007) and Reis and Pinho (2020b). Dickey-Fuller test assessed whether the series had unit root or were stationary as required for all variables. Then the paper proceeded to dynamic factor modeling.
Afterward, this paper conducts an in-sample and out-of-sample analysis using a benchmark of the three-factor model created by Fama and French to test if this study's sentiment measure better pre-dicts the European returns over a recursive rolling window of 50 months and an out-of-sample as of 2012 till July 2019.

Dynamic factor models
Dynamic factor models (or diffusion indexes) are simulations for multivariate time series where unobserved factors display an autoregressive vector structure, and exogenous covariates are allowable both in the latent factors' equations and the observable dependent variables equations (Statacorp, 2019). They were originally recommended by Geweke (1977) as a time-series addition of factor models established for cross-sectional data. The main advantage of the approach is that a small number of factors can explain a large fraction of variance in many financial series (Stock & Watson, 2011). Another advantage is that if disturbances in equations 1 and 2 (see below) show a Gaussian distribution, efficient forecasts can be obtained by regressing the variable of interest on the lagged factors and the variable lags (Stock & Watson, 2011).
Consequently, the models rely on a few factors substituting for a usually much larger number of variables.
Additionally, errors in the equations for dependent variables might be autocorrelated. The DFM betas of dynamic-factor models are obtained by maximum likelihood using the stationary Kalman filter (De Jong, 1991) and the De Jong (1988) diffuse Kalman filter. Accordingly, dynamic factor models extract a mutual component from a set of time series. Dynamic factor models have been frequently applied in macroeconomics (Stock & Watson, 1989, 1998Watson & Engle, 1983;Reijer, 2013;Stock & Watson, 2002), but rarely in behavioral finance, with a single study on the American market (Lutz, 2016). This paper assumes that the individual sentiment proxies reflect a common dynamic explaining factor: where t X is an nx1 vector of investor sentiment proxies orthogonalized with macro variables (Table 1), its first lag ∂ is a matrix of sentiment proxy loadings, ∧ is a matrix of the loadings of the factor of dimension nxk t F is a kx1 vector of period specific factor loadings, and t ε is an nx1 vector of measurement errors.
The dynamic factor model assumes factors with a dynamic autoregressive form: where ( ) L φ is a lag polynomial that describes the autoregressive structure of the data-generating process, and t µ represents the error. The extracted factor is then standardized to mean 0 and variance 1. After reducing explanatory variables to a single factor, the next step is the prediction of returns and volatility following the methodology of Stock andWatson (1991, 2002), Lutz (2016), and Statacorp (2019).

Threshold autoregressive conditional heteroskedasticity
This work applied a threshold autoregressive conditional heteroskedasticity model (TGARCH) as in the works of Aydogan (2017) and Rupande et al.
(2019), which assumes that extended high volatility periods are tailed by other high unpredictability periods, while low volatility periods are followed by other low volatility periods (cluster volatility). Consequently, the error term is conditionally heteroscedastic and susceptible to representation by an ARCH (autoregressive conditional heteroskedasticity model), GARCH (generalized autoregressive conditional heteroskedasticity model), or TARCH (threshold autoregressive conditional heteroskedasticity model). Therefore, this paper model includes two equations: the ARCH mean equation (3) and the volatility equation (4): where t y is the log of the STOXX600 return, with a one and two period lag form as an independent variable along with the level and the one-period lag of this work's standardized sentiment factor ( ), stdf obtained by dynamic factor modelling according to equations (1) and (2), and t µ represents residuals.
The equation for the conditional variance of stock returns index at time t with an ARCH, GARCH and TGARCH effect is: leverage effect, and the positive value of its coefficient appoints that negative news rises upcoming volatility greater than positive news. The leverage effect term returning a negative sign shows that positive shocks have larger effects on the next period volatility than damaging shockwaves of the same sign or magnitude. Sudha (2015) argues that a measure of ij ϑω + very close to unity proves both short-run and long-run persistent volatility. The volatility expression holds regressors to deal with a structural part of volatility denoted as multiplicative heteroskedasticity (HET). This work includes the dynamic factor F (stdf in this paper's model) with its effects represented by the δ coefficient as defined by Judge et al. (1985) and Statacorp (2019). The work assumed a Gaussian distribution for all estimations. This paper's model used the condition of one ARCH term, so that equation (4) simplifies to i = 1 and j = 1. The prerequisites for ARCH effects in the mean equation require the absence of serial correlation, heteroscedasticity, and ARCH effects in the return residuals. This paper applied the Breush-Godfrey LM test for residual correlation (Breusch, 1978;Godfrey, 1978), the White test for heteroscedasticity (1980), and the Engle ARCH LM test (1982) for ARCH effects, as also applied by Reis and Pinho (2020b). After estimating the model, this paper assessed the residuals and ensured the avoidance of serial correlation. The work then ran the Toda and Yamamoto (1995) and the Dolado and Lutkepohl (1996) causality tests, a modified Wald test for Granger causality that does not rely on a prior cointegration test, to assess whether this paper's measure of sentiment causes returns. Next, this paper predicted variance and after standardization, it was compared and correlated with other investor sentiment in-dex measures proposed by Baker and Wurgler (2006) and Reis and Pinho (2020b). Table 2 shows the joint estimation of the mean equation and volatility equation through ARCH, GARCH, and TGARCH effects, and the effect of stdf on the two equations. Only significant variables are shown.    This study stdf factor closely tracks the main events causing instability in stock markets. After 2001, the dot.com or technology bubble bursting, Note: Table 3 reports the out-of-sample forecasts for the periods from one to four months ahead the monthly excess European market return using this paper's sentiment index. The out-of-sample predictions are projected on a 60-month rolling recursive window and the forecast formation at time t. R 2 oos is calculated according to the Campbell and Thompson (2008)   for horizons 0 to 4 with Newest t statistics and the terrorist attacks in the US caused great pessimism and led to the lowest stock market levels. The period between 2007 to 2009 corresponded to the subprime crisis followed by the government sovereign debt crisis, and this work's index was able to capture the lowest sentiment level ever seen in the years covered by the sample and the following period of recovered optimism until 2019. Table 2 presents the TGARCH, GARCH, ARCH, and sentiment (stdf ) effects on volatility and log Stoxx600 return. The mean equation (3) produced a P(chi2) = 0.3925, and, thus, this work cannot reject the null hypothesis of no serial correlation in the Breush-Godfrey test. The mean equation also implies that the series is heteroskedastic as the Lagrange-Multiplier test produces a P(chi2) < 0.0001), thereby demonstrating the presence of ARCH effects. Accordingly, the paper applies TGARCH to model Stoxx600 variance. Table 2 shows that this paper's sentiment measure stdf has a negative correlation with returns (with causality power according to modified Wald test, Prob > chi2 = 0.0048), confirming previous analyses by Baker  Investors pay less attention to fundamentals in periods of euphoria and can buy at higher prices. On the opposite way, during periods of fear and negativity, investors are more cautious and build their portfolios on a more rational basis, underlying the risk-return principle's importance.

RESULTS
A measure close to unity of ij ϑω + represented by the ARCH (l.arch) and GARCH (lgarch) coefficients in Table 2, demonstrates both short-run and long-run persistent volatility (Sudha, 2015), which confirms respectively that past information and historical volatility have a substantial effect on contemporary volatility, as well as highlight the presence of cluster volatility in European markets.
The TGARCH-M mean equation shaped as an ARMA (1.1) resolves any potential issues related to autocorrelation in the mean equation. The outcome shows that the ARMA (1.1) coefficient is statistically significant (P < 0.001, t-value = 3.67, which implies that STOXX600 returns are explained by their historical values, indicative of momentum and mean reversion in share trading as well as past unanticipated shocks, confirming the analyses by Rupande et al. (2019). This result may indicate that the effect of historical returns and historical shocks on the conditional mean may disappear afterward a brief period.
Positive and negative news have diverse grades of influence on stock market volatility. The evidence for such unequal effects (or the leverage effect captured by the TARCH coefficient) suggests that negative news has a much stronger effect on volatility than positive news of the same size. Ho et al. (2013) applied two-family GARCH models to demonstrate that specific news sentiments have a daily impact on volatility and volatility persistence, and showed that negative news has a more substantial impact on volatility than positive news. Kumari and Mahakud (2015) also applied a VAR-GARCH model and concluded that negative investor sentiment, predominantly from retail investors, causes high volatility in US markets. Sayim and Rahman (2015) argued that, in general, investors have optimistic expectations about the economy and market fundamentals, which results in positive expectations, reduction of uncertainty, and lower volatility of stock returns. Baker and Stein (2004) also concluded that pessimistic investor sentiment accelerates equity liquidness evaporation and strengthens the selling force during a financial crisis, suggesting a negative mental bias on liquidness and investor trading comportment. Furthermore, the model developed in this paper revealed a noteworthy positive association between sentiment and volatility (with causality power according to modified Wald test, Prob > chi2 = 0.0048), with the negative TARCH coefficient, implying that the leverage effect is more noticeable in periods of positive news, thus resulting in periods of high volatility. This paper's results disagree with some published studies. However, they agree with many others, including Sudha (2015), Lee et al. (2002), and Aydogan (2017), with the latter showing a positive liaison between sentiment and variance in Italy and Turkey based on the consumer confidence as their single proxy measure for the sentiment. Moreover, Kling and Gao (2008) and Kumari and Mahakud (2015) estimated a positive sentiment parameter in the variance equa-tion. Bahloul and Bouri (2016) also obtained positive correlations between sentiment and volatility. Periods of festive investor mood may drive more prominent stock price fluctuations characteristic of cluster volatility, thereby attracting retail investors more prone to sentiment shifts. Retail investors are willing to pay more, and high fluctuation in prices can therefore promote stock rally periods.
In Europe, institutional investors (Louche & Lydenberg, 2006) are more rational, more patient, and more prone to making investment decisions based on fundamentals rather than sentiment. This view is confirmed by Labidi and Yaakoubi's (2016) suggestion that aggregate volatility risk is an autonomous risk factor during negative sentiment periods. During optimistic periods, the association amid aggregate volatility risk and predicted returns is fainter due to sentiment-led agents' superior involvement. In times of pessimism and fear, when more rational traders prevail, prices include a premium that considers market volatility.
If this work subdivides the analysis period ( Figure  2), from 1999 to 2006, it observes a strong Pearson correlation of r = 0.61 between sentiment and variance. During the subprime and sovereign crisis (2007 to 2009), there is a clear negative correlation of r = -0.53, and from 2010 onwards, a positive correlation of r = 0.28. Therefore, the sentiment index developed in this study can capture the strength of market variance in returns during the various crises and rank the severity of stock market crises. As in the 2007-2009 period, intense world crises associated with generalized panic generate more significant market variance and instability. High sentiment and market recoveries stimulate higher variance and related volatility, creating uncertainty in markets, demonstrating the asymmetric sentiment effect.
During sentiment peak periods, low sentiment increases uncertainty, reduces liquidity, induces sentiment-led retail traders to leave the market, and strongly attracts rational investors. Alternatively, high volatility is caused by high sentiment during averaged sentiment periods, mainly in rally periods, where periods of good news support higher volatility characterized by irrational retail investors joining the market and broadening price fluctuations. Differences between this paper's results and other studies conducted in other countries and regions may result from using the index instead of single sentiment proxy measures, using past autoregressive values of sentiment based on an index built upon market data instead of sentiment measures based on text collection or surveys, and using a stock index to study global European markets.

Forecasting European excess
market returns 1999-2018, robustness test, and out-of-

sample analysis
The sentiment indicator is a precise predictor of returns in the European environment during different forecast horizons and under a rolling window of 50 months in an in-sample analysis. With a healthy return explanation power as per the relevant R 2 and significant coefficient (Figures 3 and  4), this paper's sentiment index captures the latent factor of irrational behavior. To evaluate the forecast accuracy of this work's sentiment index on returns, this paper uses the out-of-sample R 2 statistic, R 2 oos , defined by Campbell and Thompson (2008) as:  When sentiment is small, forthcoming stock returns increase, which is valid for many upcoming monthly periods. While investors are pessimistic and frightened, they are more disposed to reasonableness, and thus institutional investors enter the market when retail investors leave. These investors buy at low prices, select high growth stocks and stocks that pay high dividends, and decide upcoming positive stock market earnings. This conclusion is also understated by Wurgler (2006, 2007) for the American market, and Zouaoui et al. (2011) for some European countries using consumer sentiment as sentiment proxy. Also, Papapostolou et al. (2016) support the significance of supertanker industry sentiment as a contrarian global predictor of financial assets in both in-sample and out-of-sample frameworks. Han and Li (2017) argue that China's investor sentiment is a regular momentum signal at a monthly rate both in and out of the sample. They use a sentiment index based on Baker and Wurgler's (2007) method. Other researchers find proof of intraday S&P 500 index returns predictability when using delayed half-hour investor sentiment in both in-and outof-sample analytical metrics (Sun et al., 2016).

CONCLUSION
The sentiment index measure developed in this study built upon dynamic factor modeling of several individual sentiment proxies in Europe closely tracks returns and volatility. The measure shows the spillover effect between the US and Europe, mainly during the subprime and sovereign crisis period. Investors' past psychological biases measured by this work's sentiment index are correlated with expected volatility patterns of clustering, erratic, and leverage influence on Stoxx600 returns variance. This work concludes that investor sentiment asymmetrically affects volatility during firm peaks or strong troughs of sentiment or moderate sentiment periods. However, periods of festive investor mood may drive more prominent stock price fluctuations characteristic of cluster volatility, thereby attracting retail investors more prone to sentiment shifts. Retail investors are willing to pay more, so high price fluctuations can contribute to stock rally periods.
Furthermore, this work's sentiment index is a predictor of European returns and is a valid indicator of risk aversion. Additionally, the out-of-sample forecasts on a 60-month rolling recursive window prove that this sentiment index is significantly more accurate than the Fama and French three-factor model in anticipating excess market yields. Moreover, when restraining for recession and expansion periods, this articles' investor sentiment index provides a high relevance of predictability power. The sentiment index closely tracks the main events causing instability in stock markets and can serve as a robust financial monitoring and forecasting tool for financial service providers, regulators, and investors.