“ Government debt forecasting based on the Arima model

The paper explores theoretical and practical aspects of forecasting the government debt in Ukraine. A visual analysis of changes in the amount of government debt was conducted, which has made it possible to conclude about the deepening of the debt crisis in the country. The autoregressive integrated moving average (ARIMA) is considered as the basic forecasting model; besides, the model work and its diagnostics are estimated. The EViews software package illustrates the procedure for forecasting the Ukrainian government debt for the ARIMA model: the series for stationarity was tested, the time series of monthly government debt was converted into stationary by making a number of transformations and determining model parameters; as a result, the most optimal specification for the ARIMA model was chosen.Based on the simulated time series, it is concluded that ARIMA tools can be used to predict the government debt values.


INTRODUCTION
Government debt is not only a means of fundraising to finance public needs, but also an effective tool to stabilize the country's economic development, whose predictive values allow making effective management decisions at the state level and developing effective measures to improve the economic and debt situation.
Despite the advanced mathematical tools for data forecast, choosing the appropriate method that provides adequate forecasts is one of the main tasks that arise in forecasting the amount of government debt.
There are many different methods of forecasting economic information in modern statistical theory. Most of them relate to time series forecasting, without additional information, i.e. without analyzing the impact of other factors. Of course, such analyses are incomplete, but their results are often more accurate than other forecasting techniques. Constructing an ARIMA model is one of such methods. Its main idea is that some time series are a set of random variables that depend on time, but changes in the entire time series have certain rules that can be represented by the corresponding mathematical model. When analyzing the mathematical model, one can understand the structure and characteristics of the time series more deeply and achieve optimal predictive values.
Different factors, such as GDP, inflation, industrial production index, exchange rate, as well as other factors of the country's economic life, may influence the change in the value of the government debt indicator. Collecting, processing, and analyzing selected factors to build a multivariate regression model can be time consuming and require significant resources that do not correspond to the end result. For this reason, it is more appropriate to use time series-based forecasting methods, such as an ARIMA time series model.

LITERATURE REVIEW
Time series analysis, in particular, ARIMA analysis, is widely used in the scientific literature to predict macroeconomic variables. The ARIMA model (autoregressive integrated moving average), a time series prediction method, was first developed and applied in the mid-1970s by two American scientists J. Box and D. Jenkins (Box, Jenkins, & Reinsel, 1994). The use of Box-Jenkins processes makes it possible to build an accurate and adequate model for the short-term forecast, but because of the data non-stationarity, this method needs improvement to build a more accurate long-term forecast.
The government debt dynamics for the United States of America and the United Kingdom are examined by applying non-linearity tests and threshold autoregressive models (Gnegne & Jawadi, 2013). The research gives understanding that the dynamics of government debt have different threshold effects (due to economic downturns, oil shocks, and debt crises); their consideration can improve the modeling and forecasting of the government debt development.
In the context of analyzing the government borrowing forecast, it is important to focus on Ericsson (2017), who looks at different approaches to testing for potential bias and proposes the use of impulse indicator saturation (IIS) to find out how biased the forecasts are. This indicator can also be applied to the government debt in Ukraine.
Despite significant scientific contributions to ARIMA's research, developing such models for government debt and key debt sustainability indicators is in its infancy.
Goswami and Hossain (2013) use a time series forecasting tool (ARIMA model) to analyze debt sustainability in Bangladesh for the period 2013-2033. The objects of the study are public debt, external debt and debt sustainability indicators; the predicted values are compared with the thresholds and stress tests applied to the external debt.
It should be noted that carrying out such a study for some debt sustainability indicators in Ukraine (for example, the government debt to GDP ratio, government external debt to annual exports of goods and services) can be problematic (because annual data are not enough to make adequate forecasts).
Nikoloski and Nedanovski (2017) and Navapan and Boonyakunakorn (2017) consider the varieties of ARIMA models that simultaneously take into account lag variables of the studied indicator and exogenous factors. In particular, Nikoloski and Nedanovski provide a detailed analysis of the dynamics of government debt and its structure in the Republic of Macedonia over 16 years. The study carefully examines how certain factors affect government debt to project its future trend.
Predicting the value of Ukraine's government debt by scientists is usually inseparable from other socio-economic indicators of the country. For this purpose, one-factor and multi-factor regression models are used to extrapolate future values. Yashchenko (2014) uses a four-equation system to predict external and internal debt on the basis of revenues and expenditures of the state budget of Ukraine and to determine the correlation dependence between the variables.
ARIMA models are rarely used to predict Ukraine's public debt.
When predicting the direct government debt of Ukraine and considering the cyclicality factor, Caruk (2007) found that the use of taxonomies of forecasting methods can significantly simplify the process of choosing the optimal forecast model according to the existing limitations, resources and tasks set by a researcher. A multicriteria approach to selecting the optimal ARIMA model specification is proposed. Therefore, many issues concerning the formation of a holistic view of the ARIMA model construction to forecast the amount of public debt, taking into account the specifics of current economic transformations, have not been fully resolved. In addition, the analysis of government debt and its impact on the economy requires continuous improvement of research methods, especially in terms of dynamic comparisons and forecasting of data.
The purpose of the paper is to analyze the dynamics of government debt and to predict its future values using the ARIMA model.

METHODS
The main research methods are scientific abstraction, generalization and systematization; they are used to consider scientific literature and substantiate basic theoretical propositions. Besides, the time series analysis method (ARIMA model) is employed to analyze the dynamics and trends of changing public debt and predict its future values.
To build the ARIMA time series model, EViews software is used, which has a considerable range of capabilities for primary data analysis, their graphical display, etc.
The factual base of the research consists of reporting and analytical information of the Ministry of Finance of Ukraine, as well as articles of Ukrainian and foreign scientists, materials of international organizations and their scientific calculations.
ARIMA modeling is a procedure for determining the parameters p, d, and q, where p is the order of the component AR, d is the order of the integrated series, and q is the order of the component MA. The process of analyzing data and building an ARIMA model can be presented in several steps: 1. Time series analysis (plotting of dynamics of indicators and, accordingly, analysis of the graph's appearance: the presence of a trend, cycle, seasonality, and zero values).

Checking the time series for stationarity.
Determining an integration parameter d. In the case of non-stationarity of the time series, the order of integration, namely the form in which it behaves as a stationary series (e.g., first or second differences, taking logarithms, seasonality adjustments, etc.) should be defined and justified (Lukianenko & Zhuk, 2013).
3. Model evaluation uses regression methods to obtain estimates of the parameters included in the model. The autocorrelation coefficient (ACF) and the partial autocorrelation coefficient (PACF) of sequence, as well as the autocorrelation order p and the moving average order q of the ARMA model are calculated.
4. Testing the basic prerequisites of regression analysis and checking the model for adequacy.
5. Using the model for forecasting.
In practice, varieties of ARIMA models, such as ARMAX models, are also used; they simultaneously consider lag variables of the studied indicator and exogenous factors. This model is a kind of hybrid of linear multivariate regression and ARIMA models. Being specified correctly, the ARMAX models make it possible to reduce forecasting errors and prove more effective than multivariate regression models or even pure ARIMA models (Lukianenko & Zhuk, 2013).

RESULTS
Let's build the ARIMA model for the government debt of Ukraine and determine the predictive value of this indicator for the next few months. There is a time series of monthly data on the amount of the government debt of Ukraine from December 2011 to October 2019, in UAH million.
The first stage is a visual analysis of changes in the volume of the government debt of Ukraine (a simple linear graph was selected based on the raw data with an additional histogram of the distribution along the axis). Figure 1 shows results.
It should be noted that over the last few years, there has been a steady tendency to increase the government debt of Ukraine; it is determined by the irrational pursuit of debt policy, high cost of attracting new loans, unstable debt refinancing in previous years, currency risks of the debt, etc.
During 2011-2013, there was a slight increase in government debt, with a minimal growth rate for this period. Figure 1 shows that since 2014, there has been a rapid aggravation of the debt stability of Ukraine.
The period from January 2014 to the beginning of 2015 can be called the extensive growth phase, with a peak value in February 2015. The distribution histogram, which is further represented on the left, has three peaks.
The next step is to check the time series for stationarity. The Dickey-Fuller test is the most common time-series test for stationarity. Figure 2 shows the results of testing the series for stationarity in its original form.  Figure 2 shows that the calculated value of the τ-statistic is -1.163521 and it is greater (lies to the right) than the critical values at the 1, 5, and 10% significance levels. Besides, p-value = 91% (p-value > 10%) is the minimum probability that a row has a single root and is not stationary. Therefore, the null hypothesis of having a single root in the time series cannot be rejected.
The observed series was transformed to bring it to the stationary process. Since there is an anomalous value in this series (February 2015) that is not typical for the whole series, it was decided to replace it with an average value (between the previous and the next) to prevent the results from being distorted.
The results of using the Dickey-Fuller test in the first differences indicate the stationarity of the transformed series ( Figure 3). The value of McKinnon's calculated τ-statistic (-8.89) is less (lies to the left) than the critical values at 1, 5, and 10%; therefore, the null hypothesis on a single root (non-stationarity) in some of the first differences is rejected with a minimum probability of making a 0% mistake (since p-value equals zero). Thus, the model will be built for the time series in the first differences, with order of integration 1; at this stage, the study has AR(?)I(1)MA(?).
Consider the graphs of the autocorrelation and partial autocorrelation function of a series of Ukraine's government debt (see Figure 4) to determine the general specification of the future ARIMA model and the number of lags for each component.
The correlogram graph is analyzed based on the key properties of the graphs of ACF/PACF functions for MA, AR and ARMA processes ( Figure 5) (Lukianenko & Zhuk, 2013).
The visual analysis of the correlogram (ACF/PACF functions) makes it possible to determine whether the selected data set is a pure AR or MA process or a mixed ARMA process. The conclusion on the maximum number of lags can be made only in cases of pure processes. In a mixed process, special identification procedures (for example, the Hannan-Rissanen procedure) should be used. Consider the graphs of the ACF and PACF functions to determine the type of the process and the number of lags to be included in the model. Figure  4 indicates that the 11th lag, the 3rd lag and probably the 1st lag are significant. The process is mixed, as evidenced by the visual analysis of the ACF/ PACF charts.
Given that judgment is very subjective, special procedures must be used to establish a more accurate model and identify the order of AR and MA components.
Since the process is not a pure autoregressive or moving average process, the order of AR and MA components using the Hannan-Rissanen procedures was determined. According to these procedures, the AR component was determined using the least-squares method (the optimal lags for inclusion in the model are those for which the value of the AIC information criterion is minimal) (Greene, 2012).
After determining the optimal AR component, it is necessary to form a set of the model residues for later use in determining the optimal order of the MA component of the ARMA/ARIMA model.
Numerical criteria only give some term (value), which makes it possible to conclude that the model is adequate. It should also be noted that the most accurate estimation is given by the Schwartz-Rissanen information criterion (SIC) and the Hannan-Quinn criterion (HQ), since as t → ∞ they completely identify the true model. Independent use of the Akaike criterion (AIC) re-sults in an inadequate increase in the number of model parameters.
To construct the predicted values of the transformed time series of the government debt of Ukraine, parameter estimates for 96 specifications of the ARIMA model with the number of parameters p, q, d from 1 to 12 were determined. Accordingly, the models with the lowest SIC and AIC are selected for further analysis. Table 1 gives the most significant model specifications.
The proposed model allows obtaining the forecast values of government debt in Ukraine (the forecasting results in the next five months are shown  Table 2). As Table 5 shows, the projected growth of government debt has been decreasing since 2020 in the forecast horizon.
MAPE criterion (average absolute error of the percentage forecast) is one of the most used indicators serving for estimating the model's predictive quality. For this case, the MAPE value = 12.9%, which indicates a high predictive quality of the model. The ARIMA model for forecasting government debt ratios confirmed its positive predictive properties. Therefore, it is advisable to recommend the implementation of this technique.

CONCLUSION
Forecasts of trends in economic variables are very complex operations. To make them, the economy relies on econometric modeling. The obtained forecasts are subject to additional practical and theoretical processing to get the most relevant and accurate results. Quite often, it is necessary to rely on predictive values of indicators when making operational decisions, without digging into a detailed analysis of the factors that influence their change. The ARIMA model works on this principle.
The model consists of AR, I and MA, where, AR is the autoregressive model, I represents the integration and indicates the order of a single integer, and MA is the moving average model.
Summing up the ARIMA simulation of Ukraine's government debt, one can state that the government debt indicators for a certain period represented a non-stationary time series that can be led to a stationary one by performing a series of transformations (in particular, replacing anomalous values by the average of a series) and taking first order difference from the time series created. To forecast the value of the government debt of Ukraine, parameter estimates for the 96 specifications of the ARIMA model were identified. The most optimal specification of the ARIMA model (1, 1, 3) was selected. The model can generally be used in practice, but it requires further improvements and calculations.