Інтерпретація моделі ARIMA

У мене питання про моделі ARIMA. Скажімо, у мене є часовий ряд $Y_t$ який я хотів би прогнозувати, і модель $\text{ARIMA}(2,2)$ здається хорошим способом проведення прогнозування.

Δ Y_{t} = α_{1} Δ Y_{t - 1} + α_{2} Δ Y_{т - 2} + ν_{т} + θ_{1} ν_{т - 1} + θ_{2} ν_{т - 2}

$\Delta Y_t = \alpha_1 \Delta Y_{t-1} + \alpha_2 \Delta Y_{t-2} + \nu_{t} + \theta_1 \nu_{t-1} + \theta_2 \nu_{t-2}$ Тепер відставання

Y

$Y$ означає, що сьогодні на мою серію впливають попередні події. Це має сенс. Але яке тлумачення помилок? Мій попередній залишок (як я був у моєму розрахунку) впливає на цінність моєї серії сьогодні? Як обчислюються відсталі залишки в цій регресії, оскільки це продукт / залишок регресії?

regression time-series interpretation

— габріель
джерело

Я думаю, що вам потрібно пам’ятати, що моделі ARIMA є атеоретичними моделями, тому звичайні правила інтерпретації оцінених коефіцієнтів регресії строго не застосовуються однаково. Моделі ARIMA мають певні особливості, які слід пам’ятати. Наприклад, чим нижчі значення

в AR (1), тим швидше швидкість конвергенції. Але візьмемо для прикладу модель AR (2). Не всі моделі AR (2) однакові! Наприклад, якщо умова

задоволена, то AR (2) відображає псевдопериодичну поведінку, і в результаті її прогнози становлять стохастичні цикли.

α_{1}

$\alpha_{1}$

(α_{1}^{2} + 4 α_{2} < 0)

$(\alpha_{1}^{2}+4\alpha_{2}<0)$

— Graeme Walsh

(продовження ...) Дещо подібним чином, коли ми маємо справу з векторними авторегресіями, трактується функції імпульсного реагування (ІЧС), а не оцінені коефіцієнти; Коефіцієнти часто важко інтерпретувати, але сенс зазвичай може бути отриманий з IRF. З цікавості ви знайшли багато робіт, в яких автор (и) приділяли багато уваги інтерпретації коефіцієнтів у моделі ARIMA?

— Graeme Walsh

Здається, проблема із позначенням. "

" не може бути правильним, оскільки моделі ARIMA мають три терміни

для кожного з компонентів AR / I / MA відповідно, тоді як моделі ARMA мають два (наприклад,

) - але, здається, у вас є перше розмежування, яке б підказало вам мати на увазі

. Відредагуйте, щоб відобразити свої наміри.

ARIMA (2, 2)

$\text{ARIMA}(2,2)$

(p, d, q)

$(p,d,q)$

ARMA (2, 2)

$\text{ARMA}(2,2)$

ARIMA (2, 1, 2)

$\text{ARIMA}(2,1,2)$

— Glen_b -Встановити Моніку

@Glen_b Я пригадую, що я задавав те ж саме в іншому питанні . Виявляється, у нас є дублювання різновидів. Це питання та пов'язане з цим питання дуже схожі.

— Graeme Walsh

Я думаю, що вам потрібно пам’ятати, що моделі ARIMA є атеоретичними моделями, тому звичайний підхід до інтерпретації оцінених коефіцієнтів регресії насправді не переносить на моделювання ARIMA.

Для інтерпретації (або розуміння) оцінених моделей ARIMA було б добре усвідомити різні функції, що відображаються у ряді поширених моделей ARIMA.

Ми можемо вивчити деякі з цих особливостей, дослідивши типи прогнозів, що виробляються різними моделями ARIMA. Це основний підхід, який я застосував нижче, але гарною альтернативою було б розглянути функції імпульсного реагування або динамічні часові шляхи, пов'язані з різними моделями ARIMA (або стохастичними рівняннями різниці). Я про це поговорю наприкінці.

AR (1) Моделі

Розглянемо мить модель AR (1). У цій моделі можна сказати, що чим менше значення тим швидше швидкість конвергенції (до середнього). Ми можемо спробувати зрозуміти цей аспект моделей AR (1), дослідивши характер прогнозів для невеликого набору модельованих моделей AR (1) з різними значеннями для $\alpha_{1}$ $\alpha_{1}$ .

Набір з чотирьох моделей AR (1), про які ми говоритимемо, можна записати в алгебраїчні позначення у вигляді: де - константа, а решта позначень випливає з OP. Як видно, кожна модель відрізняється лише відносно значення .

Y_{t} = C + 0.95 Y_{t - 1} + ν_{t} (1) Y_{t} = C + 0.8 Y_{t - 1} + ν_{t} (2) Y_{t} = C + 0.5 Y_{t - 1} + ν_{t} (3) Y_{t} = C + 0.4 Y_{t - 1} + ν_{t} (4)

$\begin{equation} Y_{t} = C + 0.95 Y_{t-1} + \nu_{t} ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (1)\\ Y_{t} = C + 0.8 Y_{t-1} + \nu_{t} ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (2)\\ Y_{t} = C + 0.5 Y_{t-1} + \nu_{t} ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (3)\\ Y_{t} = C + 0.4 Y_{t-1} + \nu_{t} ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (4) \end{equation}$

C

$C$

α_{1}

$\alpha_{1}$

На графіку нижче я побудував позапробні прогнози для цих чотирьох моделей AR (1). Видно, що прогнози для моделі AR (1) з сходяться з меншою швидкістю щодо інших моделей. Прогнози для моделі AR (1) з збігаються швидше, ніж інші. $\alpha_{1} = 0.95$ $\alpha_{1} = 0.4$

введіть тут опис зображення

Примітка: коли червона лінія горизонтальна, вона досягла середнього значення модельованого ряду.

MA (1) Моделі

Now let's consider four MA(1) models with different values for $\theta_{1}$ . The four models we'll discuss can be written as:

Y_{t} = C + 0.95 ν_{t - 1} + ν_{t} (5) Y_{t} = C + 0.8 ν_{t - 1} + ν_{t} (6) Y_{t} = C + 0.5 ν_{t - 1} + ν_{t} (7) Y_{t} = C + 0.4 ν_{t - 1} + ν_{t} (8)

$\begin{equation} Y_{t} = C + 0.95 \nu_{t-1} + \nu_{t} ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (5)\\ Y_{t} = C + 0.8 \nu_{t-1} + \nu_{t} ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (6)\\ Y_{t} = C + 0.5 \nu_{t-1} + \nu_{t} ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (7)\\ Y_{t} = C + 0.4 \nu_{t-1} + \nu_{t} ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (8) \end{equation}$

In the graph below, I have plotted out-of-sample forecasts for these four different MA(1) models. As the graph shows, the behaviour of the forecasts in all four cases are markedly similar; quick (linear) convergence to the mean. Notice that there is less variety in the dynamics of these forecasts compared to those of the AR(1) models.

enter image description here

Note: when the red line is horizontal, it has reached the mean of the simulated series.

AR(2) Models

Things get a lot more interesting when we start to consider more complex ARIMA models. Take for example AR(2) models. These are just a small step up from the AR(1) model, right? Well, one might like to think that, but the dynamics of AR(2) models are quite rich in variety as we'll see in a moment.

Let's explore four different AR(2) models:

Y_{t} = C + 1.7 Y_{t - 1} - 0.8 Y_{t - 2} + ν_{t} (9) Y_{t} = C + 0.9 Y_{t - 1} - 0.2 Y_{t - 2} + ν_{t} (10) Y_{t} = C + 0.5 Y_{t - 1} - 0.2 Y_{t - 2} + ν_{t} (11) Y_{t} = C + 0.1 Y_{t - 1} - 0.7 Y_{t - 2} + ν_{t} (12)

$\begin{equation} Y_{t} = C + 1.7 Y_{t-1} -0.8 Y_{t-2} + \nu_{t} ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (9)\\ Y_{t} = C + 0.9 Y_{t-1} -0.2 Y_{t-2} + \nu_{t} ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (10)\\ Y_{t} = C + 0.5 Y_{t-1} -0.2 Y_{t-2} + \nu_{t} ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (11)\\ Y_{t} = C + 0.1 Y_{t-1} -0.7 Y_{t-2} + \nu_{t} ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (12) \end{equation}$

The out-of-sample forecasts associated with each of these models is shown in the graph below. It is quite clear that they each differ significantly and they are also quite a varied bunch in comparison to the forecasts that we've seen above - except for model 2's forecasts (top right plot) which behave similar to those for an AR(1) model.

enter image description here

Note: when the red line is horizontal, it has reached the mean of the simulated series.

The key point here is that not all AR(2) models have the same dynamics! For example, if the condition,

α_{1}^{2} + 4 α_{2} < 0,

$\begin{equation} \alpha_{1}^{2}+4\alpha_{2} < 0, \end{equation}$ is satisfied then the AR(2) model displays pseudo periodic behaviour and as a result its forecasts will appear as stochastic cycles. On the other hand, if this condition is not satisfied, stochastic cycles will not be present in the forecasts; instead, the forecasts will be more similar to those for an AR(1) model.

It's worth noting that the above condition comes from the general solution to the homogeneous form of the linear, autonomous, second-order difference equation (with complex roots). If this if foreign to you, I recommend both Chapter 1 of Hamilton (1994) and Chapter 20 of Hoy et al. (2001).

Testing the above condition for the four AR(2) models results in the following:

(1.7)^{2} + 4 (- 0.8) = - 0.31 < 0 (13) (0.9)^{2} + 4 (- 0.2) = 0.01 > 0 (14) (0.5)^{2} + 4 (- 0.2) = - 0.55 < 0 (15) (0.1)^{2} + 4 (- 0.7) = - 2.54 < 0 (16)

$\begin{equation} (1.7)^{2} + 4 (-0.8) = -0.31 < 0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (13)\\ (0.9)^{2} + 4 (-0.2) = 0.01 > 0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (14)\\ (0.5)^{2} + 4 (-0.2) = -0.55 < 0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (15)\\ (0.1)^{2} + 4 (-0.7) = -2.54 < 0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (16) \end{equation}$

As expected by the appearance of the plotted forecasts, the condition is satisfied for each of the four models except for model 2. Recall from the graph, model 2's forecasts behave ("normally") similar to an AR(1) model's forecasts. The forecasts associated with the other models contain cycles.

Application - Modelling Inflation

Now that we have some background under our feet, let's try to interpret an AR(2) model in an application. Consider the following model for the inflation rate ( $\pi_{t}$ ):

π_{t} = C + α_{1} π_{t - 1} + α_{2} π_{t - 2} + ν_{t} .

$\begin{equation} \pi_{t} = C + \alpha_{1} \pi_{t-1} + \alpha_{2} \pi_{t-2} + \nu_{t}. \end{equation}$ A natural expression to associate with such a model would be something like: "inflation today depends on the level of inflation yesterday and on the level of inflation on the day before yesterday". Now, I wouldn't argue against such an interpretation, but I'd suggest that some caution be drawn and that we ought to dig a bit deeper to devise a proper interpretation. In this case we could ask, in which way is inflation related to previous levels of inflation? Are there cycles? If so, how many cycles are there? Can we say something about the peak and trough? How quickly do the forecasts converge to the mean? And so on.

These are the sorts of questions we can ask when trying to interpret an AR(2) model and as you can see, it's not as straightforward as taking an estimated coefficient and saying "a 1 unit increase in this variable is associated with a so-many unit increase in the dependent variable" - making sure to attach the ceteris paribus condition to that statement, of course.

Bear in mind that in our discussion so far, we have only explored a selection of AR(1), MA(1), and AR(2) models. We haven't even looked at the dynamics of mixed ARMA models and ARIMA models involving higher lags.

To show how difficult it would be to interpret models that fall into that category, imagine another inflation model - an ARMA(3,1) with $\alpha_{2}$ constrained to zero:

π_{t} = C + α_{1} π_{t - 1} + α_{3} π_{t - 3} + θ_{1} ν_{t - 1} + ν_{t} .

$\begin{equation} \pi_{t} = C + \alpha_{1} \pi_{t-1} + \alpha_{3} \pi_{t-3} + \theta_{1}\nu_{t-1} + \nu_{t}. \end{equation}$

Say what you'd like, but here it's better to try to understand the dynamics of the system itself. As before, we can look and see what sort of forecasts the model produces, but the alternative approach that I mentioned at the beginning of this answer was to look at the impulse response function or time path associated with the system.

This brings me to next part of my answer where we'll discuss impulse response functions.

Impulse Response Functions

Those who are familiar with vector autoregressions (VARs) will be aware that one usually tries to understand the estimated VAR model by interpreting the impulse response functions; rather than trying to interpret the estimated coefficients which are often too difficult to interpret anyway.

The same approach can be taken when trying to understand ARIMA models. That is, rather than try to make sense of (complicated) statements like "today's inflation depends on yesterday's inflation and on inflation from two months ago, but not on last week's inflation!", we instead plot the impulse response function and try to make sense of that.

Application - Four Macro Variables

For this example (based on Leamer(2010)), let's consider four ARIMA models based on four macroeconomic variables; GDP growth, inflation, the unemployment rate, and the short-term interest rate. The four models have been estimated and can be written as:

\begin{array}{rcl} Y_{t} & = & 3.20 + 0.22 Y_{t - 1} + 0.15 Y_{t - 2} + ν_{t} \\ π_{t} & = & 4.10 + 0.46 π_{t - 1} + 0.31 π_{t - 2} + 0.16 π_{t - 3} + 0.01 π_{t - 4} + ν_{t} \\ u_{t} & = & 6.2 + 1.58 u_{t - 1} - 0.64 u_{t - 2} + ν_{t} \\ r_{t} & = & 6.0 + 1.18 r_{t - 1} - 0.23 r_{t - 2} + ν_{t} \end{array}

$\begin{eqnarray} Y_{t} &=& 3.20 + 0.22 Y_{t-1} + 0.15 Y_{t-2} + \nu_{t}\\ \pi_{t} &=& 4.10 + 0.46 \pi_{t-1} + 0.31\pi_{t-2} + 0.16\pi_{t-3} + 0.01\pi_{t-4} + \nu_{t}\\ u_{t} &=& 6.2+ 1.58 u_{t-1} - 0.64 u_{t-2} + \nu_{t}\\ r_{t} &=& 6.0 + 1.18 r_{t-1} - 0.23 r_{t-2} + \nu_{t} \end{eqnarray}$ where

Y_{t}

$Y_{t}$ denotes GDP growth at time

t

$t$ ,

π

$\pi$ denotes inflation,

u

$u$ denotes the unemployment rate, and

r

$r$ denotes the short-term interest rate (3-month treasury).

The equations show that GDP growth, the unemployment rate, and the short-term interest rate are modeled as AR(2) processes while inflation is modeled as an AR(4) process.

Rather than try to interpret the coefficients in each equation, let's plot the impulse response functions (IRFs) and interpret them instead. The graph below shows the impulse response functions associated with each of these models.

enter image description here

Don't take this as a masterclass in interpreting IRFs - think of it more like a basic introduction - but anyway, to help us interpret the IRFs we'll need to accustom ourselves with two concepts; momentum and persistence.

These two concepts are defined in Leamer (2010) as follows:

Momentum: Momentum is the tendency to continue moving in the same direction. The momentum effect can offset the force of regression (convergence) toward the mean and can allow a variable to move away from its historical mean, for some time, but not indefinitely.

Persistence: A persistence variable will hang around where it is and converge slowly only to the historical mean.

Equipped with this knowledge, we now ask the question: suppose a variable is at its historical mean and it receives a temporary one unit shock in a single period, how will the variable respond in future periods? This is akin to asking those questions we asked before, such as, do the forecasts contains cycles?, how quickly do the forecasts converge to the mean?, etc.

At last, we can now attempt to interpret the IRFs.

Following a one unit shock, the unemployment rate and short-term interest rate (3-month treasury) are carried further from their historical mean. This is the momentum effect. The IRFs also show that the unemployment rate overshoots to a greater extent than does the short-term interest rate.

We also see that all of the variables return to their historical means (none of them "blow up"), although they each do this at different rates. For example, GDP growth returns to its historical mean after about 6 periods following a shock, the unemployment rate returns to its historical mean after about 18 periods, but inflation and short-term interest take longer than 20 periods to return to their historical means. In this sense, GDP growth is the least persistent of the four variables while inflation can be said to be highly persistent.

I think it's a fair conclusion to say that we've managed (at least partially) to make sense of what the four ARIMA models are telling us about each of the four macro variables.

Conclusion

Rather than try to interpret the estimated coefficients in ARIMA models (difficult for many models), try instead to understand the dynamics of the system. We can attempt this by exploring the forecasts produced by our model and by plotting the impulse response function.

[I'm happy enough to share my R code if anyone wants it.]

References

Hamilton, J. D. (1994). Time series analysis (Vol. 2). Princeton: Princeton university press.
Leamer, E. (2010). Macroeconomic Patterns and Stories - A Guide for MBAs, Springer.
Stengos, T., M. Hoy, J. Livernois, C. McKenna and R. Rees (2001). Mathematics for Economics, 2nd edition, MIT Press: Cambridge, MA.

— Graeme Walsh
джерело

Love the application of IRF to non-VARs. They always seem to be associated and I'd never thought of using IRFs on mere ARIMAs. (That plus, who can really understand what MA terms do?)

— Wayne

What a great answer!

— Richard Hardy

Note that due to Wold's decomposition theorem you can rewrite any stationary ARMA model as a $MA(\infty)$ model, i.e. :

Δ Y_{t} = \sum_{j = 0}^{\infty} ψ_{j} ν_{t - j}

$\Delta Y_t=\sum_{j=0}^{\infty} \psi_j\nu_{t-j}$

In this form there are no lagged variables, so any interpretation involving notion of a lagged variable is not very convincing. However looking at the $MA(1)$ and the $AR(1)$ models separately:

Y_{t} = ν_{t} + θ_{1} ν_{t - 1}

$Y_t=\nu_t+\theta_{1}\nu_{t-1}$

Y_{t} = ρ Y_{t - 1} + ν_{t} = ν_{t} + ρ ν_{t - 1} + ρ^{2} ν_{t - 1} + . . .

$Y_t=\rho Y_{t-1}+\nu_{t}=\nu_t+\rho \nu_{t-1}+ \rho^2 \nu_{t-1}+...$

you can say that error terms in ARMA models explain "short-term" influence of the past, and lagged terms explain "long-term" influence. Having said that I do not think that this helps a lot and usually nobody bothers with the precise interpretation of ARMA coefficients. The goal usually is to get an adequate model and use it for forecasting.

— mpiktas
джерело

+1 This is more or less what I was trying to get at in my comments above.

— Graeme Walsh

Ha, I did not see your comments, when I was writing the answer. I suggest converting them to the answer.

— mpiktas

I totally agree with the sentiment of the previous commentators. I would like to add that all ARIMA model can also be represented as a pure AR model. These weights are referred to as the Pi weights as compared to the pure MA form (Psi weights) . In this way you can view (interpret) an ARIMA model as an optimized weighted average of the past values. In other words rather than assume a pre-specified length and values for a weighted average , an ARIMA model delivers both the length ( $n$ ) of the weights and the actual weights ( $c_1,c_2,...,c_n$ ).

Y (t) = c_{1} Y (t - 1) + c_{2} Y (t - 2) + c_{3} Y (t - 3) + . . . + c_{n} Y (t - n) + a (t)

$Y(t) =c_1 Y(t−1) + c_2 Y(t-2) + c_3 Y(t-3)+ ... + c_n Y(t-n) + a(t)$

In this way an ARIMA model can be explained as the answer to the question

How many historical values should I use to compute a weighted sum of the past?
Precisely what are those values?

— IrishStat
джерело