Subscribe to DSC Newsletter

This is a mathematical challenge, thought it is related to statistical parameter estimation in the context of time series / auto-regressive processes, such as ARMA. No prior advanced calculus knowledge necessary - smart high school kids can find the solution, thought it's not trivial!

Click here for picture source 

Let's say that we have the model X(t) = a X(t-1) + b X(t-2) + e, where e is a white, independent noise (random variable) with zero mean, and t is the time. In short, a basic auto-regressive process or time series. More complex models are considered below.

The questions are as follows:

  1. What constrainsts should we put on a and b to guarantee that the model is sound?
  2. What statistical inference techniques offer solutions satisfying the above conditions? 

Example: Let's assume that X(0) = 1, X(1) = 1, and for the sake of simplicity, let's assume that e = 0. Clearly if a=0.5 and b=0.5, then X(t) is constant, always equal to 1 no matter the value of t. If a=1 and b=1, then X(t) quickly becomes infinite as t grows.

We have the following potential cases for X(t), depending on a and b:

  • Polynomial growth (including linear or constant)
  • Exponential growth (with or without wild oscillations)
  • Converging to 0
  • Stable and non-periodic
  • Stable and periodic

Question: what are the parameter sets driving stability?

The model X(t) = a X(t-1) + b X(t-2) + e has the following characteristic equation:

x^2 - a*x - b = 0.

The solutions to this equation (as well as initial conditions X(0) and X(1)) entirely determines whether X(t) is stable or not. Let's denote as r and s the two solutions of this characteristic equation:

  • If r=s, we get linear or no growth for X(t).
  • If |r| and |s| are < 0, then X(t) converges to 0 as t grows.
  • If |r| or |s| > 0, we might experience exponential growth.


  • Formalize conditions to be satisfied by a and b, to guarantee long-term stability
  • Identify statistical techniques (regression, Box-Jenkins) producing estimates that meet the previous conditions. Show that most traditional statistical (econometrics) inference techniques actually fail to meet the condition, and are thus only good for very short-term predictions.
  • Generalize to X(t) = a X(t-1) + b X(t-2) + c X(t-3) + noise
  • Generalize to spatial processes, for instance an image with pixel interactions with neighbor pixels: X(t, u) = a X(t-1, u) + b X(t+1, u) + c X(t, u-1) + d X(t, u+1) + noise

Perform monte carlo simulations with various values of a, b, X(0) and X(1) to simulate these auto-regressive time series (can be done in Excel, R, Perl, Matlab or Python), to confirm your findings.

Former weekly challenge

Views: 3423

Reply to This

Follow Us

On Data Science Central

On DataViz

On Hadoop

© 2017 is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Terms of Service