A Data Science Central Community

This is a mathematical challenge, thought it is related to statistical parameter estimation in the context of time series / auto-regressive processes, such as ARMA. No prior advanced calculus knowledge necessary - smart high school kids can find the solution, thought it's not trivial!

*Click here for picture source *

Let's say that we have the model X(t) = **a** X(t-1) + **b** X(t-2) + e, where e is a white, independent noise (random variable) with zero mean, and t is the time. In short, a basic auto-regressive process or time series. More complex models are considered below.

The questions are as follows:

- What constrainsts should we put on
**a**and**b**to guarantee that the model is sound? - What statistical inference techniques offer solutions satisfying the above conditions?

Example: Let's assume that X(0) = 1, X(1) = 1, and for the sake of simplicity, let's assume that e = 0. Clearly if **a**=0.5 and **b**=0.5, then X(t) is constant, always equal to 1 no matter the value of t. If **a**=1 and **b**=1, then X(t) quickly becomes infinite as t grows.

We have the following potential cases for X(t), depending on **a** and **b**:

- Polynomial growth (including linear or constant)
- Exponential growth (with or without wild oscillations)
- Converging to 0
- Stable and non-periodic
- Stable and periodic

Question: what are the parameter sets driving stability?

The model X(t) = **a** X(t-1) + **b** X(t-2) + e has the following characteristic equation:

x^2 - a*x - b = 0.

The solutions to this equation (as well as initial conditions X(0) and X(1)) entirely determines whether X(t) is stable or not. Let's denote as r and s the two solutions of this characteristic equation:

- If r=s, we get linear or no growth for X(t).
- If |r| and |s| are < 0, then X(t) converges to 0 as t grows.
- If |r| or |s| > 0, we might experience exponential growth.

**Challenge**

- Formalize conditions to be satisfied by
**a**and**b**, to guarantee long-term stability - Identify statistical techniques (regression, Box-Jenkins) producing estimates that meet the previous conditions. Show that most traditional statistical (econometrics) inference techniques actually fail to meet the condition, and are thus only good for very short-term predictions.
- Generalize to X(t) =
**a**X(t-1) +**b**X(t-2) +**c**X(t-3) + noise - Generalize to spatial processes, for instance an image with pixel interactions with neighbor pixels: X(t, u) =
**a**X(t-1, u) +**b**X(t+1, u) +**c**X(t, u-1) +**d**X(t, u+1) + noise

Perform monte carlo simulations with various values of **a**, **b**, X(0) and X(1) to simulate these auto-regressive time series (can be done in Excel, R, Perl, Matlab or Python), to confirm your findings.

**Former weekly challenge**

Tags:

© 2019 AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

**Technical**

- Free Books and Resources for DSC Members
- Learn Machine Learning Coding Basics in a weekend
- New Machine Learning Cheat Sheet | Old one
- Advanced Machine Learning with Basic Excel
- 12 Algorithms Every Data Scientist Should Know
- Hitchhiker's Guide to Data Science, Machine Learning, R, Python
- Visualizations: Comparing Tableau, SPSS, R, Excel, Matlab, JS, Pyth...
- How to Automatically Determine the Number of Clusters in your Data
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- Fast Combinatorial Feature Selection with New Definition of Predict...
- 10 types of regressions. Which one to use?
- 40 Techniques Used by Data Scientists
- 15 Deep Learning Tutorials
- R: a survival guide to data science with R

**Non Technical**

- Advanced Analytic Platforms - Incumbents Fall - Challengers Rise
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- How to Become a Data Scientist - On your own
- 16 analytic disciplines compared to data science
- Six categories of Data Scientists
- 21 data science systems used by Amazon to operate its business
- 24 Uses of Statistical Modeling
- 33 unusual problems that can be solved with data science
- 22 Differences Between Junior and Senior Data Scientists
- Why You Should be a Data Science Generalist - and How to Become One
- Becoming a Billionaire Data Scientist vs Struggling to Get a $100k Job
- Why do people with no experience want to become data scientists?

**Articles from top bloggers**

- Kirk Borne | Stephanie Glen | Vincent Granville
- Ajit Jaokar | Ronald van Loon | Bernard Marr
- Steve Miller | Bill Schmarzo | Bill Vorhies

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives**: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions