Subscribe to DSC Newsletter

Challenge of the week - Continued fractions for predictive modeling

Continued fractions is a fascinating subject, see picture below. They are extremely stable from a numerical point of view, have tons of useful properties, and are thus more robust against over-fitting, compared with standard linear regression. They have been extensively studied in the context of approximation and numerical analysis.

Why is there no statistical theory of continued fractions? Why aren't these beautiful and powerful mathematical objects not used in data science?

One could think of a model such as Y = a1 + X1/(a2 + X2/(a3 + X3/(a4 + X4/...))) where Y is the response. a1, a2 and so on are the parameters to be estimated, and X1, X2, X3 etc. are the independent variables, also called features or predictors. In practice X1 is the predictor with highest predictive power, X2 the second best predictor, and so on.

How do you develop a statistical theory around this concept?

Related articles

Views: 1450

Reply to This

Replies to This Discussion

Hi Vincent,

This concept reminds me of infinite impulse response filters. Maybe some of the techniques of transfer function analysis may be applied here. I wonder, though, how data would be represented in a transformed domain.



Hi Fernando,

SVM (support vector machines) - if I understand correctly - uses a transformed domain (or mapping) in an augmented space, to be able to find hyperplanes that separate clusters, in clustering problems. Not sure how these mappings are created, I haven't studied SVM in details, but I don't think they use stuff similar to signal processing (such as Fourier or other transforms for deconvolution).



Hi Vincent,

Yes, you are right. SVMs can operate in transformed domains of infinite dimensions by using the so called "kernel trick". In their mathematical formulation there is a term, the kernel function, which implicitly operates in that domain, behind the curtains, without your having to explicitely perform infinite calculations.

So you were thinking something along those lines?

My implication of IIR filters was just a wild thought, but definitely, the recursive structure of continued fractions reminded me of them. Not that I can immediately think of an application, though.

Just out of curiosity, is this a problem you are seriously working on or just a thought provoking question?



Just a thought provoking question for now.


On Data Science Central

© 2019 is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service