A Data Science Central Community
I was reading about the "Automated Reduced Error Predictive Analytics" patent secured by Rice Analytics (see below) and my first question is:
How can you successfully sue competitors about using a mathematical technology? After all, most vendors offer error and variance reduction as well as dimension reduction and automated model selection (based on optimizing goodness-of-fit) in their software. All statistical and data mining consultants, including myself, also use similar techniques to help solve business problems from their clients. For instance, I have developed methodology that achieves the same goal, and my methodology (hidden forests, see http://www.analyticbridge.com/forum/topics/hidden-decision-trees-vs) is public domain, non-patented, and everybody can use it freely.
Any claim about patent violation would most likely fail, the defendant's argument being "my algorithm is different, the only thing that our technology shares with the defendant's system is a methodology - well known and used by analytic professionals for decades - to reduce dimensionsonality, automate model selection and reduce error".
What about the newly recently published algorithm for random number generation based on the decimals of numbers similar to Pi (see http://www.analyticbridge.com/profiles/blogs/new-state-of-the-art-r...). This is public domain and non-patented. Could such a methodology be patented (assuming it would never have been published)? I don't think so, but would like to have your opinion on this.
The Rice Analytics Patent
Rice Analytics Issued Fundamental Patent on RELR Method
This Patent Covers RELR Error Modeling and Related Dimension Reduction
St. Louis, MO (USA), October 4, 2011 – Rice Analytics, the pioneers in automated reduced error regression, announced today the issuance to it by the US Patent Office for a patent for fundamental aspects of its Reduced Error Logistic Regression (RELR) technology. This patent covers important error modeling and dimension reduction aspects of RELR. Dan Rice, the inventor of RELR and President of Rice Analytics, stated the significance of this RELR patent as follows:
“While large numbers of patents are important in many technology applications, it is also clear that just one fundamental patent can lead to the breakthrough commercialization of an entire industry. The MRI patent in the early 1970’s had such an effect and by the 1990’s had resulted in billions of dollars in licensing fees and enormous practical applications in medicine. We believe that this RELR patent could have a similar effect in the field of Big Data analytics because RELR completely avoids the problematic and risky issues related to error and arbitrary model building choices that plague all other Big Data high dimensional regression algorithms. RELR finally allows Big Data machine learning to be completely automated and interpretable. Just as the MRI allowed the physician to work at a much higher level and avoid arbitrary diagnostic choices where two physicians would come to completely different and inaccurate diagnoses, RELR allows analytic professionals to work at a much higher level and completely avoid arbitrary guesses in model building. Thus, different modelers will no longer either build completely different models with the very same data or have to rely upon pre-filled parameters that are the arbitrary choices of others. Most modelers would spend significant time testing arbitrary parameters because they are worried about the large risk associated with such parameters, but then it is very hard for them to find the time to be creative. The complete automation that is the basis of RELR frees analytic professionals to work at a much higher and creative level, so they can pose better modeling problems and develop insightful model interpretations. Most importantly, unlike parsimonious variable selection in all other algorithms, RELR’s Parsed variable selection models actually can be interpreted because these models are not built with arbitrary choices and because they are consistent with maximum probability statistical theory.”
This US patent referenced as number 8,032,473 describes a method of modeling and reducing error in logistic regression that can be applied quite generally in machine learning applications. Logistic regression is one of the more general advanced analytics methods because it can be used to model the probability of outcomes in all classic regression problems without regard to the form of the dependent variable. The most common application of logistic regression is in modeling categorical outcomes, such as binary or ordinal outcomes. Yet, any continuous dependent variable can be categorized into intervals and also modeled with logistic regression, such as in forecasting and survival analysis problems. Logistic regression remains one of the most widely used advanced analytics methods in business, government, medicine, and science applications. The reason for the popularity of logistic regression is that it allows the possibility of insight into the key putative drivers of the predicted regression outcome, but problems related to error and dimensionality are major limiting factors and prevent such insight with non-experimental data. This patented RELR method overcomes these problems.
Read more about this patent at http://www.riceanalytics.com/_wsn/page9.html