A Data Science Central Community
Newcomers to data mining and predictive analytics (whether individuals, academics, government, non-profits, or large commercial businesses) are always asking "Yes, this predictive analytic model looks good, but what is the "p-value" or what is the "confidence limit" of the ACCURACY score? How do I know this model is reliable?"
For many, who have learned (or maybe been "indoctrinated") "traditional p-value / Fischerian Frequentist - Central Limit Theorem" statistics, "resistance" to accepting what I have come to know as the "180 degree turn around" in the thinking process of PREDICTIVE ANLAYTICS is the issue. I was trained in "traditional statistics", wrote a Master's Degree Thesis using such methods, and then wrote a Ph.D. Thesis using traditional methods, and then did 2 Post-Doctoral Fellowships using such methods; and then continued for 30 years in "medical research" using such methods. SO SHOULD NOT I BE ONE OF THESE "RESISTANT TRADITIONAL STATISTICIANS"? You would think so. But when I learned about "data mining" in the late 1990's, I EMBRACED it !!! ... I liked it !~!! ..... I immediately somehow understood that "this is where it is at" ... "this is the future". About that time we (my wife and I, she also a "traditionally trained statistician") were asked to be FACULTY for a local MEDICAL RESIDENCY PROGRAM with the job of instructing the medical residents on "HOW" to do a research project and also follow them through the entire project including data analysis. We immediately started adding DATA MINING procedures - the ATTENDING MEDICAL FACULTY (M.D. / D.O.'s), who were our age or older, did not really comprehend what we were doing (but accepted us because we were publishing books on DATA MINING and PREDICTIVE ANALYTICS that were having exceptional sales, being readily accepted by the "data analytics community". Many of the MEDICAL RESIDENTS were "resistant to learn"; also, having been "indoctrinated" by MEDICAL JOURNAL articles using P-value statistics. The most common thing these medical residents wanted to do was "apply NUMEROUS p-value / t-tests .... Something one does NOT do in "traditional statistics" - the ERROR OF MULTIPLE TESTS !!!!. We insisted that each medical resident research paper included DATA MINING / PREDICTIVE ANALYTIC modeling; thus the paper was divided into two parts: 1) the traditional statistics presentation, which usually was primarily only descriptive (because the residents usually had lots of variables but few cases ....) and 2) the DATA MINING / PA analyses. But over the years the medical residents got behind this and started "opening their arms" to learning the DM / PA modern methods, because they began to see that with these methods they "could do more - get better information from their data". One of the culminating research papers happened last year, where a team of two residents developed a model that was 100% Sensitive and 100% Specific (using 2 different DM Algorithms one for each parameter) for determining if a person presenting at an EM (Emergency Room) with "heart attach symptoms" could be released or needed to stay overnight for more expensive tests (which was the usual procedure previously); this was based on 5 questions or simple tests done with the patient, thus eliminating a very expensive overnight stay with additional very very expensive tests. All the ATTENDING MEDICAL DOCTORS immediately "caught" how important this was. But it has taken 10 years working with these M.D. / Residents to fully get this acceptance of modern DM / Predictive Analytic methods. (This EMERGENCY ROOM / Heart Attack symptom study will be included as a CASE STUDY in our upcoming book:
CITATION FORMAT: L.A. Miner;; P.S. Bolding; M. Goldstein; J.M. Hilbe; T. Hill; R Nisbet; N Walton, and G.D. Miner; Practical Predictive Analytics and Decisioning Systems for Medicine: Informatics Accuracy and Cost-Effectiveness for Healthcare Administration and Delivery Including Medical Research, (2014) Waltham, MA: Elsevier/Academic Press; ISBN number: 9780124116436.)
Also discussed in this upcoming book (and also cited in our 2009 book – HANDBOOK OF STATISTICAL ANALYSIS & DATA MINING APPLICATIONS) is the fact that up to 85% of the medical research journal articles are incorrect in their use of “TRADITIONAL P-VALUE STATISTICS”. (This “mis-use” of traditional statistics does not seem to be apparent in other more pure science fields, like physics, astronomy, etc., where the level of “retraction of published articles” is essentially zero.) The use of DATA MIINNG / PREDICTIVE ANLAYTICS in medical research and healthcare delivery is much more easily understandable, and accurate, thus this “resistance to its use” needs to be rectified if we as a society are to make headway in providing ACCURATE DIAGNOSES and ACCURATE TREATMENT PLANS for the INDIVIDUAL patient - I think we all realize that accomplishing these goals are essential to bringing our medical health care costs down to reasonable levels (eliminating all the un-needed tests that are now done because we do not have this needed accuracy.....).
NOW TO THE PUNCHLINE: One aspect of this “resistance to use modern data analytic methods” is expressed by: “But what is the CONFIDENCE LIMIT, or other statistical significance test, to this ACCURACY SCORE of the PA model?” which is heard so often coming from those who have not yet made the “180 degree turn about” in their thinking about how data analysis is done in this 21st century. Today I ran into the following BLOG / INTERVIEW conducted with three of my closest colleagues in the PA field: KARL REXER, JOHN ELDER, and DEAN ABBOTT. Instead of CONFIDENCE LIMNITS or other “traditional tests of significance”, they point out three methods that they use to determine whether or not a model is an accurate, meaningful representation that will prove valuable to a business, organization, or individual, and also determine which model is the “best” of several that may have been created. The methods Karl, John, and Dean use to determine “significance” of their models fall into these 3 categories:
These are all different than what is used in “traditional frequentist statistical” significance testing.
To read the full, very detailed interview, please go to the following cite: