A Data Science Central Community
Currently, high schools have some of the biggest challenges determining the individual paths of students that enter the school yard each day. In fact, both institutions and industries would like to know, for example, which student will graduate from a particular course program and which student will need assistance in order to graduate on time. Which graduates are most receptive to what kind of high education institution? Which non-graduates are most receptive to what kind of intervention to avoid dropping out? Furthermore, traditional issues such as teacher-to-student ratio, enrollment management, cultural competency mismatch, and curriculum engagement continue to motivate educators, administrators and policy makers to search for better solutions to graduate all students on time with a certain level of intellectual capacity.
One way to efficiently, effectively and equitably address these challenges is through information communication technology (ICT), particularly the predictive analytics components. Basically, education stakeholders can leverage the advancement in ICT enabled analytics to gain intelligence (i.e. hindsight, insight, and foresight) from the vast content cumulated daily from the classroom-to-community. These advancements (e.g. data mining, Big Data, data warehousing, cube analysis, etc) enables stakeholders to uncover and understand hidden patterns in vast content resources including databases, files, and unstructured repositories (e.g. Internet).
Traditionally, these patterns have been ignored with dire consequences that have impacted the country’s economic and education competitive advantage (i.e. high unemployment rates, high public assistance reliance rates, high incarceration rates, lower higher education completion rates and lower entrepreneurship rates). Similarly, across other industries and disciplines (i.e. social science, political science, econometrics, etc) patterns are being uncovered using data mining models and used to predict individual academic and behavioral choices with high accuracy. For example, local police departments are using them to plan for drunk driver stop points during the holiday season. In some cities, insurance companies are using them to determine who should get a notice of premium increases based on crime rate change in the city. In fact, some credit card companies are using them to determine whose credit limit should be adjusted because of the current government shutdown. Clearly, in each incident, information is necessary to take action, or to allocate resources with an accurate estimate of how many stakeholders will take a particular path. As a result, the hindsight, insight or foresight obtained from these examples impact decisions that constantly affects output and outcome as warranted.
Nevertheless, the intent of this short article is to spur awareness of the capabilities of predictive analytics and its potential application in urban education systems like the District of Columbia. Furthermore, to spur the demand for additional time and resources allocation to produce a paper that uses Meta analysis and predictive analytics techniques to make a comparison of public and charter school graduation rate performance in the District of Columbia and Minnesota – both with high concentration of public and charter schools. Ideally, such a paper would comply with the guidelines and steps in the CRoss-Industry Standard Process for Data Mining (CRISP-DM). This compliance would have future implications as the demand for predictive analytics increases and more algorithms are created, in part, because CRISP-DM ensures good practices that other education institutions facing inequitable outcomes for student performance can follow.
In this context, predictive analytics and data mining are based on some a few fundamental concepts. They both rely on a combination of essential methods: Classification, categorization, association and visualization. Classification can identify clusters, and separates individuals under study. Categorization uses rule induction algorithms to handle categorical outcomes, such as “dropout”, “transfer”, “persist” or “stay.” Estimation includes predictive functions or likelihood and deals with continuous output variables, such as GPA and household income. Finally, visualization uses interactive graphs to demonstrate mathematically induced rules and scores. In other words, it provides more sophistication than standard Microsoft Excel pie or bar charts. Incidentally, visualization is used primarily to depict three-dimensional geographic locations of mathematical coordinates.
For more information on predictive analytics, CRISP-DM, or “Analyticship” – analytics driven leadership to improve performance and productivity of an enterprise - refer to my top 15 sources below for more in depth knowledge.
Personally, the challenge was not which book to read but how to maximize the intelligence from each one.