What is the difference between statistical computing and data mining? - AnalyticBridge2021-06-13T16:25:28Zhttps://www.analyticbridge.datasciencecentral.com/forum/topic/show?id=2004291%3ATopic%3A8561&commentId=2004291%3AComment%3A22631&x=1&feed=yes&xn_auth=noHello Ilya.
Could it be that…tag:www.analyticbridge.datasciencecentral.com,2015-03-14:2004291:Comment:3219612015-03-14T14:01:55.712ZEdith Ohrihttps://www.analyticbridge.datasciencecentral.com/profile/EdithOhri
<p>Hello Ilya.</p>
<p>Could it be that you talk about text mining? Because I was referring to the alpha-numeric type of data mining. Anyway, the data mining task is the same in both, i.e. to search for hidden patterns of behavior or underlying connections. In other word: DM means searching for new hypotheses and should come BEFORE the statistics test. Therefore DM is using a different methods than stat. </p>
<p>Edith</p>
<p>Hello Ilya.</p>
<p>Could it be that you talk about text mining? Because I was referring to the alpha-numeric type of data mining. Anyway, the data mining task is the same in both, i.e. to search for hidden patterns of behavior or underlying connections. In other word: DM means searching for new hypotheses and should come BEFORE the statistics test. Therefore DM is using a different methods than stat. </p>
<p>Edith</p> Not exactly: to get Internal…tag:www.analyticbridge.datasciencecentral.com,2015-03-13:2004291:Comment:3216882015-03-13T16:33:44.161Z06av5ms9n1391https://www.analyticbridge.datasciencecentral.com/xn/detail/u_06av5ms9n1391
<p>Not exactly: to get Internal statistics you first need to manipulate with texts: parse them obtaining phrases by clauses, for sentences and paragraphs. And only after that you calculate weights; where the weight refers to the frequency that a context phrase occurs in relation to other context phrases.</p>
<p>Therefore, statistics is one of two components and the result.</p>
<p>Not exactly: to get Internal statistics you first need to manipulate with texts: parse them obtaining phrases by clauses, for sentences and paragraphs. And only after that you calculate weights; where the weight refers to the frequency that a context phrase occurs in relation to other context phrases.</p>
<p>Therefore, statistics is one of two components and the result.</p> Strange my answer disappeared…tag:www.analyticbridge.datasciencecentral.com,2011-06-12:2004291:Comment:1141262011-06-12T01:42:13.244ZEdith Ohrihttps://www.analyticbridge.datasciencecentral.com/profile/EdithOhri
Strange my answer disappeared; here again:<br />
Data mining is everything that Statistics is not, conceptually and practically.<br />
1. Data mining role is to define hypotheses, to be run later by Statistics.<br />
2. DM is algorithms while Statistics is a set of mathematical formulas.<br />
3. DM is made for exploration of actual data, Statistics is made for controlled "lab" environment.<br />
4. Statistics need provision of knowledge. DM generates it.<br />
5. DM as an exploration tool does not necessarily require the target…
Strange my answer disappeared; here again:<br />
Data mining is everything that Statistics is not, conceptually and practically.<br />
1. Data mining role is to define hypotheses, to be run later by Statistics.<br />
2. DM is algorithms while Statistics is a set of mathematical formulas.<br />
3. DM is made for exploration of actual data, Statistics is made for controlled "lab" environment.<br />
4. Statistics need provision of knowledge. DM generates it.<br />
5. DM as an exploration tool does not necessarily require the target to be stated at start. Statistics require a target function. The same about the interrelations among variables – DM does not assume it upfront, Statistics require it to be clearly defined.<br />
6. Statistics (if satisfied the required conditions) produce an optimum, DM dosn’t.<br />
7. DM knows to handle complexity, incomplete data, dynamics, and unknown population mix. Statistics is much restricted about these.<br />
8. In addition, GT type of data mining can observe rare correlations (such as irregularities, mutations, etc). Statistics is blind by definition to any effect that is not known apriori... That is why data mining was invented.<br />
<br />
<br />
Edith Statistics has a strong set o…tag:www.analyticbridge.datasciencecentral.com,2010-07-19:2004291:Comment:742242010-07-19T15:23:01.003ZK.Kalyanaramanhttps://www.analyticbridge.datasciencecentral.com/profile/KKalyanaraman
Statistics has a strong set of principles based on axioms of probability. It is a subject where the random sample is the starting point. The basic difference between statistics and data mining is in the way data is generated for a study. In fact, when you have a problem, you propose to study, you define the associated population, you choose a method of sampling, you sample and then use the different statistical methods of computing to infer. In laboratory experiments data is generated using…
Statistics has a strong set of principles based on axioms of probability. It is a subject where the random sample is the starting point. The basic difference between statistics and data mining is in the way data is generated for a study. In fact, when you have a problem, you propose to study, you define the associated population, you choose a method of sampling, you sample and then use the different statistical methods of computing to infer. In laboratory experiments data is generated using principles of experiments, like, randomization, replication and local control, then use appropriate statistical computing to infer. But in data mining the data is available, you try to identify patterns and use them for your requirement. In fact, data mining is termed as 'dirty statistics'. What is required is to use the pattern, propose experiments and generate new data, then it will be leading to statistical computing. I hope computational statisti…tag:www.analyticbridge.datasciencecentral.com,2010-02-23:2004291:Comment:612732010-02-23T09:56:03.968ZTejamoy Ghoshhttps://www.analyticbridge.datasciencecentral.com/profile/TejamoyGhosh
I hope computational statistics is on favored side : )
I hope computational statistics is on favored side : ) did you mean exploratory?tag:www.analyticbridge.datasciencecentral.com,2010-02-23:2004291:Comment:612722010-02-23T09:53:21.329ZTejamoy Ghoshhttps://www.analyticbridge.datasciencecentral.com/profile/TejamoyGhosh
did you mean exploratory?
did you mean exploratory? I feel "data mining" is nothi…tag:www.analyticbridge.datasciencecentral.com,2009-07-29:2004291:Comment:491552009-07-29T09:45:06.917ZKesavan Hariharasubramanianhttps://www.analyticbridge.datasciencecentral.com/profile/KesavanHariharasubramanian
I feel "data mining" is nothing but "data analytics" aided by "computational statistics" . You need both to actually mine for knowledge. For instance, Market Basket Analysis is a type of data analytic which requires computational statistics such as Probabililty and Regression measures to beget knowledge.
I feel "data mining" is nothing but "data analytics" aided by "computational statistics" . You need both to actually mine for knowledge. For instance, Market Basket Analysis is a type of data analytic which requires computational statistics such as Probabililty and Regression measures to beget knowledge. i would say Statistical compu…tag:www.analyticbridge.datasciencecentral.com,2009-07-03:2004291:Comment:476712009-07-03T04:59:16.220ZRana Pratap Singhhttps://www.analyticbridge.datasciencecentral.com/profile/RanaPratapSingh
i would say Statistical computing is a confirmatory technique and Data mining is an explainratory technique.
i would say Statistical computing is a confirmatory technique and Data mining is an explainratory technique. I allways thought there was a…tag:www.analyticbridge.datasciencecentral.com,2009-07-02:2004291:Comment:476242009-07-02T05:45:59.026ZJohn A Morrisonhttps://www.analyticbridge.datasciencecentral.com/profile/JohnAMorrison
I allways thought there was a huge difference, you know but now Im not so sure, I am assisting with a 'Random Forrest' implementation right now, its a bit off-piste for me, athough ironically a long time ago I was involved in early data mining experiments which were not as methodologically sound as they are today. Anyway implementing this Random Forrests application makes clear to me that not only from the technology perspective but also in terms of the math and the visualisation potential,…
I allways thought there was a huge difference, you know but now Im not so sure, I am assisting with a 'Random Forrest' implementation right now, its a bit off-piste for me, athough ironically a long time ago I was involved in early data mining experiments which were not as methodologically sound as they are today. Anyway implementing this Random Forrests application makes clear to me that not only from the technology perspective but also in terms of the math and the visualisation potential, data mining and statistical computing are asymptotic to use an odd metaphor. There are other current trends pointing that way, particularly in the 'semantic integration' and optimised search space, in my view; for what its worth. Re the idea of meaninglessnes…tag:www.analyticbridge.datasciencecentral.com,2009-06-29:2004291:Comment:474662009-06-29T16:34:59.402ZLinda Ann Seltzerhttps://www.analyticbridge.datasciencecentral.com/profile/LindaAnnSeltzer
Re the idea of meaninglessness: Blind signal processing is the analysis of data in which one doesn't know what components are there, or their meanings. In contrast, recognition techniques such as speech recognition and pattern recognitioni are used when one is searching for particular features.
Re the idea of meaninglessness: Blind signal processing is the analysis of data in which one doesn't know what components are there, or their meanings. In contrast, recognition techniques such as speech recognition and pattern recognitioni are used when one is searching for particular features.