<p>Hello Ilya.</p>
<p>Could it be that you talk about text mining? Because I was referring to the alpha-numeric type of data mining. Anyway, the data mining task is the same in both, i.e. to search for hidden patterns of behavior or underlying connections. In other word: DM means searching for new hypotheses and should come BEFORE the statistics test. Therefore DM is using a different methods than stat. </p>
<p>Edith</p>
<p>Not exactly: to get Internal statistics you first need to manipulate with texts: parse them obtaining phrases by clauses, for sentences and paragraphs. And only after that you calculate weights; where the weight refers to the frequency that a context phrase occurs in relation to other context phrases.</p>
<p>Therefore, statistics is one of two components and the result.</p>
Strange my answer disappeared; here again:<br />
Data mining is everything that Statistics is not, conceptually and practically.<br />
1. Data mining role is to define hypotheses, to be run later by Statistics.<br />
2. DM is algorithms while Statistics is a set of mathematical formulas.<br />
3. DM is made for exploration of actual data, Statistics is made for controlled "lab" environment.<br />
4. Statistics need provision of knowledge. DM generates it.<br />
5. DM as an exploration tool does not necessarily require the target…
Edith Statistics has a strong set o…tag:www.analyticbridge.datasciencecentral.com,2010-07-19:2004291:Comment:742242010-07-19T15:23:01.003ZK.Kalyanaramanhttps://www.analyticbridge.datasciencecentral.com/profile/KKalyanaraman
Statistics has a strong set of principles based on axioms of probability. It is a subject where the random sample is the starting point. The basic difference between statistics and data mining is in the way data is generated for a study. In fact, when you have a problem, you propose to study, you define the associated population, you choose a method of sampling, you sample and then use the different statistical methods of computing to infer. In laboratory experiments data is generated using…
I hope computational statistics is on favored side : )
did you mean exploratory? I feel "data mining" is nothi…tag:www.analyticbridge.datasciencecentral.com,2009-07-29:2004291:Comment:491552009-07-29T09:45:06.917ZKesavan Hariharasubramanianhttps://www.analyticbridge.datasciencecentral.com/profile/KesavanHariharasubramanian
I feel "data mining" is nothing but "data analytics" aided by "computational statistics" . You need both to actually mine for knowledge. For instance, Market Basket Analysis is a type of data analytic which requires computational statistics such as Probabililty and Regression measures to beget knowledge.
i would say Statistical computing is a confirmatory technique and Data mining is an explainratory technique.
I allways thought there was a huge difference, you know but now Im not so sure, I am assisting with a 'Random Forrest' implementation right now, its a bit off-piste for me, athough ironically a long time ago I was involved in early data mining experiments which were not as methodologically sound as they are today. Anyway implementing this Random Forrests application makes clear to me that not only from the technology perspective but also in terms of the math and the visualisation potential,…
Re the idea of meaninglessness: Blind signal processing is the analysis of data in which one doesn't know what components are there, or their meanings. In contrast, recognition techniques such as speech recognition and pattern recognitioni are used when one is searching for particular features.
