Subscribe to DSC Newsletter

Hi, I've been to a data mining seminar and webinar, but still have yet to be become someone who actually mines data. Can anyone suggest the best books for learning what data mining is and isn't? For anyone in the field, how did you actually get started?

Tags: Data, Mining, and, books, entry, field

Views: 8763

Reply to This

Replies to This Discussion

I would also recommend "Data Mining: Concepts and Techniques" by Jiawei Han and Micheline Kamber in addition to the list.
It is one of the books that could help you get a general picture of essential topics in the data mining field. It was in the course literature of an introductory Data Mining course in my university.
Best regards,
I'm fairly new to data mining myself, but I've been reading a book called "Data Mining: Practical Machine Learning Tools and Techniques" by Witten and Frank. The book is a little light on theory, but it has a pretty good overview of many different techniques for data mining, and it has enough detail that you can start applying some of the techniques on your data immediately. I'm using the book in conjunction with a course on decision trees that I'm taking at, which I would recommend.

"Elements of Statistical Learning" seems to be the closest thing to an authoritative text on data mining that I've found, and it has more theory than the book I'm using for my class, so that may be of use to you. I'm planning on studying it more after I've finished taking my class.

Keep in mind that I am also a novice, so please take my advice with a grain of salt :).
I've checked the table of contents. It sounds like a very interesting book. I just bought it. I will also add it in my data mining directory (book section) on DataShaping.
I started in data mining with a PhD (I'm currently finishing it). Regarding the book, I advice, like Tom, the Witten and Frank one for practitioners and "Introduction to Data Mining" (Tan et al.) for researchers.
The 2 best books I've read in the last few years are:
Hastie, Tibshirani & Friedman's "Elements of Statistical Learning" (mentioned elsewhere).
Soumen Chakrabarti's "Mining the Web"

To the other question, I got started as a (bio)statistician, and drifted over to the 'dark side' of datamining in the mid to late 90's. The books I recommend are fairly technical. The second one is also more focused/specific.
Hi Diahn,

You might take a look at this: as well as the other material on the model & mine site. You may also check out the (predominantly positive) Amazon reviews:

The author, Dorian Pyle, recently joined Analytic Bridge.
Data mining cannot be concretely defined, so there is no book that can cover everything that it can be.
Most books cover the most popular algoritms and techniques.
The best way to learn it is not to read about it but do it. If you are a java programmer, then consider:
Java Data Mining: Strategy, Standard, and Practice: A Practical Guide for architecture, design, and implementation (The Morgan Kaufmann Series in Data Management Systems) by Mark F. Hornick , Erik Marcadé , Sunil Venkayala
For books, The Data Miners has some good ones.

However, after reading a couple: the best way to start is to start. Find things to work on. Always ask yourself what you are doing and why it works. Talk to people about what you are doing. Take your time, especially on your first few projects. Find ways to double-check your results.

Every book, class, and paper in the world tells you how things work. About the only way of finding out how things fail is by having projects blow up on you :-).

How I got started: a classmate from my stats classes got me hired into a data mining company (NeoVista) and I started doing projects.
Data Mining: Practical Machine Learning Tools and Techniques (Second Edition)

Ian H. Witten, Eibe Frank
right now I'm reading "Data Mining Techniques" by Michael J.A. Berry & Gordon S. Linoff, and i'm finding it the best book on Data Mining so far. most of the other DM books are written by statisticians with limited experience of business data & requirements. This book is written by two of the best known DM practitioners and i just love the book!

another book i would recommend is "Multivariate Data Analysis" by Hair, Black, Babin, Anderson & Tatham.


On Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service