Subscribe to DSC Newsletter

The new generation of data miners, statisticians and scientists

  1. They don't spend much time in developing traditional algorithms anymore (e.g. search engine). They don't reinvent the wheel, they re-use existing algorithms, even though they could do a better job by re-inventing these old algorithms. But they think at significantly improving existing algorithms (e.g. search engine), both from a business and technical point of view. Sometimes they even do a bit of hacking and reverse engineering.
  2. They spend a lot of time identifying useful data sources (most of the time, external sources), and checking how they could be exploited and leveraged.
  3. They develop meta algorithms: that is, algorithms that rely on lower level techniques such as taxonomy creation, spell check, keyword associations, news feed aggregation etc. They blend these low level algorithms to produce automated high end products.
  4. Modern data mining is to old data mining what coding in Python is to coding in assember, Perl, or C.
  5. These data miners don't call themselves data miners anymore: they are statisticians, scientists and data miner all at the same time, combining all these roles in one person. Indeed, they are even business people and sales guys (selling internally - to their boss and employees reporting to them, and externally - being invited at important sales calls by their company).
  6. Many times, they succeed not by analyzing data, but by applying analytic thinking to business problems. Example: we've increased our signups by a factor 3 on our network thanks to analytic thinking - however no data was leveraged to get to the action resulting in signups improvement. 
  7. They sometimes develop great products without having to produce one line of code: our news feed optimizer is a good example of a technology that was developed without writing any code, but instead relies on widgets to distribute the news, manual selection of good feeds,  and intelligent use of feed aggregators such as twitterfeed or feedburner.
  8. They find the information they are looking for on the web (and more and more from social networks as opposed to Google), but not from books nor University training. 
  9. They are working on algorithm speed optimization and scalability, even (and especially!) when developing prototypes.
  10. They are thinking so much out-of-the-box, that they are not generally a good fit for the corporate world. Instead, they become successful entrepreneurs.
  11. They are domain experts more so than coding experts. But they are also generalists across many fields and many tech skills (e.g. all of them are very familiar with SQL and R). At the same time they have deep expertise in a few (one or two) specialized domains and programming languages.

 

Views: 306

Comment

You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

Comment by Vincent Granville on May 26, 2011 at 2:47pm
Are Universities prepared to adapt curriculum accordingly?
Comment by Karl Rexer on April 25, 2011 at 1:40pm
Nice post.  I think this is also related to new trends in doing science these days.  I see more cross-disiplinary and results-oriented thinking.  E.g., it's not unusual for a young person working in biological research to look online to find pieces of what others have done, and then using those pieces in novel ways to achieve a result -- results that would have taken much much longer if they would have had to have done all the pieces themselves.
Comment by Sandeep Raut on April 24, 2011 at 7:42pm
excellent !!
Comment by Talbot Katz on April 23, 2011 at 8:49pm
Very nice!  But luckily us old-school types still have a couple of tricks up our musty dusty sleeves...

On Data Science Central

© 2019   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service