A Data Science Central Community
This question was posted on our LinkedIn group. Here's my answer
How do you define successful? Is it based on income (as an employee), revenue (as an entrepreneur or investor), or based on how much ROI you provide to your clients / employers?
In my opinion, success = consistently providing incremental lift, regardless of model used, in at least one specific field (e.g. fraud detection) across multiple applications and varied data sets (mostly big data) over a period of 2+ years, using mostly simple, scalable, replicable and stable models and implementations. Part of the success is also in seeking the right data (including external data sources), the right compound or base metrics, understanding and cleaning the data, and quickly detecting the core of the problems in a way similar to a lean six sigma approach. Part of the success is also being able to guess what executives want to address or accomplish (requires good communication and listening skills).
Now the difficult part is how to measure incremental lift: for instance, in fraud detection, it could be reduction of false positives to such an extent that it is worth to pay you your salary, and that natural causes for decrease in false positives have been ruled out. The value of a (successful) data scientist is easier to measure in some sort of A/B testing or DOE (design of experiment) framework. Ironically, the people most qualified at correctly assessing the value of a data scientist (or any employer) are... data scientists.
What is your answer?
Related topic: What is a Data Scientist?
Interesting question, Vincent, in that the question is more ambiguous than we would likely phrase a topic for data investigation. On the one hand, it could be "what is success in a data scientist," which is the question you responded to. On the other hand, it could be "what are the attributes of a successful data scientist?" To that second question, I would respond:
1. Ability to discern patterns in data
2. Creativity to develop new variables
3. Willingness to follow the data toward truth, even when it denies a desired outcome, such as a statistically significant intervention or change
4. Basic statistical and number knowledge--the tools don't need to have the sophistication of a graduate statistician, althought that can be helpful
5. A logical sense of which dependent variable to focus on
6. Willingness to research and learn new tools as needed
Yea, we are in an odd position to be pay for performance types, and to also be those best suited to measure it. I always use marketing margin in marketing applications. So > incremental revenue - fully loaded costs, including cost of me.
This doesn't leave a lot of headroom for R&D for new techniques on the firms clock. I am reemerging from a sabbatical to up my game. The only conclusion I have from reemerging that is that it will always need upping, and that the next generation of data scientists (twenty somethings) are in progress to blow the lid off of our current techniques we have today. I am convinced of that. My only mentoring opportunity to is to listen, learn, and offer up that social skills and business acumen go a long way.
Everything originally stated about making sure you consistently quantify how results and recommendations impact business performance, plus what Sam said, plus:
7. The ability to interpret and share complex results in applied mathematics / IT using language and examples that non-math/non-IT people understand and trust -- without talking "down" to people.
8. An understanding of business issues, and the ability to connect the work with the business to create recommendations for action rather than just facts. Anybody can make a report full of numbers - a good data scientist can find the solution to business problems.
9. Attention to detail - never delivering a wrong number.
10. Stick to the facts - do not interpolate or quote from memory - make sure every number and fact you quote is from a source you all can refer to and verify.