Subscribe to DSC Newsletter

After posting my article about 6000 companies hiring data scientists, I looked at the top ones in the list, and I compared with the number of job interviews (invites for screening interviews) that I receive almost daily.

There are some interesting facts. A few of the top companies never contact me (even though I have the perfect experience and I live a few miles away, though I've heard they like to relocate people rather than hire locals, as they believe that relocated people make for more committed employees). Others like GE, IBM, British Petroleum, Starbucks, Capital One, Deloitte do contact me, though if you look at my resume (accessible from the resume section below), you would think it's a bit surprising. This can also have to do with corporate culture, with some having aversion to hiring disruptive people.

Evidently there are some biases: I blog a lot and occasionally criticize some companies or publish proprietary material (from my own data science research lab), and some companies just don't like that. I also look more like a start-up guy, so I receive far more interest from start-ups. Still, a few (minority) of the biggest companies have never shown interest in me (despite my numerous LinkedIn connections with them). Maybe they never heard about me (I don't think so), or they believe I can't bring value to them, or believe that my experience is not a good fit, or believe that I am too expensive.

Solution to data scientists being too expensive

Regarding "too expensive", with income now above $400k/year (since I changed from "employee" to full-time "co-founder/owner") , that's true, I'm too expensive, but they don't know about that. Actually in my last "employee" positions back in 2012, I was making $150k-$170k, which is not high by Bay Area standards. I worked mostly in California (it has a 10% income tax on such salaries)  although I was physically in Washington state (Washington has no income tax). So I could ask a salary $15,000 below what local candidates would ask. I mention this because for California or New Jersey or New York employers, this is one way to reduce your costs: hire  telecommuting employees located in states with no income taxes (plenty of great talent can be found in Austin, Texas, as well as near Seattle, Washington - both places with no income tax). Real estate is also well below California or East Coast in these locations. A guy like me probably has a $5,000 monthly mortgage in the Bay Area, mine is $2,500, with a beautiful house, though I was lucky: I sold my house in the Bay Area and then moved to Seattle at the right time (2006). All in all, I could ask a salary $50,000 below what a Bay Area applicant would ask, and still have a better quality of life with no financial stress. This was true until 2012, now this window of opportunity (for employers) is closed, especially since not only my income was boosted by a factor 3, but my job security is also much better now (Note: I don't recommend you to switch from employee to founder; most fail; you should test the waters first, make sure you don't have financial stress, and that you love "doing business" more than you love doing statistics; and that you are good at managing finances, finding the right partners, outsourcing, delegating, business hacking, "lean start-ups", marketing, product, vendor selection, market understanding, and a bunch of other things, and most importantly passion-driven and focused on generating both value and profits).

But I'm sure there are plenty other people who did the same move as I did (probably located in Austin, Texas, or Seattle now). Identifying them might be worthwhile, as they need lower salaries. Or people who benefited from the spectacular stock market recovery might also be happy with lower salaries - a weapon that they can use to compete with other data scientists. Of course, if you need a top gun, it might be worth paying him $500k/year, but then you might need to provide him with the right job title and responsibilities (possibly external consultant or chief scientist, if you don't already have one), otherwise it could create jealousy and bad team spirit.

Now don't get me wrong, I'm not looking for a job and probably will never again, so not contacting me is a smart decision, but employers don't know that either. Also, I don't fit well as an employee, but again these companies don't know that until I've worked 2 years for them. Also some of the companies interested in discussing a job opportunity with me might just want to spend a day with me, at no cost, and steal as much IP (intellectual property, ideas) as they can from me, with no intention of hiring me in the first place.

Some companies - including one that I regularly criticize, publicly posting solutions to improve their revenue using better data science - keep contacting me regularly though I'm not a good fit (for instance, they are consistently looking for developers with some statistical knowledge, but I am not a developer). I'm sure that this company I'm talking about has boosted its revenue by many, many million dollars thanks to reading and applying my solutions, so at least they benefit from me (so do all their competitors who also read my postings).

Hiring process needs to be improved

When hiring a data scientist, sometimes the hiring company does not have data scientists on staff to interview you. They use statisticians (sometimes called director of analytics or market research analyst) to assess your data science technical knowledge. But a data scientist might not know all the modern flavors of logistic regression (for instance) and might be perceived as incompetent. But the data scientist know lots of things - API's, computational complexity, ROI optimization, KPI design and data collection, relevancy engines - that the business statistician might not know or does not value. The data scientist applicant ends up being labeled as incompetent and not hired. This brings an interesting question:

Who should interview data scientist applicants? Should it be developers, or statisticians, or business people? And how can a data scientist (interviewee) convince a statistician (interviewer) that his knowledge is critical? Answer: By discussing success stories, where your mix of business acumen / engineering / big data (non traditional) statistics helped a project succeed. 

Sometimes statisticians or business analysts are afraid by data scientists, and will write a bad review after the interview, with recommendation not to hire, out of fear. It would make sense to find and hire an external (third party, neutral) consultant (data scientist him/herself) to interview the candidate.

Mismatch between resumes and job ads

The general problem for lack of analytic talent, and for employers as well as employees wasting tons of time in fruitless job interviews, is well illustrated when you compare resumes and job ads below. One of our readers (look at the comments below) mentioned that the skill R (one of the too most popular programming languages used by data scientists, the other one being Python) is never picked up by automated search tools used by recruiters to parse resumes, because it's just one letter. So it does not matter if you have R or not in your resume, if the hiring comparing uses poor automated filtering tools to narrow down on candidates with desired skills, such as (especially and ironically) R. Another reason why companies should not rely exclusively on these tools if they are looking for someone with R (the programming language) in their resume. They should also manually check profiles on LinkedIn and on our network (on AnalyticBridge.com and on DataScienceCentral.com) as some people choose not to be on LinkedIn. Or find a skill most frequently associated with R, (maybe Python, SAS, Perl, Matlab) and use it as a proxy to find R programmers. Interestingly, a search for R on LinkedIn returned the following profiles: International Financier, Account Executive R+L Carriers, Recovery at Citizen's Acting Together Can Help, R M at DHFL, Critical Care Transport R.N, R.R.T at LifeLine Ambulance Service, and so on. None have the R skill you are looking for, indeed none of them are analytic people, not even remotely. Eventually, search technology will address this issue - it's a very easy fix, for search technology engineers: use a white-list of keywords, do not remove 1-letter or 2-letter words found in your white list (including R for R programming, IT for Information Technology, and so on) when processing a user query. 

Resumes

The following sample resume extracts are from actual data science practitioners who agreed to be featured in my book. In order to allow these professionals to delete or update their resume, I have made the resumes accessible on the web, at http://bit.ly/1j4PNuP. You can find many more resumes and profiles by doing a search on LinkedIn with the keyword data science or related keywords, or by browsing Data Science Central member profiles.

Included here in the list are people from different locales and backgrounds in an attempt to cover various aspects of data science. The emphasis is on providing a well-balanced mix of professional analytic people — both junior and senior, people with big company or startup experience or both, top stars and people with average resumes (sometimes the most faithful employees), and corporate or consultant or academia-related people. These resumes have been shortened and reformatted. I also added mine, to provide an example with patents and classical big data science such as credit card fraud detection or digital analytics.

By comparing these resumes to the job ads below, it seems that Human Resources departments are sometimes looking for a unicorn — a professional with a skill mix that does not exist. Sometimes they hesitate between hiring a data engineer, a business analyst, or a data scientist. I encourage employers to seek out and hire people with strong potential and train them, rather than looking for the rare and expensive unicorn who often turns out to not be the best fit (and may only be happy running their own business). It's sometimes easier to hire a software engineer or business guy (MBA) and have him learn statistics, than the other way around, especially at the beginning of a big project. Long term, not hiring data scientists mean losing against competition, better equipped to extract value out of data. 

Typical skills mentioned in these six resumes are: Programming language R, Python, Matlab, MongoDB, SQL, MySQL, statistics/machine learning (KNN, Decision Trees, Neural Networks, linear/logistic regression) and finally Java, JavaScript, Tableau, Excel, Recommendation Engines, Google Analytics. Of course, no one has all of them listed: 50% have R, and 50% have Python (the two most common ones),

You should check these resumes to see career progression (lateral or vertical), and the degrees, ongoing training and certifications these people have gained. 

Job Ads

Consider the sample (yet actual and recent) job ads found at http://bit.ly/1hVAmr7. The skills most frequently listed are: Python, Linux, UNIX, MySQL, Map-Reduce, Hadoop, Matlab, SAS, Java, R, SPSS, Hive, Pig, Scala, Ruby, Cassandra, SQL Server, and NoSQL. Many times, several of these skills are listed (5-6), while applicants only have a few (2-3) in their resume.

Related articles

Views: 21999

Comment

You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

Comment by Vincent Granville on January 19, 2014 at 6:40am

@Thomas: There are two types of statisticians: Those who associate themselves with ASA (American Statistical Association), and those who don't. The latter constitutes a bigger segment, but you seem to ignore them. The ones that ASA promotes need considerable training to become a data scientist. I started my career as a business statistician (after a PhD in computational statistics) so I am indeed a business statistician. They are far more numerous than you think, but usually don't care calling themselves statistician.

Comment by Thomas Speidel on January 18, 2014 at 6:22pm

Vincent, what qualifies those people as statisticians? What education, career path and certification makes them (disguised) statisticians?

You appear to have a very contorted idea of what statistics is and statisticians do.  I have been following your posts and this is the conclusion I came to.  I can't fathom what horrible experiences you must have experienced with statistics, but I can guarantee they are not a genuine reflection of who we are as a group. We think we have a lot to offer to data science.  After all, we have been doing this successfully for at least 100 years.

You seem to want to lay blame on statisticians for not being a "hot commodity".  In truth, statisticians are rare and the people you listed are not statisticians. I don't know what the graduation numbers are, but until 2 years ago they were very low for the US and Canada, and would not support your claim. The list of certified statisticians are usually public, so you take a look for yourself.   You can mostly find us in health research, government and universities.  In fact, in another post you complained that the professional organizations are biased towards clinical trials and biostatisticians. 

If you could, for just one minute remove your deep hatred for statisticians that is perceived from your posts, and look at the progress we contributed to in fields such as health research, genetics, econometrics, actuarial science, maybe, just maybe, you can see how we can share our experience to play a crucial role in data science. 

Comment by Theresa Doyon on January 18, 2014 at 12:10pm

@ThomasSpeidel - I agree and would add that we forget to ask who is doing the hiring.  With all the big data hype, a lot of IT departments have been been doing the hiring and put all the emphasis on data and programming skills, but not analysis. 

@VincentGranville - Don't take the list of programming languages in some of these postings literally. Last year I interviewed with a company that advertised for SAS and R when they in fact were using KXEN. It wasn't clear if it was an intentional attempt to confuse competitors or if the problem was with an HR-drafted job description. 

Comment by Vincent Granville on January 18, 2014 at 11:53am

@Thomas: The business statisticians are usually not called statisticians, but instead market research analyst, director of analytics, data scientist (erroneously), fraud detection expert and so on. Anyway, despite the job title, they are indeed traditional business statisticians, and therefore, statisticians.

Comment by Thomas Speidel on January 18, 2014 at 11:48am

"They use statisticians to assess your data science technical knowledge."

This is absolutely not true.  How many statisticians are employed by private companies? There are very very few of them.  It might be the case for pharma and government and a few other niche, but that's where it ends.  Truth is, most companies don't even know how to spell the word statistician, let alone having one. 

But you raise a good point in asking who should interview a data scientist. 

Comment by Scott Eilerts on January 18, 2014 at 9:51am

Chances are, the big places have too many resumes to sift through already. The overall job market is still on the soft side.

How much of this problem is due to the fact that a key skill, R, is a difficult search term for both job seekers and recruiters? Search for R, or even R programming on LinkedIn, and you find R&D, Toys R Us, DR, but maybe only 10-20% of jobs have the R skill listed explicitly. Likewise, "R Programming" turns up almost no results since recruiters usually just put "programming language such as SAS, R, Python" in listings. For R programmers, this is one of the more cynical reasons to know enough python to list it as a skill as well, since a search for R python returns almost 100% data science postings. Presumably, good recruiters search profiles in the same way.

Comment by Thomas Speidel on January 18, 2014 at 9:42am

What you are describing is a trend that has many of us have concerned.  It can be described by way of psychology: we tend to gravitate around the things we understand and eschew what we don't understand. 

What do most companies understand? Data.  How to capture massive amount of it (quantity is easier than quality). How to store it.  How to retrieve it.  How to manage it.

What do most companies do not understand? Analysis. What to measure, how to measure it, what impacts those measures, how to properly analyze those measures, how to turn results into actions.

By thinking that analysis is simply an extension or a side effect of data, there is a bias and an inclination to look for the skills that are typically associated with data, instead of analysis. This can have devastating effects, especially for sensitive analysis.  As an example, think of a medical diagnostic manufacturer hiring the wrong talent to analyze the effectiveness of their machines.

The idea of training other professionals on stats is an illogical one and reflects how distorted and poorly understood the objectives have become.  It's as if a hospital administrator hires an accountant and trains the same to be a surgeon on the grounds that they both doing work for the same patient. What companies can do, is to train professionals on the subject matter and the field so as to permit cross culture hiring.

On Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service