Subscribe to DSC Newsletter

After posting my article about 6000 companies hiring data scientists, I looked at the top ones in the list, and I compared with the number of job interviews (invites for screening interviews) that I receive almost daily.

There are some interesting facts. A few of the top companies never contact me (even though I have the perfect experience and I live a few miles away, though I've heard they like to relocate people rather than hire locals, as they believe that relocated people make for more committed employees). Others like GE, IBM, British Petroleum, Starbucks, Capital One, Deloitte do contact me, though if you look at my resume (accessible from the resume section below), you would think it's a bit surprising. This can also have to do with corporate culture, with some having aversion to hiring disruptive people.

Evidently there are some biases: I blog a lot and occasionally criticize some companies or publish proprietary material (from my own data science research lab), and some companies just don't like that. I also look more like a start-up guy, so I receive far more interest from start-ups. Still, a few (minority) of the biggest companies have never shown interest in me (despite my numerous LinkedIn connections with them). Maybe they never heard about me (I don't think so), or they believe I can't bring value to them, or believe that my experience is not a good fit, or believe that I am too expensive.

Solution to data scientists being too expensive

Regarding "too expensive", with income now above $400k/year (since I changed from "employee" to full-time "co-founder/owner") , that's true, I'm too expensive, but they don't know about that. Actually in my last "employee" positions back in 2012, I was making $150k-$170k, which is not high by Bay Area standards. I worked mostly in California (it has a 10% income tax on such salaries)  although I was physically in Washington state (Washington has no income tax). So I could ask a salary $15,000 below what local candidates would ask. I mention this because for California or New Jersey or New York employers, this is one way to reduce your costs: hire  telecommuting employees located in states with no income taxes (plenty of great talent can be found in Austin, Texas, as well as near Seattle, Washington - both places with no income tax). Real estate is also well below California or East Coast in these locations. A guy like me probably has a $5,000 monthly mortgage in the Bay Area, mine is $2,500, with a beautiful house, though I was lucky: I sold my house in the Bay Area and then moved to Seattle at the right time (2006). All in all, I could ask a salary $50,000 below what a Bay Area applicant would ask, and still have a better quality of life with no financial stress. This was true until 2012, now this window of opportunity (for employers) is closed, especially since not only my income was boosted by a factor 3, but my job security is also much better now (Note: I don't recommend you to switch from employee to founder; most fail; you should test the waters first, make sure you don't have financial stress, and that you love "doing business" more than you love doing statistics; and that you are good at managing finances, finding the right partners, outsourcing, delegating, business hacking, "lean start-ups", marketing, product, vendor selection, market understanding, and a bunch of other things, and most importantly passion-driven and focused on generating both value and profits).

But I'm sure there are plenty other people who did the same move as I did (probably located in Austin, Texas, or Seattle now). Identifying them might be worthwhile, as they need lower salaries. Or people who benefited from the spectacular stock market recovery might also be happy with lower salaries - a weapon that they can use to compete with other data scientists. Of course, if you need a top gun, it might be worth paying him $500k/year, but then you might need to provide him with the right job title and responsibilities (possibly external consultant or chief scientist, if you don't already have one), otherwise it could create jealousy and bad team spirit.

Now don't get me wrong, I'm not looking for a job and probably will never again, so not contacting me is a smart decision, but employers don't know that either. Also, I don't fit well as an employee, but again these companies don't know that until I've worked 2 years for them. Also some of the companies interested in discussing a job opportunity with me might just want to spend a day with me, at no cost, and steal as much IP (intellectual property, ideas) as they can from me, with no intention of hiring me in the first place.

Some companies - including one that I regularly criticize, publicly posting solutions to improve their revenue using better data science - keep contacting me regularly though I'm not a good fit (for instance, they are consistently looking for developers with some statistical knowledge, but I am not a developer). I'm sure that this company I'm talking about has boosted its revenue by many, many million dollars thanks to reading and applying my solutions, so at least they benefit from me (so do all their competitors who also read my postings).

Hiring process needs to be improved

When hiring a data scientist, sometimes the hiring company does not have data scientists on staff to interview you. They use statisticians (sometimes called director of analytics or market research analyst) to assess your data science technical knowledge. But a data scientist might not know all the modern flavors of logistic regression (for instance) and might be perceived as incompetent. But the data scientist know lots of things - API's, computational complexity, ROI optimization, KPI design and data collection, relevancy engines - that the business statistician might not know or does not value. The data scientist applicant ends up being labeled as incompetent and not hired. This brings an interesting question:

Who should interview data scientist applicants? Should it be developers, or statisticians, or business people? And how can a data scientist (interviewee) convince a statistician (interviewer) that his knowledge is critical? Answer: By discussing success stories, where your mix of business acumen / engineering / big data (non traditional) statistics helped a project succeed. 

Sometimes statisticians or business analysts are afraid by data scientists, and will write a bad review after the interview, with recommendation not to hire, out of fear. It would make sense to find and hire an external (third party, neutral) consultant (data scientist him/herself) to interview the candidate.

Mismatch between resumes and job ads

The general problem for lack of analytic talent, and for employers as well as employees wasting tons of time in fruitless job interviews, is well illustrated when you compare resumes and job ads below. One of our readers (look at the comments below) mentioned that the skill R (one of the too most popular programming languages used by data scientists, the other one being Python) is never picked up by automated search tools used by recruiters to parse resumes, because it's just one letter. So it does not matter if you have R or not in your resume, if the hiring comparing uses poor automated filtering tools to narrow down on candidates with desired skills, such as (especially and ironically) R. Another reason why companies should not rely exclusively on these tools if they are looking for someone with R (the programming language) in their resume. They should also manually check profiles on LinkedIn and on our network (on AnalyticBridge.com and on DataScienceCentral.com) as some people choose not to be on LinkedIn. Or find a skill most frequently associated with R, (maybe Python, SAS, Perl, Matlab) and use it as a proxy to find R programmers. Interestingly, a search for R on LinkedIn returned the following profiles: International Financier, Account Executive R+L Carriers, Recovery at Citizen's Acting Together Can Help, R M at DHFL, Critical Care Transport R.N, R.R.T at LifeLine Ambulance Service, and so on. None have the R skill you are looking for, indeed none of them are analytic people, not even remotely. Eventually, search technology will address this issue - it's a very easy fix, for search technology engineers: use a white-list of keywords, do not remove 1-letter or 2-letter words found in your white list (including R for R programming, IT for Information Technology, and so on) when processing a user query. 

Resumes

The following sample resume extracts are from actual data science practitioners who agreed to be featured in my book. In order to allow these professionals to delete or update their resume, I have made the resumes accessible on the web, at http://bit.ly/1j4PNuP. You can find many more resumes and profiles by doing a search on LinkedIn with the keyword data science or related keywords, or by browsing Data Science Central member profiles.

Included here in the list are people from different locales and backgrounds in an attempt to cover various aspects of data science. The emphasis is on providing a well-balanced mix of professional analytic people — both junior and senior, people with big company or startup experience or both, top stars and people with average resumes (sometimes the most faithful employees), and corporate or consultant or academia-related people. These resumes have been shortened and reformatted. I also added mine, to provide an example with patents and classical big data science such as credit card fraud detection or digital analytics.

By comparing these resumes to the job ads below, it seems that Human Resources departments are sometimes looking for a unicorn — a professional with a skill mix that does not exist. Sometimes they hesitate between hiring a data engineer, a business analyst, or a data scientist. I encourage employers to seek out and hire people with strong potential and train them, rather than looking for the rare and expensive unicorn who often turns out to not be the best fit (and may only be happy running their own business). It's sometimes easier to hire a software engineer or business guy (MBA) and have him learn statistics, than the other way around, especially at the beginning of a big project. Long term, not hiring data scientists mean losing against competition, better equipped to extract value out of data. 

Typical skills mentioned in these six resumes are: Programming language R, Python, Matlab, MongoDB, SQL, MySQL, statistics/machine learning (KNN, Decision Trees, Neural Networks, linear/logistic regression) and finally Java, JavaScript, Tableau, Excel, Recommendation Engines, Google Analytics. Of course, no one has all of them listed: 50% have R, and 50% have Python (the two most common ones),

You should check these resumes to see career progression (lateral or vertical), and the degrees, ongoing training and certifications these people have gained. 

Job Ads

Consider the sample (yet actual and recent) job ads found at http://bit.ly/1hVAmr7. The skills most frequently listed are: Python, Linux, UNIX, MySQL, Map-Reduce, Hadoop, Matlab, SAS, Java, R, SPSS, Hive, Pig, Scala, Ruby, Cassandra, SQL Server, and NoSQL. Many times, several of these skills are listed (5-6), while applicants only have a few (2-3) in their resume.

Related articles

Views: 18161

Comment

You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

Comment by Thomas Speidel on February 8, 2014 at 10:04am

@Keith: "My thought is that analysts are not wed to a technology."

That's a good point.  However, I think the technology need to be assessed on par with all other aspects of competency, suitability and fit.  For roles like these, technology is crucial and in some cases takes many years to master.  In some ways, I see technology as a proxy for several other aspects of a candidate that do not emerge otherwise, like flexibility, seriousness, as well as his or her problem solving abilities.

Comment by Charles Rein on February 8, 2014 at 9:49am

One of the biggest reason that I see when we have a recovery after a recession, is the fact that every hiring company is looking for a salary constrained position.  They are looking for an experienced professional with 3-5 years or 5-7 years. These are typically standard internal company rankings based on hiring structure and breakdowns.  The companies will say 5 years experience as a norm with exact experience in Software and Application skill and Industry vertical.

So a Data Engineer with 5+ years experience, in Oil & Gas Industry doing Sales and marketing applications in Java and Teradata.

Now finding all those skills will show on LinkedIn, but what is missing is the true 5 years. Very few companies were hiring and training engineers in any field in the years 2008,2009, 2010, 2011.

Look at your own staff and see how many employees you have that graduated in those years to create a candidate base.

The dilemma then arises in; if you truly find a 2009 grad with all the right experience you better move with lightning and Tact, because he/she becomes a Number 1 draft pick and is bombarded by recruiters, managers, promises, lie, innuendos AND Offers. 

This is where the hiring companies err in recruiting, chasing and offering these high value recruits just like any other opening. 1-2 months to go through a recruiting and interview process and then taking 2-3 weeks to formalize an offer then 2 more weeks to wait on acceptance.  At this point the candidate is not talking to anyone and has 5-10 competitive offers, and you have used all the arrows in your quiver.  30 years recruiting technical professionals, and if you are using agencies your still think their fees are too high, and when you beat them up for a fee, you blame them for not delivering quality.  A players go to the A TEAMS.

Comment by Keith Osbon on February 7, 2014 at 2:06pm

My thought is that analysts are not wed to a technology.  Other than making sure a person is grounded in very basic technology, I would never hire someone based on his experience with a certain programming language or enterprise software platform.  I only look for problem solvers; I let the technology guys handle the technology.  When you start stipulating that someone be technically proficient in a specific advanced technology, then you immediately limit yourself to 10% or less of the available talent pool.

Comment by Roger Fried on January 27, 2014 at 1:15pm

I find the Data Scientist vs. Statistician debate below amusingly similar to the difference between financial professionals and accountants.  All financial professionals use accounting, but many of the elements of accounting are rather arcane and distant from the business of management.  Most financial professionals are not "true accountants", but we are far more useful (i.e., flexible, global, and practical) than accountants within many business situations (accountants are far more useful in other situations.)  

Comment by Ira Gershkoff on January 22, 2014 at 9:22am

I'll add another reason why companies can't find analytic talent: their recruiting practices are simply awful.  They drag out the due diligence process far longer than necessary; they keep candidates in the dark about where they stand; and they request many more meetings and phone calls than are necessary.  I've also heard more than a few stories about hiring managers going through the process and selecting a candidate to make an offer to, and then the hiring manager's boss or the CFO refuses to sign the requisition approving the new hire.  If the company needs analytic talent and they had permission to post the job, why won't they allow the manager to fill it?

I've seen this sort of shoddy, self-defeating practices at all size companies from startups to Fortune 50.  It's not always a big company problem.

Good people won't put up with this nonsense.  Either they go somewhere else, or they do what Vincent did and go out on their own.  I agree that the people coming from other countries where math education is given more emphasis do very well in this environment, and they tend to be good entrepreneurs as well.

So what is best practice for recruiting?  It's moving the process along at a brisk pace, where each communication between the candidate and the company adds value for both.  It's where the time from the initial resume submission by the candidate to the formal offer letter is 2-3 weeks, during which time there was a phone screen, an on-site interview, submission of HR paperwork, a follow-up phone call from the hiring manager, reference checking, a verbal offer, a negotiation phone call, and the offer letter.  (If it takes 4-6 weeks to do those things, that's still a good performance.)  If the company can't keep things going at a reasonable pace, it should tell the candidate what the hangup is, and when to expect it to clear up.  Stonewalling over some internal problem just wastes everyone's time.

If hiring companies would pay a bit more attention to their recruiting and hiring processes, I'm sure they would find out that there's more analytic talent out there than they think there is.

Comment by Venkatesh Chellappa on January 21, 2014 at 3:50am

@ Vincent, a very good read indeed and kindles my thought process. But I must say I concur with Chandrasekhara! I did Veterinary Medicine for my Bachelors degree and Informatics for Masters. I have been analyzing (statistically) data for the the past 3 years. I did have Statistics in my High School, College and my Vet degree (Biostatistics) which was strengthened with my Masters and I worked as a Drug Discovery Scientist which involved heavy mathematical modelling of chemical, biological and pharmaceutical data. I never had a formal certification! Having read the article and the comments, a question arises - does being a certified or chartered Statistician override the merits of scientific domain knowledge? 

@Thomas - "having statistical experience does not equate to being a statistician!" - Agreed! But what does? A university degree/formal education or the innate prowess? some people are not just analysis material. what would be your take on this. What makes anyone a statistician? If today I want to pursue that path with my current background what directions would you give me?

Comment by Chandrasekhara S. "C.S." Ganti on January 20, 2014 at 10:12pm

@ Vincent, An excellent summary - not withstanding some unpleasant experience of some folks, I don't want to be labeled , and I have to take exception to the countries not having proper majors :

In fact we have mostly degrees based on (former British rule) continental education three subject group in undergraduate:  I am from India with B.S. in Math, Physics and Chemistry (as majors); M.Sc in Statistics (full fledged 2 years - long academic year -- did : Probability Mathematical Statistics, Statistical Inference SQC, DoE, Analysis of Variance, Measure Theory,  Operations Research.

 .. so we don't know about other countries, we have / had strong under graduate / graduate backgrounds with Majors such as .. Botany, Zoology, Chemistry;  can go for medicine and sciences. That is how most of B.S. can go to M.S. curriculum in the US and M.S folks can go to Doctoral programs. 

Commerce and Business concentrations can go to Business and there are  many other group and  it is long academic longer day.  Morning to late afternoon are the classes.   (only 6 wk or so summer break)

Comment by Thomas Speidel on January 19, 2014 at 10:49am

Vincent, I too am one of those people that came from somewhere else.  I can assure you that the concept of major is not as homogeneous as you paint it (it changes from country to country even w/in EU).  Also, let's not confuse mathematics with statistics and let's not confuse statistical literacy with specialization in statistics. 

Finally, if you are talking about the usual suspects (Google, MS, Facebook, Amazon, Paypal, Ebay etc), I suspect those organizations know what a data scientist should look like.  Let's not forget though that most American and Canadian corporations (and elsewhere, of course) are not Google and will hardly have a statistician working with them. 

Having statistical experience does not equate to being a statistician.  I know plenty of people that can run a simple linear regression in Excel.  Does that make them statisticians?  I'm afraid it takes much, much more than that. So, I think the disagreement stems from a wildly different view of what a statistician is and does.

Comment by Vincent Granville on January 19, 2014 at 8:58am

@Thomas: I run a job board for analytic people, indeed the largest one. I see thousands of resumes and job ads. So saying that my conclusions are based only on anecdotal data is a lie. But it is true that I see fewer of the "traditional" (ASA category) statisticians and might underestimate their numbers. About 0.5% of the workforce is business statisticians - one for every 200 employees, one in every company with 200 employees, dozens in a company like Google, a few hundreds at Microsoft (I'm connected to 100 at Microsoft). About 500,000 total in US. Many, if not most, come from places like India, China, Russia, France, Germany, UK, have first studied in their countries where the concept of "major" (in education) does not even exist. In these countries, math and stats education is far more popular than in US. Indeed, these countries dominate the analytic market, though a bunch of them eventually become US immigrants. You might see the tree, but not the forest.

The vast majority are not certified or chartered statisticians, and are not members of any statistical association. I believe ASA has 20,000 members. On this network, we have far more than that. Our main LinkedIn group alone has more than 100,000 members (not all of them are statisticians per se, though most have statistical expertise, and participate as interviewers when hiring data scientists). This is public information, you can check it out yourself.

Comment by Thomas Speidel on January 19, 2014 at 8:26am

I may well underestimate the numbers. But from here to say that graduates from one of the least popular major are responsible for the hiring of data science folks is misinformation.  Your experience is just that, anecdotal.

Follow Us

On Data Science Central

On DataViz

On Hadoop

© 2017   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Terms of Service