Monday 18 February 2013

What is a "Data Scientist" and how do I become one?

I read an interesting post yesterday by Daniel Tunkelang of LinkedIn called "Data Science: What's in a name".

Essentially, a Data Scientist is someone who applies the scientific method - explore, hypothesize, test, repeat - to data.  This is probably what a lot of "QlikViewers" and other "data discovery" (Tableau, Spotfire, etc.) experts think they do.  But there are potential gaps that need to be considered.

Daniel introduces the Data Science Venn Diagram from Drew Conway in his article:


My experience is that a lot of data discovery experts come from an IT background - databases, reporting, etc.  Very much a "computer science" type of person.  They may have built up a lot of the substantive expertise in many areas of business and probably have a great set of "hacker" skills that they have built up over the years, but perhaps a lack of the core statistical skills or understandings.

This is where I found myself over the last number of years.  I have a great range of "hacking" skills and abilities to be able to get at data and get it into a form that I can use it to answer the business questions where I can apply my built-up business expertise.  But Drew Conway identifies this as "Danger Zone!" on his venn diagram - someone who knows enough to be dangerous.

I think that I recognized this in myself a while ago, so decided to take action.  Starting with just reading - I highly recommend How to Lie with Statistics by Darrell Huff - and then moving on to taking Statistics One on Coursera.  I even find that regularly listening to More Or Less from BBC Radio 4 is an education in itself.

Massive Open Online Courses (MOOCs) are a great way for people to educate themselves about the gaps in their knowledge.  They do have a commitment in time, but I think that it is worth it.  One of the great things about it is the ability to interact with others from right around the world.

I see a new one from Coursera called Introduction to Data Science is scheduled to start in April 2013.  Time to sign yourself up?


Stephen Redmond is CTO of CapricornVentis a QlikView Elite Partner. We are always looking for the right people to join our team.
Follow me on Twitter: @stephencredmond