Deconstructing sex appeal


30 March 2014
Tan Ee Sze

Math was never my strong point. And now they tell me that to get a shot at anything near to the sexiest job on the planet this century, I will need math.

Some time back, no less august a publication than the Harvard Business Review described the data scientist as “the sexiest job of the 21st century”. If HBR says so, it must be so. (Although some of you may prefer to trust People magazine on this).  HBR pointed to the LinkedIn story – a success achieved more on the back of statistical analysis rather than the technological ingenuity of social networking. Then of course there is Google, with its unnerving ability to serve up ads about products that you thought you had just thought of.  And we have Amazon shipping products before you actually order them…

While much of the buzz surrounding all these developments has had a distinctively IT hum to it – IT vendors presumably having a bigger marketing and PR budget than statisticians – there seems to be no running away from the fact that much of it is underpinned by mathematical acumen.  

Speaking at a data analytics seminar earlier this year, Greg Whelan, Director of Data Science at Pivotal Asia Pacific and Japan, made no bones about this. “You can’t do data science without the maths,” he said.

So, coming from an IT perspective, how does IT expertise fit in with data science?

I would say, the same way it fits into almost every other area where tech is being applied.

We know that tech is an “enabler” of many things. Data scientists need data in order to do their job. As data volumes grow at stupendous rates, these data scientists will need technology platforms and tools that will enable them to dive through the deluge, get at the data they need, in double quick time, not just to analyse current trends but also to predict the future.

But, we also know the addendum to this - for tech to do a good job as an enabler, the people involved in developing tech tools and platforms need to have a good understanding of the domain in which the tech is being applied. This means knowing enough maths to know what the data scientists need to do. Knowing enough to build them the tools they need to build their models and test their hypotheses, and the engines to help them deliver insights and recommendations that are relevant, timely and easy for the business to understand and act upon.

It looks like there is no running away from some math. But the clarion call today seems to be to morph more IT professionals into data scientists in order to meet the worldwide shortfall of expertise in this area.  Are we clambering up the wrong tree?

Would it be possible instead to deconstruct the amalgamation of expertise that make up the data scientist, get better clarity into the different skillsets needed for different roles within the data science ecosystem, and build the knowledge equivalent of APIs for these skillsets to “talk” to each other?

Just as software development lifecycle provided IT with a methodology for engaging business in the software development cycle, could we develop a methodology for IT and the mathematicians to engage with each other - maybe a more fluid, agile and iterative methodology in keeping with today’s trends, but a methodology nonetheless?

This way, our data scientists could still be (for the large part) statisticians, and our technologies could still be (for the large part) technologists. Both roles will continue to be important to the data ecosystem, but talent development efforts in this space could be focused not so much on pushing technologists into statistics, but on pushing new student intakes into the mathematics departments of universities and training institutions, whilst ensuring sufficient cross-pollination of courses for each to know what the other is doing.

There will still be huge opportunities for IT professionals in the data science space, just not necessarily as data scientists. And for those technologists who are not totally seduced by the allure of regression models and extremely randomised trees, it would be good to know that there are other branches to climb.