Big Data creates demand for analytics skills

OVUM VIEW

Summary

The emergence of Big Data promises to create IT jobs, with business intelligence (BI) and analytics expertise the key drivers. As Big Data increasingly becomes part of corporate IT strategies and infrastructure, organizations will be on the lookout for specialist analytics skills to unlock business value from it. Technical skills around Hadoop, MapReduce, and proprietary commercial Big Data frameworks are scarce and in high demand. However, Ovum believes the role of “data scientist” will be instrumental in turning Big Data into a valuable business asset. While a broadening and maturing set of tooling might lower the bar, widespread adoption of Big Data will depend on having a vibrant, interconnected, and highly skilled data-science community.

The supply of Big Data analytics skills is expected to lag behind demand

The era of Big Data, fueled by web logs, sensor systems, social media, and transaction data, is upon us, bringing with it an unprecedented opportunity to transform business. The technology is in place thanks to the convergence of scale-out storage and sophisticated analytics. As organizations attempt to harness the huge volumes of data they collect, it is becoming clear that there is a shortage of skills to make Big Data work for business, with Deloitte estimating a shortage of nearly 180,000 skilled Big Data professionals in the US over the next five years.

Analytics skills in particular have always come at a premium, which perhaps explains the recent push to deployment models that shield business users from the complexities of BI development. Research from McKinsey & Co suggests that US organizations are facing a shortage of 200,000 IT staffers with deep analytics skills. The natural intersection of Big Data and analytics will make the prerequisite skills gap even wider.

This shortage of skills, coupled with a slow change of attitude away from traditional relational-based analysis methods, needs to be addressed if organizations are to grasp the full benefits of Big Data.

Get ready for the decade of the data scientist, but at a healthy premium

Skills around installing, managing, provisioning, and scaling out Hadoop clusters are of course important. Relating specifically to managing, securing, and optimizing large, scaled-out clusters and storage infrastructures, these are skills that typically decide whether Hadoop is located in the cloud or on-premise, which vendor and Hadoop distribution is used, and the size of the cluster. Hadoop engineering skills are responsible for creating and building the distributed MapReduce data processing algorithms used.

These essential data plumbing skills are important for data input and management, but they only address a part of the Big Data puzzle. Analytics skills are also needed to uncover business insights in the data that drive better decision-making. These skills are most in demand, and the hardest to find. They relate to the “science” of Big Data, using statistical and mathematical techniques to interrogate data and turn it into positive business outcomes. This requires a unique blend of technical, analytic, and business skills to align structured and unstructured data to the business.

Job descriptions for data scientists are starting to gel. While statistical and mathematical skills form a core part, creative thinking about how Big Data can improve aspects of the business, such as new operating policies or customer engagement models, is equally important for success. Data scientists should have an aptitude not only for hard programming skills in SAS, SPSS, and R, but also for understanding how to display or visualize information in a business context. Data science is therefore a business practice, rather than a defined set of statistical or technology competencies.

These skills will not come cheap. Ovum expects data scientists to command skyscraper salaries in the same way that SAP ABAP programmers did in the late 1990s. However, as tooling and vertical industry or process-specific solutions emerge, this will take some of the edge off soaring demand.

It is still unclear whether Big Data will create new jobs or necessitate retraining

The skills requirements for Big Data follow a familiar script: a need for IT professionals proficient in technology platforms, infrastructure, and data management, along with analytics and business process/vertical domain expertise. While a lack of Big Data skills might be imminent, what is less clear is whether it will create brand new jobs or will force organizations to retrain existing IT staff.

Hadoop is threatening to become the new data warehouse for many organizations, with many of the basic data management tasks around Hadoop similar to those required for traditional relational database and data warehouse environments. DBAs charged with administering Oracle, IBM, or Teradata warehouses might now have to redefine their roles, loosen their outlook on traditional data modeling, and refresh their skills to administer Hadoop clusters, and in many cases to integrate Hadoop environments with existing relational database technologies.

Experienced Java or C++ programmers could also find greater opportunities to extend their skills (and salaries) with MapReduce. BI analysts tackling Big Data are likely to gravitate toward scripting languages such as Python, Perl, BASH, and AWK that are emerging as the staple tools of choice for data scientists.

Big Data will also attract a wave of demand for analytics skills around predictive modeling, data mining, natural-language processing, content analysis, social network analysis, and sentiment analysis. This is already leading to productized Big Data offerings such as R for advanced predictive and statistical analysis.

For now, service providers are plugging the skills gap. Indications are that revenue generated from consulting and systems integrators organizing formal Big Data practices focused on Hadoop will initially be much greater than that generated from NoSQL and Hadoop products hitting the market. Vendors are starting to redress the skills imbalance through training, tooling, solutions, and services, with IBM, Cloudera, Hortonworks, and MapR all providing training courses in Hadoop.

Analyzing Big Data is an emerging opportunity in data science, and the race is on. Organizations that want to be at the front of the pack will need to equip themselves with the right mix of technical and business skills and competencies required to harness its potential benefits. This will be the biggest challenge to Big Data adoption in the next couple years.

APPENDIX

Further reading

For an in-depth analysis of the Big Data skills shortage, see Ovum’s forthcoming report: “Where are the Big Data Skills?”

Disclaimer

All Rights Reserved.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior permission of the publisher, Ovum (an Informa business).

The facts of this report are believed to be correct at the time of publication but cannot be guaranteed. Please note that the findings, conclusions, and recommendations that Ovum delivers will be based on information gathered in good faith from both primary and secondary sources, whose accuracy we are not always in a position to guarantee. As such Ovum can accept no liability whatever for actions taken based on any information that may subsequently prove to be incorrect.