Big Data: Big Opportunity
Every once in a while, the IT industry invents a new buzzphrase and everyone rushes towards this latest wave, often leading to the creation of a bubble that bursts, but sometimes, leading to something that becomes genuinely sustainable and real, such as cloud computing. Well, now we have the latest buzzphrase to emerge: Big Data.
Firstly, lets look at the Wikipedia definition of Big Data:
"In IT, Big Data consists of datasets that grow so large that they become awkward to work with using on-hand database management tools. Difficulties include capture, storage, search, sharing, analytics and visualising. This trend continues because of the benefits of working with larger and larger datasets allowing analysts to spot business trends, prevent diseases, combat crime. Though a moving target, current limits are on the order of terabytes, exabytes and zettabytes of data.
Scientists regularly encounter this problem in meteorology, genomics, connectomics, complex physics simulations, biological and environmental research, Internet search, finance and business informatics. Data sets also grow in size because they are increasingly being gathered by ubiquitous information-sensing mobile devices, aerial sensory technologies (remote sensing), software logs, cameras, microphones, radio-frequency identification (RFI) readers, wireless sensor networks and so on.
Every day, 2.5 quintillion bytes of data are created and 90% of the data in the world today was created within the past two years. One current feature of Big Data is the difficulty working with it using relational databases and desktop statistics/visualization packages, requiring instead massively parallel software running on tens, hundreds, or even thousands of servers.
The size of Big Data varies depending on the capabilities of the organisation managing the set. For some organisations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration."
From a technology perspective, Big Data lies at the heart of an Open Source initiative called Hadoop, which was originally conceived at Yahoo! and is now being exploited by a number of commercial organisations, including new players, such as Cloudera and Hortonworkworks. This also relates to innovation from the more familiar cloud computing players, in the form of Amazon Web Services (AWS) and Google App Engine (GAE) with MapReduce: a software framework introduced by Google in 2004 to support distributed computing on large data sets on clusters of computers.
So now, in simple terms, we have something that conceptually, looks like what went before in IT: a never-ending quest to turn timely insights from data into business value, where at the extreme, new business models and ventures may be created and where competitive advantage may be generated. Of course, we have seen this before: data warehousing, massively parallel processing (MPP) computers and so forth.
Again, whilst Big Data purists might not like this comparison with data warehousing, in what we now face is a significantly larger amount of data being produced and consumed, thanks to the explosion of the Internet, social media and so forth. Big Data looks to me like data warehousing externalised and socialised - in the cloud.
If Going Non-Linear is all about knowledge-intensive firms moving beyond the constraints of billable people-time, then Going Non-Linear with Big Data can become a foundation for growing new cloud ventures, leveraging these new commercialisations of Hadoop and the like and creating a new Polymath IT ecosystem: effectively, software engineers plus a new breed of talent required: Data Scientists.
According to Forbes:
"To get the greatest business value from Big Data, companies are looking for multi-skilled experts who understand programming, large-scale mathematics, statistics and business. They call this new role a Data Scientist. Like Big Data, the term Data Scientist is both catching on and generating skepticism."
So, as someone who is engaged in the business of building technology teams in nearshore locations to combat the shortage of skills in software engineering, this for me, opens up an interesting possibility to build a talent pool of Data Scientists who can help enterprises to engage in Going Non-Linear with Big Data, helping to create new Non-Linear Revenues, intellectual property (IP), cloud apps and services.
From a personal perspective, I have a commercial interest in growing a nearshore network of high-value software engineers (and now Data Scientists too) in Ukraine. But this relies on inherent demand for software coding, testing and support. So, effectively, there has to be a clear pull demand to resource with skilled people not easy to find onshore.
In the nearshore, there is a price advantage for skilled labour, but this is becoming increasingly less important as the driver for building Own Software Development Teams in Ukraine or Belarus at Ciklum. Shortage of talent, the ability to scale - and more importantly, the measurement of best practices is a much greater need to serve.
Just as we have to source, train, motivate and manage the best Java, Ruby or Apex software developers, so too, we now need to build an effective scaling for this new breed of Data Scientists, if Big Data is to scale effectively and Going Non-Linear becomes a reality for Next Generation ISVs and other new Big Data cloiud ventures.
To quote McKinsey:
“By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.”
This, of course, may be pro-rated across Europe and the emerging economies too. Hence, if Big Data is going to become a success, in the form of new ventures and services and contribute to growth from Going Non-Linear, then this skills crunch (rather than money) could become the ultimate bottleneck for all players entering this new marketspace.


