In May, McKinsey published a report entitled ''Big Data: The next frontier for innovation, competition, and productivity'' that focuses on the disruption caused by Big Data. It states: “The use of data has long been part
We’ve seen this movie before about the use of big data sets to fuel decision support, now known as business intelligence (BI). It may have been called VLDB (Very Large Database) in the past. So, what’s different here? It’s really two key factors: First, the commoditization of the ability to process large data sets. Second, the use of cloud computing platforms.
The reality is that processing huge data sets for any business purpose typically meant millions of dollars to a database vendor for the enabling technology. Now, with the advent of Hadoop, which provides a divide and conquer approach called MapReduce to the database, we now have the big data problem solved using an open source solution. Hadoop, when married with commodity hardware, is able to process terabytes of data in a matter of seconds or minutes. This all once took hours or days with traditional database technology and models.
However, it’s not until technology such as Hadoop is married with the cloud does this development become interesting. Indeed, cloud providers provide access to hundreds of servers that may be provisioned as needed to support the distribution of processing that big data systems require, and the use of cloud computing platforms, such as Amazon Web Services (AWS) and Rackspace make that process quick and cheap.
The power of Big Data is perhaps less understood than the technology required to achieve it. McKinsey talks about five key advantages:
- Making big data more accessible in a timely manner;
- Using data and experimentation to expose variability and improve performance;
- Segmenting populations to customize actions;
- Replacing and supporting human decision-making with automated algorithms; and
- Innovating new business models, products, and services.
In other words, the use of big data to discover information about the health of the business, and to make informed and timely strategic decisions is the true power of this technology.
Instead of just generating report after report, Big Data systems can automatically trigger changes to core business processing around analysis of most of the data sets in the business. Moreover, there is the ability to see all operational data directly, instead of dealing with small, specialized databases set up specifically to support BI.
The technology is indeed a game changer, but the path to big data for most enterprises is not that well understood. My suggestion is to leverage what’s best in the world of SOA as core approaches to transform your enterprise in this direction. I have to agree with McKinsey, Big Data is indeed the next frontier for IT.
This was first published in June 2011