‘Big Data’ was probably one of the most used and abused terms in the IT industry throughout 2012. Conferences, publications, vendors, and analysts alike talked tirelessly about the opportunities and challenges created by Big Data. However, what is really amazing is the speed at which organizations are trying to learn and assimilate radically new architectures to process their data. For instance, recent estimates from IDC and GigaOM predict the Big Data market will be $26B or more by 2016. It seems like 2012 was all about experimentation and setting expectations. Now, in 2013, it’s time to walk the talk! So, what can organizations expect as they embark on their Big Data journey? Well, for starters, it’s important to recognize some of the key challenges they will face.
The Big Data skills gap. Hadoop has emerged as the de-facto platform to process Big Data. However, as my colleague Steve Totman pointed out in a recent ITWorks blog, technical skills in Hadoop, MapReduce and all things Big Data are becoming more and more expensive and difficult to find. Therefore, it is critical for organizations to find tools that can leverage skills that already exist within their organizations –writing Java, designing data processing flows with a GUI – to take advantage of this highly scalable framework.
Weed out the noise. Earlier last year, Gartner pointed out the relationship between noise and Big Data. I couldn’t agree more. As companies start to collect, store and process Big Data, they need to be very careful to filter out the bad data. As noise grows, the value of Big Data goes down exponentially. Therefore, organizations will need the right tools to not only connect to all relevant sources of data, but also pre-process and cleanse before they load it to their data processing frameworks, which will most likely be a Hadoop environment.
It’s the economy, stupid. Ok, ok, I knew that line would catch some attention, but it’s actually true. What we’re seeing is a shift from big, heavy architectures that demand exponential costs just to keep up – a.k.a. scale to meet the demands for more data – to seemingly low cost, highly scalable approaches like Hadoop. While Hadoop can scale much more cost-effectively by adding commodity hardware (nodes), organizations will hit a wall at some point. Think about maintenance, cooling, power, and even real-state costs. As Hadoop implementations grow, the need for tools that can maximize the performance of each node will become a critical factor or success.
2013 can be the year that Big Data technology gets traction and this will only ramp up Hadoop adoption. Organizations looking to reap the benefits of Big Data, have to be smart about their Big Data strategy, not only as it pertains to Hadoop – yes Hadoop is not the holy grail for everything – but as they move through a path that should eventually lead them to answer the big questions, or isn’t that what Big Data is all about?