Takeaways from GigaOM’s Structure:Data 2012 Conference
Last week, I attended two days of sessions at GigaOM’s Structure:Data Conference in New York City where over 700 attendees came together to discuss the business and industry-transformative nature of Big Data, and the latest technologies and approaches to best manage it all.
What struck me this year is that the conversation has evolved from Big Data being an infrastructure-only issue to now the realization that the Big Data stack requires contribution from everyone from the bottom layer of the infrastructure up through the top application layer.
The following key themes emerged from the onsite discussions and will be the focus as the community continues to develop the Big Data stack:
1) It’s all about high performance computing and speeding up analytics as data volumes grow exponentially. The pain points for unstructured versus structured data are different. While unstructured data requires better visualization of the data, structured data requires more cleansing making filtering and grouping much more critical. One of the speakers referenced a quote from Clay Shirky that, “Information overload is not the problem. It’s filter failure.”
2) The line between personal and business behavior is blurring as analytics moves out of the IT realm and into the hands of business users, and as a result there is an expectation that delivery of data can be more easily consumed, such as through visualization capabilities and collaboration.
3) Real-time decision making through predictive analytics and machine learning is becoming essential with sensor data, digital exhaust and need to get ‘insight’ to consumer behavior.
As such, there’s a realization that the Big Data market is fragmented, and there is plenty of opportunity to contribute to building the Big Data stack. Software packages and tools need to be built on top of Hadoop for example to increase enterprise adoption. Currently most of the available enterprise software is proprietary. Offering applications layered on top of Hadoop will spur the Big Data market leading to more open source contributions and additional opportunities for startups.
Syncsort has a lot to offer in the areas of performance, data integration, and processing – all critical components to the Big Data stack. We can deliver and run ETL over Hadoop without requiring a brand new development team and skill set. One of the speakers suggested that businesses should consider adopting Hadoop only if they are willing to dedicate a separate team. Syncsort’s offering eliminates this requirement for the enterprise. We can also efficiently move the data in and out of Hadoop which as John Webster points out in his CNET post continues to be an issue.
To reach the holy grail of Big Data management – the focus needs to be on building a top to bottom Big Data stack which will require different segments of the market to come together.