AWS re:Invent 2013 — Big Data in the Cloud
I was in Las Vegas for the second annual Amazon Web Services re:Invent conference this week, where we were launching our exciting new product, Syncsort Ironcluster™, the first product of its kind available on the Amazon Web Services Marketplace that runs natively on Amazon Elastic MapReduce (EMR). The energy level at the event was impressive, with over 9,000 attendees and an equal number tuning in via social media.
The conference had a particular focus on deploying Big Data technology in the cloud, and in my discussions with analysts, partners, and customers at re:Invent there were a few recurring themes that I wanted to relay:
– A huge percentage of the customers who are deploying Hadoop clusters as their initial foray into Big Data technology are funding the buildout of these clusters by offloading expensive data processing workloads and storage from legacy systems. This kind of offload is viewed as a no-brainer “Phase 1” Hadoop project because it saves so much more money than it costs to implement the cluster.
– Even the most optimistic organizations are not fully prepared for how much money they end up saving by using Hadoop to offload processing workloads and storage from legacy systems.
– Everyone wants this offload process to be as seamless and turnkey as possible. Even though the offload saves staggering amounts of money, organizations want to conserve the money they free up for use in higher value “Phase 2” Big Data advanced analytics projects that leverage the long-term active archive data in the cluster. This is one reason there’s so much interest in turnkey offload technologies like Ironcluster that also make it much easier to build these next-generation analytical systems, make them run faster, and make them more secure.
– Organizations view cloud-based Hadoop platforms, like Ironcluster on Amazon EMR, as a great way to get started with these projects, and an excellent complement to on-premise Hadoop deployments. They’re particularly attracted to the virtually infinite scalability, high performance, and the continuously decreasing prices of the Amazon platform.
I’m not sure I have any great advice for the legacy data warehouse or ETL vendors who are seeing customers delay purchases or shut down existing spending. It’s a tough situation when a new platform emerges that’s this much less expensive than the product you’re selling. The Hadoop and the NoSQL platforms just have so much venture capital funding flowing into them, and so many companies building to the same core infrastructure, that it’s really hard to keep up with how much better and cheaper they’re getting every day.
I was interviewed for a live broadcast on theCUBE yesterday at the event, where I discuss these topics as well as our acquisition strategy – you can watch the SiliconANGLE video here: