Big Data: Bridging the Old and the New
2014 was a pivotal year for many Big Data open source platforms such as Apache Hadoop. We saw customer activity transition from early-stage experimentation to major production deployments at some of the largest enterprises in the world. Billions of dollars of investment capital flowed from the venture community into the companies, such as Cloudera, MapR, and MongoDB, focused on commercializing the leading Big Data open source platforms. And with the 2014 Hortonworks initial public offering, the public equity markets got their first pure-play Hadoop investment vehicle.
After its IPO, Hortonworks is a $1 billion Hadoop company (again). Credit: Nasdaq
This flood of investment has started to have a material impact on the pace of improvement of these open source platforms and is affecting customer decision-making. Many customers, even traditional enterprises, want to focus their data software investments on the platforms that will improve the most quickly, even if those platforms are not currently quite as mature as others. Older software vendors, bogged down by their more conservative investor bases, are having trouble spending as much on their own offerings as the fleet of well-resourced, venture-backed companies are investing in open source projects like Hadoop. Because the newer platforms are open source, customers also have a specific emotional attraction to them: when building on Hadoop, customers avoid the anxiety of becoming dependent on a specific vendor for years or decades. In their messaging, analyst firms such as Gartner, Forrester, Wikibon, and others are advising customers to use open source whenever possible when choosing software for the data infrastructure layer. Many customer decision-makers are taking this guidance very seriously, as they vividly remember the challenges of vendor lock-in from prior eras (the industry appears to be more tolerant of proprietary software at the data application layer than at the data infrastructure layer, at least for the time being).
Syncsort is in a fortunate position in that, like a startup, we don’t have a large legacy data platform business we have to worry about defending — however, unlike a startup, we already have a global presence, thousands of existing customers across 85 countries, growing revenue and profits we can use to acquire high-value software companies with complementary technology, and the exceptional, highly-relevant intellectual property our R&D teams have built up over the years.
We came into 2014 focused on a specific transformation taking place in the software industry: fast-growing, young companies were offering powerful new ways to manage, store, and analyze data, upending large existing markets with improved performance, dramatically lower price points, and cutting-edge capabilities. Many of these open source data platforms were effectively the commercialization of battle-tested software developed initially for in-house purposes by Google, Facebook, Amazon, Yahoo, LinkedIn, Twitter, and others.
Our strategy was to partner with the fastest-growing of these companies and to deliver high-value software that could run natively on the new data platforms that were experiencing the most explosive growth. In 2014, we continued to be an active contributor to the Apache Hadoop community, and extended technology partnerships with Amazon Web Services, Cloudera, Docker, Hortonworks, MapR, Splunk and Waterline Data Sciences.
This strategy did not go unnoticed. Our Apache Hadoop-based product line is now running in production on some of the most industrial-scale Hadoop clusters in the world, as well as on Amazon Web Services. Many of these deployments are “offload” scenarios, where the customers are moving data and workloads from more expensive legacy systems into Hadoop. Forbes featured Syncsort in its “Top 10 Big Data Pure-Plays 2014” report, and Database Trends and Applications magazine selected Syncsort’s Hadoop solution as a “Trend-Setting Product for 2014” and named Syncsort to its 2014 DBTA 100 “Companies That Matter Most in Big Data” list.
Towards the end of last year, our prolific mainframe software research lab launched the unique Syncsort Ironstream product, which allows customers to perform advanced analytics on critical mainframe data from within Splunk Enterprise and Splunk Cloud. In 2015, we expect to see more turnkey data-oriented applications emerge that are built natively on the fastest growing platforms, such as Splunk. These turn-key applications will be able to take advantage of all of the new sources of data that are coming online every day, such as the mainframe data we’ve just made available to Splunk customers. Expect much more from us in this space in the coming year.
Lonne Jaffe at Haddop Summit 2014
In 2015, I also expect to see a continued acceleration of customers moving data and data workloads off of legacy systems into newer Big Data systems — deepening the offload trend from last year. Syncsort will continue to invest in Project SILQ, which makes it easier for customers to move legacy data warehouse and mainframe computing workloads into Hadoop. In 2015, a new layer of open source projects built around Hadoop will continue to gain traction, such as Apache Spark, Kafka, Parquet, Avro, Sqoop, and the various Google Dremel clones, such as Impala and Drill. Within the next couple of months, expect some important announcements out of our research labs around these projects, and many others.
2015 should be an exciting year!