In part 1 of Paige Roberts’ candid interview with MapR Chief Application Architect Ted Dunning, he discussed how customers are using Apache Flink and Apache Apex, and why they’re a step ahead of older distributed compute frameworks. In this post, he contrasts the advantages of new platforms like Flink versus the old streaming standby, Apache Storm.
Paige: Does Flink performance compare favorably to Storm?
Ted: No, no, no. It’s inconceivably faster.
It’s much faster than Storm?
Much faster. Like 40X faster on common benchmarks.
It has to do with the fact that it’s more modern. Storm is beginning to backport some of these changes into their code, though. The Storm community is a good community.
Taylor Goetz is one of the leaders in that community. He’s one of the best people who gets the Apache projects’ philosophy just out of the box. He’s amazing. Taylor says that he learned those skills of how to work with people and build a community while working as a volunteer fireman. If you’re going to die if you don’t do the community well, I think that is slightly more stressful.
It’s a good motivation [laughter].
He brings those skills based on something with that level of importance to something less important like software development. I hope nothing dies if the software doesn’t work, but it could happen. [Laughs] It means that he’s good at the community aspect.
I was talking to a friend of mine who works on Metron, and they use Storm. I asked, “Why did you choose Storm?” He said it is partially because he had Storm’s skills on his team. He had some pre-existing code that was already built in Storm. But the other reason was that a lot of the scaling issues and problems Storm had earlier on have been fixed. The community has kept it up and kept modernizing.
And it’s gotten faster. A lot faster over time. But it still suffers from the inherent original architecture.
The original architecture is that packets are acknowledged, and, entire sets of packets are retransmitted if, after a time, all of those packets are not acknowledged. It’s a very clever system for keeping track of whether or not a batch of packets have all been acknowledged. But that doesn’t let you do this Chandy-Lamport streaming checkpoint. And it doesn’t support proper exactly once processing. When you do add that, at least in the original ways it was added to Storm, it was called Trident.
Trident involved waiting until a certain phase of computation was complete and then writing all of those at once in Storm. That can be really slow, and it can be pretty memory intensive to store all that. And it was all driven from that first architectural decision. It’s taken a lot of time, a lot of thought to move to the newer style of architecture.
Flink, for instance, came out of the Stratosphere project, which has been researched in a number of universities in Europe for years. As many as a hundred doctoral researchers, at the time, were working on that. There’s probably 40 to 50 academics working on Flink related topics now. It takes a lot of time and academic and intellectual effort to improve things on a fundamental level.
To redevelop it from scratch almost.
Absolutely, from scratch. But we had to have something working before that was done. So, all of these different tradeoffs are important for the community to move forward, the bigger meta-community.
In the third part of this conversation, Ted Dunning and Paige Roberts discuss the secrets of building a good open source community, and some fascinating new Apache projects that you may not have heard of yet.
To see how a global information services corporation achieved success with Hadoop, Syncsort and MapR, watch: How Experian Increased Insights & Speed to Market with Hadoop