Expert Interview with Kevin Deierling of Mellanox on the State of OpenStack
Business has been strong for Mellanox in the last quarter. What sectors are driving this growth?
Previously, our business was tied primarily to the high-performance computing (HPC) market; but over the past several years, we’ve seen significant diversification such that the HPC segment now represents less than half our revenues. HPC continues to grow; however, other segments of our business are growing even faster and our revenue base continues to diversify. Aside from HPC, the other three major growth drivers for our business are Cloud/Web 2.0, enterprise, and storage. In addition, we also acquired Kotura and IPtronics, allowing us to enter the optical cable and transceiver market, which is a new source of growth for Mellanox.
The hyper-scale markets (cloud and Web 2.0) are the most obvious areas that explain our growth. We have a very strong position here, with four of the top five cloud providers adopting our networking solutions. It is also interesting to note that we continue to grow in enterprise and storage, areas where others are no longer experiencing growth. This is largely attributed to the fact that we are no longer selling legacy technology, such as Fibre Channel. Instead, we are enabling the parts of the enterprise that are growing, including the new class of private/hybrid cloud, Hyper-Converged Infrastructure, Big Data analytics and software-defined scale-out storage.
Can you talk about the role you see for major players like Cisco in the growth of both your business and standards in the space where Mellanox operates?
We actually don’t compete directly with Cisco because they focus primarily on traditional enterprise environments. That said, cloud and hyper-scale technologies have begun to trickle down into Fortune 500 enterprises in the form of private cloud and software-defined data centers – two areas of great strength for Mellanox. In this arena, Cisco can be a strong partner as a well as a potential competitor.
On the standards front: Mellanox, along with companies like Microsoft and Google, were the founders of the 25GbEthernet Consortium that pioneered the next speed generation of 25Gb/s Ethernet. Cisco has actually joined this effort as a key promoter of the technology. There is also now an effort within the IEEE to standardize the technology, making Cisco an important partner on developing this standard and ensuring that our 25Gb/s Ethernet adapters, switches, and cables work seamlessly with Cisco solutions.
Many people are unaware of Remote Direct Memory Access (RDMA). In fact, Microsoft reportedly supports RDMA for Windows Server 2012 via SMB Direct. What shifts in technology adoption would make this technology more widespread?
Indeed, Microsoft is one of the big vendors that has publically talked about the way they use RDMA over Converged Ethernet (RoCE) to streamline their data center, but they are not the only hyper-scale user. Most people have no idea that as they surf the web and communicate with friends, RDMA is being used inside the data center to move and process the massive amount of data being shared.
RoCE technology will become more visible as it continues to penetrate the enterprise data center. One driver for this is Microsoft’s new Azure Pack Cloud Platform System for enterprise private cloud deployments. The idea here is to take all the great technology developed for Microsoft’s Azure public cloud platform and make it available for enterprise users. Obviously, to compete enterprises need to adopt technologies like RoCE that have enabled cloud providers to achieve agility, efficiency, and a lower total cost of ownership for their data centers.
Nearly all of the big storage vendors are already using RDMA inside their big storage arrays. And once Microsoft storage spaces expose enterprise customers to the kind of performance improvements possible with RDMA, the other vendors will follow suit. It’s a pretty straightforward transition to expose RDMA on the front side connection, and we’ve already seen this from Microsoft, IBM (GPFS/ESS), and Netapp. Finally, new protocols like Ceph and iSER/Cinder are starting to adopt RDMA too, so it is becoming more and more common.
What is the sweet spot for folks most likely to be interested in your OpenStack collaboration with OEM hardware vendor SuperMicro?
Our CloudX reference platform is a really fast and easy way for companies to get started with OpenStack. The work we are doing with SuperMicro and others on OpenStack is a great example of an open source initiative that enables building the most efficient and cost-effective clouds. Our CloudX OpenStack platform pulls together all of the pieces needed to build the same sort of cloud infrastructure used by the big hyper-scale cloud providers. Naturally, we think it makes sense to run this on the same hardware these guys do; and we’ve seen the big cloud providers move beyond 10GbE to 40GbE and now 25, 50, and 100GbE.
Mellanox has dominant adapter share here with around 90 percent of the market for speeds greater than 10Gb/s. Most of the revenue today is at 40Gb/s, but analysts are predicting very fast growth for 25GbE adapters and 100GbE for both adapters and switch links. These network speeds are really less about achieving high performance and more about achieving better efficiency and scalability. Our customers use our network gear to achieve higher virtual machine density and better Ceph/Cinder storage efficiency.
We have built our CloudX reference architecture with SuperMicro and the OpenStack distributions, such as Canonical/Ubuntu, Redhat, and Mirantis. This CloudX reference architecture allows customers to get started using OpenStack quickly – with properly sized, integrated, and pre-validated platforms that scales efficiently.
What do you say to critics who argue that despite the coolness factor for InfiniBand, most of the market action is with Ethernet?
InfiniBand is more than just cool. InfiniBand is still the highest performance networking technology available. InfiniBand is at the heart of the highest performance and most scalable en- critical applications at the largest government and business enterprises in the world. These platforms are not only being deployed for enterprises, but also in the new generation of public cloud offerings. For example, Salesforce.com, the largest Software-as-a-Service provider, is hosted on Oracle ExaData platforms running on an InfiniBand backbone. And workloads like Hadoop, SAP, and Parallel Data Warehouse are being hosted on these appliances as well.
So while some data centers may not move to InfiniBand, for many Web 2.0 workloads and converged infrastructure appliances, there is simply no better technology. After all, clients may connect to these appliances with Ethernet; but at the end of the day, all you care about is how fast and reliable these solutions are to respond to storage or transaction requests. And with InfiniBand inside, it’s faster, more efficient, and more reliable.
Some experts are concerned with the overhead associated with OpenStack, which seems to become significant as the node count grows to 100,000. They point to recent moves by Rackspace to step back from OpenStack. How is this concern being addressed?
First of all, scalability is a problem that Mellanox is really good at solving. We are at the heart of the majority of the largest data centers and super computers in the world and have shown scalability to 100,000s of nodes. To achieve this level of scalability, hardware and software need to work together. A good example within OpenStack is Ceph. Here, as the node count grows, both the software and the hardware becomes stressed. This is where our hardware offloads remove much of the processing burden from the CPU. So by performing much of the network processing normally performed by the CPU, we free up computer resources to run the other parts of the Ceph software – making it higher performance and improving scalability. Frankly, these sorts of scalability issues are good problems to have as it means there are nice business opportunities available, and we will work with our software partners to provide the efficient hardware platforms needed to solve them.
Some are concerned that many network topologies being designed today rely on implicitly centralized rather than decentralized networks – with associated dependencies on high reliability, high throughput centers. Do you agree?
I don’t entirely agree. I actually think network topologies and management are orthogonal. It turns out that both centralized and decentralized management can work.
Today, most of the Ethernet deployed in the enterprise relies on decentralized network management, with each switch running independent routing algorithms. Of course, if you try to build a giant Fat-Tree topology with simple layer 2 protocols, then a spanning tree algorithm is going to prune away all those expensive links you bought for performance and reliability reasons. Instead, you need to confine the L2 domain to a rack and run advanced layer 3 protocols like BGP and ECMP to take advantage of a robust topology. This will work if you use the right network management architecture to take advantage of your topologies.
The alternative to this is a centralized network management scheme such as software-defined networks (SDN). The key to achieving reliability here is a distributed-centralized management SDN architecture. Now “distributed-centralized” sounds like an oxymoron (an oxymouthful?); but when you really dig into what’s going on, it makes perfect sense. A good example in the storage world is Ceph’s CRUSH object mapping algorithm. It is centralized in the sense that there is a single, uniform view of the cluster; and it is distributed in the sense that the mapping of objects to nodes is not stored in a centralized metadata server, but rather can be generated by distributed entities using consistent hashing algorithms.
The same thing is happening with SDN, where you have a centralized view of the network with distributed virtual routers actually doing the heavy lifting. This hybrid implementation will likely become the norm. So you get the benefit of both fault tolerance and scalability.
Where do you think we are in the adoption curve for Software Defined Networks?
It’s still early days for SDN, but we already know that the concept works at scale. For example, InfiniBand implements a centralized network management scheme that was developed before it was even called SDN. And there are many large-scale deployments of InfiniBand for both HPC and Web 2.0 properties, so we know this works at scale. New SDN suppliers – VMware (Nicira), Nuage, Plumgrid, and Midokura – are also building highly-scalable centralized network management. These guys are really pioneering the distributed-centralized management mentioned above.
Some experts have convinced some folks that the future is clusters of “shared nothing” – local attached storage andcloud servers united in running a Lambda architecture. Agree or disagree?
I absolutely believe this is the right way to build scalable systems. However, locally-attached storage is sometimes confused with DAS in the narrow sense of dedicated storage. I definitely believe that scale-out storage is the future with hyper-converged “compu-storage” clusters, with locally-attached storage being the way to build out truly massive data stores. But this doesn’t mean that the locally-attached storage only supports the local compute node. It needs to be accessible by the entire cluster using a distributed architecture with low latency and low overhead. This is where technologies like ours come in to support Hadoop over GPFS and HDFS over RDMA, or iSCSI over RDMA and NVMe over Fabrics.
There are some great new consistent hashes like Ceph’s CRUSH algorithm that eliminate the requirement for a centralized meta-data store. Also, as stateless micro-services become the norm, it will become even more important to access the relevant state quickly without burdening the CPU. So it really relies on the type of networking solutions we provide.
What standards, current or pending, are on your radar?
The PCI-Express Gen4 standard being developed in the PCI-SIG is important as it allows us to continue providing faster network access to compute and memory in large clusters. PCI-Express Gen3 has run out of bandwidth, so we need to double or even triple the signaling rate. Everything we need is ready, but the question is: how fast can we bring products to market? Traditionally it’s been Mellanox and Intel leading the way here. We are confident we’ll have solutions next year, but we’re not sure who else will be there with us. It may be that along with us, it is other CPU, switch and FPGA vendors that take the lead here.
We are also working with partners to drive InfiniBand forward within the IBTA to faster speeds and to support new accelerators and virtualization functionality. In addition, we’ve seen an influx of new companies into the IBTA (Avago/Emulex, Broadcom, Cisco, Microsoft and Qlogic) to support the RoCE specification and ecosystem. RoCE is now fully routable and the congestion management mechanisms are standardized. In addition, the congestion management algorithms are being more broadly understood and exposed. These congestion management developments are actually quite important for networking at large, because at 25 and 100 Gb/s, you simply can’t afford to drop packets as a standard way of doing business.
Furthermore, I already mentioned the IEEE 25GbE study group, but we are also already working on 400G with IEEE802.3bs.
And on the optical side, we are shipping PSM4-1550 and one of the founding members the OpenOptics WDM single mode fiber standard (OpenOpticsMSA.org).
What are you looking for in prospective employees?
Mellanox is a unique company, and we look for really smart, passionate people who can recognize and execute on opportunities that fundamentally change technology in ways that will improve the world. It sounds a bit Panglossian, but our customers really are building things with our gear that are changing the world by making people’s lives better, eliminating traffic, curing diseases and more.
These big, important life-changing technologies are not easy to accomplish and take a huge commitment and effort. Some really good people just don’t fit or do well here; but others appreciate the challenges and level of commitment demanded and thrive in this environment. It also doesn’t hurt if we have some fun, develop careers and talent, and make some real money along the way.