Interview with Ioana Hreninciuc: Full Metal Power in the Bigstep High Performance Cloud
There are a number of cloud computing models out there. How did the founders at Bigstep settle on IaaS?
We actually started Bigstep not because we wanted to build just any cloud, but because we had the idea that we could build the highest performance cloud in the world by eliminating the hypervisor and completely reengineering the architecture of the cloud. So it was always IaaS for us; we never wanted to start a PaaS or SaaS. This is because we had excellent experience with IaaS from our previous work with Hostway Corporation – it was what we knew how to do best. We might move to PaaS in the future, not entirely, but with some parts of the service – maybe provide Hadoop as a service, for instance. But our core will stay IaaS.
In the era of software defined networks and the market success of VMware, why did Bigstep decide to sell physical isolation, or “Full Metal Power”?
Virtualization is a great tool for some tasks, but in big data it definitely does more harm than good. Even with the virtualization technologies embedded in CPUs, hypervisors still eat up a lot of performance. It’s not the CPU; they dramatically increase network latency and disk access, and they don’t allow direct access to memory for in-memory applications.
For big data applications, like NoSQL DBs and Hadoop, our Full Metal Cloud can achieve performance improvements of 20-400% compared to virtualization. Yes, in some cases we’ve seen query times drop from 200 to 40 milliseconds on databases of 10 million records.
Before our Full Metal Cloud, there was no high-performance cloud offering to answer the need for computing power and flexibility in the big data market. We decided to answer that need.
Your website mentions “data location transparency.” The EU has developed increasingly distinct guidelines for data privacy, protection and ownership. To what extent is this a driver for business at Bigstep and others in your space?
Well, we’re one of the few cloud vendors that are incorporated and based in the UK. Our infrastructure is based in the UK as well, of course. Since the Snowden case, we’ve found people are much more sensitive concerning data location. Some do choose us because we’re a UK company and guarantee the physical location of their data. Other public cloud providers mostly rely on regions and usually have contract terms that allow them to move your data at any time, without even notifying the client.
What role does the DevOps movement play in the type of features requested and consumed by Bigstep customers?
DevOps are actually a very important audience for us. They’re the ones working day to day with our Full Metal Cloud, even if other departments benefit from the results. They’re great because they’re very technical and can get things done immediately. But that also means they’re not impressed by bells and whistles; the features they need are very advanced.
For instance, we had planned to have one-click deployments of major big data apps, such as CDH, Elasticsearch, Couchbase and the likes. But we found out that DevOps don’t need help installing the apps; they’ve got that covered. It’s the integration of synchronization of these applications that takes the most time and that they’d like the provider to take over. But of course, automating to that level is much more of a challenge. We’re working on it as we speak.
There is speculation that Bigstep’s elastic SSD offering is driven by strong movement toward in-memory processing. Are you seeing this? Which big data or business intelligence tools are driving this?
Spark and Cloudera’s Impala are the most popular uses of in-memory we’re seeing. Redis is also often used as an in-memory caching layer. In the future, we expect to see more of Hortonworks’ Stinger as well.
However, we work with SSDs because they provide increased performance overall, not just in the case of in-memory apps, and our goal was to be the highest performance cloud in the world.
To what extent is streaming big data (Storm, Spark, etc.) driving demand for your infrastructure?
People are working with Spark and Storm, but we’re not seeing them as part of many production environments just yet; deployments are relatively small.
Our Full Metal Cloud gives our users direct access hardware and, in this particular case, to memory, and we are expecting the rise of Spark and Storm to have a relevant impact on our growth once they’re mature enough.
What role would ETL, such as Syncsort’s offerings, play in transitioning to big data uses in your infrastructure?
I think Syncsort could empower clients that are currently dealing with legacy on-premise infrastructures, which have become ticking time bombs, to move to the cloud. Most enterprises are actually keeping quite sensitive data in these systems, and their concerns are usually related to how secure their data would be in the cloud. In our Full Metal Cloud, we physically isolate instances and network traffic so security and privacy risks are minimal, and we provide all the elasticity of the cloud. So it’s a way to get the best of both worlds.
What tools are your customers using to access your infrastructure API? How does Bigstep manage the resources, including security, dedicated to supporting the API?
The API is compatible with most programming languages – I’m not sure which is more popular; it really depends on what language each user is most comfortable with. We also provide a CLI and user interface build on top of the API, which can be used for simplified management. The interface is an easy-to-use, wireframe-like infrastructure builder, based on drag-and-drop actions.
What issues, whether technical or business, prove to be the biggest obstacles when explaining your offerings to prospective customers?
Data security is a major concern. No one wants to take data outside of their firewall, so we have to work to extend networks into our data center. It’s not a major problem, but it introduces delays. Also, there’s still a lingering perception in the market that the cloud is not suited for data crunching and that big data is best done on premise, especially for cost reasons. But we’re seeing more and more people getting over that, due to the fact that scaling on-premise infrastructure is so difficult, because of corporate procurement and budget approval processes.
How does Bigstep deal with the internal operations effort associated with pay-per-use and on-demand provisioning? Are you using products like PuppetLabs?
We built our own software specifically to deal with all that. It’s the magic behind our Full Metal Cloud. It allows us to make bare metal elastic and also to re-engineer our core architecture compared to other clouds. There are some alternatives in the market; we find other providers are choosing OpenStack more than something like PuppetLabs. But we needed something flexible, that we could build on, that we’d have control over. So OpenStack wasn’t really an alternative for us; it would have just slowed us down.
Your pricing scheme includes a number of IaaS parameters (e.g., CPU type, number of cores, processing speed, RAM, cycle time). How did you settle on these particular parameters? How might these parameters change over the next five to 10 years?
They’re quite typical specs for infrastructure – the difference is that we do provide more information than virtualized cloud providers. We share the exact CPU series, core count and frequency. Many infrastructure providers don’t do that so they can get away with using older hardware – we’ve seen CPUs from 2009 being sold into what was supposed to be a “new” server for the client.
I think in the future we might have memory frequency up there as well, because it plays a big role in performance – even if people don’t really think about it right now. It’s hard to say what will happen in five to 10 years though. Technology changes so quickly.
Syncsort customers are interested in hybrid solutions powered by its DMX product. How would you see DMX being used in your full metal IaaS?
Clients that are just moving to the cloud are typically still working with at least two systems – an on-premise, legacy infrastructure, and their new cloud setup. We haven’t used it ourselves, but clearly a tool like DMX could be helpful in managing big data integration.