Live Migration and Reboots – or Why Innovation in the Cloud Matters

Live migration in cloud computing is still very rare and now, sadly for some, Cloud Computing IaaS is once again forcing users into periods of cloud reboots and downtime due to security issues and poor cloud architectural decisions. Amazon, Rackspace and other Xen-based IaaS platform users are being told to be prepared for pending cloud reboots. It’s like deja vu as the same providers forced their customers to reboot last September.

But why? Wasn’t cloud supposed to enable users to avoid downtime due to “web scale” architectures?Clouds in Sky

I’d say in this case it’s brains over brawn.

Without Live Migration and other modern implementations of innovative cloud architectures, Amazon customers will experience downtime.

When we first came together as a team to design and architect a modern Cloud Computing Infrastructure as a Service (IaaS) platform back in 2010, I had a list of “must haves”. No,  make that “without us addressing these requirements, we’ll never deliver on the vision of IaaS: reliable, flexible, abstracted compute, storage and networking” haves. We would simply be a “me too”, and in a market that would no doubt be a commodity, we would never achieve our business goals.

I’ve been there – as the former CTO for an Internet hosting firm with 100,000 servers and 10 million customers, I knew that the decisions made early on will either elevate you or haunt you for years to come…just as Amazon, Rackspace and others are being haunted and trapped in the web of their early architecture decisions now.

The ProfitBricks team and I made tough decisions.

  • We spent months just researching, designing and testing various fundamental building blocks of the cloud: virtualization, networking, storage, hardware and the provisioning engine we would need to build in order to offer the best possible IaaS service available in the market.
  • We took the path less travelled, but we knew it was the only way to solve the tough problems of the cloud platforms at the time  – poor operational reliability, poor performance, costly designs and a lack of flexibility for the end user.

How does ProfitBricks deliver on “Zero Downtime” due to physical infrastructure maintenance and hypervisor updates?

  • We selected KVM as our virtualization layer. Simply put, KVM is the best virtualization platform available. We knew it then. A few new offerings know this now. But Amazon, Rackspace and others are haunted by their inability to offer “Live Migration”, or, ultimately, deliver on the promise of the cloud – virtual machines that are abstracted from hardware and virtualization layers. We have it, they don’t.
  • We selected InfiniBand as our network interconnect layer. Some might think we simply picked the fastest and most expensive technology to solve the poor and inconsistent performance problems that plague IaaS platforms such as Amazon AWS. But we knew that with InfiniBand we could truly deliver – not just 80Gbps high performance networks connecting servers and servers to storage, but we could Live Migrate users from one physical machine to another without causing any interruptions.

By combing these technologies, ProfitBricks has been winning over customers and racking up awards worldwide, while Amazon and the other early architecture clouds continue to stumble and grow ever more entangled in the web of their early decisions.

Brains over brawn makes for innovation and agility. All of us here at ProfitBricks are committed to delivering the best possible Cloud Computing Infrastructure to developers and operations teams. Features like live migration are just one of our unique features that our customers benefit from.

Why not give ProfitBricks a spin and see how painless cloud computing infrastructure can be.