Live migration (aka on-line migration) is a process of moving running virtual machines from one physical machine to another without perceived downtime. It is one of the most fascinating and complex aspects of platform virtualization. Further in the text, I will examine its pros and cons in order to answer the question "is live migration worth using in a data center of significant size and still growing?".
Real world usage scenarios
Let's start from evaluating adaptive data center scenario. It is a vision of the data center that is capable of adapting its computing power (number of active physical machines) to match current users' requirement. At the times when users run more computation-intensive applications and the overall data center load peaks, additional machines are provisioned immediately to match surge in users' demands. On the other hand, if overall load is low, inactive machines are deactivated. Apart from power consumption savings (see green IT scenario) there might be other benefits, such as decreased costs of proprietary software licenses.
Next, let's move to second scenario. Physical machine maintenance relates to data centers with hundreds of inexpensive physical machines. Presently, majority of such centers perform any maintenance operations that require server and related database downtime, such as OS kernel upgrade, within precisely defined maintenance windows (usually at nights or during weekends). If such a data center hosts banking services, then any downtime is unacceptable to customers who are accustomed to 24/7 banking. Such situations can be avoided if anytime physical maintenance is used. It is a new approach to maintenance tasks that allows performing administrative chores at any time of the day without any interruption noticed by end users.
Last but not least, green computing (green IT) is a refreshing idea for all those organizations that either own or operate huge data centers and have to cope with increasing power and cooling costs. If we look at the nodes of a big data center, we may observe that at any moment of time, some nodes are either inactive (i.e. no user jobs are being executed on them) or under-utilized. With the help of live migration it is possible to configure a previously virtualized environment to consolidate load across all available machines. In other words, currently running virtual machines will be automatically redistributed on nodes to ensure that as many nodes as possible are fully utilized. In most cases, some nodes will become inactive and can be effectively powered off. As soon as the load increases, previously deactivated machines will be activated again.
Moreover, there are rigid requirements, such as minimal network bandwidth or specific cluster configuration, that must be met in order to migrate quickly, only one vendor writes about them in a precise manner (see Brian's Power Windows Blog post). It is likely that on overloaded servers and over-saturated networks they might not be met.
Moreover, I think that the importance of live migration should not be underestimated. The data centers of tomorrow will likely to continue growing and in order to facilitate maintenance and reduce power consumption they will put more emphasis on virtualization and live migration.