ServiceTechMag.com > Archive > Issue LIX, February 2012 > Virtualized Cloud Power Management
Enrique Castro-Leon

Enrique Castro-Leon

Biography

Enrique Castro-Leon is an enterprise architect and technology strategist with Intel Corporation working on technology integration for highly efficient virtualized cloud data centers to emerging usage models for cloud computing.

He is the lead author of two books, The Business Value of Virtual Service Grids: Strategic Insights for Enterprise Decision Makers and Creating the Infrastructure for Cloud Computing: An Essential Handbook for IT Professionals.

He holds a BSEE degree from the University of Costa Rica, and M.S. degrees in Electrical Engineering and Computer Science, and a Ph.D. in Electrical Engineering from Purdue University.

Contributions

rss  subscribe to this author

Bookmarks



Virtualized Cloud Power Management

Published: February 16, 2012 • Service Technology Magazine Issue LIX PDF
 

Contributors include: Miguel Gómez, Jim Blakley, Aamir Yunus, Herman Wong, and Rekha Raghu.

Abstract: In this article we’ll conduct a deep analysis on a number of topics in virtualized cloud data centers providing analytical tools for readers interested in architecting solutions related to their present environment. We’ll start with power considerations for virtualized environments, refine the notion of power versus energy management, define concepts of efficiency applicable to virtualized pools, and introduce the approach of composite usage models as an optimizing analytic tool to tailor power related technical solutions to match a specific business goal.


Power Management in Virtualized Cloud Environments

Given the recent intense focus in the industry around data center power management and the furious pace of the adoption of virtualization, it is remarkable that the subject of power management in virtualized cloud environments has received relatively little attention.

It is fair to say at the time of writing that power management technology has not caught up with the needs of the virtualized data center. For historical reasons the power management technology available today had its inception in the physical world where watts consumed in a server can be traced to the watts that came through the power utility infrastructure. Unfortunately, the semantics of power in virtual machines have yet to be comprehensively defined to industry consensus.

For instance, assume that the operating system running in a virtual image decides to transition the system to the ACPI S3 state, sleep to memory. What we have now is the state of the virtual image preserved in the image's memory with the virtual CPU turned off.

Assuming that the system is not paravirtualized, the operating system won't be able to differentiate whether it's running in a physical or virtual instance. The effect of transitioning to S3 will be purely local to the virtual machine. If the intent of the virtualized application was to transition the machine to S3 to save power, it does not work this way. The virtual machine still draws resources from the host machine and requires hypervisor attention.

Transitioning the physical host proper to S3 may not be practical because other virtual machines might still be running, not ready to go to sleep.

Server consolidation is another technology for reducing data center power consumption by driving up the server utilization rates. Consolidation for power management is a blunt tool, where applications that used to run in a physical server are now virtualized and compressed into a single physical host.

The applications are sometimes strange bedfellows. Application profiling might have been done to make sure the applications can coexist, as an a priori, static exercise, with the virtual machine instances treated as black boxes.

Server consolidation technology makes no attempt to look at the workload profiles inside each virtualized instance and in real time. Power savings come from an almost wishful side effect of repackaging applications formerly running in a dedicated server into virtualized instances.

A capability to map power to virtual machines, in both directions, from physical to virtual and virtual to physical would be useful from an operational perspective. The challenge is twofold, first from a monitoring perspective because no method has as yet been commonly agreed upon to prorate host power consumption to the virtual instances running within, and second from a control perspective. It would be useful to schedule or assign power consumption to virtual machines, allowing end users to make a tradeoff between power and performance. Fine-grained power monitoring would allow prorating power costs to application instances, introducing useful pricing checks and balances encouraging energy consumption instead of the more common method today of hiding energy costs in the facility costs.

Let's look at a similar dynamic in a different context: In some regions in the globe water used to be so inexpensive that residential use was not metered.

The water company would charge a fixed amount every month and that was it. Hence, tenants in an apartment would never see a water bill. The water bill was a predictable cost component in the total cost of the building and included in the rent. Water was essentially an infinite resource and reflecting this fact, there were absolutely no incentives in the system for residents to rein in water use.

As population increased, water became increasingly a more precious and expensive resource. The water company started installing residential water meters, but bowing to tradition, landlords continued to pay the bills, which was still a very small portion of the overall operating costs. Tenants still had no incentive to save water because they did not see the water bill.

Today there are very few regions in the world where water can be treated as an infinite resource. In our initial example, the cost of water increased so much faster than other cost components to the point that landlords decided to expose this cost to tenants. Hence the practice of tenants paying the specific consumption for the unit they occupy is common today. Also, because this consumption is exposed at the individual unit level, the historical data can be used as the basis for the implementation of water conservation policies, for instance charging penalty rates for use beyond a certain threshold. The use of power in the data center has been following a similar trajectory.

For many years the cost of power had been a noise level item in the cost of operating a data center. It was practical to include the cost of electricity in the bill of the cost of the facilities. Hence IT managers would never see the energy costs. This situation is changing as we speak. See for instance this recent article in Computerworld [REF–1].

Server platforms enabled with Intel® Intelligent Power Node Manager allow compiling a historical record for power usage. The historical information is useful for data center planning purposes by delivering a much tighter forecast, beneficial in two ways: by reducing the need to over specify the power designed into the facility or by maximizing the amount of equipment that can be deployed for a fixed amount of power available.

From an operational perspective we can expect ever more aggressive implementations of power proportional computing in servers where we see large variations between idle power consumption and power draw at full load.

As mentioned earlier, this variation used to be less than 10 percent. Today 50 percent is not unusual. Data center operators can expect wider swings in data center daily and seasonal power demand cycles. These swings bring additional management challenges. Server power management technology provides the means to manage these swings, stay within a data center's power envelope, and yet maintain existing service level agreements with customers.

There is still one more complication: with the steep adoption of virtualization in the data center in the past two years starting with server consolidation, an increasing portion of business is being transacted using virtualized and possibly cloud resources. Under this new environment, using a physical host as the locus for billing power may not be sufficient anymore, especially in multi-tenant environments, where the cost centers for virtual machines running in a host may reside in different departments or even in different companies.

It is reasonable to expect that this mode of fine-grained power management at the virtual machine level will take root in cloud computing and hosted environment where resources are typically deployed as virtualized resources. Fine-grained power monitoring and management makes sense in an environment where energy and carbon footprint is a major TCO component. To the extent that energy costs are exposed to users just as the MIPS consumed, this information provides the checks and balances and the data to implement rational policies to manage energy consumption.

Based on the considerations above, we envision power monitoring and control practices to evolve under the following three scenarios.

  • Undifferentiated, one bill for the whole facility. Power hogs and energy efficient equipment are thrown in the same pile. Metrics to weed out inefficient equipment are hard to come by.
  • Power monitoring at the physical host level implemented. Exposes inefficient equipment. Many installations are feeling the pain of increasing energy cost, but organizational inertia prevents passing costs to IT operations. Power monitoring at this level may be too coarse grained, too little, too late for environments that are rapidly transitioning to virtualization with inadequate support for multi-tenancy.
  • Fine-grained power monitoring. Power monitoring encompasses virtualized environments. This capability would align power monitoring with the unit of delivery of value to customers.

We expect these scenarios to be correlated with practices under the Data Center Power Management Capability Maturity Model. The first scenario would prevail in Stage 1 and 2 shops; the second scenario would be typical of Stage 3. The technology to support the most advanced scenario is still being perfected today. Combined with the process maturity needed to implement it, we don't expect this scenario to be implemented earlier than Stage 5 shops.

Before we continue it will be useful to formally define two concepts that we touched earlier in this chapter, namely power capping range and power dynamic range, followed by a discussion on the subtle differences between power management and energy management.


Power Capping and Power Dynamic Range

Up to this point we have been using power capping and power dynamic range informally. These numbers constitute important indicators for power management performance applicable to a single server or to a server pool. Let's use some mathematical rigor for a more precise definition of what these figures of merit mean.

The power capping range is the ratio of power consumed under full load over power consumed when the machine or group is capped to the lowest possible power consumption:


Where Pload represents group power consumption under workload; Pcapped,load represents the same, but under maximum capping action. The reciprocal, 1/ρcapping represents how much power consumption is lowered when the machine or group are capped for lowest power consumption.

The capping range applied to a group of machines is the average for all machines in the group:


Likewise, the power dynamic range is the ratio of power consumed under full load over power consumed when the machine or group is idling. It is similar to ρcapping, except that the denominator used is the idle power, not the capped power:


Where Pidle represents the idle power consumption. A machine with a ρdynamic number of 2:1 means that its idle power consumption is one half of its peak power consumption. The group version is defined similarly:


that is, the machine exhibits a ρdynamic ratio of 12:1 for S5. Using the reciprocal metric, we can say that the machine consumes only about 8 percent of its peak consumption during S5.

Here are some examples from the Intel Cloud Builder [REF-2] engagement with Microsoft. The measurements were taken from a pool of two machines running Microsoft Windows 2008 with the Hyper-V role enabled. The synthetic workload MaxPower was loaded on the virtual machines until the host reached peak power consumption. Runs were made on an Intel Server Board S5500WB, a low power baseboard mounted in a 1U chassis, code named Willowbrook. Each server is provisioned with 24 GB of memory and one internal hard drive.

Figure 1 shows the calibration power run trace for the machine named ComputeNode3. After the machine reached a peak consumption of 309 watts, an aggressive cap beyond the current control range was applied.



Figure 1 – Power Run Trace for Compute Node 3.


The capping ratio for this machine is


The numbers for ComputeNode2 were 252 watts and 191 watts, respectively. Hence the collective capping ratio for the two nodes would be


Note the significant variation in power consumption even though the two machines were configured identically.


Power Management versus Energy Management

Discussions about power management actually involve two main aspects, power management proper and energy management. A metric for power management allows us to track operational "goodness", making sure that power draw never exceeds limits imposed by the infrastructure. The second metric tracks power saved over time, which is energy saved. Energy not consumed goes directly to the bottom line of the data center operator.

Energy represents a capability to deliver a certain amount of work, whereas power represents an instantaneous measure of a capability of intensity to carry out that work. In electrical terms power is measured in watts. Energy represents power applied over a time interval. Hence energy is measured in watt-hours.

Power consumed by a server can vary instantaneously and continuously.

The language of calculus is useful for expressing this time-varying relationship: the energy consumed by a server is represented by the integral of its power consumption over that particular interval:


A power saving mechanism can also yield energy savings. To understand the dynamic between power and energy management let's look at the graph below and imagine a server without any power management mechanisms whatsoever. The power consumed by that server would be Punmanaged regardless of any operating condition. Most servers today have a number of mechanisms operating concurrently, and hence the actual power consumed at any given time t is Pactual(t). The difference Punmanaged – Pactual is the power saved. The power saved carried over time, t1 through t2 yields the energy saved during that particular interval.

Mathematically, the energy saved is represented by the equation


The graphical representation of this equation is shown in Figure 2. From this analysis it becomes clear that in order for a power saving mechanism to yield meaningful energy savings, power savings need to be maintained for a long time and the difference between Punmanaged and Pactual needs to be as large as possible.



Figure 2 – Power and Energy Saved from Application of Management Technology.


Note that a mechanism that yields significant power savings may not necessarily yield high energy savings. For instance, as previously mentioned, the application of Intel Intelligent Power Node Manager can bring down power consumption by about 100 watts, from 300 watts at full load to 200 watts in a dual-socket 2U Intel® Xeon® processor 5500 series through the use of voltage and frequency scaling.

However, if Intel Intelligent Power Node Manager is used as a guard rail mechanism, to limit power consumption if a certain threshold is violated, Intel Intelligent Power Node Manager may never kick in, and hence energy savings will be zero for practical purposes. The reason why we do this is because Intel Intelligent Power Node Manager works best only under certain operating conditions, namely high loading factors, and, because it works through frequency and voltage scaling, it brings a power consumption versus performance tradeoff.


Power Proportional Computing

Another useful figure of merit for power management is the dynamic range for power proportional computing. Energy proportional designs have been proposed to achieve a significant saving in energy consumption in the data center.

The relationship is not necessarily linear, but assume it is to simplify the discussion. Hence, the power consumption of this server can be represented by this model:




Figure 3 – Power Proportional Computing.


This model is represented graphically in Figure 3. The x-axis represents the workload that can range from 0 to 1, that is, 0 to 100 percent. Pbaseline is the power consumption at idle, and Pspread is the power proportional computing dynamic range between Pbaseline and power consumption at 100 percent workload. A low Pbaseline is better because it means a low power consumption at idle.

As previously mentioned, for a Intel Xeon processor 5500 series-based server, Pbaseline is roughly 50 percent of power consumption at full utilization, which is remarkable, considering that it represents a 20 percent over the number we observed for the prior generation servers using the Intel Xeon processor 5400 series [REF-3]. The 50 percent figure is a number we have observed in our lab for a whole server, not just the CPU alone.

With the typical 5 to 10 percent loading, the actual power consumption from servers will also be less than the peak. However, even when loading factors are low, power consumption remains significant portion of peak.

As mentioned above, a low Pbaseline is better. Current technology imposes a limit on how low this number can be. Progress has been considerable. Just a few years ago it was close to 90 percent of full limit if we look at a server as power, and today it stands at about 50 percent. That's the current our unit of analysis.

If a 50 percent Pbaseline looks outstanding, we can do even better for certain application environments such as load-balanced front end Web server pools and the implementation of cloud services through clustered, virtualized servers. We can achieve this effect through shutting down or parking idle servers. For instance, consider a pool of 16 servers. If the pool is idle, all the servers except one can be parked. The single idle server is consuming only half the power of a fully loaded server, consuming one half of one sixteenth of the cluster power. The dormant servers still draw about 8 percent of full power. Hence, after doing the math, the total power consumption for the cluster at idle will be about 8 percent of the full cluster power consumption.

For a clustered deployment, the power dynamic range has been increased from 2:1 for a single server to about 10:1 for the cluster as a whole.

Luiz Barroso in his classic 2007 paper [REF-4] posited that given that servers in data centers were loaded between 10 and 50 percent of peak, it would be beneficial from an energy perspective to have servers with a large power dynamic ratio, the ratio of power consumed at full workload to power at idle. Figure 4 represents the state of the art today with a dynamic ratio of about 2:1.

Let's assume these servers deployed in a traditional data center, that is, a nonvirtualized data center with the operating system still running on bare metal. For these centers, common utilization rates hover around 15 percent and efficiency runs at about 20 percent. The operating band depicted is more conservative than what Barroso indicated, with a CPU utilization that rarely surpasses 40 percent.

A 20 percent efficiency is rather low compared with the efficiency obtained at higher load factors toward the right side of the graph.



Figure 4 – Efficiency as Function of Workload Demand.


Figure 5 shows what happens if we improve the dynamic ratio to 5:1.

A 5:1 dynamic ratio means Pbaseline is only 20 percent of peak power when idle. This is not possible today for single servers, but it is attainable for cloud data centers and as a matter of fact, for any environment that allows servers be managed collectively as pools of fungible resources and where server parking is in effect: a lower Pbaseline means efficiency ramps up much faster with workload.

The improved dynamic ratio also dramatically improves the operating efficiency in the operating band of the data centers, but it gets even better: the servers in the active pool are kept in the sweet spot of utilization in the range of 60 to 80 percent. If the CPU utilization in the active pool gets below 60 percent, the management application starts removing servers from the active pool to the parked pool until the utilization starts inching up. If the CPU utilization gets close to the upper range, the management applications starts bringing back servers from the parked pool into the active pool to provide relief and bring the utilization numbers down.



Figure 5 – Power Proportional Computing with Cloud Clusters.


Optimizing Power Performance in Virtualized Cloud Data Centers

Two approaches are commonly applied to reduce lighting energy use in residential or commercial buildings: turning lights off and using dimming mechanisms.

Turning lights off yields the greatest power savings, assuming the room is not to be used. A small amount of residual power is still being drawn to power pilot lights or motion sensors to turn on the illumination if someone enters the room.

Dimming the lights reduces power consumption when a room is in use it is possible to reduce the illumination level while allowing people to occupy the room for the intended purpose. For instance, illumination in certain areas may not be needed because mixed daylight is in use, zonal lighting on work areas is sufficient, or because the application calls for reduced lighting, such as in a restaurant or dining room. Power saved through dimming will be less than turning lights off.

Similar mechanisms are available in servers deployed in data centers. Applying server parking would be the equivalent of turning lights off in a room. The capability for "dimming lights" in a server is embodied by the Enhanced Intel® SpeedStep® technology and Intel Intelligent Power Node Manager technology. Enhanced Intel SpeedStep technology reduces power consumption during periods of low workload and Intel Intelligent Power Node Manager can cap power, that is, reduce power consumption at high workload levels under application control.

There is also a richer set of options for turning off servers than there are for turning lights off. The ACPI standard defines at least three states suitable for server parking: S3 (sleep to memory), S4 (hibernation where the server state is saved in a file) and S5 (soft off, where the server is powered down except for the circuitry to turn it on remotely under application control). The specific choice depends on hardware support; not all states are supported by a specific implementation. It also depends on application requirements. A restart from S3, if supported by the hardware, can take place much faster than a restart from S5. The tradeoff is that S3 consumes more energy than S5 because of the need to keep the DIMMs charged.

A widespread use of server parking is not feasible with traditional where a hard binding exists between the application components and the hardware host because bringing any of the hosts offline could cripple the application.

This binding gets relaxed for virtualized cloud environments that support dynamic consolidation of virtual machines into a subset of active hosts.

With the binding between hosts and virtual machines, it is feasible to move virtual machines around effectively defining a sub-pool of active hosts grown or shrunk to optimize utilization levels as shown in Figure 6. The active pool contains the servers running application virtual machines. During periods of decreasing demand, when server utilization goes below a predefined threshold, the management application consolidates virtual machines into a smaller pool of active servers. Vacated hosts are parked, the equivalent of turning lights off in a room, and as in the lighting example, once a server is in parked state the server can't run applications.

Conversely, during periods of increasing demand, if the load average increases above a predefined threshold, the management application restarts one of the dormant servers in the parked pool and spreads out the virtual machines to provide relief to machines in the active pool.



Figure 6 – Dynamic Reconfiguration plus Server Parking.


Composite Usage Models

Combining the application of Intel Intelligent Power Node Manager power capping with ACPI S3 or S5 server parking is an example of a composite usage.

A composite power usage is analogous to approaches used in other knowledge domains, such as in the making of single malt or single grain whiskies versus blended whiskies or varietal wines versus blended wines involving the blending of different products to achieve a specific flavor or effect. In medicine, it is not uncommon for physicians to combine several drugs to treat serious conditions and produce certain desired therapeutic effects.

The motivating use case for the composite approach in this section is to extend the power dynamic range to a degree comparable to the daily cycle dynamic range typical of common workloads. From our observations, a dynamic range of at least 5:1 is desirable because it can bring power consumption down to the ballpark level of utilization in traditional data centers around 20 percent.

Placing servers in a low energy state essentially changes the configuration of a pool of servers servicing a cloud workload, and hence we name this approach dynamic reconfiguratio.

We have measured a number current generation servers in the lab. We found for S5 power consumption in the range of 5 to 8 percent of peak server power consumption while S3 power consumption hovers around 10 percent of peak power. The asymptotic value for achievable dynamic range is the reciprocal of these numbers, or about 12:1 for S5 and 10:1 for S3. In practice it will be less than that because at least one of the servers in the pool mustremain active to be able to wake up the rest of the pool, unless a separate console is in use to perform this action.

Under the TANSTAAFL principle, application of a composite usage is not free. It introduces complexity, which requires certain process maturity. In addition, each technology element brought into the mix brings side effects.

These side effects need to be evaluated to ensure they won't interfere with the application, and when interference exists, measures are needed to neutralize the side effects.

The main benefits of dynamic reconfiguration are twofold:

  • Dynamic reconfiguration can potentially bring the average power consumption of a cluster of servers down to a level commensurate with workload demand. If the average workload demand is about 20 percent, we would expect the average power draw to be also about 20 percent of peak usage for the cluster, even though servers individually can't go below 50 percent of power usage, even at idle.
  • Second, dynamic reconfiguration can bring significant reductions in energy use by lowering the average power demand.

The main observed side effect of dynamic reconfiguration is a slower demand for demand spikes. The reason is that if when an uptick in demand takes place and a server in the pool needs to be restarted, it might take as long as 15 minutes to bring a server from S5 to an operating state.

Let's run the following scenario as an example. Imagine a pool of servers supporting an application. Furthermore, assume each server has a performance has a performance yield of 1,000 transactions per second (TPS) and that recovery time from S5 is 15 minutes.

The tuning of this installation would start with the examination of a daily cycle looking for the fastest occurrence of a demand spike. Let's say that the trace shows a bump in demand from 8,000 to 13,000 TPS in 2 minutes. The goal here is to have just enough servers running to meet the demand at any given time to save as much energy as possible.

Before the bump occurs, the system operator is happy running with 8 servers. As soon as the bump hits, there is trouble: two minutes after the bump, there is a demand for 13,000 TPS but the extra servers do not come online until 13 minutes later. Meanwhile, with 8 servers servicing a demand meant for 13, customers start experiencing longer and longer wait times until the SLA goes down the drain.

A solution for this situation is to have extra servers powered on, with enough numbers to ride the worst bump. In this case we'd need at least 5 extra servers online at all times only because they are needed for only one or two instances during the day. If demand is met with only 3 or 4 servers for most of the day, an extra 5 seems a price too steep to pay just to ride out the spikes.

Now assume that by using S3 instead of S5 the recovery time is reduced to 1 minute. Since recovery is so much faster, it may be possible to have only one or two servers in reserve and still meet the SLA.

What we have done is essentially create a third sub-pool of servers in standby to meet SLA requirements. Another approach to minimize the impact of maintaining the reserve sub-pool is to use these servers for low priority workloads and run them under an aggressive power cap to minimize their power draw. The management of the reserve pool is not static; if it looks like the reserve pool is getting exhausted due to a demand spike, the management application will start replenishing it to maintain a safety margin, with less urgency, because the servers in the reserve pool can be brought to work very quickly.

The pattern that emerges from this example is the notion of platoons: different contingents or sub-pools of servers backing each other to implement a global power management policy. For this reason we call this scheme for distributed power management platooning.

Figure 7 captures a generalized platooning framework, with parked states toward the right and active states toward the left and some in-between states. There is a choice of parking states offering different tradeoffs between power consumption and recovery time. Power capping can be applied to define other sub-pools and to trim power consumption as needed.



Figure 7 – Generalized Platooning Framework.


Platooning requires actively managing the server pools in each partition, either by transitioning the servers across different states, and hence moving them across platoons, or by moving workloads across platoons through the use of virtual machine migration technology.

In theory a platooning scheme can be made as complex as needed to meet most any power performance behavior. In practice a lot of weight needs to be given to simplicity. A scheme with three or more platoons can become difficult to manage: as platoons are added to achieve certain behaviors, additional platoons may need to be defined to counteract possible secondary effects.

Figure 8 summarizes the results of a platooning experiment performed during the TelefoÅLnica proof of concept. For a more detailed recount, please refer to Chapter 22. The setup for dynamic reconfiguration is the simplest possible: the application runs on two servers. At any given moment theof many workloads as they go through their daily ups and downs. Machines with a high capping ratio perform well in this environment, but this may not be sufficient. We introduce the notion of composite usages where we combine one or more usages paired with their respective technologies to amplify the power proportional computing effect, for instance, combining Intel Intelligent Power Node Manager power capping with server parking, putting a server to sleep. An extra benefit from this approach is it is possible to attain significant energy savings. The TelefoÅLnica proof of concept indicated an energy reduction of 27 percent with the smallest possible pool size of two. Gains are likely larger with larger pool sizes.

Composite usages have side effects; they bring operational complexity, and hence composite schemes need to be architected with care. In the example mentioned, the composite scheme reduces the system's capability to follow workload spikes. Possible solutions are to add reserve machines that can come online fast, which requires more power, or use a more energetic parking state, which also requires more power.

For more information about cloud computing, please refer to the book Creating the Infrastructure for Cloud Computing: An Essential Handbook for IT Professionals by Enrique Castro-Leon, Bernard Golden, Miguel Gomez, Raghu Yeluri and Charles G Sheridan.


References

[REF-1] http://www.computerworld.com/s/article/9126920/Power_struggle_What_role_should_IT_play...

[REF-2] http://www.intel.com/content/www/us/en/cloud-computing/cloud-builders-provide-proven-advice.html

[REF-3] This particular experiment was conducted with an S5000SL platform provisioned with two Intel® Xeon® E5440 processors and configured with 8 GB of memory and one SATA hard disk drive.

[REF-4] http://www.barroso.org/


About the Contributors

Bernard Golden has been called "a renowned open source expert" (IT Business Edge) and "an open source guru" (SearchCRM.com) and is regularly featured in magazines like Computerworld, InformationWeek, and Inc. His blog "The Open Source" is one of the most popular features of CIO Magazine's web site. Bernard is a frequent speaker at industry conferences. He is the author of Succeeding with Open Source, (Addison-Wesley, 2005, published in four languages), which is used in over a dozen university open source programs throughout the world. Bernard is the CEO of Navica, a Silicon Valley IT management consulting firm.

Miguel Gómez is a Technology Specialist for the Networks and Service Platforms Unit of Telefónica Investigación y Desarrollo, working on innovation and technological consultancy projects related to the transformation and evolution of telecommunications services infrastructure. He has published over 20 articles and conference papers on next-generation service provisioning infrastructures and service infrastructure management. He holds PhD and MS degrees in Telecommunications Engineering from Universidad Politécnica de Madrid.

Raghu Yeluri is a lead architect with the Intel Architecture Group at Intel, with focus on virtualization and cloud architectures. He is responsible for understanding enterprise and data center needs, developing reference architectures and implementations aligned with Intel virtualization and cloud related platforms and technologies. Prior to this role, he has worked in various engineering management positions in systems development, focusing on service-oriented architectures in information technology.

Charles G. Sheridan leads Intel Labs Europe Sustainability and Energy research program focused on the application of information and communication technologies (ICT) to drive and enable the shift to a more sustainable economy and society. He has participated in several European Commission research projects in FP7 and consulted with the European Union (EU) Commission. He co-chairs the leading industrial consortia focused on sustainable computing with key companies. Charlie has worked with Intel for 17 years with roles in both automation and IT innovation before joining Intel Labs earlier this year.

Copyright © 2011 Intel Corporation. All rights reserved.

This article is based on material found in book Creating the Infrastructure for Cloud Computing: An Essential Handbook for IT Professionals by Enrique Castro-Leon, Bernard Golden, Miguel Gomez, Raghu Yeluri and Charles G Sheridan.