Pets vs Cattle (and ITSM)

With the advent of cloud computing (regardless whether it is private, public like AWS, Rackspace, etc, or hybrid) a popular meme has arisen to "treat your servers like cattle, not as pets".

This meme suggests that IT organisations should change their views (and therefore behaviors) with servers in the cloud by not treating servers as their favourite pets, but rather act like farmers and view their servers as cattle. There are several blog posts already on this topic by authors like Mark Needham, Greg Ferro, Massino, Simon Sharwood.

The slide below from Gavin McCance from CERN provides a great, single image of the meme. His presentation titled “CERN Data Centre Evolution” detailed the scientific organisation's 12,000-odd servers and plans to manage them more efficiently.

From this slide, you probably now understand the meme:
  • If pets are sick, we nurse them back to health. If cattle are sick, we destroy them (sounds harsh, but we can spin up new servers and drop sick servers in minutes);
  • Pets sleep near/in the farmhouse (in our datacentre under our protection), cattle sleep out in the distant barn or field (their data centre with no/little protection from us);
  • If you want a server that scales up, deploy a pet (server). If you want to scale out your servers, deploy cattle (servers. Public cloud providers prefer to provide many 'small' servers rather than few 'big' servers);
  • Due to their different service models, the tools you use to manage them should be different as well (as recommended by Massino)

For a more practical explanation of this meme, Mark Needham provides a real world example in his blog post.

So what does this mean for IT service management? As I mentioned earlier, a change in perceptions should lead to a change in behaviours, and I suggest some of the changes may include:

Less focus on incidents, more on problems (especially on  applications).

Troubleshooting cattle servers becomes redundant since new servers can be spun up and under performing servers can be dropped out, ideally using automation. IT staff should focus more on event/incident trend analysis to identify problematic server builds/configurations and applications. Important pet servers will likely to retain their needs for attention, and reviews should be conducted to convert these pets into cattle (where feasible).

Lighter service asset and configuration management.

If you've struggled to provide a good configuration management service with accurate data from a CMDB, then you may want to reconsider the feasibility of the service. Public cloud providers provision servers and infrastructure in a 'black box' delivery style. You may know the server name and some configuration information, but you may not know where it is. I'd recommend moving your focus more towards users (demand management), applications, API and interfaces (aka relationships between users, application & middleware, or the top end of the application stack).

Simplify change management

With the ability to safely and automatically spin up and shut down (cattle) servers/infrastructure, IT organisations should review these operations and their change models. As the risk profile of these operations reduces, lodging change requests should be as simple as either automating standard change requests (via your deployment system) or even reclassifying these changes as service requests. Reviewing and streamlining your change management process in this fashion will also support your organisation's ability to implement continuous delivery (and DevOps).

Check your bills regularly

As IT services move to the cloud, the budgeting tends to focus less on capital expenditure and more on operating expenditure. Public cloud providers often have very simple pay-as-you-go charging schemes which provide easy charge-back mechanisms for your customers. As your 'herd' of servers begins to grow (through the easy and rapid deployment of new servers), it is wise to regularly monitor the size and utilisation of the servers to ensure that you're not paying for servers that are no longer required. When managing new business and project demands with automated provisioning systems, it is all too easy to forget to clean-up and decommission redundant servers.

Focus on single points of failure

With public cloud computing providers offering high availability services through virtualised technologies, Availability Management changes focus from component level availability to the identification and elimination of single point of failures in IT services. IT services and applications could be redesigned to take advantage of the benefits that virtualised, 'cattle-like' servers offer.  

In summary, the meme to "treat your servers like cattle, not as pets" will probably introduce new perceptions and behaviours from our IT teams. In conjunction with this, I recommend ITSM practitioners to review their organisation's IT service management policies and procedures to ensure that they enable IT teams to gain maximum benefit from these investments in cloud computing (aka don't let your outdated procedures become delivery blockers). For ITSM teams with limited resources and budgets, we often have to prioritise our work and focus on some key focus areas. This meme will generally move our need to focus on component-level, operational areas to service level, tactical areas of IT service management. 


Popular posts from this blog

Using the Lean Canvas for an IT solution proof of concept

Seamless and scalable IT major incident management

Applying Scaled Agile Framework (SAFe) to IT Service Management and IT Operations