IT process automation and its impacts on IT service management

Introduction

IT organisations are under pressure to reduce idea-to-product cycle times while improving the service availability of the diverse range of systems and technologies under their charge. IT service management (from IT strategy to operations) cannot support this change if it continues to be underpinned by manually executed processes and activities. Contemporary IT service management (ITSM) incorporates concepts such as cloud, infrastructure as code and Continuous Delivery (CD), where ITSM must be able to manage the complexities of numerous elastic and dynamic IT environments that can change in size and location at short notice.  Manually executing the underpinning activities and tasks will lead to higher probability of errors and longer service provisioning times. Automation is a solution since it imposes consistency and reduces the manual work that is tedious and error prone.

As an example, an Australian financial institution was embarking on a digital transformation program to maintain their competitive advantage and exceed customer expectations. As part of this program, the institution required assistance to identify the uplifts required to alter their IT service management (ITSM) capability to support both mode 1 (traditional mode of IT emphasising safety and accuracy) and mode 2 (exploratory mode, emphasing agility and speed with Continuous Delivery) as a transition state. This is commonly referred to as bi-modal or multi-modal (Nielsen, 2015). During the future state analysis of this organisation, it became apparent that tooling and automation uplifts would be a significant enabler to achieve this goal. As an example of a successful case study, Google manages their environments with a strong preference to leverage automation as much as possible. Beyer, Jones, Petoff & Murphy (2016) stated that "part of our (Google's) preference for automation springs from our particular business challenges: the products and services we look after are planet-spanning in scale, and we don’t typically have time to engage in the same kind of machine or service hand-holding common in other organizations". While most IT organisations are not of the same scale as Google, this reference highlights valuable considerations for most IT enterprises.

Automation is a key enabler for agility since it "enforces consistency and reduces tedious, error prone manual work (in managing IT infrastructure). Furthermore, application development benefits by being able to reproduce all the necessary parts of the production environment to build, run and test their code on their workstations" (Kim, Humble, Debois & Willis, 2016, p. 114). So while automation has been identified as a key enabler for digital transformation and improving the agility of ITSM, this article seeks to provide clarity on the classes of IT process automation (ITPA), how ITPA impacts ITSM and considerations for an effective implementation of ITPA with ITSM.

IT Process Automation

Justice (2017) and KPMG (2017) have identified three classes of IT process automation:

Class 1: Basic process automation
The simplest form of IT process automation is where the tools tend to follow a series of simple tasks and pre-defined rules. This is also commonly known as run book automation. The inputs to the process are well known and defined. The tasks are primarily repetitive in nature with little to no use of artificial intelligence. Examples include data entry, IT event, incident and problem management, IT request fulfilment and transaction processing where limited subjective thinking or judgement is required. A rules library is used and requires management.

Class 2: Enhanced process automation
A more complex form of IT process automation that involves the interpretation of media such as video, images and audio (interpretation of natural language and voice conversation). The inputs are more problematic, and as a result require some machine learning and reasoning. This class is often seen in IT service desks and employs capabilities like chat bots and digital personal assistants. Digital personal assistants can be employed to provide first line support to customers and support to (human) second line support agents by retrieving relevant knowledge base articles/procedures for escalated calls. Some examples from the IT industry cited by Justice (2017) included Google Now, Google Assistant, Amazon Alexaxica, Microsoft Cortana and IPsoft Ameila.

Class 3: Cognitive automation
This is the most complex form of automation where inputs are less defined (e.g. social media) and reasoning requires more judgement as it relates to an output/outcome. Some activities include data discovery, data analytics, financial accounting, and providing tax advice and fraud detection. Justice (2017) provided an example where IBM Watson was configured to conduct and support the auditors in performing a credit audit. This in turn meant less reliance on human data sampling and correlation. A key characteristic is the ability to use inputs and apply probabilistic thinking as we see in self-driving cars. Cognitive automation (CA) can visualise and interpret from video imagery, and make deterministic judgements on what is on the road in front of the car (e.g. interpreting an animal on the road or reading a street sign). The automation determines the current traffic patterns and then executes one or more commands to the vehicle to take some course of action.

While class 1 and some elements of class 2 ITPA are commonly employed in today's ITSM, this article aims to outline the potential impacts and opportunities to employ CA to improve the effectiveness and efficiency of IT service delivery.

How does ITPA impact ITSM

To illustrate the potential impacts and opportunities in ITSM, a series of uses cases where CA and Continuous Delivery are presented in alignment with the four IT4IT value streams: Plan, Build, Deliver and Run (The Open Group, 2017). The phases of IT4IT was chosen due the framework's ability to "support real world use cases driven by the digital economy" and to clearly present the fundamental patterns of ITSM.

1. Plan: Strategy to Portfolio
CA enables decision support of IT investments by leveraging information from external sources (e.g. vendor pricing) and internal usage trends from an operational data warehouse/lake. Policy compliance is supported by continuous security assessments being delivered against the latest standards and guidelines. Configuration items with detected anomalies are automatically quarantined from the remainder of the production environment.
Portfolio demand is bolstered by CA monitoring and responding to dynamic fluctuations in IT service consumption in line with policy compliance (e.g. provisioning extra software licences). CA continuously rationalises the service portfolio based on consumption and utilization patterns and trends detected across various data sources (e.g. dynamically shifting workloads across various cloud providers). CA can provide real time accounting records to show the ongoing “investment & spend” in each service/application.

2. Build: Requirement to Deploy
Requests for change (RFC) are raised by commands issued to chat bots or voice commands with CA. RFCs are auto-created by the Continuous Delivery tooling part for the application development release cycle and copied/transferred into an operational data warehouse/data lake for centralised record keeping, event correlation, monitoring and reporting purposes.
Changes are continuous and incremental in nature. Based on the nature of the change it can be deployed anytime (24x7) as CA leverages infrastructure as code to deploy and destroy new environments. Build & test activities are performed early in the release cycle through the CI/CD tools with oversight by CA. If required, CA will interact with peers or leadership of the developer to seek approval for higher risk changes. Conference calls/bridges can be initiated to join respective parties into the review discussion.
CA inspects change records for potential impacts against other change records either scheduled in the Forward Schedule of Change or currently being deployed to avoid conflicts. CA adjusts the deployment windows of the requested change or other changes to reduce the risk of an outage or service degradation. Chat bots or voice commands with CA can be leveraged to learn what changes are currently being released or scheduled for a particular time period.
Automated or manual post deployment validation is carried out to ensure the success of the change. For customer facing applications, CA monitor social media channels for potential customer complaints, diagnoses the responsible change and commences rollback. CA monitors the success of changes to determine if new candidates for standard changes exist and provide recommendations. Finally CA decommissions any test environments used prior to releasing the change into production.

3. Deliver: Request to Fulfil
The requestor is initially authenticated through standard identity and access management or by natural language processing. Services can be requested from a dynamic self-service tool. The list of requestable items is tailored based on the requestor’s interaction with CA or most popular request items for their job role. The requestor is presented a series of related requestable items based on past buying trends or insights on current projects in their portfolio. When the service is requested, the requestor receives an estimated time of delivery with pricing information. The request is approved based on information provided in the form of actioned hyperlink, scanned finger print or voice signature from the approver’s mobile device. Approval of the request triggers CA to send a fulfilment command to the orchestration engine to commence fulfilment.
The orchestration tools initiates service / pattern provisioning / modification / deletion. Should the new request endanger pre-configured capacity limits, CA will initiate calls to other orchestration engines to increase capacity using predefined patterns. Any issues with fulfilment are alerted to CA who determines if the issue can be contained and resolved leveraging a knowledge base. CA may interact with the requestor to provide alternative options if the requested item is not available.
IT finance tool is updated with the final request fulfilment information including the billed amount. The need for traditional CMDB will reduce & be superseded by an operational data warehouse capturing real-time request events from various sources including cloud providers and other CA. CA interacts with the requestor to gauge their level of customer satisfaction with the fulfilment service. Exceptional feedback is escalated to customer relationship managers.

4. Run: Detect to Correct
Incidents are identified by the monitoring tools, user interactions, social media or other trusted CA engines. The incident records are copied/transferred into an operational data warehouse/lake for centralised record keeping, event correlation, monitoring and reporting purposes.
P1/P2 incidents handled with high priority with CA assisting in limiting business impact and establishing conference calls between on-call personal and major incident managers. Incident communications are either developed and disseminated automatically by CA or drafted for human review and publishing. CA interrogates internal knowledge bases and supplier/vendor knowledge bases to leverage appropriate incident resolution articles. If required, CA communicates with other known, trusted CA engines to coordinate resolution activities. Support personnel leverage ChatOps, chatbots and/or CA to conduct detailed investigation and diagnosis with information within the operational data warehouse/lake.
Incidents are auto-escalated by CA if appearing not to be resolved within required service targets. In resolving low impact incidents, CA deploys and destroy IT environments rather than conduct diagnosis and fix issues. Clear incident management guidelines are required in case manual (human) intervention is required to modify the production environment that is normally accessible by secured CI/CD tooling.
CA conducts regular scans across the operational data warehouse/lake used to store and report trends on all ITSM records, metrics, events and unstructured machine data. The data warehouse/lake enables machine learning, trend analysis and anomaly detection. As a consequence, the role of a traditional network operations centre changes significantly and will need to be more of a data focused technical command centre focused on real time BI analytics and response.

Considerations for employing ITPA with ITSM

In his research on ITPA, Fung (2014) concluded that ITPA would deliver the following benefits:

IT service repeatability and consistency
IT service predictability
Better IT services integration
Increased IT service productivity
Increased IT services satisfaction
Reduced IT risk
IT service cost effectiveness
Improved business performance

Further to this KPMG (2017) stated that automation “increases flexibility to accommodate volume fluctuations and boosts productivity and reduces effort spent on repetitive tasks”. While there may appear to be clear benefits to implement ITPA, the implementation of new or enhancement of existing ITPA within an organisation induces a series of consideration across the IT operating model. Some of these considerations include:

Integrated vision on how ITSM needs to change to leverage ITPA
The scale of impact across ITSM means an integrated, customer centric vision and strategy is needed for the effective adoption of ITPA. While an IT organisation could possess a clear understanding of traditional delivery methods and ITIL based ITSM, their understanding of Continuous Delivery and ITPA (and their implications) could be low. To establish this vision and strategy, it is important to secure executive sponsorship, establish shared goals, integrate with existing governance structures and programs, collaborate with various IT and governance risk and compliance communities and develop a change strategy.

Changes for some IT roles

Service Desks
Service desks will employ digital service agents/bots to manage the majority of calls and enquiries. The human service agents provide administration and governance over the digital service agents, tracking for potential problems and new feature requests for product managers. Digital service agents may not only provide first line support but also assist human 2nd and 3rd level support engineers in completing repetitive IT tasks or assisting in research and investigations.
Managed Service Providers
For a number of years, IT organisations have leveraged managed service providers and outsourcers to provide IT services commonly to either reduce IT operating costs or to free up expensive in-house resources to focus on high value work. The rise of ITPA threatens the value proposition for managed service providers as ITPA promises to provide the same service without the need to manage the supplier relationship with an external entity where services may be provided from a distant locality. Justice (2015) considers ITPA to bring about “the Death of Outsourcing.” However in his research, Justice also found that "not only are companies like IPsoft, IBM Watson and Automation Anywhere providing the core technologies for cognitive automation, many of the world’s largest outsourcers like Wipro and TCS have introduced their own cognitive platforms which are poised to transform the labour-centric offshore services landscape to one that resides in the cloud."

IT staff need to be re-skilled

Once established, communication and awareness of the new vision of multi-modal ITSM incorporating ITPA will be required for staff. It is important to secure executive buy-in that there will be short term pain (cost, time and schedule) for the long term gain and their role will be to support teams in the appropriate application of ITPA. People should be trained on contemporary practices such as Agile, Continuous Delivery and Lean to empower them with the skills to enable a multi-modal IT environment (even if it a temporary IT delivery model). Supplying practice coaches can help clarify the required shift in roles and responsibilities as teams move towards Continuous Delivery. Furthermore, people should be educated on the three classes of IT process automation including strengths and weaknesses of each class so teams can make informed decisions on the employment of ITPA.

Beware of complacency

Fung (2014) quoted Singh et al. (2009) who found that “the extensive use of automated systems has created a new concern whereby IT staff are not capable to detect an automation malfunction (termed as ‘automation-induced complacency’)”. The study conducted by Singh et al. (2009) shown that high static automation reliability has increased automation-induced complacency and perceived mental workload has reduced from pre to post-test sessions under high static reliability condition. Singh et al. (2009) went on to suggest that applying adoptive automation to remediate this challenge. Adaptive automation is the "modification of functionality, level or number of automated systems in real time to enable the resetting of the task environment based on evolving situational demands."
Further to this, the increasing exposure to IT process automation and gradual decline of knowledge and experience in executing manual tasks may expose an IT organisation to prolonged outages in the event of disaster. In such events, disaster recovery plans typically employ manual tasks to fulfil processes which may now be automated. As more automation is employed and experience in the corresponding manual tasks decline, the time to restore service may extend to unacceptable levels as tacit knowledge has been lost.

Customer service

ITPA has delivered both negative and positive consequences on the level of personal touch from IT staff to their customers. Fung (2014) observed the negative effect was visible "especially in large call centres or service desks whereby some of the IT works have been automated e.g. passwords reset automation". Contrary to this, Fung (2014) quoted Adapt One (2010) who stated that "when tasks are accomplished automatically, IT staff have more time and energy devoted toward customer satisfaction. Moreover, with fewer mistakes made this will enhance the relationships with customers."

Revised processes, roles and responsibilities

To effectively adopt ITPA across an organisation, Enterprise Architecture should be engaged to provide guidance on the process, information and tooling architecture to achieve IT investment goals. Considerable engagement between ITSM process owners with automation/tooling owners would be required to ensure alignment, clear ownership and articulated roles and responsibilities between the various IT teams and other business units (e.g. Governance, Risk and Compliance (GRC), Finance and Procurement). The parties should collaborate to develop guidance on a single set of ITSM processes and ITPA tooling to support agile IT service delivery, and in turn, reduce confusion and increase service efficiency and effectiveness.

The adoption of Continuous Delivery at scale needs to coincide with application teams taking accountability for service stability of infrastructure or platform services they have consumed and on which their applications are deployed (with I&O accountable for common/shared capabilities and infrastructure). It will be critical for application team KPIs, performance management and incentives to align to this new construct as this represents a significant shift/change. This change may present challenges from a sourcing perspective where the current application run model in an organisation has a reasonably high degree of separation between onshore internal development and offshore application support teams.

Governance, risk and compliance

Change controls need to reflect a move to a highly automated and standardise delivery model, reducing approval authority levels & reliance on pre-prod centralised control and governance. As an example, the majority of change requests move from normal to standard; the focus of GRC bodies’ shifts towards assurance and confidence in patterns, automation scripts and APIs and non-functional requirements are considered in the infrastructure-as-code application templates. The mix of Change Management KPIs shifts (e.g. number of failed changes becomes less critical as speed and frequency of roll-backs improves; change reviews rely on automated testing with peer reviews and post implementation verification is automated). The shift towards a more automated change control process and its benefits are supported by data presented in the 2017 State of DevOps Report (Forsgren etc al., 2017) that shows that high performing IT organisations operate a change approval process with the lowest amount of manual activities (see Table 1 below).
Table 1: Percentage of work that is done manually, by performance group (Forsgren etc al., 2017)

Real time IT decision making

Continuous Delivery pipelines supported by ITPA facilitate the rapid creation and improved data quality of ITSM records. Underpinned by an operational data warehouse/lake, ITPA is able to easily identify key trends and detect anomalies in security controls, fluctuations in IT service demand and consumption and customer satisfaction. The increased volume and quality of information enables better decision making in service strategy and portfolio management and is leveraged by ITPA to then dynamically tune the customer facing service catalogue and offerings. Further to this, the majority of IT budget expenditure should shift from CAPEX to OPEX and therefore facilitate the rapid approval of IT spend when transferring dynamic workloads across multiple cloud providers.

Conclusion

There is increasing interest in the adoption of Continuous Delivery, particularly in enterprises for enabling digital transformation. On this topic, there is considerable information and recommendations from various sources for organisations to carefully manage the cultural aspects of implementing Continuous Delivery. This is due to the requirement for traditionally siloed IT teams to unite and collaborate on a new and agreed value stream (including a set of processes, new roles and responsibilities, reporting and tooling to support the Continuous Delivery pipeline). Acknowledging that Continuous Delivery can provide a significantly positive change to ITSM, it is the author's opinion that an equal if not greater impact on ITSM will come from the requirement to adopt ITPA, particularly the most advanced form - cognitive automation.

This article has introduced ITPA, highlighted some implications across ITSM and presented some considerations for implementation. Contemporary IT service management can no longer tolerate the errors and delays associated with manually executed activities in order to meet today’s customer demands. Therefore it is recommended that IT organisations seek and address those opportunities to enhance IT’s value chain, and reduce operational risks with greater employment of automation.

References

Adapt One (2010) Automate Processes for a Competitive Edge. White Paper, Adapt One Company, London.
Beyer, B., Jones, C., Petoff, J. & Murphy, N. (2016) Site Reliability Engineering. O’Reilly Media, Inc. IBSN 978-1-491-92912-4
Cognitive Computing. (2017).Retrieved February 19, 2017 from https://en.wikipedia.org/wiki/Cognitive_computing
Forsgren, N., Humble, J., Kim, G., Brown, A. & Kersten, N. (2017). 2017 State of DevOps Report. Puppet and DevOps Research & Assessment. Retrieved June 10, 2017 from https://puppet.com/resources/whitepaper/state-of-devops-report
Kim, G., Humble, J., Debois, P. & Willis, J. (2016). The DevOps Handbook. IT Revolution. ISBN 978-1942788003
KPMG. (2016). The Creative CIO's Agenda: Six big bets for digital transformation.  Retrieved March 13, 2017 from  (http://www.kpmg-institutes.com/institutes/advisory-institute/articles/2016/08/digital-transformation-six-big-bets.html
KPMG. (2017). Robotics and Cognitive Automation. Retrieved February 19, 2017 from  https://home.kpmg.com/in/en/home/services/advisory/management-consulting/robotic-process-automation.html
Justice, C. (2015). Cognitive technology and the automation of everything. Retrieved February 21, 2017 from http://www.cio.com/article/2977565/robotics/cognitive-technology-and-the-automation-of-everything.html
Justice, C. (2017). What type of cognitive automation is best for your enterprise? Podcast retrieved February 19, 2017 from https://www.kpmg-institutes.com/institutes/advisory-institute/events/2017/02/podcast-best-cognitive-automation-type.html
Nielsen, K. (2015). Bimodal IT: How to be digitally agile – without making a mess! Gartner Research ID:G00268866.
Singh A.L., Tiwari. T., Singh I.L. (2009) Effects of Automation Reliability and Training on Automation-Induced Complacency and Perceived Mental Workload. Journal of the Indian Academy of Applied Psychology 3: 9-22.
The Open Group. (2017). IT4IT Overview. Retrieved February 21, 2017 from http://www.opengroup.org/IT4IT/overview


Popular posts from this blog

Improving your IT service delivery and operations with ChatOps

Using (Agile) planning poker for risk assessment of IT changes

Delivering Problem Management with Kanban