.shock - Fotolia

Let automation take away the tedium

Computer Weekly looks at how far the suppliers of systems management tools and associated products have progressed with automation

This article can also be found in the Premium Editorial Download: Computer Weekly: Future-proofing Gatwick Airport’s networks

The degree to which information technology increases automation, often now aided by machine learning and artificial intelligence (AI), is high on the list of benefits put forward by IT suppliers for the applications they promote to various industry sectors. However, how good is the IT sector at eating its own dog food? And how well is IT management being automated?

The management of IT systems often involves repetitive and tedious tasks. The challenge increases with the growing number of servers, user endpoints, appliances and other devices, exacerbated by the capability to virtualise almost all of these and deploy on resource-rich public cloud platforms. Automation, or digital labour, carried out by bots (software robots), has a part to play in streamlining all stages of the systems management process.

Traditional systems management suppliers are adapting their tools to support automation: BMC has just revamped its product range as BMC Helix Cognitive Service Management; IBM has its Enterprise IT Automation Services; Microsoft supports automation via its System Centre Orchestrator as well as via PowerShell (its configuration management framework) with Service Management Automation; Quest’s Kace has some automation; and Spiceworks, the free, ad-funded toolset, supports automation of ticket routing and some other basic tasks. Relative newcomers, such as IPsoft, have built in automation from the outset.

Claims as to what can be achieved are impressive. IPsoft says, typically, it can automate 35% of a customer’s operations when it first goes live. This is achieved by applying machine learning from the outset, looking at historic data – for example the past 12 months’ helpdesk tickets. Over time, says IPsoft, automation can be increased to 70-80%, and in the best cases over 95%.

Two things are instrumental to early success with automation: well-documented standard operating procedures, and a comprehensive understanding of IT infrastructure recorded via an up-to-date configuration management database (CMDB). The lower the quality of either of these, the more effort automation will require.

The dynamic modelling and documentation of infrastructure is made possible by a range of tools, which are often included with systems management suites. There are also specialists in the area, such as RedSeal, SolarWinds and Ipswitch, which provide dynamic modelling of IT infrastructure and a feed into CMDBs. BMC has announced Helix Discovery, which will be available later this year.

One issue that is often raised with automation is the impact on human operators. In some industries, this can be a problem, however with IT systems management the problem is usually one of scale, as human operators can be overwhelmed by the challenge as reliance on IT increases. Automation can deal with routine and mundane tasks, highlighting anomalies and exceptions that require human attention via various communications channels such as Slack or Skype. IPsoft claims virtual engineers can carry out 75% or more of the most basic tasks, and 30-40% of the more complex ones – impressive, but leaving plenty of the less tedious stuff for humans to do. Another challenge is that both digital and human operators require privileged access to systems. Hackers also seek privilege, so any mismanagement of privileged credentials, such as their embedding in scripts to support automation, can be a gift to the bad guys. Suppliers such as Osirium ensure privilege can be delegated safely both for its own automated IT management products and those of third parties it interfaces with.

Problem discovery

Early alerts to arising problems can reduce their severity and impact, and reduce the mean time to repair (MTTR). The usual way of finding out, and one which will not disappear, is to wait for users to report issues. In many cases, these will be specific to a certain user, for example a request for access to an application or a password reset. In other cases, a user report may presage a more serious problem that will affect many other users if not addressed.

Initial interaction with users is generally via an IT helpdesk, many of which now support automation, at least for initial interactions. IPsoft’s 1Desk interacts with users via its cognitive agent, Amelia, which carries out routine fixes by teaming with virtual engineers. BMC has recently announced its Helix Digital Workplace for user self-service, and the bots involved are trained using historic service desk data.

Read more about automation in IT

IT administrators anticipate that artificial intelligence and automation will only affect mundane tasks, freeing up 19% of their workload.

Intelligent agents and automation will be needed to support digitisation efforts in business, but to succeed, IT will need to apply these techniques internally.

Automation can go further and discover problems before users are even aware of them. IT devices are constantly generating machine data, dumped into log files, about their status and activity, and the volume can be many gigabytes per day. Most of it is meaningless background noise, but products such as Splunk IT Service Intelligence (ITSI) can trawl through the data and identify anomalies that indicate emerging problems and raise alerts.

When an arising problem is recognised, action is required, generally via a request for change (RFC), a ticket logged in an IT service management (ITSM) system such as ServiceNow or BMC Helix Remedy.

Here, there is more scope for automation, through auto-classification, assignment, prioritisation and routing of incidents. Splunk’s ITSI interfaces directly with ITSM tools to raise RFCs.

Resolving incidents

IPsoft’s IPcentre has automations for more than 20,000 standard operating procedures carried out by virtual engineers. Administrators will often need to access several devices and/or systems to fix a problem. Automated tools will often be driving scripts, for example those developed using configuration languages such as Puppet, Ansible or Chef. These open source tools are themselves evolving. For example, Puppet can automate various stages of systems management, including provisioning, configuration and patching.

Osirium can broker access to devices and systems without the administrators needing to know all the necessary privileged access credentials. This can either be via its own robotic process automation and task language, or by supporting other suppliers’ products. Its product is also used for high-volume requirements, such as automating the provisioning and update of mobile devices, where hundreds of parameters may need changing on thousands of devices.

Closing the loop

Once a problem has been resolved, effective root cause analysis (RCA) provides an understanding of why the problem occurred, and thus how to prevent recurrence. Splunk trawls through machine data to find causes. A recent Splunk report – Damage control: the impact of critical IT incidents – showed that more than 13% of all incidents happen due to failure to learn through RCA from previous incidents. IPsoft supports what it calls IPsoft probalistic RCA. Osirium uses machine learning to trawl through previous tickets, aiming to prevent future failures.

Why artificial intelligence is gaining momentum in IT management

• Huge numbers of physical and virtual IT devices require the automation of routine IT management tasks, both for discovering and resolving problems.

• Successful automation requires a dynamic model of an organisation’s internal and cloud-based infrastructure, and well-documented standard operational procedures.

• Artificial intelligence uses machine learning to improve automated responses, interact with users via automated helpdesks and helps seek out the root causes of problems.

• Human operators are not going to be replaced by automation, but will be focused on the most challenging tasks, working with automated tools via shared workflows.

• Traditional systems management suppliers are delivering automation, and specialists have also emerged, sometimes forming partnerships with the incumbents.

Furthermore, unlike humans, automated systems do not forget how previous problems were fixed, readily recalling previously used procedures. All actions are tracked and logged for later analysis, providing repeat fixes and driving further insights for continuous process improvement.

The systems management automation dividend

Automation in systems management makes use of the huge amounts of data produced by IT systems which human operators would never be able to review. This has huge potential to drive machine learning, which many suppliers are only at the early stages of exploiting. Automation in systems management is also essential to support the scale of IT operations, be it a multitude of virtual servers or thousands of user smartphones.

The automated processes that are being developed include workflow links that enable tasks to be passed to humans. The need for skilled human operators in IT management remains – automation serves to make their jobs less tedious and more rewarding.

Read more on IT operations management and IT support

Data Center
Data Management