Have service assurance tools finally come of age?

Vendors have long promised that service assurance tools could monitor across IT systems, but they've always fallen short. Are these tools finally growing up?

This article can also be found in the Premium Editorial Download: IT in Europe: Adopting an application-centric architecture

Here’s a concept: Whether or not a specific link on your network is healthy is the least of your worries. In fact, service assurance (SA) vendors warn that networking is only a tiny fraction of what can go wrong behind an application, yet poor user experience still falls at least partially on the shoulders of the networking...

team. That’s where SA tools come in.

SA tools monitor across IT infrastructure and report into a single console where the information can be analysed to track both the root cause behind poor application performance and troubled end-user experience. The idea is to take monitoring completely away from specific infrastructure elements, such as networks, storage, servers, virtual machines and databases, and instead examine the interdependencies among these systems.

“The problem is not just the network or the servers; it’s that and everything in between,” said Steve Shalita, vice president of marketing at management and monitoring company NetScout. Multi-tiered applications, for example, depend on a collection of middleware, servers and databases that can all cause problems, he said.

What’s more, the emergence of virtualisation and converged storage/data centre networks have only increased the need for correlated event analysis.

“Ten years ago you had Fibre Channel SAN, and if there was a problem, you knew what HBA (host bus adapter) was attached to a specific server, and you knew which application was affected,” said Bob Laliberte, senior analyst at Enterprise Strategy Group. “Now servers, networks and storage are all interdependent, and with virtualisation, you need new tools that are able to accommodate a dynamic infrastructure.”

But tossing aside elemental monitoring for an integrated approach may not be so easy. For one thing, users question whether there is really any one tool good enough to handle the job.

“I am not aware that anyone has come up with a magic bullet,” said Carl Mazzanti, vice president of network strategies at systems integrator eMazzanti Technologies. “The number of vendors you have to be able to interoperate with in order to make this work is so high. Think about how many firewall companies and disk manufacturers [in addition to switch, router, server and storage vendors] you would need to work with.”

Even if you could build a tool that talked to every system, in many cases, individual monitoring tools fall short or can be difficult to manage, so some users question the point of integrating their information.

“All network monitoring tools are flawed from the perspective that unless there is a custom signature and you have the resources to create a solid baseline, you can’t get much done,” said a network engineer at a multinational consulting firm, explaining that most reporting from these tools can be so overwhelming that it is never read or analysed. “A simple collection of data requires a little tuning and a lot of massaging, so you need a tool that can do this now across all of that reporting. I haven’t seen one that exists yet.”

Even more troubling to users is that the term service assurance is only being thought of as a way to rebrand technology that has been tried in the industry for decades but has always ended up as a project tossed-aside on an IT shop shelf.

“Nearly 20 years ago, CA was selling Business Process Views as part of Unicenter,” said Rob England, an IT consultant and creator of the IT Skeptic blog. “Nowadays all the vendors promise a service-level view of status in their monitoring tools, and a service entity-type in their CMDB. It runs well in a simple demo, but it is either too expensive to set up or manage for the majority of organisations. In general, I think it is a tech geek fantasy of a magic tool solution to a very difficult problem.”

With so much skepticism, why bother with SA tools?

IT managers may be more convinced to invest in SA tools if they could prove return on investment. And that’s not impossible if these tools actually work and user-facing applications suddenly begin to perform better.

What makes SA tools different than basic monitoring tools is that they provide information about IT functions to the business side of an organisation as well as the IT shop, aiming to better support missioncritical applications, or applications that would most hurt business productivity if they go down.

SA tool users start by identifying these applications and then creating service models and baselines to measure them by.

So, for example, in supporting a customer relationship management (CRM) application, the SA tool would take into account Oracle on the back end, WebSphere for the front end, tools for security and network identity, as well as all of the servers and network links that support these.

CA’s Service Assurance tools— like those from most vendors—set “an intelligent baseline that understands what performance looks like at 8 a.m. Monday and how that’s different than Friday at 4 p.m.,” said Patrick Ancipink, vice president of marketing at CA. Then it uses that information to seek out anomalies.

Different companies, different approaches, but which is right?

No one takes issue with the idea of seeking out anomalies. Users are more concerned with how these tools will reach across systems. Some vendors offer SA tools that include home-grown monitoring applications, while others seek to funnel information from existing monitoring tools into a joint console for analysis. Which tool to choose depends on the existing monitoring investment.

“If [users] just made an investment in individual domain tools, it’s going to be hard to justify replacing all of that and bringing in something new,” said Laliberte. On the other hand, tools that are built to work together for application support may be more effective.

CA’s SA strategy is made up of a patchwork of monitoring tools the company has either acquired or developed over the years, including network monitoring from its NetQoS acquisition and application performance management from its Wily acquisition. Those tools work alongside the company’s Spectrum network infrastructure management tool that looks at everything from NetFlow information and packet information to line code and application response time. The SA console then pulls the information into a series of maps and impact graphs for both root cause analysis and predictive modelling.

For Zenoss, an open source monitoring and management software provider, the ability to adapt to working with any existing system and domain-based monitoring tool is its biggest advantage.

“We can talk to any system out there, whether it’s via data protocol like SSH or application protocols like Apache consoles or JBOSS. On the virtualisation front, we’ve gone further to manage Cisco UCS; we talk to VMware vCenter, to Puppet and to Openstack,” said Floyd Strimling, a cloud technical evangelist for Zenoss. “We can monitor the application stack, the server stack, the storage stack, the virtualisation stack, network components and speciality components, like environmental systems and power.”

In fact, Zenoss purposely uses existing network monitoring so as to avoid “recreating the wheel,” said Strimling. “We haven’t gone into wanting to becoming Cflow or Jflow—or any flow. We can gather that data and bring it into the system via partnerships with Infoblx or Plixer. There are certain things in networking that are well defined,” he said.

While visibility across the IT spectrum is the defining factor of an SA tool, network monitoring itself plays a crucial role—specifically packet sniffing, or deep packet inspection (DPI).

NetScout—which specialises in packet sniffing—places its tools across the IT spectrum. One monitoring tools sits physically in the data centre, looking at transactions in real time. Virtual appliances sit in each virtual server, and another virtual appliance can live in a Cisco Integrated Services Router (ISR). But NetScout tools look at the packet as it travels through all of these areas. “We see the packet as the source of intelligence. It is the one thing that touches every aspect of service delivery; it touches every single piece of technology that makes an application work,” said Shalita.

Virtualisation and the cloud make SA monitoring even more complex

When it comes to virtualisation and the cloud, following data paths isn’t so easy. The biggest complaint among networking professionals is the lack of traffic visibility in a virtual environment. In fact, even systems teams have a problem with visibility.

“We put a monitoring agent on every virtual machine—on the host and the client application—but it can only tell you so much,” said Mezzanti.

And virtualisation doesn’t stop at the server. As companies build out private clouds, they are moving toward using what is basically a network hypervisor in which the control plane of the network is decoupled from the physical components so that network managers have more granular control over resources. These so-called network hypervisors will also have to provide visibility for SA tools in order to ensure application performance, said Strimling.

Without that kind of visibility, moving applications into both private and public clouds will be impossible. At this point, though, most cloud providers are not focused on the level of end-user experience that enterprises and even smaller companies need. So in addition to making very complex internal reporting systems work, SA users will have to place their monitoring tools in the cloud and integrate this information into their management consoles—and that’s a long way off.

“What’s going to have to happen is that cloud providers will have to make investments in SA just like enterprises,” said Shalita. In the meantime, companies may have to place their own monitors on their portion of the cloud.

A new IT job: Service assurance manager

Breaking down silos within IT organisations has been a running theme in the industry over the past couple of years as IT professionals grapple with managing virtualised environments and converged networks. Yet even as IT pros realise that working together might help in designing and managing complex environments, there is still resistance to unification, as well as finger pointing between groups when something goes wrong.

While SA tools aim to eliminate the blame game when it comes to performance issues, they also require cooperation between IT groups. What’s more, to make SA effective, implementation and reporting need to be shared with the business side of the house. That’s why many SA vendors foresee the emergence of an SA manager who can interface among internal parties.

“This person would look across domains and say, ’I have five problems that are affecting business services, which is the highest impact?’” explained Ancipink.

NetScout’s Shalita sees the new role of service assurance manager as being a “manager of managers,” or someone that collects the useful information and presents it to each group so that no one is swimming in data. That’s meant to address the issues of complexity that many users see related to SA tools.

Until there are tools that can be depended upon across virtual environments and the public cloud, it is highly unlikely that SA managers will become a dime a dozen.“These [vendors] are in the right spot. We need [SA tools], customers are asking for them and solution providers are waiting to see who delivers the best first,” Mazzanti said.

Read more on IT architecture