Organisations should look to emulate the best practices of web giants Facebook, Google, eBay and Amazon to support scalable IT.
Previously the sole domain of internet giants, trends such as the internet of things and the integration of web and traditional business processes mean that CIOs will increasingly be asked to deliver the scalabilty that was once only possible on Google.
In the Gartner report Capacity and Performance Management Form the Basis of Web-Scale IT, analyst Ian Head noted that practices that are, at this point, rarely seen outside the web-scale community are likely to become more prevalent. He wrote: "As digital business requirements become more ubiquitous, the drivers for this will be from the business."
With the economy finally moving in the right directions, businesses are starting to reinvest in their IT capabilities.
A recent Gartner survey of CEOs found that senior executives now have an appetite to invest in IT and use technology to gain a competitive edge.
Speaking to Computer Weekly about these findings, Gartner fellow Mark Raskino said: "Business leaders tell us they recognise the need to invest in e-commerce, mobile, cloud, social and other major technology categories, and the capabilities they enable. That can't be done from within existing IT budgets alone."
Web companies such as Amazon and Google have pioneered the use of technology to run their businesses differently. With the CEO's renewed enthusiasm for exploiting technology to gain a competitive edge, perhaps now is the perfect time for IT to look at how Amazon et al manage their sophisticated IT architectures and plan capacity.
- Use stateless application architectures and horizontally scaling infrastructure architectures to deliver web-scale IT capacity management.
- Categorise and standardise workloads to balance the capacity of the IT infrastructure across services.
- Make application product teams responsible for application self-instrumentation (such as identifying scaling triggers) and analytics that empower near-real-time horizontal resource reallocation.
- Use demand-shaping techniques, such as canary, limited and dark launches, to limit the unexpected capacity and performance impacts of releases.
- Grow skills in the use of advanced analytics tools and techniques, directed at gaining a deep understanding of application performance demands and constraints.
Source: Gartner – Capacity and Performance Management Form the Basis of Web-Scale IT
Managing web scale
Gartner describes the big companies' IT architecture as "web scale" and although most businesses are not fortunate enough to have the millions of customers of Amazon or eBay, the flexibility of their IT architectures can be applied in many organisations.
Many of the tools that web companies use have been developed in-house, such as Netflix’s Hystrix, a cloud performance library. These tools are often available as open source.
As an example of what the big web businesses do to keep their systems running, engineers from the eBay Paas team wrote a blog saying that while it is not too difficult or time-consuming to make changes on one, two, or even a dozen servers, making changes to hundreds or thousands of servers becomes a non-trivial task. The auction site uses Ansible, a distributed systems management tool, to make changes across hundreds of servers in a consistent manner to reduce the chance of errors.
In the post, the eBay engineers said: "Ansible can be used as a configuration management, software deployment, and do-anything-you-want kind of a tool. It employs a plug-and-play concept, where existing modules have already been written for many functions. For example, there are modules for connecting to hosts with a shell, for AWS EC2 automation, for networking, for user management."
According to Gartner, web businesses also use deep analytics to facilitate proactive, real-time and near-real-time capacity planning.
Commenting on web-scale computing, one Computer Weekly reader agreed that ready web-scale services and elastic provisioning of resources will necessitate many changes: "The importance of applying 'analytics' to real-time monitoring and performance data cannot be overstated. Hard-coded thresholds simply won't do the job in an economically feasible manner. Figuring out what 'normal' looks like and giving the analytics hints about what 'normal' is expected to be will create challenges of their own."
Traditional off-the-shelf capacity planning tools are unsuitable for web-scale applications. Gartner's Head said: "By 2016, the availability of capacity and performance management skills for horizontally scaled architectures will be a major constraint or risk to growth for 80% of major businesses. To take advantage of web-scale IT approaches to capacity and performance management, IT architects need to fully embrace stateless application architectures and horizontally scaling infrastructure architectures."
Testing web scale
One of Gartner's recommendations in the capacity planning report came from a five-year-old post on Facebook, in which an engineer described how his company approached adding usernames to the social media site: "During the two weeks prior to launch, we began what we call a 'dark launch' of all the functionality on the back end. Essentially, a subset of user queries are routed to help us test, by making 'silent' queries to the code that, on launch night, will have to absorb the traffic. This exposes pain points and areas of our infrastructure that need attention prior to the actual launch."
In effect, "dark launching" allows the engineering team to stress-test parts of Facebook while simulating launching the new functionality to real users.
Gartner’s research has found that the web giants definitely do IT differently. Their approach to IT infrastructure, management and capacity planning has not only built the foundation on which they have established themselves as consumer brand leaders, but it has also shown the IT industry how to build for web-scale businesses.