Google has attempted to shine a light on Application Performance Management (APM) technologies built in what the company calls ‘a developer-first mindset’ to monitor and tune the performance of applications.
The end-game suggestion here is that we don’t ‘just’ turn cloud on, we also need to tune and monitor what happens inside live applications.
The foundation of Google’s APM tooling lies in two products: Stackdriver Trace and Debugger.
Stackdriver Trace is a distributed tracing system that collects latency data from applications and displays it in the Google Cloud Platform Console.
Stackdriver Debugger is a feature of the Google Cloud Platform that lets developers inspect the state of a running application in real time without stopping it or slowing it down.
There’s also Stackdriver Profiler as a new addition to the Google APM toolkit. This tools allows developers to profile and explore how code actually executes in production, to optimise performance and reduce cost of computation.
Google product manager Morgan McLean notes that the company is also announcing integrations between Stackdriver Debugger and GitHub Enterprise and GitLab.
“All of these tools work with code and applications that run on any cloud or even on-premises infrastructure, so no matter where you run your application, you now have a consistent, accessible APM toolkit to monitor and manage the performance of your applications,” said McLean.
When is an app not an app? When it’s unexpectedly resource-intensive says McLean.
He points to the use of production profiling and says that this allows developers to gauge the impact of any function or line of code on an application’s overall performance. If we don’t analyse code execution in production, unexpectedly resource-intensive functions can increase the latency and cost of web services.
Stackdriver Profiler collects data via sampling-based instrumentation that runs across all of an application’s instances. It then displays this data on a flame chart to present the selected metric (CPU time, wall time**, RAM used, contention, etc.) for each function on the horizontal axis, with the function call hierarchy on the vertical axis.
NOTE**: Wall time refers to real world elapsed time as determined by a chronometer such as a wristwatch or wall clock. Wall time differs from time as measured by counting microprocessor clock pulses or cycles.
Don’t ‘just’ turn cloud on
Not (arguably) always known to be the most altruistic, philanthropic and benevolent source of corporate muscle in the world, Google here appears to keen to ‘give back’ to the developer community with a set of tooling designed to really look inside large and complex batch processes to see where different data sets and client-specific configurations do indeed cause cloud applications to run in a less-than-optimal state.
You don’t ‘just’ turn cloud on and expect it to work perfectly – well, somebody had to say it.