AI In Code Series: Rapid7 - Plotting the AI graph, one node at a time

We users use Artificial Intelligence (AI) almost every day, often without even realising it i.e. a large amount of the apps and online services we all connect with have a degree of Machine Learning (ML) and AI in them in order to provide predictive intelligence, autonomous internal controls and smart data analytics designed to make the end user User Interface (UI) experience a more fluid and intuitive experience.

That’s great. We’re glad the users are happy and getting some AI-goodness. But what about the developers?

But what has AI ever done for the programming toolsets and coding environments that developers use every day? How can we expect developers to develop AI-enriched applications if they don’t have the AI advantage at hand at the command line, inside their Integrated Development Environments (IDEs) and across the Software Development Kits (SDKs) that they use on a daily basis?

What can AI can do for code logic, function direction, query structure and even for basic read/write functions… what tools are in development? In this age of components, microservices and API connectivity, how should AI work inside coding tools to direct programmers to more efficient streams of development so that they don’t have to ‘reinvent the wheel’ every time?

This Computer Weekly Developer Network series features a set of guest authors who will examine this subject — this post comes from Erick Galinkin in his role as principal Artificial Intelligence researcher at Rapid7 — a company known for its SecOps practices which it says work to deliver shared visibility, analytics and automation to unite security, IT and DevOps teams.

Galinkin notes that his mission at Rapid7 revolves around finding applications for (and, crucially, vulnerabilities in) Artificial Intelligence systems… and this is insight and research that he takes forward to publicise in order to inform the public and dispel FUD.

Galinkin writes as follows… and says that we already have pretty good predictive models for intelligent code completion.

“IntelliSense, Codota, and other types of code completion software are already helping programmers remember particular bits of syntax and fill in functions or methods automatically. When we couple this with the demonstrated capacity of things like GPT-3 to write code from scratch in a completely language-agnostic way, we end up with a really fruitful opportunity for tools that can offload a lot of writing and rewriting for software engineers,” he said.

GPT-3 explained

With its source code controlled by Microsoft, Generative Pre-trained Transformer 3 (GPT-3) is an autoregressive (i.e. it has randomness at its heart) language model that uses deep learning to produce human-like text and generate sentences that are (almost, in many cases) indistinguishable from human-generated text. It can be described as a sub-field of AI that is capable of building (in the normal developer sense of the word) computer code for text generation using massive volumes of training data… it is, essentially, computer-assisted authorship (which in itself the latest development in a long line of spellcheck technologies).

Galinkin continues by saying that there’s a real possibility for AI to generate the sorts of utility functions programmers write all the time by being provided a description of what it needs to do in lieu of interrupting a programmer’s flow.

“That said, I do not expect AI to entirely replace programmers any time soon – while extremely powerful models can generate code and can even generate AI code that writes AI code, the code quality is quite poor… and the further you get from the initial output of the model – given a correct description – the higher your probability of something unusable being created,” he said.

Plotting the AI graph, one node at a time

Insisting that debugging, reverse engineering and even vulnerability discovery within programs is still a wide-open problem, Galinkin notes that the difficulty is that if we think of a program as a graph – you need to explore the possible failure conditions at every node and find every path to every node.

“For simple programs, this is no problem, but once we start importing functions from other libraries, we end up with monstrosities that are difficult to navigate programmatically. There is still a ‘feel’ that humans have not yet been able to codify in such a way that a machine can learn it. Broadly, AI is still narrow and writing tools to help people do their jobs better is a far more fruitful and useful direction than aiming for something like artificial general intelligence,” concluded Galinkin.

Smarter secure cloud-natives, left-to-right

In related developer (ops-centric) news, earlier this year, Rapid7 and Snyk partnered together with the goal of securing cloud-native apps across the software development lifecycle (SDLC).

The companies explain that as modern development teams continue to adopt new technology that helps them accelerate their efforts, security teams are tasked with making sure they can advance their security strategies in similar ways. The Rapid7 and Snyk partnership allows security teams to embed security from the farthest “left” of the SDLC to the farthest “right” of the SDLC with a holistic approach to testing and monitoring across the application layer.