Knowledge Management 2.0: have we found a way to make KM work?

This is a blogpost by Jim Webber, Chief Scientist at Neo4j, on how tech has finally caught up with ideas.

To a computer, classifying things in the real world can be a bit of a puzzle. We’ve taught a system that there are cars, for example, and that some cars run on liquid fuels and some run on batteries. But then you ask it to classify a battery EV that’s not from a major brand but is a kit car—should it go in the list of electric cars like a Nissan Leaf or to a new EV subset of kit cars?

This might seem like a trivial problem—just add another category for electric kit cars, right? But it’s you who solved the computer’s problem, not the computer. It was unable to process any kind of ambiguity here that sometimes a car can be electric and a kit car, and that’s fine–it can be both.

The real world is full of these issues. A baby can throw a ball and even throw up, but they can’t throw a party. In some states of the US, you can’t turn on a red light—apart from in the states where you can (the level of honking behind you is a clue). And these are all good examples of why traditional knowledge management breaks down.

Finally, technology can acknowledge that knowledge is messy

Knowledge management is the collection of methods related to creating, sharing, and managing the knowledge and information an organisation contains. It leverages what we already know to find new things that can help us. But one of the peculiarities of knowledge is that some of it is regular and some of it isn’t. Knowledge management 1.0 hasn’t progressed that far because of these exceptions to the norm. Facts and their patterns of knowledge exhibit uniformity in some areas, irregularity in others, density in certain places, and sparsity in certain others—knowledge is patchy.

And as you have to use data representation and tooling to codify the internal knowledge management system, that represents a challenge. Initially, relational technology, was used, but relational cannot manage regularity.

In a relational database, information is stored in tables, which we can combine through joins. However, the situation can become chaotic quickly if we have an excessive number of tables categorising types of cars. This issue was addressed by making everything regularised, because that’s the way the database wanted to work. Nevertheless, the actual world lacks regularity, so the results were limited. First-generation knowledge management systems struggled to generate anything noteworthy given their inability to accommodate these irregularities. These are the very factors that contribute to fresh discoveries, novel product possibilities, and identifying new customer segments.

I would argue that a real breakthrough came with the semantic web, but that only advanced us to knowledge management 1.5. The semantic web innovation started with the acknowledgment that exceptions are inherent in the data model, not mere regrettable bugs. The proponents of the semantic web said that the web is a graph of knowledge comprising reliable and less reliable information. By creating taxonomies and ontologies and meta-level descriptions of the data, we can more easily discern the difference instead of attempting to regularise everything.

Represent your data however heterogeneous

The idea was to make the web machine-readable, enabling software to explore the graph on our behalf. And that allowed us to govern the data better, but also let it flourish—to be able to have all kinds of connectivity and connections between facts at any level. Such flexibility was prohibited by the old, more rigid approach to knowledge management.

However, this initiative was primarily driven by idealism rather than data expertise. Despite these efforts, not all websites became machine-readable. There were attempts to integrate the data into triplestore/RDF, but this approach gained little traction as it diverted from the essence of the web.

Now the database community are operationalising knowledge management 1.0 and 1.5 into knowledge management 2.0. This involves knowledge graphs, high-fidelity querying and graph data science that leads into machine learning and GenAI.

The good news is that a knowledge graph doesn’t need all the ontology the semantic web people employed. You get a graph but without any of the clever but academic baggage. Now, you can build a graph, or a knowledge graph, that represents your data however heterogeneous it might be.

Finally, we have got to where we want with knowledge management: you can add a property like ‘kit car’ to your ‘electric cars’ node—and you’re finally off to the knowledge management races. Sounds good to me.

Data Center
Data Management