The challenge of metadata at the edge

Nelson Petracek, chief technology officer at Tibco  says that one of the issues in deploying and managing edge computing devices is how will the device metadata be managed and governed. Such metadata may contain the device location, manufacturer, date of installation and last maintenance date.

In this guest blog post Petracek discusses how the device topology and relationships can be managed and governed and how this representation can be kept in sync with the physical layout:

With respect to device metadata, there will usually be a catalogue or metadata repository included as part of the overall architecture. It’s most likely that it will be in the datacentre or in the cloud and will act as a centralised function. Not only will this catalogue provide a picture of what is deployed and where, but there may also be data that is needed from this catalogue by different areas of the IoT processing pipeline during device data processing.

When running logic, say at the gateway level, it may be necessary to draw upon reference data to make an educated decision. For example, various parts of metadata around the devices, including the manufacturer, when it was put into service, or its maintenance history might be required in order to complete the decision making process.

It is unlikely that organisations will want 100,000 locations hitting this central store for metadata every time a device activates. Instead, it’s more likely that metadata will be applied closer to the datacentre or to the cloud, especially when using it as part of the model or the rules.

Maintaining and managing the overall relationships between devices – the topology – is also critical. It’s key to understand where the device is, to what it is attached, and how everything is linked together. This information is key to understanding the behaviour of an IoT network, and can help in ensuring that decisions are determined and optimised in the correct context and state.

It is unlikely that organisations will want 100,000 locations hitting their central store for metadata every time a device activates

One well-known example comes from what you would see in a power grid. For example, if you’re a utility company responsible for distributing power to consumers, you are concerned with how power gets from the source to a meter attached to a house. Many pieces are equipment and thus devices are involved in this process, including the meter, transformers, substations, etc. When looking at a distribution network for electricity, you have a vast network of linked devices, and for a variety of reasons (safety, accurate maintenance, capacity, thresholds) it is important to have an accurate picture of not only the devices themselves, but also their relationships. If a new meter or line is added or a transformer is changed, the blueprints and recorded topology must also reflect this change. Changes must make their way back to the metadata management environment so that proper decisions can be made, both in batch (future infrastructure planning) and real-time (power delivery and restoration in the event of a failure).

There are complete systems for managing this, but it can be quite complicated. However, that level of complexity isn’t always required in order to achieve the capability, but some mechanism is necessary for capturing changes – whether it’s just through automated processes or a periodic introspection, or a combination thereof.

What’s actually out and deployed must be reflected in the device and metadata catalogue. We are not yet at the point where we can dynamically go out and automatically discover all the devices that are running everywhere along with their relationships and metadata. One can do some of this automatically, but there is likely some aspect that unfortunately is still going to be manual.

Data Center
Data Management