Getty Images

Are we ready for the era of the ‘sentient’ document?

This year is set to be the automation of documents since the invention of the printing press. Content control is key

We now have the technology for documents and digital content to develop levels of interactive responsiveness and competence that will make them appear super-intelligent. The question now becomes, how should we approach and harness this advancement?

As deep learning algorithms distil key points with increasing accuracy and present them to users via the medium and format of their choosing, ChatGPT and other examples of Generative AI tools have already made it much more intuitive to interact with large banks of knowledge.

The next step is for enterprise documents to carry an inherent understanding of what they are and what they contain, so they can ‘speak’ directly to recipients or processing teams, or to their automated proxies acting on their behalf.

It’s a world where business documents tell us—both humans and other software systems—about themselves, like NPC characters in computer games.

The advent of self-aware, communicative enterprise documents could arguably become the most important advance in the automation of documents since the investigation of the printing press by Johannes Gutenberg. Thanks to the potential to revolutionise enterprises’ comprehension of content in all of its available forms, and the knowledge it contains, the potential is enormous: 

  • As content becomes more ‘conscious’ and able to relay information about itself, the need for overstretched professionals to visually scan, respond/address email messages will go away
  • Language differences will become invisible, so that teams can ask English questions of a German document and vice versa
  • Mass distillation of key facts and the extraction of insights from entire company information libraries on an automated basis will become commonplace

Requests for information or actions, or attached invoices, contracts, or applications, will simply announce and identify their presence and file themselves or trigger automated processing, according to their type and priority level. 

Faced with a very complex legal contract in an unfamiliar  language, you are able to ask ‘What is this about?’, ‘Who are the contracting parties?’, ‘What is the expiry date?’ or ‘What are the penalty clauses for breach?’—and receive a full answer in your own language. 

Where does the machine end and the human begin?

Document ‘sentience’ can apply to orders, invoices, job applications, change of address forms, customer onboarding forms—essentially any form of document. Sentience will  also extend into domain specialisms, e.g., legal, pharmaceutical, medical or other areas, with field-specific questions and answers. 

The real game-changing potential of sentient documents is their ability to help manage the enormous amounts of content coming into every business yearly—estimated to be approximately 150 zettabytes (150 billion terabytes) of content this year alone. And sentient interaction won’t be limited to text documents. Knowledge workers will be able to interrogate video, audio, and any form of business content just as seamlessly through natural conversation.

We need that even more since the pandemic, when we ushered in remote collaboration tools like Teams and Zoom, which led to enterprises accumulating more ‘dark’ content—valuable data trapped in recorded meetings, chat logs, and presentations that is effectively sealed off from the main corporate knowledge base. An intelligent content layer will illuminate this dark data, making the insights buried within those digital interactions accessible and actionable. 

All this will happen because sentient documents represent the evolution of what we’ve been doing for decades in ECM (enterprise content management). Document sentience is the next step for content services platforms, where organisations store, manage, search, or archive documents and other useful content.

Sentience is likely to see off robotic process automation (RPA) too. Although RPA tools can accomplish routine tasks very well, this value is now being surpassed by that of real, adaptive intelligent automation - which actually understands the content it is analysing and can thus make the whole workflow intelligent (rather than a robotic hardwired sequence).

As with all artificial intelligence developments, clearly a line needs to be drawn as to the extent of the control we hand over to this new generation of intelligent content. But while the notion of ‘document manager overlords’ is unsettling, I suspect these sentient textual entities will only ever operate as collaborative assistants, providing us with useful support, not autonomous overreach. And given current economic pressures, this support may feel very welcome.

Dr. John Bate, is author of Thingalytics and CEO of SER Group. He is also a non-executive director at Sage.

Read more content management articles

  • Generative AI's ability to create content can enhance workflow automation within a CMS. Yet, organizations must implement guardrails to ensure data privacy and content quality.
  • Content management trends like generative AI, compliance, workflow automation and cloud deployment can help organizations automate processes and support remote work.

Read more on Content management

Data Center
Data Management