Governance of data using software tools: The evaluation checklist

Governance of data can be optimized using a data governance tool. Here’s a detailed evaluation checklist to help you identify the right solution.

The deployment of data governance tools by organizations provides financial stewardship through accurate and timely aggregation and reporting of financial results to meet enterprise needs and regulatory requirements. These focus on reducing costs and increasing compliance. Here are some of the capabilities that you should look out for in a solution that seeks to streamline governance of data.

Versioning capabilities: Governance of data through data quality tools also uses the specific version information to monitor the results of the updates and, if necessary, restores the original values as set for the same. While configuring and developing versioning in a data governance tool, specific picture source-code control services systems that have full branching and merging capabilities are sought after. If the Master Data Management (MDM) main or hub-and-spoke connectivity model requires specific types of versioning, they are mainly implemented or fixed with the main link, or they may be connected to tables and rows in a version table that link with a particular version of the same MDM record.

History reports with reference to data modification: These are essential to track data modifications and the source of those modifications. Having command over data modifications helps deal with data mishaps and maintains good governance of data. This enables setting parameters for future reference formulated from the major issues that have been faced over the years.

Rollback capabilities: A data governance tool should enable a running application to conveniently revert to the most recent saved state by specifying the transaction name in the ROLLBACK statement. A well-generated data system must have rollback capability so that a system can recover to a known state in case the execution process fails. A partial rollback can also be accomplished by specifying a save point name in lieu of the transaction name.

Metadata management: Capabilities for metadata management are crucial for effective governance of data, especially aspects such as how you can monitor the use of centralized data storage with distributed data capture and archived data for further use. For example, an intranet web page may include metadata specifying the language it's written in, tools used for creation, where to go for more on the subject, and so on. But how regularly are these updates monitored? The data governance tool should have the capacity to organize this as well.

Managing metadata in an organization nets out to three main areas of work: designing the metadata repository, populating the repository, and using information in the repository. Another approach for governance of data is to leverage a metadata repository which comes with a toolset that’s already in use. For instance, ETL vendors offer metadata management applications that serve to catalog and manage the ETL metadata. In some cases, they also provide the tools to catalog metadata associated with source and target systems. If not, the repositories that underlie these tools can be extended to serve broader metadata management applications.

Data quality capabilities to look out for

Duplication: Perhaps one of the most challenging data quality issues is duplication of data. Duplication arises when there are conflicting representations of the same entity across source systems. To deal with these it is necessary that the data governance tool has capabilities to profile as well as attribute data to a particular entity. It should also be able to match other data and cleanse the database of duplicates.

Profiling: Leveraging data profiling during the modeling of metadata efforts ensure that your data models accurately represent the data content, as well as the business and data requirements. Profiling data helps pro-actively assess whether a source data extract meets the data governance solution’s baseline quality standards.

Profiling gets set according to the internal cues set by the governing body. These should be fixed dimensions to empower the governance of data. For example, there could be a profiling condition that there will not be capture of data after 6:00 pm. The data governance tool should accommodate such validation and impulse to accurately trigger these profiles. A good tool also accommodates the capture of data after 6:00 pm, and shows these as exceptions. Profiling makes it easy to manage this data.

Another example would be, if one wants to do a profit analysis by means of transactions or if monitoring is required. For example, if a particular previously defined deviation has should be monitored as part of the business rule. Control engineering and sustainable engineering would be some terms that may be used to discuss this. After a data set successfully satisfies profiling standards, it still requires data cleansing to ensure that all business and schema rules are properly met.

About the author:  Suresh A. Shan is the Head of Business Information Technology Solutions (BITS) at Mahindra & Mahindra Financial Services Limited, Mumbai.

(As told to Sharon D’Souza)

Read more on IT governance