TDM plays a critical role in application development and IT maintenance activities in organizations.
It involves providing the right-sized, privacy-safe test data to various non-production IT environments in an organization. This is carried out so that developers and testers have quality data for application testing. If done correctly, TDM can add immense value to an organization’s application development program by alleviating the risk of exposing sensitive information in development and testing environments. While TDM’s value is well-acknowledged, TDM software needs to evolve to reflect changes in how applications are built and tested. This is imperative to make IT processes more efficient and intelligent.
By utilizing a framework to architect a modern TDM solution, organizations can meet the contemporary demands of application development and testing and overcome current challenges.
TDM as an integral component of an organization’s larger data management framework, is generally accepted as a mature function. It delivers substantial value as a part of an organization’s application development operations.
TDM encompasses the entire gamut of IT processes that are required for delivering privacy-safe test data. The markets have seen a proliferation of dedicated TDM software that digitizes and executes IT processes and enables automation. In fact, based on an organization’s TDM maturity, there may be dedicated groups governing these processes.
Based on this, the demand for TDM continues to be strong. However, like other data processing technologies, TDM has not remained immune to unprecedented changes in business and IT landscapes over the last decade. From a business perspective, growth-hungry organizations desire rapid application upgrades to gain a competitive edge. From an IT standpoint, they are dealing with a surfeit of data and application technologies, both on-premises and on the cloud. There is also a marked variation in TDM requirements due to business size and varying application release philosophies. A small company or a startup with a small development team trying to release frequent application enhancements will have distinct TDM requirements than a large, multinational company with multiple lines of businesses delivering applications through global teams and defined delivery processes. All these complexities have imposed tremendous strain on TDM and are challenging it in unforeseen ways. Clearly, today’s expectations of TDM technology are quite different from a few years ago.
These outlining features will lay the foundation for your modern TDM software.
The features expected in contemporary TDM software are:
Support for data and application heterogeneity: TDM software must handle an ever-growing diversity of data sources, formats, structures, and locations, and cater to diverse application technologies. This is a must-have feature.
Agility in test data: Business applications are being developed and upgraded rapidly, so the software must deliver appropriate test data in the right testing environments with minimum turnaround time.
Capability to work with application delivery software: TDM software must work in tandem with application delivery or DevOps software so that test data creation and delivery can be orchestrated seamlessly as a part of the overall application build cycle.
Availability of close-to-business test data: Some business data attributes obtain their value as a function of other business data attributes. A business attribute such as ‘customer discount percentage’ may depend on a customer’s age, gender, category, and so on. So, the ‘customer discount percentage’ data values generated by the software must be synchronized with values generated by other relevant attributes. The software must be able to maintain such interdependencies.
Retention of statistical trends: While generating voluminous test data for testing new analytical data models, it is imperative that generated test data retains the aggregated statistical properties of the source dataset. Otherwise, analytical data models will be erroneously tested against data different from those present in real-world scenarios.
Adequacy of test data: The software should work so that the generated privacy-safe test data reflects accurate production data. This ensures that the business application is tested against all business rules coded within it.
Reuse of test data: The software should provide a mechanism to store a copy of the existing test data and a way to retrieve it for future requirements.
Compliance with data privacy norms: The software must enable quick adaptation to the changing data protection regulatory landscape. This will allow organizations to effectively handle data privacy compliance needs and multiple data protection regulations.
Capability for robust metadata management: The software should have a mature underlying metadata management capability that can discover and provide inventories of business metadata (such as business units, applications,owners, or business data attributes) and technical metadata (data centers, data stores, application programs, names of columns, data types, data sizes, and so on). It must be able to depict relationships between metadata elements, perform impact analysis, and retain the version history of all changes being made.
A next-gen framework
The next-generation TDM software design is systematic, nimble, workable, and powerful.
This TDM framework enables organizations to do the following:
Enhance software support for various structured, semi-structured, and unstructured forms of data. Design the software in such a way that it can work with commercial providers of data source connectors or connectivity application programming interface (API) software.
Treat metadata management as a primary feature of the software and not as an ancillary capability. Enable a powerful interconnected system of metadata inventory, business glossary, data dictionary, and masking policy framework that can work in unison.
Provide a policy-based data masking framework so that the software can handle the privacy compliance needs of multiple data protection regulations. Ensure this framework is nimble for easy reconfiguration of policies whenever data protection regulations are updated.
Provide a test data archive within the software to store existing test datasets for a defined duration of time. Create a user-friendly interface that can quickly point to a desired test dataset in the archive through filters of metadata attributes, such as test data creation date, creator, business application, database name, or test case connected to test dataset. This will save time and eliminate the recreation of similar test data. Create a systematic purge mechanism to delete unwanted, older test datasets that have been stored in the repository beyond a defined duration and are not marked for long-term use.
Facilitate faster test data delivery through self-service test data provisioning. Design the software to work with data virtualization technology for swift test data delivery and to refresh testing environments. Set up a mechanism to store interim versions of test data between the execution of test cases. Provide an interface for users to compare data in consecutive versions and allow them to roll back the data in a datastore to an earlier version if needed.
Enable hooks for working with application delivery or DevOps software so that test data generation services can be directly invoked as part of the application build cycle. Architect the software to interpret test case details and trigger test data creation jobs automatically for required applications.
Create an interface between reverse engineering software (RE) and TDM. Allow RE to feed intelligence of application business rules into the TDM software so that the latter can derive and apply consistent values to various related business data attributes. Power the TDM software with machine learning (ML), artificial intelligence (AI), and natural language processing (NLP) capabilities.
Implement modern data generation techniques, such as generative adversarial networks (GAN) and differential privacy, that will help generate test data, which resembles original production data in terms of its statistical properties.
Enable basic data profiling capabilities within the software. Before actual testing, profile the generated test data to ensure that it is similar to production data in appearance. Use data profiling to verify that the generated test data has the required ’failure values’.
Organizations of all sizes expect agile delivery of contextual test data for diverse technologies.
Test data management software has come a long way from just being used for data masking and generation purposes. Now, it needs to evolve into an intelligent system that operates within an organization’s larger application delivery ecosystem. Such an ecosystem makes provisions for privacy-safe, right-sized, on-demand, and close-to-business data with a faster turnaround.