Real-world data (RWD) has a key role in clinical research.
Randomized controlled clinical trials are considered a benchmark in clinical research. However, the increased awareness of health equity under public scrutiny demands greater patient outreach and evidence-based assessments. The explosion of real-world data and the advancements in modern data management techniques can accelerate the drug discovery process and help clinical researchers meet their shared goal of health equity. We explore the IT strategy elements that will drive effective use of real-world data in clinical or medicinal research.
Real-world data is primarily an individual's health data generated from various sources. These can be classified into three categories (see Figure 1). The first is machine-generated data which contains information medical devices generate about an individual's health. Examples include health records that CT scanners, wearables, and biosensors generate. The second category is manually generated information and insights. This includes patients' health information provided by them, or recorded by caregivers such as nurses, clinicians, doctors, or pharmacists. The last category is the environmental data generated by external sources pertaining to the community or public health.
New approaches in drug discovery are increasingly becoming more inclusive and diverse.
Real-world data helps generate insights from clinical trials and provides the required evidence for regulatory purposes. Modern data management techniques encourage regulatory bodies to ask pertinent questions related to clinical studies, identify gaps, and request for evidence-based assessments. The increased availability of real-world data is helping policymakers understand the impact and effect of drugs more clearly. They can also review and evaluate potential issues related to drug discovery.
With the enhanced use of real-world data, regulatory bodies can set new guidelines and issue more effective approaches for clinical trials. It helps the regulatory authorities evaluate and monitor post-market risks. Technological advancements in acquiring, storing, and analyzing real-world data help researchers scale up the analytics and improve the efficacy of medicinal research.
Drug discovery demands greater diversification of patient demography, as different categories of patients may respond differently to the same medicine or vaccine. While the results of randomized controlled clinical trials may or may not apply widely, the modern approach to clinical research makes clinical trials more inclusive and diversified. This increased awareness of health equity from civil societies, government, and non-government organizations has underscored the importance of real-world data usage in clinical trials.
The data modeling considerations for real-world data differ from other data modeling types.
Figure 2 demonstrates the collection of data from diverse sources. The volume and velocity of data generated from various sources are different or inconsistent. Each source may produce the data at different granularity levels. Though each source can generate information in a predefined data type, the complete data set is a vast collection of structured, unstructured, and semi-structured data.
Another important aspect is that these data sets contain personally identifiable information, which needs careful consideration from a data privacy perspective. As clinical trials expand the scope and embrace more diversity, the geographical aspect increases the complexity of dealing with real-world data. Each country’s data privacy guidelines must be factored in for dealing with personally identifiable information.
Pharma companies must take various aspects into architectural considerations to maximize the benefits of using real-world data in clinical research.
A future-ready data architecture (see Figure 3) can offer researchers extensive advantages for carrying out analytics at scale. The essential elements of building the data strategy are described below:
Data acquisition and harmonization: A comprehensive approach should be taken to acquire data from multiple sources in disparate forms. The system needs to be scalable for exponential data growth.
Data curation: Indexing and cataloging unstructured data, including medical images, are fundamental for traceability. It is critical to maintaining the audit trails, linkages, and traceability of data.
Resolving data quality issues: Data inconsistencies and gaps are found, especially in electronic health records, and may need fixing for different technical reasons. Hence, following the preprocessing and data cleansing methods can help eliminate these gaps.
Building IT controls for data governance: Categorized as protected health information (PHI), this data is crucial to ensuring adherence to federal guidelines like HIPAA and GDPR while implementing the required IT controls.
Extracting value from data: The effectiveness of this layer is critical in extracting meaningful information and providing it to the end users as required. Apart from diagnostic, prescriptive, and predictive analytics, data visualization is also a desired feature of this layer. Continuous inflow and exponential growth of data require a robust real-time high-speed data analytics framework.
Successful usage of real-world data will set a cornerstone for health equity. Given the modern developments in data science, pharmaceutical scientists can speed up the clinical research journey even further. Applying the right technologies and a contextual solution approach can help clinical researchers achieve faster drug discovery.