Analyzing clinical trial data culminates in tables, listings, and figures (TLF) as reports that are part of the clinical study report submitted to regulatory authorities or can be used for an ongoing review during study conduct e.g., safety update reviews. The importance of accuracy with respect to content makes it a time-consuming, effort-taking but necessary activity for clinical study analysis.
There are two aspects of clinical reporting:
Designing the TLFs, and
Generating the TLFs based on study data
Both activities are manual, and interrelated but can be catered for by a single solution. Let us understand this further.
For any clinical study, the entire package of clinical trials will primarily consist of standard reports and study-specific reports. The design and layout of standard reports would remain the same and would not require additional work (such as generating and validating the standard reports). However, if there is any “study-specific” update in the standard report, it would be mentioned in the specification document which needs to be incorporated by the study programmer. We, therefore, can conclude that the standard report design can be reused for each study after some minor modifications defined by the sponsor at the study level.
However, the study-specific reports must be designed and currently, they are being designed. These are being maintained in documents or even designed on slides and pasted into the document. Lack of automation between specification and deliverable is one of the reasons for the gaps that are observed in reports during submission. Also, such a lapse is noted in later stages where some amount of clinical trial data is available.
As mentioned earlier, the study programmer is the link between the specification and TLFs that are created. Currently, the process flow is depicted in Figure 1, where the trial statistician lists the TLFs (either from standard or study-specific) in the TLF specification document. This is then taken as input by the study programmers (including validators) who then create the codes needed to generate the TLF. These codes could also be a part of a standard library in some cases. Hence, every update that happens in the specification is cascaded through the programmer to see the intended impact in the TLF. This does have its own disadvantages. Other than the lapses and errors caused due to misunderstanding of the specifications, this also requires complete flow understanding for new programmers. Hence, new resources cannot be aligned during milestone deliveries.
In short, the TLF process can easily be optimized further to reduce the time and effort of the team. It can be summarized as follows:
The specification should be consistent and machine-readable.
Based on the specifications that are defined, the layout of the TLF should be easy to comprehend. The meaning and implication should be consistent to decrease the learning curve.
Lastly, the specification should be in a format that can be consumed by an engine for generating the TLF automatically.
All the above goals will not only reduce team efforts but also reduce the gap between the design and the actual report. However, to achieve this, we need to have robust data models which can cover all aspects of clinical trial reporting.
CDISC has developed and announced the Analysis Results Standards (ARS), which is an exhaustive model covering all aspects of analysis reporting. The model covers the following.
Table of contents: This provides an overview of the reporting package part of ARS that would be delivered for analysis. E.g. you could have a subset of reports being generated for an interim analysis.
Layout: This provides all the layout details for each individual report. It caters to managing the title, footnotes, section header, legends, and all other details that are to be maintained to enhance the report.
Report content: This defines the content that would be displayed in each report. For listing, the report will typically contain the values under variables with or without any condition. This would include a list of the statistics that should be shown e.g. the mean or the standard deviation or the quartiles and so on. This also specifies the population on which the report would be derived.
Content derivation: Once the content has been defined for the report, the user would need to define how the content would be derived from the dataset. So, for example, if the statistics are to be displayed in the percentage of adverse events in the safety population, the report would need additional information regarding the numerator (like the count of adverse events) or the denominator (like the count of subjects in that treatment group).
Data massaging: This defines data massaging, which is needed to derive the report or to base the analysis on.
Analysis result storage: It is the structure to store all the analysis results that are generated. Since it is a single data model, the storage of all reports is consistent, and a report generation engine can be executed to generate reports in one go.
As an example, let us look at the standard demography table as shown in Table 1.
Table 2 below depicts the same table shell along with annotated ARS categories (Layout, Report Content, Content Derivation) and primary metadata points (Display, Analysis Group, Analysis Variable, Treatment Variable, Analysis Method, Display Format, Operation). As evidenced by the annotated shell, all aspects of the report can be defined using the ARS model. This is also true for complex shells where the other categories mentioned above can be referenced.
As seen above, the ARS is a complete structured representation of analysis which streamlines the analysis process.
Given the robust structure for ARS, there could be multiple possibilities for automation but let us consider the following use cases:
The ARS model on the framework of a robust MDR and a well-built execution engine will optimize the study deliverable process significantly. Since this is metadata-driven, it not only saves effort but also helps in reducing the errors that might arise. It is also beneficial in cases where the TLF specifications go through changes during study, and/or closure of the study. The process around specification development can be developed so that there is a negligent chance of the changes not being reflected in the TLF metadata.
Let us look at the benefits of implementing ARS in the framework of MDR (metadata repository) and CDR (clinical data repository).
Metadata content
TLF designer: Since each aspect of the report is defined precisely by a specific component within the model, it is easier to define new reports in a standard format. This presents an added advantage that the TLF designer metadata extract is not only machine-readable but can also be consumed as a mock shell specification document.
Study setup: A standard library for the TLF metadata in the MDR not only allows the standards team to maintain the metadata for the reports but can also be used to set up the study for standard reports in a single click.
TLF automation: A metadata-driven execution engine to read the machine-readable TLF specifications can automate report generation. So, the MDR will implement the ARS metadata as a single source of truth to ensure that the downstream activity of creating the report consumes the same metadata.
End-to-End traceability: Since the ARS model provides a component to define the transformation metadata as well, complete end-to-end traceability can easily be built on top of libraries.
Data content
The analysis result data model can be incorporated into an existing CDR to store the results across different studies. This will facilitate activities like data mining and even meta-analysis to draw insights while pooling the result for a single compound or indication or population.
An MDR with robust metadata management helps organizations to reduce cost and effort, promotes reusability, assists with process automation, and enables end-to-end data traceability. Including the robust ARS model not only strengthens the MDR to meet the above objectives but also provides a framework to design new TLFs. The complete framework has immense potential to help organizations attain operational efficiency by enabling process automation.
The authors would like to thank Mayank Bhatia, Head of Product Strategy and Management, Tata Consultancy Services Ltd (TCS) for providing valuable guidance and feedback.
© Copyright [2024], Tata Consultancy Services Limited. All Rights Reserved. Document ID CGTC010555