Data Aggregation: Definition and Importance to Life Sciences Researchers

The advent of big data and the explosion of health data sources theoretically provide a wealth of information for life sciences researchers. However, accessing meaningful data is still problematic, which increases the importance of data aggregation.

The data aggregation definition is: a process during which data is searched, gathered and presented in a report-based, summarized format to achieve specific business objectives or processes and/or conduct human analysis.
Data aggregation is the step that occurs between data and analysis. Data analysis is an important component in the identification and development of new drugs, and researchers understand the need for quality data. However, even if the researcher has access to quality data and robust data analysis tools the importance of quality data aggregation is often overlooked.

While there are a number of data aggregation tools that can search for and gather data, the task can be onerous for life sciences researchers due to siloed, disparate data sources. At the pharmaceutical research and development (R&D) level, the greatest obstacle to quality data aggregation is the quality of the original data. The lack of consistent data standards and formats often means researchers find themselves acting as “data janitors” to clean up data before it can be used.

Up to 80 percent of a Ph.D.’s time is spent “mopping up” non-standardized data to create clean information that is ready to analyze. Not only is this an inefficient use of a critical employee’s time, but it slows the process of discovery.

The same challenge exists in all phases of drug development, clinical research, manufacturing and pharmaceutical marketing. Just having access to data is not enough – there must be a way to integrate, aggregate and harmonize data to provide easy access to information across the spectrum of users relying on data for research and business insights.

For clinical and translational research, the challenge is enhanced by the increasing use of contract research organizations and external healthcare providers in the research and clinical trial processes. As life sciences companies look for ways to decrease costs and bring new products to market more quickly, the need to collaborate becomes more critical. Data aggregation may differ for each researcher, but relying on a platform that integrates and harmonizes data from each disparate source enables researchers to easily access information they need for data analysis.

While there are a number of integration and data aggregation solutions, the unique needs of scientists in pharmaceutical R&D require a research-oriented, healthcare-specific solution. For example, a platform that features intuitive high-volume clinical and omics data import; robust processes for genomic analysis; comprehensive translational research tools; and connected access to public databases and reference data supports cross-organizational collaboration in the drug discovery process.

Data aggregation can also be hampered by the limitations of research partners’ systems. While only an estimated 30 percent of pharma organizations use the cloud for data applications, a cloud-based integration and aggregation platform is the solution for pharma researchers. Because there is no capital investment required for new research partners, the pool of potential new data sources is easily expanded.

Quality and accuracy is also improved with the use of a platform that harmonizes the format of aggregate data. When CROs can hand off their EDC data in native form to a platform that translates the data into the pharmaceutical partner’s required format, time is saved and errors reduced for both partners. At the same time, clinicians in physician offices or hospitals who are reporting results to a pharmaceutical company can submit data from existing systems to a digital platform that will convert the data to a format accessible to all researchers.
A data integration and aggregation solution should meet the following criteria:

  • Enterprise-wide solution that leverages legacy IT systems and integrates all systems with no disruption to operations
  • Flexible architecture that allows for growth to adapt to changing needs for data aggregation
  • Predictable performance with storage that persists data in a big data repository – providing on-demand, self-service access to clean, quality data
  • Scalability that provides incremental updates in minutes to enable near-real-time processing and ensure researchers have the most up-to-date information
  • Access to data integration and aggregation expertise that includes familiarity with health research specific requirements including privacy and security rules

Unfortunately even as pharmaceutical researchers have access to more data, aggregation and data analysis can be hampered by shrinking resources – technology and human. As the industry becomes more data-centric, internal IT staffs will be challenged to stay abreast of changes in technology, compliance requirements and data-sharing standards.

For this reason, many pharmaceutical companies turn to a cloud-based platform with a predictable, subscription-based pricing model that includes managed services. Relying on a third-party to monitor and manage the integration, harmonization and data aggregation activities provides access to highly skilled technical employees who are not only knowledgeable about the technology but also up to date on changing standards and multiple data formats. This enables in-house IT staff to focus on their core business activities.

More importantly for pharmaceutical researchers, use of a digital platform that aggregates and harmonizes the data before depositing it into research-oriented systems makes data available to scientists in a timely manner. When a third-party expert manages the platform, the scientist is free to focus on research – a better use of time and company resources.

While the data aggregation definition is straightforward, in life sciences research the actual achievement of quality aggregation of quality data for quality insight is not simple. As pharmaceutical companies evaluate ways to improve time-to-market for new drugs, it is important not to overlook technology that can streamline data aggregation by integrating and harmonizing data from multiple sources into standard formats all researchers can access and use.

Although Thomas Sowell is an American economist rather than a pharmaceutical researcher, his observation that “the same set of statistics can produce opposite conclusions at different levels of aggregation” rings true for scientists. Providing data in a format that optimizes the use of data aggregation tools and data analysis leads to more accurate insights – which leads to faster time-to-market for pharmaceutical products.

OpenText

OpenText is the leader in Enterprise Information Management (EIM). Our EIM products enable businesses to grow faster, lower operational costs, and reduce information governance and security risks by improving business insight, impact and process speed.

Related Articles

Close