Skip Navigation
National Cancer Institute

Integrative Data Analysis and Big Data

Funding Opportunity Announcements
  1. Cancer-related Behavioral Research through Integrating Existing Data  PAR-16-256 (R01), PAR-16-255 (R21) Expires: June 15, 2019
  2. Modeling Social Behavior PAR-13-374 (R01) Expires January 8, 2017
  3. Methodology and Measurement in the Behavioral and Social Sciences PAR-16-260 (R01), PAR-16-261 (R21) Expires September 8, 2019
  4. NCI Small Grants Program for Cancer Research (Omnibus) PAR-16-416 (R03) Expires January 8, 2020
Integrative Data Analysis

Integrative data analysis (IDA) refers to a set of strategies in which two or more independent data sets are pooled or combined into one and then statistically analyzed. IDA approaches differ from and offer advantages over other methodological techniques that also strive to build cumulative knowledge bases, such as meta-analysis.

In meta-analysis, summary statistics across multiple studies are pooled together. Because IDA techniques pool original raw data, there is no loss of individual information as found within meta-analytic approaches, which allows researchers to find out what works, for whom, and in which contexts. In addition, the use of IDA affords expanded inquiry within many areas of health behavior research. IDA can be used to incorporate big data that were not originally intended for the examination of theoretically relevant measures. For example, searches on Google for health-related topics could be used as an objective measure of information seeking that could supplement what is gleaned from a self-report data source such as the Health Information National Trends Survey (HINTS). 

Data integration typically takes one of two forms:

  1. merging data by common data elements (units of information that are shared or widely used across data collection efforts.), where these elements are often multi-item scales or indices but can be individual items; or
  2. linking data sets through a common factor at the record level (e.g., linking across data through demographic information) such as that seen in the Surveillance, Epidemiology, and End Results (SEER)-Medicare data set, or at multiple levels such as the environmental or policy level (e.g., linking state- or county-level information with individual-level data). 

The Behavioral Research Program seeks to promote the use of IDA to answer novel cancer control questions to accelerate scientific discovery. 

Overview of Big Data

Big Data is a term that captures the opportunities and challenges involved with accessing, managing, analyzing, and integrating information within diverse data sets that are increasingly larger, more diverse, and more complex. These data sets currently exceed the abilities of traditional data management approaches. The value of data from behavioral measures can be significantly amplified by aggregating or integrating them with other data. Adapted from: https://datascience.nih.gov/bd2k/about/what.

Big Data and Theory Advancement

The program is invested in the improvement of the scientific rigor with which health behavior theories are tested and applied. BRP encourages and supports the use of new data sources and methods for theory testing. Information about behavior and its influences from both prospective and archival collection methods is increasingly more temporally dense and big (i.e., high in volume, variety, and velocity). These Big Data require advanced analytic approaches, greater access, and more opportunity for training and collaboration.

In September 2013, the program developed the Big D.A.T.A. (Data and Theory Advancement) workshop to complement the NIH Big Data to Knowledge (BD2K) effort.

The Big D.A.T.A. initiative convened experts in data analytics, systems science, and theory development and testing in order to address a fundamental question:
“How can behavioral scientists contribute to and leverage Big Data to advance health behavior theory in the context of cancer risk reduction and improved disease outcomes?” Robust data sets and accompanying models of dynamical systems present opportunities to substantively test, refine, and improve health behavior theories. The goal of the Big D.A.T.A. initiative is to stimulate new directions in theory development, testing, and integration with the use of Big Data, dynamic systems modeling, and novel measurement advances.

September 2013 Big D.A.T.A. Workshop Executive Summary (PDF)

September 2013 Big D.A.T.A. Workshop Executive Presentations

Contact

Richard Moser, Ph.D.
Fellowship Training and Research Methods Coordinator
240-276-6915
richard.moser@nih.gov