EEG-IP: An international infant EEG data integration platform for the study of risk and resilience in autism and related conditions

Background: Establishing reliable predictive and diganostic biomarkers of autism would enhance early identification and facilitate targeted intervention during periods of greatest plasticity in early brain development. High impact research on biomarkers is currently limited by relatively small sample sizes and the complexity of the autism phenotype.

Methods: EEG-IP is an International Infant EEG Data Integration Platform developed to advance biomarker discovery by enhancing the large scale integration of multi-site data. Currently, this is the largest multi-site standardized dataset of infant EEG data.

Results: First, multi-site data from longitudinal cohort studies of infants at risk for autism was pooled in a common repository with 1382 EEG longitudinal recordings, linked behavioral data, from 432 infants between 3- to 36-months of age. Second, to address challenges of limited comparability across independent recordings, EEG-IP applied the Brain Imaging Data Structure (BIDS)-EEG standard, resulting in a harmonized, extendable, and integrated data state. Finally, the pooled and harmonized raw data was preprocessed using a common signal processing pipeline that maximizes signal isolation and minimizes data reduction. With EEG-IP, we produced a fully standardized data set, of the pooled, harmonized, and pre-processed EEG data from multiple sites.

Conclusions: Implementing these integrated solutions for the first time with infant data has demonstrated success and challenges in generating a standardized multi-site data state. The challenges relate to annotation of signal sources, time, and ICA analysis during pre-processing. A number of future opportunities also emerge, including validation of analytic pipelines that can replicate existing findings and/or test novel hypotheses.

Keywords: Autism risk; Biomarkers; EEG; High performance computing; ICA; Pre-processing.