The Harvard Automated Processing Pipeline for Electroencephalography (HAPPE): Standardized Processing Software for Developmental and High-Artifact Data

Electroenchephalography (EEG) recordings collected with developmental populations present particular challenges from a data processing perspective. These EEGs have a high degree of artifact contamination and often short recording lengths. As both sample sizes and EEG channel densities increase, traditional processing approaches like manual data rejection are becoming unsustainable. Moreover, such subjective approaches preclude standardized metrics of data quality, despite the heightened importance of such measures for EEGs with high rates of initial artifact contamination. There is presently a paucity of automated resources for processing these EEG data and no consistent reporting of data quality measures. To address these challenges, we propose the Harvard Automated Processing Pipeline for EEG (HAPPE) as a standardized, automated pipeline compatible with EEG recordings of variable lengths and artifact contamination levels, including high-artifact and short EEG recordings from young children or those with neurodevelopmental disorders. HAPPE processes event-related and resting-state EEG data from raw files through a series of filtering, artifact rejection, and re-referencing steps to processed EEG suitable for time-frequency-domain analyses. HAPPE also includes a post-processing report of data quality metrics to facilitate the evaluation and reporting of data quality in a standardized manner. Here, we describe each processing step in HAPPE, perform an example analysis with EEG files we have made freely available, and show that HAPPE outperforms seven alternative, widely-used processing approaches. HAPPE removes more artifact than all alternative approaches while simultaneously preserving greater or equivalent amounts of EEG signal in almost all instances. We also provide distributions of HAPPE’s data quality metrics in an 867 file dataset as a reference distribution and in support of HAPPE’s performance across EEG data with variable artifact contamination and recording lengths. HAPPE software is freely available under the terms of the GNU General Public License at https://github.com/lcnhappe/happe.

Keywords: EEG; EEG processing; artifact removal; automated; data quality; development; electroencephalography; pipeline.