OBJECTIVES: To evaluate the reliability and generalization of NeoNaid, a fully automated software tool for neonatal EEG analysis, based on functional brain age (FBA) estimation and sleep staging. METHODS: NeoNaid combines a multi-task deep learning model with proposed quality control routines detecting artifacts, out-of-distribution inputs, and uncertain predictions. Based on a raw EEG input, it outputs one global FBA estimate and a continuous 2-state hypnogram. We validated performance on two independent hospital settings: an internal dataset (33 EEGs, 17 infants, median 900 min/recording) and an external dataset (38 EEGs, 24 infants, median 124 min/recording). RESULTS: Quality control rejected a comparable number of segments in the internal and external datasets, reducing extreme errors in FBA estimation, and modestly improving sleep staging accuracy. Across the internal and external data, NeoNaid achieved median absolute FBA errors of 0.50 and 0.55 weeks and Cohen's Kappa values of 0.89 and 0.87 for quiet sleep detection, respectively. DISCUSSION: NeoNaid demonstrated improved reliability through integrated quality control and maintained performance across two independent datasets. By focusing on validation and trustworthiness, this work takes an essential step toward clinical adoption of automated neonatal EEG analysis and supports its utility for both NICU practice and large-scale research.
Journal article
2026-01-01T00:00:00+00:00
20
automated analysis, clinical validation, functional brain age, neonatal EEG, quality control, sleep staging