AttenGW: Gravitational-Wave Detection Pipeline 2025
Resources: AttenGW Paper (Link) | AttenGW GitHub | RADAR Paper (integration example)
Implementation: Python with PyTorch/PyTorch Lightning for HPC and multi-GPU training.
Overview:Low-latency gravitational-wave detection is challenging because true signals are rare and often buried in non-stationary detector noise, while follow-up facilities need reliable triggers within seconds. At the same time, several observatories operate under data locality constraints, making it impractical to stream raw data to a central location for further analysis. AttenGW is a deployable, end-to-end GW detection pipeline for real LIGO data: it includes a GWOSC-based data download + preprocessing workflow, a training/inference stack, and a lightweight multi-detector model designed to run efficiently on HPC hardware. It can also be exposed through interfaces that make it straightforward to plug into federated, multi-messenger workflows (e.g., RADAR).
Model & Training:I developed AttenGW as a standalone, deployable pipeline covering preprocessing/data loading, injected-signal dataset construction, model training, and trigger generation on real interferometer data, with an emphasis on reliable low-latency triggers on non-stationary noise and practical deployment close to the data. An earlier version of this detector was integrated into RADAR as a federated GW component; in that setting, my module served as one component within the broader RADAR stack (with the federated infrastructure and radio-analysis layer led by collaborators).
The core model is a WaveNet-style Hierarchical Dilated Convolutional Network (HDCN) that processes Hanford (H) and Livingston (L) strain independently to capture long-range temporal structure in fixed-length windows. Its outputs are combined by an attention-based aggregator that performs cross-channel communication between detectors to produce a per-timestep GW confidence score (see the model schematics below for a visual overview). I pursued an attention-based design because of my prior work with Transformers for GW forecasting and because attention mechanisms have proven broadly effective in signal processing tasks where long-range dependencies matter. This replaces graph-based multi-detector aggregationand helps reduce false positives in multi-detector data while keeping the model lightweight enough for deployment close to the data.
Figure 1–2. Whitened BBH (left) and BNS (right) injections in real LIGO noise for Hanford and Livingston. Both use labels set to 1 only in the 0.5 s preceding merger and 0 elsewhere; separate models are trained for each source class.
Models are trained on synthetic waveforms injected into real LIGO noise from the Gravitational Wave Open Science Center (GWOSC), using a curriculum that starts from louder injections and decays toward realistic signal-to-noise ratios. The data pre-processing pipeline involves glitch interpolation, whitening, band-passing, windowing, and label construction (with positive labels restricted to the pre-merger segment). The model returns a per-timestep score, while post-processing applies a calibrated, multi-criterion peak-finding stage (height, width, and plateau tests) to convert scores into reliable trigger times for use in downstream analyses.
Figure 3. Example application to the binary neutron star event GW170817. Top: whitened L1 strain. Middle: whitened H1 strain. Bottom: model output and resulting trigger, correctly localizing the merger time in hour-long real interferometer data. A Tukey window is applied to a known glitch in the Livingston detector; the detection is robust to this choice.
Model Schematics:
Figure 4. Hierarchical Dilated Convolutional Network (HDCN) used per detector to capture multi-scale temporal structure in the strain.
Figure 5. On the left, Cross-Attention Network (CAN) submodules that exchange information between detectors via multi-head attention, enabling detector-specific correlations to be modeled explicitly. On the right, final output module that aggregates enriched detector features into a per-timestep GW detection score suitable for low-latency triggering.
Benchmarks:
AttenGW is benchmarked on real O3 data and on injection studies in real LIGO noise. For comparability with prior work, the paper mirrors the February 2020 O3b segment previously used to evaluate a spatiotemporal graph ensemble. On this month of data, a single AttenGW model reduces false positives relative to a single graph-based detector by a factor of a few, and a 3-model AttenGW coincidence matches the background suppression previously reached with a 6-model ensemble (at matched detection efficiency). Injection studies on real noise further support the stability of attention-based aggregation on non-Gaussian backgrounds.
Federated Integration:RADAR is an example of how this kind of detector can be deployed in a federated setting: only compact trigger information and learned representations are shared with a coordinator, preserving data locality while enabling joint GW–EM analysis. I contributed by providing the GW detection module and helping shape the interfaces/requirements so it integrated cleanly, while the federated optimization, event fabric, and radio afterglow modeling were led by the broader RADAR team.
Related Work:This detector builds on and extends previous ML-based GW detection approaches, in particular by Tian et al. (2023) , and uses real strain data and catalogued events from GWOSC. Our contribution retrains on injections into real GWOSC noise, redesigns the multi-detector aggregator with cross-attention for improved false-positive control, and provides a production-ready implementation suitable for integration into federated multi-messenger infrastructures.