Aqua DNA Project: Rainfall Aware Hybrid Rule-Based Framework for Detecting Sewer Blockages
Analysis of CSO (Combined Sewer Overflow) sensor and rainfall data from 2017 to 2020 revealed persistent abnormal sewer level patterns across multiple sites using a hybrid rule-based detection framework, highlighting opportunities for earlier blockage intervention through automated, data-driven alerting. The framework was developed using Python, with pandas, NumPy, and Matplotlib supporting the core analytical pipeline.
Hybrid Rule-Based Detection
The detection framework combines statistical anomaly detection with engineering rules, requiring three conditions to be satisfied simultaneously before an alert is raised.
The first condition requires that the observed sewer level exceeds the rolling 99th percentile baseline, indicating that current levels are abnormally high relative to recent history at that site. The second condition requires that weather conditions are classified as dry, meaning rainfall accumulation over the past six hours falls below a defined threshold. This step explicitly filters out storm-driven level increases. The third condition requires that the abnormal dry-weather state persists for a minimum of four consecutive hours.
Persistence is critical to the framework’s reliability. Short spikes may occur naturally or due to sensor noise, and requiring four hours of sustained elevation substantially reduces false positives. Only when all three conditions are satisfied does the system generate a blockage alert.
Results and Validation
Visual validation confirmed that the system correctly identifies sustained abnormal behaviour. During normal operation, water levels fluctuate below the dynamic baseline. During detected events, levels rise above the baseline and remain elevated for extended periods, consistent with obstruction behaviour. Rainfall-driven increases were successfully excluded when six-hour accumulation exceeded the rainfall threshold.
A sensitivity analysis was also conducted by varying the persistence threshold from three to six hours. Using a three-hour threshold produced ten detected alerts across the test period, while a six-hour threshold produced none. The four-hour threshold returned two alerts, providing a balanced trade-off between responsiveness and false alarm reduction.
Two alerts were ultimately validated in the results: one at CSO E23608, which lasted 5.25 hours beginning on 28 September 2019, and one at CSO E26824, which lasted 4.00 hours beginning on 17 February 2020.
Strengths of the Framework
The framework offers several notable strengths. It is highly interpretable, providing clear reasoning behind each alert rather than operating as a black box. Thresholds are site-adaptive, removing the need to manually calibrate fixed limits across different locations. Rainfall-aware filtering ensures that storm events do not generate spurious alerts. The approach requires no labelled historical blockage data for training, making it immediately deployable in operational environments. It is also scalable across large sewer networks without significant additional complexity.
Limitations and Future Improvements
The framework relies on parameter choices, including persistence duration and rainfall thresholds, which may require calibration based on operational feedback over time. Additionally, the system detects ongoing blockages but does not predict future failures. Future development could incorporate predictive modelling techniques to anticipate blockages before they become critical, or integrate additional sensor types such as flow rate measurements to strengthen detection confidence.
Conclusion
This project developed a rainfall-aware hybrid rule-based detection framework capable of identifying potential sewer blockages across multiple CSO locations. By combining rolling statistical baselines, rainfall filtering, and persistence logic, the system generates structured and interpretable alerts suitable for operational deployment.
The framework satisfies the client requirement for an automated alerting system that is transparent, scalable, and robust against the confounding effects of rainfall, providing a strong analytical foundation for future enhancements in sewer network monitoring.
Contribution
I contributed to the feature engineering and rule-based detection components of the project, including the development of the rolling percentile baseline and the three-condition alert logic, and supported the sensitivity analysis comparing different persistence thresholds.
Acknowledgement
This work was developed collaboratively with my MSc Data Analytics team members Ebrima Khan, Muhammad Qureshi, Shivangi Sinha, and Jahnavi Potula. I would also like to thank our supervisor and the Jacobs client team for their guidance throughout the project.