eds challenge: overview and some preliminary results

EDS Challenge: Overview and Some Preliminary

Results

Preliminary Results – For Official Use Only

Katie Umberg, USEPA

Victoria Berry, CH2M HILL

Steve Allgeier, USEPA

AWWA Water Security Conference

Nashville, TN; September 13, 2011

1. Overview

2. Some Preliminary Results

3. Path Forward

2

Overview

3

Event Detection Systems (EDSs)

• Water Quality Monitoring event detection system

(EDS): software that monitors water quality data in

real-time and produces an alarm if water quality is

abnormal. An EDS allows the utility to efficiently

monitor the large quantity of data that can be

produced by an online water quality monitoring

system

– May use supporting data such as sensor alarms and

operations data

– Can be implemented at the actual sensor site or at a central

location

• Goal: maximize detection of abnormal water quality

events, while minimizing false alarms

4

5

EDS Challenge

• Goals:

– To provide an objective demonstration of available

EDS’s performance, measuring both true and false

alarms.

– To challenge EDS developers to incorporate

innovative approaches for analyzing complex water

quality

• Factors such as cost, ease of use, and support were

not considered. Nor is the LOE for installation and

configuration: each developer trained their own tool.

6

Testing Data

• One year of data was obtained for a total of 6 monitoring

stations from 4 US water utilities.

– 3 months were provided to participants for training. 9

months were used for evaluation.

– Data from sites with variable / complex water quality

was requested.

– Corresponding operational data was also provided,

where available.

• 2, 5, 10, and 20 minute polling intevals were used. The

longer ones are not ideal, but some utilities could not

provide data on a smaller interval.

Testing Data

• For each station, each trained EDS was challenged

with:

– Baseline utility data to primarily measure false

alarms

– 96 simulated contamination events per monitoring

station to calculate detection ability

>For each station, an event was created with

every combination of 6 contaminants, 2

contaminant concentrations, 4 event start times,

and 2 event profiles

>The events were designed to be varied and

realistic

7

8

Simulated Events

• Water quality changes consistent with contamination

scenarios were superimposed on the baseline data

– Empirically modeled from contaminant reaction

studies

0

0.2

0.4

0.6

0.8

1

1.2

4:00 6:00 8:00 10:00 12:00 14:00 16:00

mg

/L

Chlorine Data with

Event Superimposed

Baseline Chlorine

Data

Baseline Events

9

• All testing data was methodically analyzed to identify

periods in the baseline data where data was abnormal.

For analysis, alarms during these periods were considered

true detections, not false alarms

– 16 total “baseline events” were identified in the testing data

0

0.25

0.5

0.75

1

1.25

1.5

1.75

2

2.25

2.5

11/10 11/11 11/12 11/13 11/14 11/15

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1TOC Turbidity

Bad Quality Alarms

• The testing data was also processed to identify

periods of bad quality data.

• These were considered false alarms, but were listed

separately as a utility could easily identify the cause

of these alarms

10

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

11/10 1:40

11/10 11:40

11/10 21:40

11/11 7:40

11/11 17:40

11/12 3:40

11/12 13:40

11/12 23:40

11/13 9:40

TO

C

11

EDS Challenge Participants

• Open to any team with software capable of analyzing

time series data and producing normal / abnormal

output for each timestep

• Participating EDS developers had to submit their

“trained” software tool to EPA for testing

– Originally 16 teams registered

– 8 withdrew due to limited resources and / or

unwillingness to adhere to requirements

– 3 withdrew due to poor performance

EDS Challenge Participants

• 5 EDS tools participated (in alphabetical order by

EDS name):

– BlueBox, WhiteWater Security*

– CANARY, Sandia National Labs / USEPA

– Event Monitor, the Hach Company*

– Moni::tool, S::CAN

– OptiEDS, OptiWater (Elad Salomons)

* Due to issues with running the software in off-line mode, BlueBox was

only run on the 3 stations with longer polling intervals

* Hach chose to analyze only the 3 sites with the 2-minute polling

intervals

12

Preliminary Results

13

Challenge Results

• This presentation gives a “snapshot” of the results

obtained in the EDS Challenge. Formal, more in

depth results will be published via a report and journal

article.

• As these results are preliminary, EDS names will not

be used in this presentation of results.

14

Performance Notes

• This truly was a Challenge, and utilities would likely

experience better performance in their implementation.

– Stations with complex WQ were intentionally chosen to challenge

the tools (these are “worst case” stations).

– Some of the simulated events were intentionally hard to detect.

• Participants had to significantly modify their tool to run in

off-line mode, as required by the Challenge.

– BlueBox, Event Monitor, and Moni::tool use real-time user feedback

to determine future alarming.

• Also, all participants have updated and enhanced their

software since these results were generated.

15

Alarms Summary, Station A

• Polling interval: 5 minutes

• At a distribution system point of entry. Entirely

different water source and quality depending on the

status of three co-located pumps. The on/off status

for each of these pumps was provided in data.

16Preliminary Results – For Official Use Only

Alarms Summary, Station B


• At the connection where a ground storage tank is

filled, which provides water to the utility’s large

customers. Flow through the station is intermittent

based on demand, and there are clear daily patterns.


Alarms Summary, Station D


• At a large reservoir. WQ is affected by operations at

co-located pump station and by two upstream pump

stations. A variety of operations data was provided,

but there is no exact correlation between these and

the station’s WQ changes


Alarms Summary, Station E


• At a reservoir. Water sources to the station include

two different mains and the reservoir. Many

operational data tags were provided including WQ

from the main, reservoir level and flow, and pump

status for 3 non-co-located pumps.


Alarms Summary, Station F


• Located at a large elevated tank. Tank levels and co-

located pump statuses were provided.


Alarms Summary, Station G


• At a major pumping station, connected to a bi-

directional line that runs between the reservoir and

pump station. Pumping operations cause “blips” in

the data that look like sensor errors. Reservoir flow

and pump statuses provided in data.


Path Forward

22

23

Additional Analyses

• Further analysis to be done by EPA, including:

– Performance analysis using multiple alarm

threshold settings

>Production of ROC curves

– Evaluation of how setpoint alarms would perform

– Consideration of additional metrics such as the

accuracy of outputted trigger parameters, detection

time, alarm length, output variability…

– Analysis of the impact of each event characteristic

on detection: breakdown of results by monitoring

location, contaminant, contaminant concentration,

start time, and event profile

Acknowledgements

• The patience and cooperation of the Challenge

Participants: the Hach Company, Optiwater (Elad

Salomons), Sandia National Labs, S::CAN, and

WhiteWater Security

• Support from Erin Cummings, Zheng Jie, Reese

Johnson, Raja Kadiyala, and Adam Pollak, CH2M

HILL

24

Questions

[email protected]

25

eds challenge: overview and some preliminary results

Technology

water quality data

baseline utility data

periods of bad quality

eds alarms

trained eds

water security eds challenge

time series data

large quantity of data