Methodology¶

This section provides technical details regarding the algorithm used for the detection of 1D anomalies.

Anomalies are identified from the detection of maximum, minimum and inflection points calculated from the first and second order derivatives of individual data channels. The algorithm relies on the Numpy.fft routine for the calculation of derivatives in the Fourier domain.

Detection parameters are available for filtering and grouping co-located anomalies. The selection process is done in the following order:

Primary detection¶

Loop over the selected data channels:

Apply the Minimum Data Value threshold.
For every maximum (peak) found on a profile, look on either side for inflection and minimum points. This forms an anomaly.
Keep all anomalies larger than the Minimum Amplitude

Grouping¶

Anomalies found along individual data channels are grouped based on spatial proximity:

Find all peaks within the Maximum Peak Migration distance. The nearest peak is used if multiple peaks are found on a single channel.
Create an anomaly group that satisfies the following criteria:
- The data channels must be part of a continuous series (maximum of one channel skip is allowed)
- A group must satisfy the Minimum number of channels

Minimum Amplitude¶

property PeakFinderParams.min_amplitude: int¶: Threshold on the minimum amplitude of the anomaly, expressed as a percent of the height scaled by the minimum value.

Threshold value (\(\delta A\)) for filtering small anomalies based on the anomaly minimum (\(d_{min}\)) and maximum (\(d_{max}\)).

\[\delta A = \left|\left|\frac{d_{max} - d_{min}}{d_{min}}\right|\right| \cdot 100\]

See figure for a visual example of the anomaly amplitude.

Todo

Add figure showing the effect on anomaly identification

Minimum Data Value¶

property PeakFinderParams.min_value: float¶: Minimum absolute data value to be considered for anomaly detection.

The minimum data threshold (\(\delta_d\)) (see Figure) can be defined by:

\[\begin{split}\begin{equation} d_i = \begin{cases} d_i & \;\text{for } d_i > \delta_d \\ nan & \;\text{for } d_i \leq \delta_d\\ \end{cases} \end{equation}\end{split}\]

Todo

Add figure showing the effect on anomaly identification