RUSSIAN JOURNAL OF EARTH SCIENCES VOL. 10, ES1001, doi:10.2205/2007ES000278, 2008
Recognition of anomalies from time series by fuzzy logic methodsA. D. Gvishiani, S. M. Agayan, Sh. R. Bogoutdinov, and E. M. GraevaGeophysical Center, Russian Academy of Sciences, Moscow, Russia J. Zlotnicki Observatoire de Physique du Globe, Clermont-Ferrand, France J. Bonnin L'Institut de Physique du Globe de Strasbourg, Strasbourg, France Contents
Abstract[1] This paper is devoted to the detection of anomalies by the fuzzy comparison algorithm for recognition of signals (FCARS). The algorithm is a result of soft (based on fuzzy mathematics) modeling of interpreter's logic and continues in this direction the difference recognition algorithm for signals (DRAS) and the fuzzy logic algorithm for recognition of signals (FLARS), previously developed by the authors. A characteristic feature of FCARS is a more comprehensive use of the so-called fuzzy comparisons introduced by the authors. This makes FCARS more versatile and adaptive than DRAS and FLARS. 1. Introduction[2] Generally accepted algorithms used for the identification of anomalies from records of signals are mostly based on statistical and frequency-time analyses. Presently, approaches to the solution of this problem involve the use of artificial intelligence, and this direction of research is the subject of the present paper addressing the fuzzy comparison algorithm for recognition of signals (FCARS). The algorithm is a result of soft (based on fuzzy logic) modeling of the logic of an interpreter attempting to detect anomalies in signal records. We utilized the formulation of such a logic proposed by Neimark [1966] for its "probabilistic'' modeling. 2. Detection of Anomalies[3] Using the monograph [Kedrov, 2005], we present here a review of actual systems of detection of anomalies (applied mainly in seismology). The goal of this far from being exhaustive overview is to compare FCARS with the difference recognition algorithm for signals (DRAS) and with the fuzzy logic algorithm for recognition of signals (FLARS) [Gvishiani et al., 2003, 2004], implementing the difference-from-moving average (DMA) approach. [4] According to [Kedrov, 2005], the complete cycle of the procedure of detecting anomalies from signal records is divided into three stages: predetection, discovery (detection), and processing of the anomaly discovered. Algorithms of anomaly detection are mostly based on a combination of the statistical approach and the spectral-time analysis (STA). The latter is a method of statistical analysis designed for the study of frequency characteristics of a stationary random process with discrete time or a time series. The STA is based on a combination of diverse spectral, asymptotic, and functional techniques that is often strongly constrained by the physical essence of events studied and, for this reason, is fairly illustrative [Prokhorov, 1999]. We give brief characterization of some of these systems. 2.1. System SESMO1[5] [Kedrov, 2005] is intended for real-time detection of short-period seismic anomalies in time and frequency domains. The algorithm uses eight Butterworth filters encompassing with overlap the entire frequency response band of an anomaly. This algorithm uses three-component polarization analysis.2.2. Autoregressive moving average (ARMA) models[6] [Kedrov et al., 2000]. This type of algorithms of anomaly detection is based on the use of adaptive and matching filters. The algorithm of such a detector is constructed in terms of an autoregressive description (ARMA models) of seismic analyses and noise. The related filtering consists in a continuous analysis of noise. Based on this, the type of data is predicted that should be recorded in a subsequent moment. If the prediction of noise accumulated until the current time moment fails, there is made a suggestion that a desired anomaly is recorded.2.3. Maximum likelihood[7] [Kushnir and Mostovoi, 1990]. Methods of anomaly detection based on the use of maximum likelihood filters are difficult to be utilized for real-time detection of anomalies. In this case, the detection procedure involves real-time continuous estimation of spectral properties of noise under the condition that parameters of an anomaly are known either completely or partially.2.4. Recognition with training[8] [Haries and Joswig, 1985]. For detecting local or regional low-amplitude anomalies in a region of interest, known anomalies are STA-analyzed to construct a set of typical patterns. A new anomaly in the given region is detected by comparing it with the available patterns. This comparison is based on a coherence value specified for several levels of the signal intensity.2.5. Neural networks[9] [Romeo, 1994]. This approach consists in the modeling of researcher's capabilities by means of a specially constructed neural network. The network is a multilayer perceptron. Input parameters that were used in this network are absolute values of seismic wave spectral amplitudes in nine frequency bands. At the output, the discovered anomalies are classified as local, regional, or teleseismic anomalies or as noise of two types.3. Detection of Anomalies by Fuzzy Logic Methods[10] The DMA-based algorithms DRAS and FLARS [Gvishiani et al., 2003, 2004] are an alternative approach (with respect to the methods presented in section 2) to modeling of human reasoning and actions in search for anomalies. DRAS and FLARS are an attempt to model the logic of a researcher recognizing an anomaly from visual inspection of a record for an automated use of this model in analysis of large sets of data that cannot be manually processed. These algorithms yield estimates for boundaries of sought anomalies and subdivide them morphologically into initial, central, and final stages, identifying strong and weak phases in the central stage [Gvishiani et al., 2003]. The algorithms are rather versatile due to a wide set of "rectifications'' [Gvishiani et al., 2003, 2004] arising in modeling interpreter's work. [11] In a simplified form, work of an interpreter detecting an anomaly by visual inspection of a record is understood here as follows. Initially he looks over the record, estimating activity of its fragments in terms of positive numbers and mentally assigns the inferred numerical estimates to the fragments or their centers. Thus, the interpreter passes from the initial record to a nonnegative function that can be naturally called "rectification'' of a record. Actually, larger values of this function (rectification) will correspond to record points that are more active from the standpoint of sought signals. Further, the interpreter searches for rises in the record rectification that correspond to the most active record fragments. Thus, the interpreter works at two levels, local (rectification of a record) and global (search for rises in the rectification). [12] Naturally, the proposed simplified model of interpreter's logic cannot be regarded as unique and/or universal. Moreover, interpreter's reasoning is largely determined by the concrete type of anomalies (data) in question. However, in our opinion, the rectification process functions, one way or other, in any case. 3.1. Local level: Construction of record rectification.[13] An anomaly in a record (time series) is an ambiguous notion changing its form both from one record to another and within one record. Similarly to other intuitively clear mathematical notions (e.g., an element of a set), we do not attempt to give its strict definition. Anomalous nature is clear from examples given by experts. In terms of the DMA approach, a set of rectifications open for updating is applied for adequate modeling of "anomalies'' (higher activity zones). Now we pass to exact constructions.
[14] Let a discrete positive semiaxis be
Definition 1.[15] If J = Dk y is a set of local survey fragments of the record y and if![]() ![]() ![]() [16] A rectification determination can be regarded as successful if anomalies identified by an interpreter are mapped onto rises in the rectification. Accordingly, the presence of training data (i.e. results obtained by an interpreter from processing of a sufficiently long record fragment) is beneficial to the construction of a rectification. Examples of rectifications: (1) survey fragment length,
(2) survey fragment energy,
[17] Many other types of rectification were used in [Gvishiani et al., 2003, 2004; Zlotnicki et al., 2005]. At a local level common to algorithms of the DRAS and FLARS families, the rectification Fy is constructed and specified for a record y. This transformation of a record is the first stage of visual analysis performed by an interpreter.
3.2. Global level: Search for rises in a rectification.[18] Examples show that a rectification topography can be rather complex (Figure 1). The activity of anomalies cannot be invariably high and they can be inhomogeneities (activity intervals are several and they are divided by "quiet'' points). The corresponding rectification intervals are oscillating rises. It is natural to seek a "platform'', i.e. a connected base of such a rise, and to detect sought "spikes'' against this base. A procedure required for the determination of rises in the curve Fy(k) does not reduce to simple selection of points in accordance with their heights. This procedure should combine a union process (search for platforms) and a subdivision process (extraction of spikes within platforms). The algorithmic implementation of this logic at a global level divides algorithms into the DRAS and FLARS families and enables differentiation between concrete implementations within each family.3.3. DRAS: The global level[19] [Gvishiani et al., 2003]. A record is first divided into background (quiet) and potentially anomalous (disturbed) parts. Connected regions in the disturbed part serve as bases (platforms) of rises. Farther, DRAS identifies undoubtedly anomalous fragments on the platforms.
[20] To implement this procedure, the algorithm uses one-side measures
LaFy(k) and
RaFy(k) that quantify, on the [0, 1] scale, the quietness of the rectification
Fy left and right of the point
kh, detecting points whose ordinates exceed a level
a [Gvishiani et al., 2003].
The latter is a free parameter of the
algorithm called the vertical level of background. In other words, the quietness
to the left (right) of the point
kh in the record
y is modeled in DRAS as a
fuzzy subset on the recording interval
T with the measures
LaFy(k) (RaFy(k)). Using the conjunction
min(LaFy(k), RaFy(k)),
provides for the possibility of versatile treatment of
Fy excesses over the level
a. With the so-called horizontal level
b ![]() ![]() ![]() ![]()
[21] The set
P is the union of the connected components
[22] Free parameters of DRAS are the rectifying functional
F and the following
positive values: the local survey window
D 3.4. FLARS: The global level[23] [Gvishiani et al., 2004]. As distinct from DRAS, the FLARS algorithm first identifies significantly anomalous intervals and then the set of these intervals is supplemented with potentially anomalous intervals, thereby forming an "aureole'' of an identified anomaly. Thus, FLARS divides, in two stages, the recording interval into three subsets (T = A![]() ![]()
[24] We remind the reader that the DRAS choice of extreme points is based on analysis
of the vertical level
a immediately in the rectification
Fy. FLARS
forms indirectly the set of anomalous points
A, using the search for extreme
values on a
Fy topography with the help of a fuzzy extremality measure
m(k) taking values from the interval
-1
Like DRAS, the FLARS algorithm divides the set of nonanomalous points is
subdivided into background and potentially anomalous components with the help of
the alternating one-sided measures
Note that, due to the normalization
m(k)
[25] Free parameters of FLARS are the rectifying functional
F and the following
positive values: the local survey window
D 4. Fuzzy Comparisons[26] In many cases, the usual difference measure of the excess of one number over another is overly rough. In particular, DMA algorithms require finer constructions for the comparison of numbers. Definition 2.[27] Fuzzy comparison n(a, b) of real numbers a and b quantifies the degree of excess of b over a on a [-1, 1] scale:
[28] Thus, fuzzy comparison can be realized in terms of any function
f(a, b),
[29] Actually, such functions will possess properties that are naturally required for comparison of numbers.
[30] If
n(a, b) is a fuzzy comparison and
y is a monotonically increasing
mapping of the segment [-1, 1] into itself, the superposition ( y [31] In algorithms of the DRAS and FLARS families, it is sufficient to use fuzzy comparisons defined on positive numbers. Actually, records are processed by these algorithms through their rectifications taking solely positive values. We introduce the following family of basic fuzzy comparisons nn(a, b), n > 0 and their variations of a specific type ng,n(a, b). Definition 3.[32] If![]()
(ii) we set
ngn(a, b) = yg(nn(a, b)) for any
g
[33] This variation is correct: n0, n(a, b) = y0(nn(a, b)) = nn(a, b), so that nn becomes larger at g > 0 and smaller at g< 0. In what follows, the comparison n(a, b) means a value of ng, n(a, b), n > 0, -1 < g< 1.
[34] We need to extend
n(a, b) to the concept of fuzzy comparisons
n(a, A) and
n(A, a) of an arbitrary number
a
![]() ![]() ![]() [35] Binary extension:
[36] Gravitational extension: Let gr A be the center of gravity of the set A, i.e.
then
[37] σ-extension: The left moment
is an argument in favor of the maximality of a compared to A modulus. Accordingly, the right moment ![]() is an argument in favor of the minimality of a compared to A modulus. Then,
[38] It is natural to set that, if the validity of a certain property is expressed in terms of the [-1, 1], then a value from [0.5, 1] ([0, 0.5]) means a strongly (weakly) extremal manifestation of this property. Following these lines, we formalize the notions "large'' and "small'' with respect to the weighted set A (modulus of A ). Definition 4.[39] Based on a given fuzzy comparison n (for a given weighted set A ) and its extensions n(A, a) and n(a, A), a number a![]() (I) strongly (weakly) large if ![]() ![]() ![]() ![]() [40] Example. The extremality measure m(k) in FLARS (the FLARS measure) is obtained as a result of comparison (2) of the rectification value Fy(k) with the weighted set
where
[42] Example. The local survey parameter
D in DRAS and FLARS quantifies
the closeness of the record
y in the recording interval
T. Using fuzzy
comparisons, the choice of this parameter can also be made automatic as follows.
Let 5. FCARS: Global Level[43] Like DRAS and FLARS, FCARS (Fuzzy Comparison Algorithm for Recognition of Signals) uses at a local level the procedure described in section 3.1 and providing FTS rectification. At a global level, the FCARS search for oscillating rises in a rectification can be described as follows. Significant vertical spikes are first detected in the rectification. Their fairly dense clusters and adjacent features are of interest. Points lying inside such clusters are considered as anomalous without regard for values taken by the rectifying function at these points. Small rectification values at these points can imply only a short-term weakening of a signal due to its inhomogeneity. Such points form central parts of rises in accordance with clusters of the aforementioned vertical spikes. [44] Points lying to the left and to the right of the dense clusters can be of two types: these are either quiet, background points near a given cluster or disturbed points of the record that are not necessarily extremal in the rectification and form the initial and final stages of the signal. [45] Thus, we can draw the following conclusion: anomalies in a record y correspond to oscillating rises of the rectification Fy. Bases of the anomalies are connected sets in the initial recording interval that consist of points extremally horizontally close to vertically extremal points of the rectification. [46] Precisely this definition of a rise serves as a basis for the FCARS global level. FCARS modeling is based on fuzzy comparisons and monolithicity [Bogoutdinov, 2006]. Fuzzy comparisons are instrumental to a correct formulation of the notion of vertically extremal spikes in a rectification. The degree of extremal horizontal proximity to the spikes is described in terms of proximity measures. Further, likewise with the help of fuzzy comparisons, the shell of the rise (anomaly) base is formed by filling gaps in dense clusters of vertically anomalous spikes with horizontally extremal points. [47] The properties of proximity measures are used for locating the central part of the rise in this shell. The rise foot is finally extracted after the localization of the side parts of the rise using fuzzy logic and fuzzy comparisons. Below we present an exact description of the FCARS. 5.1. FCARS: Vertical subdivision (the first variant).[48] In this case, the vertical measure of anomalousness m v(k)![]() where n(A, a) is defined by formulas (2)-(4) (i.e. n is the binary, gravitational, or t -extension of the fuzzy comparison ng, n(a, b) ). [49] Let as aw) be the strong (weak) level of extremality with respect to the modulus of Im Fy, that is as (aw) is the solution of the equation Definition 5 [50] (a) The point k is of the vertically background type if m v(k) < 0 Fy(k) < aw.
[51] (b) The point
k is vertically anomalous if
m v(k)
[52] (c) The point
k is vertically potentially-anomalous if
m v(k)
Let
v B,
v A, and
v P, denote respective sets of vertically-background,
vertically anomalous, and vertically potentially-anomalous points. Then the
recording period under consideration can be represented as
T = v B 5.2. FCARS: Vertical subdivision (the second variant).[53] In this case, the vertical measure of anomalousness m v(k) at the point k is the FLARS measure, defined by (5), and![]() ![]() where ![]() ![]() ![]() and ![]() ![]() 5.3. FCARS: Horizontal subdivision.[54] We introduce the left and right measures of proximity to the vertically anomalous subset v A in the model of local survey![]()
where
[55] Note. Measures (6) are connected with the DRAS standard background measures Las and Ras [Gvishiani et al., 2003] via fuzzy negation: [56] The fuzzy disjunction Definition 6.[57] (A) The point k is of the horizontally background type if mh(k) < 0.
[58] (B) The point
k is horizontally anomalous if
mh(k)
[59] (C) The point
k is horizontally potentially-anomalous if
mh(k) [60] Let hB, hA, and hP, denote respective sets of horizontally-background, horizontally anomalous, and horizontally potentially-anomalous points. This triad provides a horizontal subdivision of the recording interval
[61] The set
hB being considered as background, its complement
hA
[62] Thus, we have
Pm
[63] Statement 7. If
Pm = [b, e]
[64] Proof. Let
k*
[65] Let
bA (eA) is the first (last) point in the intersection
Pn 5.4. FCARS: anomaly boundaries.[66] As regards verticality, two types of points are present in the interval [b, bA]: background points with m v(k) < 0 and nonbackground points with m v(k)![]() [67] Now we formalize this logic on the basis of fuzzy comparisons. Let ![]() ![]() ![]()
Figure 3 presents results obtained by FCARS processing of a seismic record.
[69] Free parameters of FCARS are the rectifying functional
F, local survey
window
0 < D 5.5. Comparative analysis of FCARS with DRAS and FLARS. [70] 1. Constructively, proximity measures in FCARS coincide with background measures in DRAS.
[71] 2. The set of anomalies
hA [72] 3. As distinct for DRAS and FLARS, the choice of the free parameters a and b in FCARS is fully automated. [73] 4. FCARS differs basically from DRAS and FLARS by the block determining anomaly boundaries (subsection 5.4.): it uses the vertical subdivision T = v B + v A + v P (Definition 5). ReferencesBogoutdinov, Sh. R. (2006), Application of fuzzy logic methods (the Monolith algorithm) to Interpretation of Geomagnetic Data, in: Theoretical and Applied Problems in Geological Interpretation of Gravitational, Magnetic, and Electric Fields (in Russian), p. 41, 33rd Session of the Uspenskii International Workshop, Yekaterinburg, Russia. Gvishiani, A. D., S. Agayan, Sh. Bogoutdinov, A. Ledenev, J. Zlotnicki, and J. Bonnin (2003), Mathematical methods of geoinformatics: II. Fuzzy logic algorithms for the detection of anomalies in time series, Cybernetics and Systems Analysis (in Russian), 4, 103. Gvishiani, A. D., S. Agayan, Sh. Bogoutdinov, S. Tikhotsky, J. Hinderer, J. Bonnin, and M. Diament (2004), Algorithm FLARS and recognition of time series anomalies, System Research and Information Technologies, 3, 7. Haries, H. P., and M. Joswig (1985), Signal detection by pattern recognition methods, in: A Twenty-Five Years Review of Basic Research, edited by A. U. Kerr and D. L. Carlson, p. 579, Publisher, USA. Kedrov, O. K. (2005), Seismic Methods of Monitoring Nuclear Tests (in Russian), 412 pp., OIFZ RAN, Moscow. Kedrov, O. K., V. E. Permyakova, and G. M. Steblov (2000), Methods of Detecting Weak Seismic Phenomena within Platforms (in Russian), 101 pp., OIFZ RAN, Moscow. Kushnir, A. F., and S. V. Mostovoi (1990), Statistical Analysis of Geophysical Fields (in Russian), 270 pp., Naukova Dumka, Kiev. Neimark, B. M. (1966), Algorithm for detecting the seismic signal on the microseism background, Computational Seismology (in Russian), 1, 5. Prokhorov, Yu. V., and B. Ross, Eds. (1999), Probability and Mathematical Statistics, Encyclopedia (in Russian), 910 pp., Entsiklopedia, Moscow. Romeo, G. (1994), Seismic signal detection and classification using artificial neural networks, in: Ann. Geofis. XXXVII (Special Issue on the Workshop "Planning and Procedures for GSETT-3'', Erice, November 10-14, 1993), Annali di Geofisica, vol. XXXVII, no. 3, p. 343, Springer, Netherlands. Zlotnicki, J., J.-L. LeMouel, A. Gvishiani, S. Agayan, V. Mikhailov, and Sh. Bogoutdinov (2005), Automatic fuzzy-logic recognition of anomalous activity on long geophysical records. Application to electric signals associated with the volcanic activity of la Fournaise volcano (Reunion Island), Earth Planet. Sci. Lett., 234, 261, doi:10.1016/j.epsl.2005.01.040. [CrossRef] Received 20 December 2007; accepted 28 December 2007; published 24 January 2008. Keywords: fuzzy logic, anomaly, rectification, recognition. Index Terms: 8419 Volcanology: Volcano monitoring; 8494 Volcanology: Instruments and techniques; 9805 General or Miscellaneous: Instruments useful in three or more fields. ![]() Citation: 2008), Recognition of anomalies from time series by fuzzy logic methods, Russ. J. Earth Sci., 10, ES1001, doi:10.2205/2007ES000278. (Copyright 2008 by the Russian Journal of Earth SciencesPowered by TeXWeb (Win32, v.2.0). |