RUSSIAN JOURNAL OF EARTH SCIENCES VOL. 10, ES4001, doi:10.2205/2007ES000218, 2008

4. Method Description

4.1. Cluster Analysis

[39]  Cluster analysis is a method of multi-dimensional statistic classification, based on a compact measurement groups selection (stable parameters composition in multi-dimensional space) and outlining geometry of the groups to access distances between their centers and showing the limit dividing the space according to the assignment to one or another group. As the result of analysis the original points aggregate in multi-dimensional space (which depends on the number of parameters, applied for the classification - 10-dimensional in our case) is divided into clusters or groups of similar objects. The object is meant as elementary 1otimes1o lithospheric cell that was given a 10-parameter value. The cluster is usually defined as a group of objects (here it is a lithospheric area) with a density i.e. compact concentration of applied parameters for the above mentioned area. In this case the object density or similarity of properties is assumed to be higher within the cluster, than out of it. It means that cluster may be defined as a center, variance (efficient radius) within the outlines in shape of hypersphere and separation from other clusters. This definition is far from being absolute.

[40]  But it clearly defines its properties and tasks being in fact comprehensive.

[41]  In current study calculations were performed in STATISTICA after the loading of the prepared data. It means the authors didn't go into detail of the algorithms, implemented in STATISTICA. The authors knew only a general procedure of classification, confined to parameters available in the user's menu of the program. The number of clusters N into which all the objects are to be divided presents the main parameter. The selection of an optimal number of clusters will be discussed below in Section 4.2.

[42]  Standardized parameters (see Section 3) for each lithospheric cell (see Section 6.1.), represented in form of the table, where columns show values of one of 10 parameters for each line, corresponded with cells, are the original data for the calculations. Then the matrix of distances between each pair of objects is calculated within the multi-dimensional space. Algorithm at a given number of required clusters N divides the entire set of objects into N clusters. The general idea of the procedure is the following. At first it is given such a measure (radius), which is greater than total space objects occupation, and using this radius any object could be reached from any of her object. Then the algorithm decreases it until the appearance of separation dense groups from the general "cloud'', when the mutual access between groups at the current radius becomes impossible. The method of groups densities and areas weights estimation won't be discussed here. Such procedure could also be carried out in opposite direction: minimum measure (the shortest distance between objects) increases until the aggregation of objects from given number of clusters (equal to number of objects) into N groups.

[43]  The above presents the main idea of clusterization using the physically simplest k -means clustering method. This method, realized in STATISTICA, fits our problem the best. STATISTICA suggests a variety of parameters and clusterization algorithms details, but their description is not essential for our work.

4.2. Approach to Criteria Identification for Attainment of Results

[44]  A brief description of the method showed that our task is aimed at breaking all the objects into stable and distinctly isolated N -number statistic groups with N number as large as possible. Each group contains a certain combination of all the parameters. Apparently, groups with distinct extreme for any parameter are the first to be singled out. Division using least pronounced variations starts only after the appearance of groups formed due to maximum values or values spanning the main variability range of each parameter. At this stage it is essential to find a moment when separation by statistically different mean values in areas outlined is replaced by "forced'' separation, i.e. extraction of clusters slightly differing in value, comparable to dispersion or parameter instrumental error within zone selected. This moment corresponds to the condition when the analysis procedure terminates estimation of environment linear heterogeneity (see Section 2.1.) and starts to analyze scattered heterogeneity. In this case, geodynamical interpretation of separate clusters seems to be useless and the analysis should be stopped at the current N value. A diversity assigned to scattered heterogeneity should be statistically estimated using characteristics of high-order moment type uniform for the entire area. The availability of physical validity and geological meaning for different parameters of each cluster will also be a criterion of the result accessibility. A set of values for characteristics given for each parameter for each of clusters is the solution of the stated geodynamical zonation problem.


RJES

Citation: Sokolov, S. Yu., N. S. Sokolov, and L. V. Dmitriev (2008), Geodynamic zonation of the Atlantic Ocean lithosphere: Application of cluster analysis procedure and zoning inferred from geophysical data, Russ. J. Earth Sci., 10, ES4001, doi:10.2205/2007ES000218.

Copyright 2008 by the Russian Journal of Earth Sciences

Powered by TeXWeb (Win32, v.2.0).