RUSSIAN JOURNAL OF EARTH SCIENCES VOL. 10, ES4001, doi:10.2205/2007ES000218, 2008
[39] Cluster analysis is a method of multi-dimensional statistic classification, based on a
compact measurement groups selection (stable parameters composition in multi-dimensional
space) and outlining geometry of the groups to access distances between their centers and
showing the limit dividing the space according to the assignment to one or another group. As the
result of analysis the original points aggregate in multi-dimensional space (which depends on the
number of parameters, applied for the classification - 10-dimensional in our case) is divided into
clusters or groups of similar objects. The object is meant as elementary
1o1o lithospheric
cell that was given a 10-parameter value. The cluster is usually defined as a group of objects (here it
is a lithospheric area) with a density i.e. compact concentration of applied parameters for the
above mentioned area. In this case the object density or similarity of properties is assumed to be
higher within the cluster, than out of it. It means that cluster may be defined as a center, variance
(efficient radius) within the outlines in shape of hypersphere and separation from other clusters.
This definition is far from being absolute.
[40] But it clearly defines its properties and tasks being in fact comprehensive.
[41] In current study calculations were performed in STATISTICA after the loading of the prepared data. It means the authors didn't go into detail of the algorithms, implemented in STATISTICA. The authors knew only a general procedure of classification, confined to parameters available in the user's menu of the program. The number of clusters N into which all the objects are to be divided presents the main parameter. The selection of an optimal number of clusters will be discussed below in Section 4.2.
[42] Standardized parameters (see Section 3) for each lithospheric cell (see Section 6.1.), represented in form of the table, where columns show values of one of 10 parameters for each line, corresponded with cells, are the original data for the calculations. Then the matrix of distances between each pair of objects is calculated within the multi-dimensional space. Algorithm at a given number of required clusters N divides the entire set of objects into N clusters. The general idea of the procedure is the following. At first it is given such a measure (radius), which is greater than total space objects occupation, and using this radius any object could be reached from any of her object. Then the algorithm decreases it until the appearance of separation dense groups from the general "cloud'', when the mutual access between groups at the current radius becomes impossible. The method of groups densities and areas weights estimation won't be discussed here. Such procedure could also be carried out in opposite direction: minimum measure (the shortest distance between objects) increases until the aggregation of objects from given number of clusters (equal to number of objects) into N groups.
[43] The above presents the main idea of clusterization using the physically simplest k -means clustering method. This method, realized in STATISTICA, fits our problem the best. STATISTICA suggests a variety of parameters and clusterization algorithms details, but their description is not essential for our work.
Citation: 2008), Geodynamic zonation of the Atlantic Ocean lithosphere: Application of cluster analysis procedure and zoning inferred from geophysical data, Russ. J. Earth Sci., 10, ES4001, doi:10.2205/2007ES000218.
Copyright 2008 by the Russian Journal of Earth Sciences (Powered by TeXWeb (Win32, v.2.0).