JMSLTM Numerical Library 4.0

com.imsl.stat
Class Dissimilarities

java.lang.Object
  extended bycom.imsl.stat.Dissimilarities
All Implemented Interfaces:
Cloneable, Serializable

public class Dissimilarities
extends Object
implements Serializable, Cloneable

Computes a matrix of dissimilarities (or similarities) between the columns (or rows) of a matrix.

Class Dissimilarities computes an upper triangular matrix (excluding the diagonal) of dissimilarities (or similarities) between the columns or rows of a matrix. Nine different distance measures can be computed. For the first three measures, three different scaling options can be employed. The distance matrix computed is generally used as input to clustering or multidimensional scaling functions.

The following discussion assumes that the distance measure is being computed between the columns of the matrix. If distances between the rows of the matrix are desired, set iRow to 1 when calling the Dissimilarities constructor.

For distanceMethod = 0 to 2, each row of x is first scaled according to the value of distanceScale. The scaling parameters are obtained from the values in the row scaled as either the standard deviation of the row or the row range; the standard deviation is computed from the unbiased estimate of the variance. If distanceScale is 0, no scaling is performed, and the parameters in the following discussion are all 1.0. Once the scaling value (if any) has been computed, the distance between column i and column j is computed via the difference vector z_k=frac{(x_k-y_k)}{s_k},i=1,
  ldots,ndstm, where x_k denotes the k-th element in the i-th column, y_k denotes the corresponding element in the j-th column, and ndstm is the number of rows if differencing columns and the number of columns if differencing rows. For given z_i, the metrics 0 to 2 are defined as:

distanceMethod Metric
0Euclidean distance (L_2 norm)
1Sum of the absolute differences (L_1 norm)
2Maximum difference (L_infty norm)

Distance measures corresponding to distanceMethod = 3 to 8 do not allow for scaling.

distanceMethod Metric
3Mahalanobis distance
4Absolute value of the cosine of the angle between the vectors
5Angle in radians (0, pi) between the lines through the origin defined by the vectors
6Correlation coefficient
7Absolute value of the correlation coefficient
8Number of exact matches, where x_i = y_i.

For the Mahalanobis distance, any variable used in computing the distance measure that is (numerically) linearly dependent upon the previous variables in the indexArray vector is omitted from the distance measure.

See Also:
Example 1, Example 2, Serialized Form

Nested Class Summary
static class Dissimilarities.NoPositiveVarianceException
          No variable has positive variance.
static class Dissimilarities.ScaleFactorZeroException
          The computations cannot continue because a scale factor is zero.
static class Dissimilarities.ZeroNormException
          The computations cannot continue because the Euclidean norm of the column is equal to zero.
 
Constructor Summary
Dissimilarities(double[][] x, int distanceMethod, int distanceScale, int iRow)
          Constructor for Dissimilarities.
Dissimilarities(double[][] x, int distanceMethod, int distanceScale, int iRow, int[] indexArray)
          Constructor for Dissimilarities.
 
Method Summary
 double[][] getDistanceMatrix()
          Returns the distance matrix.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Dissimilarities

public Dissimilarities(double[][] x,
                       int distanceMethod,
                       int distanceScale,
                       int iRow)
                throws Dissimilarities.ScaleFactorZeroException,
                       Dissimilarities.ZeroNormException,
                       Dissimilarities.NoPositiveVarianceException
Constructor for Dissimilarities.

Parameters:
x - A double matrix containing the data input matrix.
distanceMethod - An int identifying the method to be used in computing the dissimilarities or similarities. Acceptable values of distanceMethod are 0, 1, 2, ..., 8. See above for a description of these methods.
distanceScale - An int containing the scaling option.

distanceScale Method
0No scaling is performed.
1Scale each column (row if iRow=1) by the standard deviation of the column (row).
2Scale each column (row if iRow=1) by the range of the column (row).

iRow - An int identifying whether distances are computed between rows or columns of x. If iRow = 1, distances are computed between the rows of x. Otherwise, distances between the columns of x are computed.
Throws:
IllegalArgumentException - thrown when the row lengths of input matrix a are not equal (i.e. the matrix edges are "jagged")
Dissimilarities.ScaleFactorZeroException - thrown when computations cannot continue because a scale factor is zero
Dissimilarities.NoPositiveVarianceException - thrown when no variable has positive variance
Dissimilarities.ZeroNormException - is thrown when the Euclidean norm of a column is equal to zero

Dissimilarities

public Dissimilarities(double[][] x,
                       int distanceMethod,
                       int distanceScale,
                       int iRow,
                       int[] indexArray)
                throws Dissimilarities.ScaleFactorZeroException,
                       Dissimilarities.ZeroNormException,
                       Dissimilarities.NoPositiveVarianceException
Constructor for Dissimilarities.

Parameters:
x - A double matrix containing the data input matrix.
distanceMethod - An int identifying the method to be used in computing the dissimilarities or similarities. Acceptable values of distanceMethod are 0, 1, 2, ..., 8. See above for a description of these methods.
distanceScale - An int containing the scaling option.

distanceScale Method
0No scaling is performed
1Scale each column (row if iRow=1) by the standard deviation of the column (row).
2Scale each column (row if iRow=1) by the range of the column (row)

iRow - An int identifying whether distances are computed between rows or columns of x. If iRow=1, distances are computed between the rows of x. Otherwise, distances between the columns of x are computed.
indexArray - An int array containing the indices of the rows (columns if iRow is 1) to be used in computing the distance measure.
Throws:
IllegalArgumentException - thrown when the row lengths of input matrix a are not equal (i.e. the matrix edges are "jagged")
Dissimilarities.ScaleFactorZeroException - thrown when computations cannot continue because a scale factor is zero
Dissimilarities.NoPositiveVarianceException - thrown when no variable has positive variance.
Dissimilarities.ZeroNormException - is thrown when the Euclidean norm of a column is equal to zero
Method Detail

getDistanceMatrix

public final double[][] getDistanceMatrix()
Returns the distance matrix.

Returns:
A double matrix containing the distance matrix.

JMSLTM Numerical Library 4.0

Copyright 1970-2006 Visual Numerics, Inc.
Built June 1 2006.