Class StorelessCovariance

java.lang.Object
org.apache.commons.math3.stat.correlation.Covariance
org.apache.commons.math3.stat.correlation.StorelessCovariance

public class StorelessCovariance extends Covariance
Covariance implementation that does not require input data to be stored in memory. The size of the covariance matrix is specified in the constructor. Specific elements of the matrix are incrementally updated with calls to incrementRow() or increment Covariance().

This class is based on a paper written by Philippe Pébay: Formulas for Robust, One-Pass Parallel Computation of Covariances and Arbitrary-Order Statistical Moments, 2008, Technical Report SAND2008-6212, Sandia National Laboratories.

Note: the underlying covariance matrix is symmetric, thus only the upper triangular part of the matrix is stored and updated each increment.

Since:
3.0
  • Field Details

    • covMatrix

      private StorelessBivariateCovariance[] covMatrix
      the square covariance matrix (upper triangular part)
    • dimension

      private int dimension
      dimension of the square covariance matrix
  • Constructor Details

    • StorelessCovariance

      public StorelessCovariance(int dim)
      Create a bias corrected covariance matrix with a given dimension.
      Parameters:
      dim - the dimension of the square covariance matrix
    • StorelessCovariance

      public StorelessCovariance(int dim, boolean biasCorrected)
      Create a covariance matrix with a given number of rows and columns and the indicated bias correction.
      Parameters:
      dim - the dimension of the covariance matrix
      biasCorrected - if true the covariance estimate is corrected for bias, i.e. n-1 in the denominator, otherwise there is no bias correction, i.e. n in the denominator.
  • Method Details

    • initializeMatrix

      private void initializeMatrix(boolean biasCorrected)
      Initialize the internal two-dimensional array of StorelessBivariateCovariance instances.
      Parameters:
      biasCorrected - if the covariance estimate shall be corrected for bias
    • indexOf

      private int indexOf(int i, int j)
      Returns the index (i, j) translated into the one-dimensional array used to store the upper triangular part of the symmetric covariance matrix.
      Parameters:
      i - the row index
      j - the column index
      Returns:
      the corresponding index in the matrix array
    • getElement

      private StorelessBivariateCovariance getElement(int i, int j)
      Gets the element at index (i, j) from the covariance matrix
      Parameters:
      i - the row index
      j - the column index
      Returns:
      the StorelessBivariateCovariance element at the given index
    • setElement

      private void setElement(int i, int j, StorelessBivariateCovariance cov)
      Sets the covariance element at index (i, j) in the covariance matrix
      Parameters:
      i - the row index
      j - the column index
      cov - the StorelessBivariateCovariance element to be set
    • getCovariance

      public double getCovariance(int xIndex, int yIndex) throws NumberIsTooSmallException
      Get the covariance for an individual element of the covariance matrix.
      Parameters:
      xIndex - row index in the covariance matrix
      yIndex - column index in the covariance matrix
      Returns:
      the covariance of the given element
      Throws:
      NumberIsTooSmallException - if the number of observations in the cell is < 2
    • increment

      public void increment(double[] data) throws DimensionMismatchException
      Increment the covariance matrix with one row of data.
      Parameters:
      data - array representing one row of data.
      Throws:
      DimensionMismatchException - if the length of rowData does not match with the covariance matrix
    • append

      public void append(StorelessCovariance sc) throws DimensionMismatchException
      Appends sc to this, effectively aggregating the computations in sc with this. After invoking this method, covariances returned should be close to what would have been obtained by performing all of the increment(double[]) operations in sc directly on this.
      Parameters:
      sc - externally computed StorelessCovariance to add to this
      Throws:
      DimensionMismatchException - if the dimension of sc does not match this
      Since:
      3.3
    • getCovarianceMatrix

      public RealMatrix getCovarianceMatrix() throws NumberIsTooSmallException
      Returns the covariance matrix
      Overrides:
      getCovarianceMatrix in class Covariance
      Returns:
      covariance matrix
      Throws:
      NumberIsTooSmallException - if the number of observations in a cell is < 2
    • getData

      public double[][] getData() throws NumberIsTooSmallException
      Return the covariance matrix as two-dimensional array.
      Returns:
      a two-dimensional double array of covariance values
      Throws:
      NumberIsTooSmallException - if the number of observations for a cell is < 2
    • getN

      public int getN() throws MathUnsupportedOperationException
      This Covariance method is not supported by a StorelessCovariance, since the number of bivariate observations does not have to be the same for different pairs of covariates - i.e., N as defined in Covariance.getN() is undefined.
      Overrides:
      getN in class Covariance
      Returns:
      nothing as this implementation always throws a MathUnsupportedOperationException
      Throws:
      MathUnsupportedOperationException - in all cases