Class KMeansPlusPlusClusterer<T extends Clusterable<T>>

java.lang.Object
org.apache.commons.math3.stat.clustering.KMeansPlusPlusClusterer<T>
Type Parameters:
T - type of the points to cluster

@Deprecated public class KMeansPlusPlusClusterer<T extends Clusterable<T>> extends Object
Deprecated.
As of 3.2 (to be removed in 4.0), use KMeansPlusPlusClusterer instead
Clustering algorithm based on David Arthur and Sergei Vassilvitski k-means++ algorithm.
Since:
2.0
See Also:
  • Field Details

  • Constructor Details

    • KMeansPlusPlusClusterer

      public KMeansPlusPlusClusterer(Random random)
      Deprecated.
      Build a clusterer.

      The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.

      Parameters:
      random - random generator to use for choosing initial centers
    • KMeansPlusPlusClusterer

      public KMeansPlusPlusClusterer(Random random, KMeansPlusPlusClusterer.EmptyClusterStrategy emptyStrategy)
      Deprecated.
      Build a clusterer.
      Parameters:
      random - random generator to use for choosing initial centers
      emptyStrategy - strategy to use for handling empty clusters that may appear during algorithm iterations
      Since:
      2.2
  • Method Details

    • cluster

      public List<Cluster<T>> cluster(Collection<T> points, int k, int numTrials, int maxIterationsPerTrial) throws MathIllegalArgumentException, ConvergenceException
      Deprecated.
      Runs the K-means++ clustering algorithm.
      Parameters:
      points - the points to cluster
      k - the number of clusters to split the data into
      numTrials - number of trial runs
      maxIterationsPerTrial - the maximum number of iterations to run the algorithm for at each trial run. If negative, no maximum will be used
      Returns:
      a list of clusters containing the points
      Throws:
      MathIllegalArgumentException - if the data points are null or the number of clusters is larger than the number of data points
      ConvergenceException - if an empty cluster is encountered and the emptyStrategy is set to ERROR
    • cluster

      public List<Cluster<T>> cluster(Collection<T> points, int k, int maxIterations) throws MathIllegalArgumentException, ConvergenceException
      Deprecated.
      Runs the K-means++ clustering algorithm.
      Parameters:
      points - the points to cluster
      k - the number of clusters to split the data into
      maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used
      Returns:
      a list of clusters containing the points
      Throws:
      MathIllegalArgumentException - if the data points are null or the number of clusters is larger than the number of data points
      ConvergenceException - if an empty cluster is encountered and the emptyStrategy is set to ERROR
    • assignPointsToClusters

      private static <T extends Clusterable<T>> int assignPointsToClusters(List<Cluster<T>> clusters, Collection<T> points, int[] assignments)
      Deprecated.
      Adds the given points to the closest Cluster.
      Type Parameters:
      T - type of the points to cluster
      Parameters:
      clusters - the Clusters to add the points to
      points - the points to add to the given Clusters
      assignments - points assignments to clusters
      Returns:
      the number of points assigned to different clusters as the iteration before
    • chooseInitialCenters

      private static <T extends Clusterable<T>> List<Cluster<T>> chooseInitialCenters(Collection<T> points, int k, Random random)
      Deprecated.
      Use K-means++ to choose the initial centers.
      Type Parameters:
      T - type of the points to cluster
      Parameters:
      points - the points to choose the initial centers from
      k - the number of centers to choose
      random - random generator to use
      Returns:
      the initial centers
    • getPointFromLargestVarianceCluster

      private T getPointFromLargestVarianceCluster(Collection<Cluster<T>> clusters) throws ConvergenceException
      Deprecated.
      Get a random point from the Cluster with the largest distance variance.
      Parameters:
      clusters - the Clusters to search
      Returns:
      a random point from the selected cluster
      Throws:
      ConvergenceException - if clusters are all empty
    • getPointFromLargestNumberCluster

      private T getPointFromLargestNumberCluster(Collection<Cluster<T>> clusters) throws ConvergenceException
      Deprecated.
      Get a random point from the Cluster with the largest number of points
      Parameters:
      clusters - the Clusters to search
      Returns:
      a random point from the selected cluster
      Throws:
      ConvergenceException - if clusters are all empty
    • getFarthestPoint

      private T getFarthestPoint(Collection<Cluster<T>> clusters) throws ConvergenceException
      Deprecated.
      Get the point farthest to its cluster center
      Parameters:
      clusters - the Clusters to search
      Returns:
      point farthest to its cluster center
      Throws:
      ConvergenceException - if clusters are all empty
    • getNearestCluster

      private static <T extends Clusterable<T>> int getNearestCluster(Collection<Cluster<T>> clusters, T point)
      Deprecated.
      Returns the nearest Cluster to the given point
      Type Parameters:
      T - type of the points to cluster
      Parameters:
      clusters - the Clusters to search
      point - the point to find the nearest Cluster for
      Returns:
      the index of the nearest Cluster to the given point