damina::CCLSVClustering Class Reference

An implementation of the Support Vector Clustering originally proposed by Ben-Hur et al (2001), with a variation in the Cluster Labeling part. More...

#include <CCLSVClustering.h>

Inheritance diagram for damina::CCLSVClustering:

Inheritance graph
[legend]
Collaboration diagram for damina::CCLSVClustering:

Collaboration graph
[legend]
List of all members.

Public Member Functions

 CCLSVClustering (double, double, struct svm_problem *)
 Constructor.
 CCLSVClustering (double, double, DataSet *)
 Constructor.
 CCLSVClustering (double, struct svm_problem *)
 Constructor.
 CCLSVClustering (double, DataSet *)
 Constructor.
 CCLSVClustering (struct svm_problem *)
 Constructor.
 CCLSVClustering (DataSet *)
 Constructor.
virtual void clusterize ()
 Run the clustering process.
virtual std::vector< int > & getClustersAssignment ()
 Returns the clustering assignments.
virtual int getKernelType ()
 Returns the current kernel type.
virtual double getKernelWidth ()
 Returns the current value of the Kernel Width.
virtual struct svm_model * getModel ()
 Returns the trained model.
virtual unsigned long int getNumberOfClusters ()
 Returns the number of clusters found.
virtual unsigned long int getNumberOfNonSingletonClusters ()
 Returns the number of clusters containing more than one point.
virtual unsigned long int getNumberOfValidClusters (unsigned long int)
 Returns the number of clusters with a number of elements greater than or equal to a certain threshold.
virtual std::vector< int > & getOriginalClasses (int)
 Returns the original classes assignments for the points in input.
virtual struct svm_parameter * getParameters ()
 Returns the SVM paramters.
virtual std::vector< unsigned
long int > & 
getPointPerCluster ()
 Returns the number of points per cluster.
virtual struct svm_problem * getProblem ()
 Returns the current input data set in the libsvm format.
virtual double getSoftConstraint ()
 Returns the current value of the Soft Margin Constraint.
virtual double getSoftConstraintEstimate ()
 Return the estimate of a more appropriate Soft Margin parameter.
virtual double getSphereRadius ()
 Returns the current value of the sphere radius.
virtual DataSetgetTrainingSet ()
 Returns the current input data set.
virtual double initialKernelWidth ()
 Computes the initial gaussian kernel width, as proposed in.
virtual void learn (struct svm_problem *)
 The learning process.
virtual void learn (DataSet *)
 The learning process.
virtual void learn ()
 The learning process.
virtual void setEuclideanDistance (EuclideanDistance *)
 Set a vector-distance measure valid for the Euclidean Space.
virtual void setKernelType (int)
 Set the kernel type for the SVM.
virtual void setKernelWidth (double)
 Set the new width for Gaussian Kernel.
virtual void setProblem (struct svm_problem *)
 Set the input data set in libsvm format.
virtual void setSoftConstraint (double)
 Set the new value for the Soft Margin Constant.
virtual void setTrainingSet (DataSet *)
 Set the input data set.
virtual void useClassicClusteringPolicyForBSV ()
 Use the classic clustering policy to clusterize BSVs.
virtual void useSpecialClusteringPolicyForBSV ()
 Use a special clustering policy to clusterize BSVs.
virtual bool usingSpecialClusteringPolicyForBSV ()
 Return true if the SVC is using the special policy to clusterize BSV.
virtual ~CCLSVClustering ()
 Destructor.

Protected Member Functions

virtual double beta (int, bool)
 Returns the i-th lagrangian multiplier.
virtual double beta (int)
 Returns the i-th lagrangian multiplier.
virtual double calculateDistanceFromCenter (struct svm_node *)
 Calculate the distance of a point from the center of the sphere.
virtual void calculateSphereRadius ()
 Calculate the sphere radius.
virtual void experimentalSeparateClusters ()
virtual double rho (bool)
 Returns the rho value of the decision function computed by One Class SVM.
virtual double rho ()
 Returns the rho value of the decision function computed by One Class SVM.
virtual void separateClusters ()
 Run the clustering process.

Protected Attributes

double C_estimation
 An estimation of a more appropriate C value, based on the heuristics proposed in the last paragraph of.
EuclideanDistanceeucDist
 Euclidean Measure for compute distance between vectors.
std::vector< int > labels
 Clustering assignments.
OneClassSVMtrainer
 The One Class SVM for the first step of the SV Clustering process.

Private Member Functions

void calculateZeta ()
 Calculate the Zeta, i.e.
void clusterizeBSVs ()
svm_node * pointOnThePath (struct svm_node *, struct svm_node *, double)
 Samples a point on the path between two points.

Private Attributes

double zeta

Detailed Description

An implementation of the Support Vector Clustering originally proposed by Ben-Hur et al (2001), with a variation in the Cluster Labeling part.

This class implements the Support Vector Clustering originally proposed in

A. Ben-Hur, D. Horn, H. T. Siegelmann, and V. Vapnik, "Support Vector Clustering," Journal of Machine Learning Research, vol. 2, pp. 125-137, 2001.

The Support Vector Clustering is among the few attempts to use the SVMs in an unsupervised way.

It consists mainly of two steps:

1. Cluster description
=======================

Find the Minimum Enclosing Sphere in the feature space, exactly as SVDD described in

D. M. J. Tax and R. P. W. Duin, "Data Domain Description using Support Vectors," in European Symposium on Artificial Neural Network, Bruges (Belgium), 1999.

D. M. J. TAX, "One-class classification: concept learning in the absence of counter-examples," PhD Thesis , 2001.

This leads to a Quadratic Programming problem like in the classic SVM training problem

This sphere, mapped back to the data space, split in various contours, each representing a cluster.

Here, we use One-class classification by Schoelkopf et al (2001) instead of SVDD by Tax.

The two methods are showed to be equivalent under following conditions:
a. The kernel must be Gaussian Kernel
b. The kernel width must be the same
c. The C parameter in SVDD must be equal to 1/(nu*N), where nu is the parameter in One-class SVM and N is the number of input elements.

The conditions a. is satisfied because the Support Vector Clustering formulation was based on the Gaussian Kernel (due the other kernels, like polynomial one, not describe the cluster boundaries well, as showed by Tax & Duin 1999)

2. Cluster labeling
===================

The cluster description algorithm does not differentiate between points that belong to differ- ent clusters.

Here we replace the originally proposed cluster labeling algorithm with a novel and faster method proposed in

S. Lee and K. M. Daniels, "Cone Cluster Labeling for Support Vector Clustering," in Proceedings of 6th SIAM Conference on Data Mining, 2006, pp. 484-488.

See also:
OneClassSVM
Author:
Vincenzo Russo - vincenzo.russo@neminis.org

Definition at line 70 of file CCLSVClustering.h.


Constructor & Destructor Documentation

damina::CCLSVClustering::CCLSVClustering ( DataSet train  ) 

Constructor.

Parameters:
train The input data set

Definition at line 12 of file CCLSVClustering.cpp.

00012                                                       : SVClustering(train) {
00013         }

damina::CCLSVClustering::CCLSVClustering ( struct svm_problem *  p  ) 

Constructor.

Parameters:
p The input data in libsvm format

Definition at line 20 of file CCLSVClustering.cpp.

00020                                                              : SVClustering(p) {
00021                 
00022         }

damina::CCLSVClustering::CCLSVClustering ( double  C,
DataSet train 
)

Constructor.

Parameters:
C The soft margin constant
train The input data set

Definition at line 30 of file CCLSVClustering.cpp.

00030                                                                 : SVClustering(C, train) {
00031                 
00032         }

damina::CCLSVClustering::CCLSVClustering ( double  C,
struct svm_problem *  p 
)

Constructor.

Parameters:
C The soft margin constant
prob The input data set in libsvm format

Definition at line 40 of file CCLSVClustering.cpp.

00040                                                                        : SVClustering(C, p) {
00041                 
00042         }

damina::CCLSVClustering::CCLSVClustering ( double  C,
double  w,
DataSet train 
)

Constructor.

Parameters:
C The soft margin constant
w The gaussian kernel width
train The input data set

Definition at line 51 of file CCLSVClustering.cpp.

00051                                                                           : SVClustering(C, w, train) {
00052                 
00053         }

damina::CCLSVClustering::CCLSVClustering ( double  C,
double  w,
struct svm_problem *  p 
)

Constructor.

Parameters:
C The soft margin constant
w The gaussian kernel width
prob The input data set in libsvm format

Definition at line 62 of file CCLSVClustering.cpp.

00062                                                                                  : SVClustering(C, w, p) {
00063                 
00064         }

damina::CCLSVClustering::~CCLSVClustering (  )  [virtual]

Destructor.

Definition at line 71 of file CCLSVClustering.cpp.

00071                                                 {
00072                  
00073         }


Member Function Documentation

double damina::SVClustering::beta ( int  i,
bool  scaled 
) [protected, virtual, inherited]

Returns the i-th lagrangian multiplier.

The One Class SVM solves the Quadratic Programming problem and computes the lagrangians "beta". LibSVM solves a scaled version of the One Class SVM problem, so this method returns either the scaled value of a beta or the normal one, depeding on the 'scaled' param.

To obtain the normal value of a lagrangian, we multiply it by C.

The betas equal to zero (non-support vector ones) are not in the solution.

Parameters:
i The index of the lagrangian
scaled True if you want the scaled version, false otherwise
Returns:
The value of the i-th beta (scaled or not, depending on 'scaled' param)

Definition at line 590 of file SVClustering.cpp.

References damina::OneClassSVM::getModel(), damina::SVClustering::getSoftConstraint(), and damina::SVClustering::trainer.

00590                                                     {
00591                 if (scaled) {
00592                         return this->trainer->getModel()->sv_coef[0][i];
00593                 }
00594                 else {
00595                         return this->trainer->getModel()->sv_coef[0][i] * getSoftConstraint();
00596                 }
00597         }

Here is the call graph for this function:

double damina::SVClustering::beta ( int  i  )  [protected, virtual, inherited]

Returns the i-th lagrangian multiplier.

The One Class SVM solves the Quadratic Programming problem and computes the lagrangians "beta". This method returns the i-th beta, multiplied by C, because LibSVM solves a scaled version of the One Class SVM problem. The betas equal to zero (non-support vector ones) are not in the solution.

Parameters:
i The index of the lagrangian
Returns:
The value of the i-th beta

Definition at line 569 of file SVClustering.cpp.

References damina::SVClustering::C, damina::OneClassSVM::getModel(), and damina::SVClustering::trainer.

Referenced by damina::SVClustering::calculateDistanceFromCenter(), damina::SVClustering::calculateQuadraticPartOfDistanceFromCenter(), experimentalSeparateClusters(), and separateClusters().

00569                                         {
00570                 return this->trainer->getModel()->sv_coef[0][i] * C; 
00571          }

Here is the call graph for this function:

Here is the caller graph for this function:

double damina::SVClustering::calculateDistanceFromCenter ( struct svm_node *  x  )  [protected, virtual, inherited]

Calculate the distance of a point from the center of the sphere.

Reference:

A. Ben-Hur, D. Horn, H. T. Siegelmann, and V. Vapnik, "Support Vector Clustering," Journal of Machine Learning Research, vol. 2, pp. 125-137, 2001.

The Equation nr. 13 in the paper above is the distance of a point from the center of Sphere

We sum over support vectors only (including Bounded SVs), because the Betas (lagrangian multipliers) are zero for non-SVs.

Furthermore, we replace K(x,x) with 1, because with the Gaussian Kernel, K(x,x) = 1 for all x.

See also:
calculateQuadraticPartOfDistanceFromCenter()
Parameters:
x The point

Definition at line 695 of file SVClustering.cpp.

References damina::SVClustering::beta(), damina::SVClustering::calculateQuadraticPartOfDistanceFromCenter(), damina::OneClassSVM::getModel(), and damina::SVClustering::trainer.

Referenced by damina::SVClustering::calculateSphereRadius(), and damina::SVClustering::separateClusters().

00695                                                                            {
00696                 double sum = 0;
00697                 
00698                 struct svm_node **SV = this->trainer->getModel()->SV;
00699                 int nSV = this->trainer->getModel()->l;
00700                 
00701                 
00702                 for (int i = 0; i < nSV; i++) {
00703                         sum += beta(i) * kernelfunction(SV[i], x, this->trainer->getModel()->param);
00704                 }
00705                 
00706                 
00707                 calculateQuadraticPartOfDistanceFromCenter();
00708                 
00709                 // 1 is because (in case of RBF kernel used here) K(x,x) is equal to 1 
00710                 // (therotically provable and pratically tested with libsvm) 
00711                 //return sqrt(1 - (2 * sum) + *(this->quadratic));
00712                 return 1 - (2 * sum) + *(this->quadratic);
00713         }

Here is the call graph for this function:

Here is the caller graph for this function:

void damina::SVClustering::calculateSphereRadius (  )  [protected, virtual, inherited]

Calculate the sphere radius.

Reference:

A. Ben-Hur, D. Horn, H. T. Siegelmann, and V. Vapnik, "Support Vector Clustering," Journal of Machine Learning Research, vol. 2, pp. 125-137, 2001.

Equation nr. 14 in the paper above is radius of the Sphere.

As suggested in

J. Yang, V. Estivill-Castro, and S. K. Chalup, "Support vector clustering through proximity graph modelling," in Neural Information Processing, 2002. ICONIP ‘02. Proceedings of the 9th International Conference on, 2002, pp. 898-903.

we can use the average among SVs' distances from center to calculate the radius.

However, theoretically, all the SVs must have the same distance from center, so we have tested and found that using the distance of just one SV is a good approximation as much as to compute the aforementioned average.

This solution avoids to compute the distance for all SVs.

Definition at line 741 of file SVClustering.cpp.

References damina::SVClustering::calculateDistanceFromCenter(), and damina::SVClustering::sphereRadius.

Referenced by damina::SVClustering::getSphereRadius().

00741                                                  {
00742                 
00743                 // first tecnique
00744                 //
00745                 // assuming SVs have all the same distanace from center (theoretically a correct assumption)
00746                 // to avoid a check on the distance of all SVs
00747                 //
00748                 this->sphereRadius = this->calculateDistanceFromCenter(this->trainer->getModel()->SV[0]);
00749                 
00750                 //if (this->sphereRadius >= 1) {
00751                 //      free(quadratic);
00752                 //      this->sphereRadius = this->calculateDistanceFromCenter(this->trainer->getModel()->SV[0]);
00753                 //}
00754 
00755                 // second tecnique
00756                 //
00757                 // we can use the average among SVs' distances to calculate the radius
00758                 
00759                 //int nSV = this->trainer->getModel()->l;
00760                 //this->sphereRadius = 0;
00761                 //for (int i = 0; i < nSV; i++) {
00762                 //      this->sphereRadius += this->calculateDistanceFromCenter(this->trainer->getModel()->SV[i]);                      
00763                 //}
00764                 //this->sphereRadius = this->sphereRadius / nSV;
00765 
00766                 //      third tecnique
00767                 //      Using One Class SVM, Radius = sqrt(1 - rho/2)
00768                 //sphereRadius = sqrt(1 - (rho()/2));
00769         }

Here is the call graph for this function:

Here is the caller graph for this function:

void damina::CCLSVClustering::calculateZeta (  )  [private]

Calculate the Zeta, i.e.

the radius of the SV-centered spheres in data space, corresponding to Support Vector Cones in features space, as described in

S. Lee and K. M. Daniels, "Cone Cluster Labeling for Support Vector Clustering," in Proceedings of 6th SIAM Conference on Data Mining, 2006, pp. 484-488.

Definition at line 108 of file CCLSVClustering.cpp.

References damina::SVClustering::getKernelWidth(), damina::SVClustering::getSphereRadius(), and zeta.

Referenced by experimentalSeparateClusters(), and separateClusters().

00108                                             {
00109                 double R = getSphereRadius();
00110                 double q = getKernelWidth();
00111                 
00112                 //double sqrt_1_R = sqrt(1 - R*R);
00113                 double sqrt_1_R = sqrt(1 - R);
00114                 double ln = log(sqrt_1_R);
00115 
00116                 this->zeta = sqrt(-(ln/q));
00117                 //this->zeta = -(ln/q);
00118         }

Here is the call graph for this function:

Here is the caller graph for this function:

void damina::CCLSVClustering::clusterize (  )  [virtual]

Run the clustering process.

In this case it runs the Step 2 of Support Vector Clustering, called "Cluster Labeling" by its authors.

The implementation reflects the solution proposed in

S. Lee and K. M. Daniels, "Cone Cluster Labeling for Support Vector Clustering," in Proceedings of 6th SIAM Conference on Data Mining, 2006, pp. 484-488.

S. Lee and K. M. Daniels, "Gaussian Kernel Width Selection and Fast Cluster Labeling for Support Vector Clustering," Department of Computer Science, University of Massachussets Lowell 2005.

namely, Cone Cluster Labeling

Reimplemented from damina::SVClustering.

Definition at line 92 of file CCLSVClustering.cpp.

References separateClusters().

00092                                          {
00093                 this->separateClusters();
00094                 //this->experimentalSeparateClusters();
00095         }

Here is the call graph for this function:

void damina::CCLSVClustering::clusterizeBSVs (  )  [inline, private]

Definition at line 81 of file CCLSVClustering.h.

00081 {};

void damina::CCLSVClustering::experimentalSeparateClusters (  )  [protected, virtual]

Definition at line 277 of file CCLSVClustering.cpp.

References damina::SVClustering::beta(), damina::SVClustering::C_estimation, damina::EuclideanDistance::calculateDistance(), calculateZeta(), damina::SVClustering::eucDist, damina::SVClustering::getKernelWidth(), damina::OneClassSVM::getModel(), damina::SVClustering::getParameters(), damina::OneClassSVM::getProblem(), damina::SVClustering::getSoftConstraint(), damina::SVClustering::labels, damina::SVClustering::trainer, damina::SVClustering::usingSpecialClusteringPolicyForBSV(), and zeta.

00277                                                            {
00278                 // compute Zeta
00279                 calculateZeta();
00280 
00281                 double ker_zeta = exp(-1 * getKernelWidth() * zeta);
00282                 
00283                 int i,j;                //      to run over the SVs 
00284                         
00285                 int nSV = this->trainer->getModel()->l - this->trainer->getModel()->lbounded;
00286                 int *SV_index = this->trainer->getModel()->SV_index;
00287                 
00288                 
00289                 /*      original dataset */
00290                 struct svm_node **x = this->trainer->getProblem()->x;
00291                 double dist;
00292                 
00293                 
00294                 /* arithmetic mean of all betas */
00295                 double betas_average = 0;
00296                 double beta_max = -1;
00297                 
00298                 /*      adj matrix only for real SVs */
00299                 UndirectedGraph adj(nSV);
00300                 for (i = 0; i < nSV; i++) {
00301                         betas_average += beta(i, false);                                                        //      sum all betas
00302                         if (beta(i, false) > beta_max) beta_max = beta(i, false);       //      max beta
00303 
00304                         for (j = 0; j < nSV; j++) {
00305                                 
00306                                 //      trivial: each node is connected to itself
00307                                 if (i == j) {
00308                                         add_edge(i, j, adj);
00309                                         continue;
00310                                 }
00311                                 
00312                                 // avoid to set the same edge more than once
00313                                 if (edge(i, j, adj).second) continue;
00314                                 
00315                                 
00316                                 //dist = eucDist->calculateDistance(x[SV_index[i]], x[SV_index[j]]);
00317                                 dist = 2 - 2 * kernelfunction(x[SV_index[i]], x[SV_index[j]], *(getParameters()));
00318 
00319                                 //      zeta is the radius of the hyperspheres in data space centered in the SVs
00320                                 //      (all hyperspheres have the same radius)
00321                                 //      Two SVs are connected if their hyperspheres overlap, i.e. if the 
00322                                 //      distance between two SVs are at most 2*zeta (overlap in 1 point)
00323                                 if (dist <=  2 * ker_zeta) {              
00324                                         add_edge(i, j, adj);
00325                                         add_edge(j, i, adj);
00326                                 }
00327                                 else {
00328                                         remove_edge(i, j, adj);
00329                                         remove_edge(j, i, adj);
00330                                 }
00331                         }
00332                 }
00333                 betas_average = betas_average / nSV;    //      divide the sum of all betas by numer of betas => arithmetic mean
00334                 
00335                 
00336                 // clustering SVs finding connected components of the adj matrix just built
00337                 std::vector<int> svLabels(nSV);
00338                 connected_components(adj, &svLabels[0]);
00339                 
00340                 
00341                 // total points
00342                 int N = this->trainer->getProblem()->l;
00343                 
00344                 // total labels
00345                 labels.resize(N);
00346 
00347                 double mindist;
00348                 int minsvindex;
00349                 
00350                 C_estimation = 10.0 * getSoftConstraint() / N;  
00351 
00352                 // clustering points other than SVs
00353                 // for each point in the input data set 
00354                 for (i = 0; i < N; i++) {
00355                         mindist = MAXFLOAT; 
00356                         minsvindex = -1;
00357                         for (j = 0; j < nSV; j++) {     // we find the nearest SV
00358                         
00359                                 if (i == SV_index[j]) { //      trivial if the point is the current SV
00360                                         minsvindex = j;
00361                                         break;
00362                                 }
00363                                 
00364                                 //dist = eucDist->calculateDistance(x[SV_index[j]], x[i]);
00365 
00366                                 dist = 2 - 2 * kernelfunction(x[SV_index[j]], x[i], *(getParameters()));
00367 
00368                                 //if (dist < mindist) { //      if the distance between a point and an SV is less or equal than zeta
00369                                                                                 //      and it is also less than the last distance found 
00370                                 if ((dist < mindist) && (dist <= ker_zeta)) {
00371                                         mindist = dist;
00372                                         minsvindex = j;
00373                                 }  
00374                         }
00375 
00376                         //labels[i] = svLabels[minsvindex];
00377                         if (minsvindex == -1) {
00378                                 mindist = MAXFLOAT; 
00379                                 minsvindex = -1;
00380                                 for (j = 0; j < N; j++) {
00381                                         if (i != j) {
00382                                                 dist = 2 - 2 * kernelfunction(x[i], x[j], *(getParameters()));
00383                                                 //dist = eucDist->calculateDistance(x[j], x[i]);
00384                                                 if (dist < mindist) {
00385                                                         mindist = dist;
00386                                                         minsvindex = j;
00387                                                 }
00388                                         }       
00389                                 }
00390                                 labels[i] = labels[minsvindex];
00391                         }
00392                         else {
00393                                 labels[i] = svLabels[minsvindex];
00394                         }
00395                 }
00396 
00397                 if (usingSpecialClusteringPolicyForBSV()) {
00398                                 // try to cluster BSV points
00399                                 int nBSV = this->trainer->getModel()->lbounded;
00400                                 int *BSV_index = this->trainer->getModel()->BSV_index;
00401                                 
00402                                 for (i = 0; i < nBSV; i++) {
00403                                         mindist = MAXFLOAT; 
00404                                         minsvindex = 0;
00405                                         for (j = 0; j < N; j++) {
00406                                                 if (j == BSV_index[i]) continue;        //      ignore if the point is the current BSV
00407                                                 if (beta(j, true) == 1) continue;       //      ignore all BSVs
00408                                                 dist = eucDist->calculateDistance(x[BSV_index[i]],x[j]);
00409                                                 if (dist < mindist) {
00410                                                         mindist = dist;
00411                                                         minsvindex = j;
00412                                                 }
00413                                         }
00414                                         labels[BSV_index[i]] = labels[minsvindex];
00415                                 }
00416                 }
00417         
00418         }

Here is the call graph for this function:

std::vector< int > & damina::SVClustering::getClustersAssignment (  )  [virtual, inherited]

Returns the clustering assignments.

In the i-th position you can find the cluster wich contains the i-th input data point

Returns:
A vector encapsulating the clustering assignments

Definition at line 387 of file SVClustering.cpp.

References damina::SVClustering::labels.

Referenced by countRightAndWrongClassificationsPerClass(), and printAllPoints().

00387                                                             {
00388                 return labels;
00389         }

Here is the caller graph for this function:

int damina::SVClustering::getKernelType (  )  [virtual, inherited]

Returns the current kernel type.

Returns:
The kernel type

Definition at line 326 of file SVClustering.cpp.

References damina::AbstractSVM::getKernel(), and damina::SVClustering::trainer.

00326                                         {
00327                 return this->trainer->getKernel();
00328         }

Here is the call graph for this function:

double damina::SVClustering::getKernelWidth (  )  [virtual, inherited]

Returns the current value of the Kernel Width.

Returns:
The current value of the Gaussian Kernel Width

Definition at line 302 of file SVClustering.cpp.

References damina::AbstractSVM::getKernelWidth(), and damina::SVClustering::trainer.

Referenced by calculateZeta(), and experimentalSeparateClusters().

00302                                             {
00303                 return this->trainer->getKernelWidth();
00304         }

Here is the call graph for this function:

Here is the caller graph for this function:

struct svm_model * damina::SVClustering::getModel (  )  [read, virtual, inherited]

Returns the trained model.

Returns:
The trained model, with objective function, lagrangians, etc.

Definition at line 241 of file SVClustering.cpp.

References damina::OneClassSVM::getModel(), and damina::SVClustering::trainer.

Referenced by damina::GKWGenerator::getNextKernelWidthValue().

00241                                                  {
00242                 return this->trainer->getModel();
00243         }

Here is the call graph for this function:

Here is the caller graph for this function:

unsigned long int damina::SVClustering::getNumberOfClusters (  )  [virtual, inherited]

Returns the number of clusters found.

Returns:
An integer value which is the number of clusters found

Definition at line 434 of file SVClustering.cpp.

References damina::SVClustering::labels, damina::SVClustering::numberOfClusters, and damina::SVClustering::pointPerCluster.

00434                                                             {
00435                 
00436                 if (numberOfClusters > -1)
00437                         return numberOfClusters;
00438                 
00439                 // calculating points per cluster and number of clusters detected
00440                 
00441                 // theoretically impossible
00442                 // here to detect some programming errors
00443                 if (labels.size() == 0) {
00444                         this->numberOfClusters = 0;
00445                         return numberOfClusters;
00446                 }
00447                 
00448                 
00449                 std::vector<int>::size_type idx;
00450                 this->pointPerCluster.push_back(0);
00451                 this->numberOfClusters = 1;
00452         for (idx = 0; idx < labels.size(); idx++) {
00453                         if ((int)(pointPerCluster.size() - 1) < labels[idx]) {
00454                         this->numberOfClusters++;
00455                         this->pointPerCluster.push_back(1);
00456                 }
00457                 else {
00458                         this->pointPerCluster[labels[idx]]++;
00459                 }
00460         }
00461 
00462                 return numberOfClusters;
00463         }

unsigned long int damina::SVClustering::getNumberOfNonSingletonClusters (  )  [virtual, inherited]

Returns the number of clusters containing more than one point.

Returns:
An integer value which is the number of non-singleton clusters

Definition at line 470 of file SVClustering.cpp.

References damina::SVClustering::getNumberOfValidClusters().

00470                                                                         {
00471                 return getNumberOfValidClusters(2);
00472         }

Here is the call graph for this function:

unsigned long int damina::SVClustering::getNumberOfValidClusters ( unsigned long int  threshold  )  [virtual, inherited]

Returns the number of clusters with a number of elements greater than or equal to a certain threshold.

Parameters:
threshold An integer value representing the minimum number of elements for a valid cluster
Returns:
An integer value which is the number of clusters requested

Definition at line 480 of file SVClustering.cpp.

References damina::SVClustering::getPointPerCluster(), and damina::SVClustering::pointPerCluster.

Referenced by damina::SVClustering::getNumberOfNonSingletonClusters().

00480                                                                                             {
00481                 getPointPerCluster();
00482                 unsigned long int counter = 0;
00483                 for (unsigned long int i = 0; i < pointPerCluster.size(); i++) {
00484                         if (pointPerCluster[i] >= threshold) {
00485                                 counter++;
00486