Stesura tesi – Ottava bozza 24/12

Eight draft of thesis. Contents are (in bold new contents)

  • Chapter 1: Introduction
  • Chapter 2: Machine learning essentials
  • Chapter 3: Clustering and related issues
  • Chapter 4: Previous works on clustering
  • Chapter 5: Minimum Bregman Information principle for Co-clustering
  • Chapter 6: Support Vector Clustering
  • Chapter 7: Alternative Support Vector Methods for Clustering
  • Chapter 8: Support Vector Clustering software development
  • Chapter 9: Experiments (Incomplete, only two experimental stages out of 5)
  • Chapter 10: Conclusion and Future Work
  • Appendix A: One Class classification via Support Vector Machines
  • Appendix B: Resources usage of the algorithms
  • Appendix C: Thesis Web Log
  • Bibliography

Downloads

Changelog downloadThesis download

Stesura tesi – Settima bozza 20/12

Seventh draft of thesis. Contents are (in bold new contents)

  • Chapter 1: Introduction
  • Chapter 2: Machine learning essentials
  • Chapter 3: Clustering and related issues
  • Chapter 4: Previous works on clustering
  • Chapter 5: Minimum Bregman Information principle for Co-clustering
  • Chapter 6: Support Vector Clustering
  • Chapter 7: Alternative Support Vector Methods for Clustering
  • Chapter 8: Support Vector Clustering software development
  • Chapter 9: Experiments (Incomplete, only two experimental stages out of 5)
  • Chapter 10: Conclusion and Future Work
  • Appendix A: One Class classification via Support Vector Machines
  • Appendix B: Time and space consume (to be completed)
  • Appendix C: Thesis Web Log
  • Bibliography

Downloads

Changelog downloadThesis download

Co-clustering softwares

The first co-clustering software is the Co-cluster developed at University of Austin, Texas. The software you can download here is the version 1.1 you can find also at the original web page.

The package hosted here includes a patch to allow the software compilation also with gcc 4.0 and so on modern Linux and Mac OS X systems. Furthermore, it also contains some bash scripts (*.sh) to analyze co-clustering results and produce clustering quality measures with respect to labeled datasets.

The original software is released under GPL license, and so is this.

Download

Co-clustering code


The original version of the second Co-clustering software is available here and it implements all the six approximation schemes for the Co-clustering, both for the Euclidean distance and for I-divergence.

The package hosted here includes also the same bash scripts included in the aforesaid Co-cluster package.

No license informations were included into the original Bregman co-clustering package, but it seems to be a fork of the Co-cluster software v. 1.0. The latter was released under GPL license, so the code of the Bregman co-clustering should be under the same license.

Download

Bregman Co-clustering code

Support Vector Clustering Code

UPDATE 18th of Feb, 2008: the official page of this software is now located at my official website.


Here I put the preliminary alpha source code for the Support Vector Clustering. It implements the Cone Cluster Labeling for the cluster assignment part

It also implements the Secant-like kernel width generator.

The SVM training part is performed by the means of the LIBSVM library, whereas the graph utilities are provided by the Boost Graph Library. Both libraries allow to redistribute the source code under some license terms, so the package you download contains everything you need to compile the code, you have just to type “make” in the source root directory.

For more information, take a look to the README directory you find once you have unpacked the tarball.

Download

SVC Source CodeSVC Doxygen documentation


UPDATE 18th of Feb, 2008: the official page of this software is now located at my official website.

Stesura tesi – Sesta bozza 14/11

Sixth draft of thesis. Contents are (in bold new contents)

  • Chapter 1: Introduction
  • Chapter 2: Machine learning essentials
  • Chapter 3: Clustering and related issues
  • Chapter 4: Previous works on clustering
  • Chapter 5: Minimum Bregman Information principle for Co-clustering
  • Chapter 6: Support Vector Clustering
  • Chapter 7: Alternative Support Vector Methods for Clustering
  • Chapter 9: Support Vector Clustering software development
  • Chapter 10: Conclusion and Future Work
  • Appendix A: One Class classification via Support Vector Machines
  • Appendix B: Thesis Web Log
  • Bibliography

Downloads

Changelog downloadThesis download

Stesura tesi – Quinta bozza 06/11

Fifth draft of thesis. Contents are

  • Chapter 1: Introduction
  • Chapter 2: Machine learning essentials
  • Chapter 3: Clustering and related issues
  • Chapter 4: Previous works on clustering
  • Chapter 6: Support Vector Clustering
  • Chapter 7: Alternative Support Vector Methods for Clustering
  • Chapter 9: Support Vector Clustering software development
  • Chapter 10: Conclusion and Future Work
  • Appendix A: One Class classification via Support Vector Machines
  • Appendix B: Thesis Web Log
  • Bibliography

Downloads

Changelog downloadThesis download

Bregman divergences, SVMs and possible implications

In order to find a connection between the works studied (Bregman Co-clustering and Support Vector Clustering) we have performed some research. An interesting result are the following paper:

The above paper generalizes the Minimum Enclosing Ball (MEB) problem to the Bregman divergences and also provide a generalization of the Bâdoiu-Clarkson (BC) approximation algorith. This is the same algorithm exploited in practical by the Core Vector Machines

CVMs reformulate the SVMs as a MEB problem. Since they use the BC algorithm and such an algorithm has been generalized to the Bregman divergences, the research on vector machines could have interesting implications.

[OT] Star galaxies separation via SVM/CVM classification – Part 2

This is a modification of the experiments in this post.

I rapidly built a new training set and this time I use only this training set for training the SVM/CVM. Than, I test the new trained classifier on all three dataset of the previous post.

The training set contain 500 points and has been built using stars and galaxies from another portion of sky.

New accuracy results (SVM)

Longo 01: 95,96 %
Longo 02: 98,08 %
Longo 03: 97,956 %

New accuracy results (CVM)

Longo 01: 96,31 %
Longo 02: 97,67 %
Longo 03: 97,138 %

Let us consider the Longo 02 tested with CVM. We have

Completeness for Stars: 98,4 %
Contamination for Stars: 4,7 %

Completeness for Galaxies: 95,4 %
Contamination for Galaxies: 1,5 %

Stesura tesi – Quarta bozza 01/11

UPDATED: Chapter 10 added

Fourth draft of thesis. Contents are

  • Chapter 1: Introduction
  • Chapter 2: Machine learning essentials
  • Chapter 3: Clustering and related issues
  • Chapter 4: Previous works on clustering
  • Chapter 6: Support Vector Clustering
  • Chapter 7: Alternative Support Vector Methods for Clustering
  • Chapter 10: Conclusion and Future Work
  • Appendix A: One Class classification via Support Vector Machines
  • Appendix B: Thesis Web Log
  • Bibliography

Downloads

Changelog downloadThesis download