Thesis and Talk

by Vincenzo Russo on April 3, 2008

This post comes after a few months in oder to put in evidence the links to download both the thesis and the final talk.

IMPORTANT: this is the last post I publish on this blog, because the thesis is over. You can reach me on my professional blog.

{ 0 comments }

Final Mark

by Vincenzo Russo on February 6, 2008

Master Degree's Thesis

Master degree received.
Final mark: 110/110 Cum Laude.

{ 3 comments }

Final talk

by Vincenzo Russo on February 3, 2008

A definitive draft of the final talk (it is in Italian language) is available for download

Download Compressed (bzip2) PDF (4.7MB)

Download PDF (6.8MB)

{ 0 comments }

Thesis - Final Draft

by Vincenzo Russo on January 28, 2008

The final draft of the thesis is available for download here.

Final contents are:

  • Chapter 1: Introduction
  • Chapter 2: Machine learning essentials
  • Chapter 3: Clustering and related issues
  • Chapter 4: Previous works on clustering
  • Chapter 5: Minimum Bregman Information principle for Co-clustering
  • Chapter 6: Support Vector Clustering
  • Chapter 7: Alternative Support Vector Methods for Clustering
  • Chapter 8: Support Vector Clustering software development
  • Chapter 9: Experiments
  • Chapter 10: Conclusion and Future Work
  • Appendix A: One Class classification via Support Vector Machines
  • Appendix B: Resource usage of the algorithms
  • Appendix C: Star/Galaxy separation via Support Vector Machines
  • Appendix D: Thesis Web Log
  • Bibliography

IMPORTANT: the file name is changed, the links in the previous posts are broken. Download the thesis from this post or from the Documents page.

Downloads

Changelog download - Thesis download

{ 0 comments }

Stesura tesi - Bozza RC2 19/01

by Vincenzo Russo on January 19, 2008

RC2 draft of the thesis. Contents are completed and read by the supervisor, prof. Anna Corazza.

Downloads

Changelog download - Thesis download

{ 0 comments }

Stesura tesi - Bozza RC1 16/01

by Vincenzo Russo on January 17, 2008

RC1 draft of the thesis. Contents are

  • Chapter 1: Introduction
  • Chapter 2: Machine learning essentials
  • Chapter 3: Clustering and related issues
  • Chapter 4: Previous works on clustering
  • Chapter 5: Minimum Bregman Information principle for Co-clustering
  • Chapter 6: Support Vector Clustering
  • Chapter 7: Alternative Support Vector Methods for Clustering
  • Chapter 8: Support Vector Clustering software development
  • Chapter 9: Experiments (only overall conclusion missing)
  • Chapter 10: Conclusion and Future Work
  • Appendix A: One Class classification via Support Vector Machines
  • Appendix B: Resources usage of the algorithms
  • Appendix C: Star/Galaxy separation via Support Vector Machines
  • Appendix D: Thesis Web Log
  • Bibliography

Downloads

Changelog download - Thesis download

{ 0 comments }

SVC and different kernels

by Vincenzo Russo on January 10, 2008

Our experiments showed more than once that the employment of kernels other than the Gaussian one can significantly improve the results in certain circumstances.

From our experiments we know that

  • The Laplacian Kernel works well on some scaled/normalized data.
  • The Exponential Kernel generally behaves the same of the Gaussian one, but in some situations makes the difference, as happened in the experiments with IRIS data (multivariate) or CLASSIC3 data (text documents in BOW model with TF-IDF encoding).

These results suggest to go deeper in the matter and explore other kernels that can be useful in clustering with SVC.

{ 0 comments }

Knowledge Discovery portal

by Vincenzo Russo on January 9, 2008

KDnuggets.com (KD stands for Knowledge Discovery) is the leading source of information on Data Mining, Web Mining, Knowledge Discovery, and Decision Support Topics, including News, Software, Solutions, Companies, Jobs, Courses, Meetings, Publications, and more.

Go to KDnuggets.com

{ 0 comments }

Creating Vector Models from Text Documents

by Vincenzo Russo on January 7, 2008

MC is a C++ program that creates vector-space models from
text documents that can be used for text mining applications. MC provides
an efficient multi-threaded implementation that can process very
large document collections. For example, MC took 1,189 seconds using
only 17.5 MBytes of main memory to process a sample collection of
about 114,000 documents (the experiment was run on a Sun Ultra10
workstation). More details on MC and its use in a fast clustering
algorithm are available in
this paper.

Download

{ 0 comments }

Stesura tesi - Bozza pre-finale 06/01

by Vincenzo Russo on January 7, 2008

Pre-final draft of thesis. Contents are (in bold new contents and updated ones)

  • Chapter 1: Introduction
  • Chapter 2: Machine learning essentials
  • Chapter 3: Clustering and related issues
  • Chapter 4: Previous works on clustering
  • Chapter 5: Minimum Bregman Information principle for Co-clustering
  • Chapter 6: Support Vector Clustering
  • Chapter 7: Alternative Support Vector Methods for Clustering
  • Chapter 8: Support Vector Clustering software development
  • Chapter 9: Experiments (Incomplete, Text Clustering results missing)
  • Chapter 10: Conclusion and Future Work
  • Appendix A: One Class classification via Support Vector Machines
  • Appendix B: Resources usage of the algorithms
  • Appendix C: Star/Galaxy separation via Support Vector Machines
  • Appendix D: Thesis Web Log
  • Bibliography

Downloads

Changelog download - Thesis download

{ 0 comments }