28 Jan, 2008
The final draft of the thesis is available for download here.
Final contents are:
- Chapter 1: Introduction
- Chapter 2: Machine learning essentials
- Chapter 3: Clustering and related issues
- Chapter 4: Previous works on clustering
- Chapter 5: Minimum Bregman Information principle for Co-clustering
- Chapter 6: Support Vector Clustering
- Chapter 7: Alternative Support Vector Methods for Clustering
- Chapter 8: Support Vector Clustering software development
- Chapter 9: Experiments
- Chapter 10: Conclusion and Future Work
- Appendix A: One Class classification via Support Vector Machines
- Appendix B: Resource usage of the algorithms
- Appendix C: Star/Galaxy separation via Support Vector Machines
- Appendix D: Thesis Web Log
- Bibliography
IMPORTANT: the file name is changed, the links in the previous posts are broken. Download the thesis from this post or from the Documents page.
Downloads
Changelog download - Thesis download
19 Jan, 2008
RC2 draft of the thesis. Contents are completed and read by the supervisor, prof. Anna Corazza.
Downloads
Changelog download - Thesis download
17 Jan, 2008
RC1 draft of the thesis. Contents are
- Chapter 1: Introduction
- Chapter 2: Machine learning essentials
- Chapter 3: Clustering and related issues
- Chapter 4: Previous works on clustering
- Chapter 5: Minimum Bregman Information principle for Co-clustering
- Chapter 6: Support Vector Clustering
- Chapter 7: Alternative Support Vector Methods for Clustering
- Chapter 8: Support Vector Clustering software development
- Chapter 9: Experiments (only overall conclusion missing)
- Chapter 10: Conclusion and Future Work
- Appendix A: One Class classification via Support Vector Machines
- Appendix B: Resources usage of the algorithms
- Appendix C: Star/Galaxy separation via Support Vector Machines
- Appendix D: Thesis Web Log
- Bibliography
Downloads
Changelog download - Thesis download
10 Jan, 2008
Our experiments showed more than once that the employment of kernels other than the Gaussian one can significantly improve the results in certain circumstances.
From our experiments we know that
- The Laplacian Kernel works well on some scaled/normalized data.
- The Exponential Kernel generally behaves the same of the Gaussian one, but in some situations makes the difference, as happened in the experiments with IRIS data (multivariate) or CLASSIC3 data (text documents in BOW model with TF-IDF encoding).
These results suggest to go deeper in the matter and explore other kernels that can be useful in clustering with SVC.
9 Jan, 2008
KDnuggets.com (KD stands for Knowledge Discovery) is the leading source of information on Data Mining, Web Mining, Knowledge Discovery, and Decision Support Topics, including News, Software, Solutions, Companies, Jobs, Courses, Meetings, Publications, and more.
Go to KDnuggets.com
7 Jan, 2008
MC is a C++ program that creates vector-space models from
text documents that can be used for text mining applications. MC provides
an efficient multi-threaded implementation that can process very
large document collections. For example, MC took 1,189 seconds using
only 17.5 MBytes of main memory to process a sample collection of
about 114,000 documents (the experiment was run on a Sun Ultra10
workstation). More details on MC and its use in a fast clustering
algorithm are available in
this paper.
Download
7 Jan, 2008
Pre-final draft of thesis. Contents are (in bold new contents and updated ones)
- Chapter 1: Introduction
- Chapter 2: Machine learning essentials
- Chapter 3: Clustering and related issues
- Chapter 4: Previous works on clustering
- Chapter 5: Minimum Bregman Information principle for Co-clustering
- Chapter 6: Support Vector Clustering
- Chapter 7: Alternative Support Vector Methods for Clustering
- Chapter 8: Support Vector Clustering software development
- Chapter 9: Experiments (Incomplete, Text Clustering results missing)
- Chapter 10: Conclusion and Future Work
- Appendix A: One Class classification via Support Vector Machines
- Appendix B: Resources usage of the algorithms
- Appendix C: Star/Galaxy separation via Support Vector Machines
- Appendix D: Thesis Web Log
- Bibliography
Downloads
Changelog download - Thesis download