May 19 2007

Support Vector Clustering e One-class classification

Alla base del training delle SVM nel caso di clustering troviamo la One-class classification. Ci sono vari metodi per effettuare la one-class classification (anche conosciuta come Distribution Estimation, Outlier Detection, Novelty Detection, Concept Learning) con le SVM, come la nu-SVM di Schölkopf o il SVDD di Tax

  • B. Schölkopf, R. C. Williamson, A. J. Smola, J. Shawe-Taylor, and J. Platt, "Support Vector Method for Novelty Detection," in Advances in Neural Information Processing Systems 12: Proceedings of the 1999 Conference, 2000.
    @inproceedings{scholkopf2000,
      author = {B. Sch”olkopf and R.C. Williamson and A.J. Smola and J. Shawe-Taylor and J. Platt},
      Booktitle = {Advances in Neural Information Processing Systems 12: Proceedings of the 1999 Conference},
      Date-Added = {2007-04-29 16:39:57 +0200},
      Date-Modified = {2007-08-10 14:18:50 +0200},
      Keywords = {SVM, clustering, SMO, one-class, novelty detection},
      Title = {Support Vector Method for Novelty Detection},
      Url = {http://axiom.anu.edu.au/~williams/papers/P126.pdf},
      Year = {2000},
      Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEEsuLi8uLi8uLi9QYXBlcnMvU2NoXCJvbGtvcGYvU3VwcG9ydCBWZWN0b3IgTWV0aG9kIGZvciBOb3ZlbHR5IERldGVjdGlvbi5wZGbSGw8cHVdOUy5kYXRhTxECLgAAAAACLgACAAAJRG9jdW1lbnRzAAAAAAAAAAAAAAAAAAAAAAAAvs54rkgrAAAANxuNH1N1cHBvcnQgVmVjdG9yIE1ldGhvIzJGMDcxRC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAvBx3CWmzZAAAAAAAAAAAAAwADAAAJAAAAAAAAAAAAAAAAAAAAAAtTY2hcIm9sa29wZgAAEAAIAAC+zlyOAAAAEQAIAADCWlC5AAAAAQAUADcbjQA3G4AAALLyAAASxgAAEq0AAgBWRG9jdW1lbnRzOm5lbW86RG9jdW1lbnRzOlVuaXZlcnNpdGE6UGFwZXJzOlNjaFwib2xrb3BmOlN1cHBvcnQgVmVjdG9yIE1ldGhvIzJGMDcxRC5wZGYADgBgAC8AUwB1AHAAcABvAHIAdAAgAFYAZQBjAHQAbwByACAATQBlAHQAaABvAGQAIABmAG8AcgAgAE4AbwB2AGUAbAB0AHkAIABEAGUAdABlAGMAdABpAG8AbgAuAHAAZABmAA8AFAAJAEQAbwBjAHUAbQBlAG4AdABzABIAXS9uZW1vL0RvY3VtZW50cy9Vbml2ZXJzaXRhL1BhcGVycy9TY2hcIm9sa29wZi9TdXBwb3J0IFZlY3RvciBNZXRob2QgZm9yIE5vdmVsdHkgRGV0ZWN0aW9uLnBkZgAAEwASL1ZvbHVtZXMvRG9jdW1lbnRzABUAAgAX//8AAIAG0h8gISJYJGNsYXNzZXNaJGNsYXNzbmFtZaMiIyRdTlNNdXRhYmxlRGF0YVZOU0RhdGFYTlNPYmplY3TSHyAmJ6InJFxOU0RpY3Rpb25hcnkACAARABsAJAApADIARABJAEwAUQBTAFwAYgBpAHQAfACDAIYAiACKAI0AjwCRAJMAoACqAPgA/QEFAzcDOQM+A0cDUgNWA2QDawN0A3kDfAAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAOJ},
      Bdsk-Url-1 = {http://axiom.anu.edu.au/~williams/papers/P126.pdf}
    }
  • D. M. J. Tax and R. P. W. Duin, "Data Domain Description using Support Vectors," in European Symposium on Artificial Neural Network, Bruges (Belgium), 1999, pp. 251-256.
    @inproceedings{es1999, Address = {Bruges (Belgium)},
      Author = {David M. J. Tax and Robert P. W. Duin},
      Booktitle = {European Symposium on Artificial Neural Network},
      Date-Added = {2007-05-07 12:55:54 +0200},
      Date-Modified = {2007-06-23 08:24:18 +0200},
      Keywords = {SVM, domain description, SVDD, novelty detection, one-class},
      Month = {April},
      Pages = {251–256},
      Title = {Data Domain Description using Support Vectors},
      Url = {http://www.dice.ucl.ac.be/Proceedings/esann/esannpdf/es1999-458.pdf},
      Year = {1999},
      Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEEUuLi8uLi8uLi9QYXBlcnMvVGF4L0RhdGEgRG9tYWluIERlc2NyaXB0aW9uIHVzaW5nIFN1cHBvcnQgVmVjdG9ycy5wZGbSGw8cHVdOUy5kYXRhTxECHAAAAAACHAACAAAJRG9jdW1lbnRzAAAAAAAAAAAAAAAAAAAAAAAAvs54rkgrAAAANxuEH0RhdGEgRG9tYWluIERlc2NyaXB0IzM2RkFCRS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA2+r7CnHJEAAAAAAAAAAAAAwADAAAJAAAAAAAAAAAAAAAAAAAAAANUYXgAABAACAAAvs5cjgAAABEACAAAwpxWJAAAAAEAFAA3G4QANxuAAACy8gAAEsYAABKtAAIATkRvY3VtZW50czpuZW1vOkRvY3VtZW50czpVbml2ZXJzaXRhOlBhcGVyczpUYXg6RGF0YSBEb21haW4gRGVzY3JpcHQjMzZGQUJFLnBkZgAOAGQAMQBEAGEAdABhACAARABvAG0AYQBpAG4AIABEAGUAcwBjAHIAaQBwAHQAaQBvAG4AIAB1AHMAaQBuAGcAIABTAHUAcABwAG8AcgB0ACAAVgBlAGMAdABvAHIAcwAuAHAAZABmAA8AFAAJAEQAbwBjAHUAbQBlAG4AdABzABIAVy9uZW1vL0RvY3VtZW50cy9Vbml2ZXJzaXRhL1BhcGVycy9UYXgvRGF0YSBEb21haW4gRGVzY3JpcHRpb24gdXNpbmcgU3VwcG9ydCBWZWN0b3JzLnBkZgAAEwASL1ZvbHVtZXMvRG9jdW1lbnRzABUAAgAX//8AAIAG0h8gISJYJGNsYXNzZXNaJGNsYXNzbmFtZaMiIyRdTlNNdXRhYmxlRGF0YVZOU0RhdGFYTlNPYmplY3TSHyAmJ6InJFxOU0RpY3Rpb25hcnkACAARABsAJAApADIARABJAEwAUQBTAFwAYgBpAHQAfACDAIYAiACKAI0AjwCRAJMAoACqAPIA9wD/Ax8DIQMmAy8DOgM+A0wDUwNcA2EDZAAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAANx},
      Bdsk-Url-1 = {http://www.dice.ucl.ac.be/Proceedings/esann/esannpdf/es1999-458.pdf}
    }

Così come formulato da Vapnik et al. in

  • A. Ben-Hur, D. Horn, H. T. Siegelmann, and V. Vapnik, "Support Vector Clustering," Journal of Machine Learning Research, vol. 2, pp. 125-137, 2001.
    @article{svc,
      author = {A. Ben-Hur and D. Horn and H. T. Siegelmann and V. Vapnik},
      Date-Modified = {2007-06-19 14:44:40 +0200},
      Journal = {Journal of Machine Learning Research},
      Keywords = {clustering, SVM, gaussian kernel},
      Pages = {125-137},
      Title = {Support Vector Clustering},
      Url = {http://citeseer.ist.psu.edu/hur01support.html},
      Volume = 2, Year = 2001, Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEDUuLi8uLi8uLi9QYXBlcnMvQmVuLUh1ci9TdXBwb3J0IFZlY3RvciBDbHVzdGVyaW5nLnBkZtIbDxwdV05TLmRhdGFPEQHqAAAAAAHqAAIAAAlEb2N1bWVudHMAAAAAAAAAAAAAAAAAAAAAAAC+zniuSCsAAAA3IEUdU3VwcG9ydCBWZWN0b3IgQ2×1c3RlcmluZy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACIbMsH5WY9QREYgAAAAAAADAAMAAAkAAAAAAAAAAAAAAAAAAAAAB0Jlbi1IdXIAABAACAAAvs5cjgAAABEACAAAwflLfwAAAAEAFAA3IEUANxuAAACy8gAAEsYAABKtAAIAUERvY3VtZW50czpuZW1vOkRvY3VtZW50czpVbml2ZXJzaXRhOlBhcGVyczpCZW4tSHVyOlN1cHBvcnQgVmVjdG9yIENsdXN0ZXJpbmcucGRmAA4APAAdAFMAdQBwAHAAbwByAHQAIABWAGUAYwB0AG8AcgAgAEMAbAB1AHMAdABlAHIAaQBuAGcALgBwAGQAZgAPABQACQBEAG8AYwB1AG0AZQBuAHQAcwASAEcvbmVtby9Eb2N1bWVudHMvVW5pdmVyc2l0YS9QYXBlcnMvQmVuLUh1ci9TdXBwb3J0IFZlY3RvciBDbHVzdGVyaW5nLnBkZgAAEwASL1ZvbHVtZXMvRG9jdW1lbnRzABUAAgAX//8AAIAG0h8gISJYJGNsYXNzZXNaJGNsYXNzbmFtZaMiIyRdTlNNdXRhYmxlRGF0YVZOU0RhdGFYTlNPYmplY3TSHyAmJ6InJFxOU0RpY3Rpb25hcnkACAARABsAJAApADIARABJAEwAUQBTAFwAYgBpAHQAfACDAIYAiACKAI0AjwCRAJMAoACqAOIA5wDvAt0C3wLkAu0C+AL8AwoDEQMaAx8DIgAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAMv},
      Bdsk-Url-1 = {http://citeseer.ist.psu.edu/hur01support.html}
    }

il training delle SVM per il clustering viene fatto tramite SVDD.

In

  • D. M. J. Tax, "One-class classification: concept learning in the absence of counter-examples," PhD Thesis , 2001.
    @phdthesis{taxsvdd05,
      author = {David Martinus Johannes Tax},
      Date-Added = {2007-05-19 14:29:53 +0200},
      Date-Modified = {2007-08-10 14:16:46 +0200},
      Keywords = {SVM, domain description, SVDD, novelty detection, one-class},
      School = {Technische Universiteit Delft},
      Title = {One-class classification: concept learning in the absence of counter-examples},
      Url = {http://www.ist.tudelft.nl/live/binaries/468f0bec-405d-4918-aae0-cada843d27f7/doc/thesis_dtax.pdf},
      Year = {2001},
      Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEGQuLi8uLi8uLi9QYXBlcnMvVEFYL09uZS1jbGFzcyBjbGFzc2lmaWNhdGlvbiBjb25jZXB0IGxlYXJuaW5nIGluIHRoZSBhYnNlbmNlIG9mIGNvdW50ZXItZXhhbXBsZXMucGRm0hsPHB1XTlMuZGF0YU8RAngAAAAAAngAAgAACURvY3VtZW50cwAAAAAAAAAAAAAAAAAAAAAAAL7OeK5IKwAAADcjbR9PbmUtY2xhc3MgY2xhc3NpZmljYSMzMUREMUEucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMd0awnR9xAAAAAAAAAAAAAMAAwAACQAAAAAAAAAAAAAAAAAAAAADVEFYAAAQAAgAAL7OXI4AAAARAAgAAMJ0YaQAAAABABQANyNtADcbgAAAsvIAABLGAAASrQACAE5Eb2N1bWVudHM6bmVtbzpEb2N1bWVudHM6VW5pdmVyc2l0YTpQYXBlcnM6VEFYOk9uZS1jbGFzcyBjbGFzc2lmaWNhIzMxREQxQS5wZGYADgCiAFAATwBuAGUALQBjAGwAYQBzAHMAIABjAGwAYQBzAHMAaQBmAGkAYwBhAHQAaQBvAG4AIABjAG8AbgBjAGUAcAB0ACAAbABlAGEAcgBuAGkAbgBnACAAaQBuACAAdABoAGUAIABhAGIAcwBlAG4AYwBlACAAbwBmACAAYwBvAHUAbgB0AGUAcgAtAGUAeABhAG0AcABsAGUAcwAuAHAAZABmAA8AFAAJAEQAbwBjAHUAbQBlAG4AdABzABIAdi9uZW1vL0RvY3VtZW50cy9Vbml2ZXJzaXRhL1BhcGVycy9UQVgvT25lLWNsYXNzIGNsYXNzaWZpY2F0aW9uIGNvbmNlcHQgbGVhcm5pbmcgaW4gdGhlIGFic2VuY2Ugb2YgY291bnRlci1leGFtcGxlcy5wZGYAEwASL1ZvbHVtZXMvRG9jdW1lbnRzABUAAgAX//8AAIAG0h8gISJYJGNsYXNzZXNaJGNsYXNzbmFtZaMiIyRdTlNNdXRhYmxlRGF0YVZOU0RhdGFYTlNPYmplY3TSHyAmJ6InJFxOU0RpY3Rpb25hcnkACAARABsAJAApADIARABJAEwAUQBTAFwAYgBpAHQAfACDAIYAiACKAI0AjwCRAJMAoACqAREBFgEeA5oDnAOhA6oDtQO5A8cDzgPXA9wD3wAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAPs},
      Bdsk-Url-1 = {http://www.ist.tudelft.nl/live/binaries/468f0bec-405d-4918-aae0-cada843d27f7/doc/thesis_dtax.pdf}
    }

è però dimostrato che, usando il kernel Gaussiano, SVVD e nu-SVM danno luogo alle stesse soluzioni (stessa superficie di decisione), laddove si abbia la medesima larghezza del kernel e C=1/nu*N, dove C è il parametro di soft-constraint in SVDD, nu è il parametro introdotto da Schölkopf per le nu-SVM, N è il numero di oggetti.

È per questo motivo che la One-class classification implementata in libSVM

  • C. Chang and C. Lin, LIBSVM: A Library for Support Vector Machines, 2007.
    @misc{libsvm,
      author = {Chih-Chung Chang and Chih-Jen Lin},
      Date-Added = {2007-04-29 15:47:28 +0200},
      Date-Modified = {2007-11-04 17:28:20 +0100},
      Keywords = {svm, software},
      Note = {Manual available at url{http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf}},
      Title = {LIBSVM: A Library for Support Vector Machines},
      Url = {http://www.csie.ntu.edu.tw/~cjlin/libsvm/},
      Year = {2007},
      Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEEYuLi8uLi8uLi9QYXBlcnMvQ2hhbmcvTElCU1ZNIEEgTGlicmFyeSBmb3IgU3VwcG9ydCBWZWN0b3IgTWFjaGluZXMucGRm0hsPHB1XTlMuZGF0YU8RAh4AAAAAAh4AAgAACURvY3VtZW50cwAAAAAAAAAAAAAAAAAAAAAAAL7OeK5IKwAAADcjZh9MSUJTVk0gQSBMaWJyYXJ5IGZvciMzNThDRDEucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAANYzRwmc1OQAAAAAAAAAAAAMAAwAACQAAAAAAAAAAAAAAAAAAAAAFQ2hhbmcAABAACAAAvs5cjgAAABEACAAAwmcZGQAAAAEAFAA3I2YANxuAAACy8gAAEsYAABKtAAIAUERvY3VtZW50czpuZW1vOkRvY3VtZW50czpVbml2ZXJzaXRhOlBhcGVyczpDaGFuZzpMSUJTVk0gQSBMaWJyYXJ5IGZvciMzNThDRDEucGRmAA4AYgAwAEwASQBCAFMAVgBNACAAQQAgAEwAaQBiAHIAYQByAHkAIABmAG8AcgAgAFMAdQBwAHAAbwByAHQAIABWAGUAYwB0AG8AcgAgAE0AYQBjAGgAaQBuAGUAcwAuAHAAZABmAA8AFAAJAEQAbwBjAHUAbQBlAG4AdABzABIAWC9uZW1vL0RvY3VtZW50cy9Vbml2ZXJzaXRhL1BhcGVycy9DaGFuZy9MSUJTVk0gQSBMaWJyYXJ5IGZvciBTdXBwb3J0IFZlY3RvciBNYWNoaW5lcy5wZGYAEwASL1ZvbHVtZXMvRG9jdW1lbnRzABUAAgAX//8AAIAG0h8gISJYJGNsYXNzZXNaJGNsYXNzbmFtZaMiIyRdTlNNdXRhYmxlRGF0YVZOU0RhdGFYTlNPYmplY3TSHyAmJ6InJFxOU0RpY3Rpb25hcnkACAARABsAJAApADIARABJAEwAUQBTAFwAYgBpAHQAfACDAIYAiACKAI0AjwCRAJMAoACqAPMA+AEAAyIDJAMpAzIDPQNBA08DVgNfA2QDZwAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAN0},
      Bdsk-Url-1 = {http://www.csie.ntu.edu.tw/~cjlin/libsvm/}
    }

può essere usata come algoritmo di training nel caso di clustering con SVM, poiché il Support Vector Clustering presuppone l’utilizzo di un kernel Gaussiano.

La libreria libSVM fornisce tuttavia gli strumenti necessari per implementare agevolmente anche SVDD, qualora lo si desideri.

Post a comment

This blog is multi language by p.osting.it's Babel