August 20 2007

Contributi al Support Vector Clustering

I contributi portati al SVC, seppur non di grandissima entità, sono diversi. Nel seguito vengono divisi in Contributi all’efficienza del clustering e Contributi alle prestazioni computazionali.

Contributi all’efficienza del clustering

Cluster assignment. Ricordiamo che tra tutti gli algortimi di cluster assignment presentati in letteratura, quello che è stato ritenuto il miglior compromesso tra qualità e complessità computazionale è il Cone Cluster Labeling (CCL)

  • S. Lee and K. M. Daniels, "Cone Cluster Labeling for Support Vector Clustering," in Proceedings of 6th SIAM Conference on Data Mining, 2006, pp. 484-488.
    @inproceedings{cone2006,
      author = {Sei-Hyung Lee and Karen M. Daniels},
      Booktitle = {Proceedings of 6th SIAM Conference on Data Mining},
      Date-Added = {2007-04-29 16:58:13 +0200},
      Date-Modified = {2007-06-19 18:52:22 +0200},
      Keywords = {SVM, clustering},
      Month = {May},
      Pages = {484–488},
      Title = {Cone Cluster Labeling for Support Vector Clustering},
      Url = {http://www.siam.org/meetings/sdm06/proceedings/046lees.pdf},
      Year = {2006},
      Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEEsuLi8uLi8uLi9QYXBlcnMvTGVlL0NvbmUgQ2×1c3RlciBMYWJlbGluZyBmb3IgU3VwcG9ydCBWZWN0b3IgQ2×1c3RlcmluZy5wZGbSGw8cHVdOUy5kYXRhTxECLgAAAAACLgACAAAJRG9jdW1lbnRzAAAAAAAAAAAAAAAAAAAAAAAAvs54rkgrAAAANyVBH0NvbmUgQ2×1c3RlciBMYWJlbGluIzJGMDk0My5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAvCUPCWn72AAAAAAAAAAAAAwADAAAJAAAAAAAAAAAAAAAAAAAAAANMZWUAABAACAAAvs5cjgAAABEACAAAwlpi1gAAAAEAFAA3JUEANxuAAACy8gAAEsYAABKtAAIATkRvY3VtZW50czpuZW1vOkRvY3VtZW50czpVbml2ZXJzaXRhOlBhcGVyczpMZWU6Q29uZSBDbHVzdGVyIExhYmVsaW4jMkYwOTQzLnBkZgAOAHAANwBDAG8AbgBlACAAQwBsAHUAcwB0AGUAcgAgAEwAYQBiAGUAbABpAG4AZwAgAGYAbwByACAAUwB1AHAAcABvAHIAdAAgAFYAZQBjAHQAbwByACAAQwBsAHUAcwB0AGUAcgBpAG4AZwAuAHAAZABmAA8AFAAJAEQAbwBjAHUAbQBlAG4AdABzABIAXS9uZW1vL0RvY3VtZW50cy9Vbml2ZXJzaXRhL1BhcGVycy9MZWUvQ29uZSBDbHVzdGVyIExhYmVsaW5nIGZvciBTdXBwb3J0IFZlY3RvciBDbHVzdGVyaW5nLnBkZgAAEwASL1ZvbHVtZXMvRG9jdW1lbnRzABUAAgAX//8AAIAG0h8gISJYJGNsYXNzZXNaJGNsYXNzbmFtZaMiIyRdTlNNdXRhYmxlRGF0YVZOU0RhdGFYTlNPYmplY3TSHyAmJ6InJFxOU0RpY3Rpb25hcnkACAARABsAJAApADIARABJAEwAUQBTAFwAYgBpAHQAfACDAIYAiACKAI0AjwCRAJMAoACqAPgA/QEFAzcDOQM+A0cDUgNWA2QDawN0A3kDfAAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAOJ},
      Bdsk-Url-1 = {http://www.siam.org/meetings/sdm06/proceedings/046lees.pdf}
    }

Il primo contributo in questo senso è stato una esplicita politica per trattare i BSV. Un alto regime di BSV si rende necessario soprattutto per separare cluster fortemente sovrapposti. I BSV però finiscono fuori dalla descrizione del dominio, risultando outliers.

La scelta è stata quella di applicare il CCL così com’era escludendo i BSV dal processo e poi una volta trovato i cluster, assegnare i BSV a questi ultimi con una semplice politica di prossimità: un BSV è assegnato al cluster al quale appartiene il punto più prossimo.

Questo ci ha permesso di classificare i BSV in maniera mediamente corretta, migliorando l’efficienza.

In

  • S. Lee and K. M. Daniels, "Gaussian Kernel Width Selection and Fast Cluster Labeling for Support Vector Clustering," Department of Computer Science, University of Massachussets Lowell2005.
    @techreport{kernwidthsvc2005,
      author = {Sei-Hyung Lee and Karen M. Daniels},
      Date-Added = {2007-05-18 10:44:22 +0200},
      Date-Modified = {2007-06-20 08:28:06 +0200},
      Institution = {Department of Computer Science, University of Massachussets Lowell},
      Keywords = {svm, clustering, kernel machines},
      Title = {Gaussian Kernel Width Selection and Fast Cluster Labeling for Support Vector Clustering},
      Url = {http://www.cs.uml.edu/~kdaniels/papers/SeiTechReport2005.pdf},
      Year = {2005},
      Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEG8uLi8uLi8uLi9QYXBlcnMvTGVlL0dhdXNzaWFuIEtlcm5lbCBXaWR0aCBTZWxlY3Rpb24gYW5kIEZhc3QgQ2×1c3RlciBMYWJlbGluZyBmb3IgU3VwcG9ydCBWZWN0b3IgQ2×1c3RlcmluZy5wZGbSGw8cHVdOUy5kYXRhTxECmgAAAAACmgACAAAJRG9jdW1lbnRzAAAAAAAAAAAAAAAAAAAAAAAAvs54rkgrAAAANyVBH0dhdXNzaWFuIEtlcm5lbCBXaWR0IzMxQ0FDQS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAxysrCcyn+UERGIAAAAAAAAwADAAAJAAAAAAAAAAAAAAAAAAAAAANMZWUAABAACAAAvs5cjgAAABEACAAAwnMN3gAAAAEAFAA3JUEANxuAAACy8gAAEsYAABKtAAIATkRvY3VtZW50czpuZW1vOkRvY3VtZW50czpVbml2ZXJzaXRhOlBhcGVyczpMZWU6R2F1c3NpYW4gS2VybmVsIFdpZHQjMzFDQUNBLnBkZgAOALgAWwBHAGEAdQBzAHMAaQBhAG4AIABLAGUAcgBuAGUAbAAgAFcAaQBkAHQAaAAgAFMAZQBsAGUAYwB0AGkAbwBuACAAYQBuAGQAIABGAGEAcwB0ACAAQwBsAHUAcwB0AGUAcgAgAEwAYQBiAGUAbABpAG4AZwAgAGYAbwByACAAUwB1AHAAcABvAHIAdAAgAFYAZQBjAHQAbwByACAAQwBsAHUAcwB0AGUAcgBpAG4AZwAuAHAAZABmAA8AFAAJAEQAbwBjAHUAbQBlAG4AdABzABIAgS9uZW1vL0RvY3VtZW50cy9Vbml2ZXJzaXRhL1BhcGVycy9MZWUvR2F1c3NpYW4gS2VybmVsIFdpZHRoIFNlbGVjdGlvbiBhbmQgRmFzdCBDbHVzdGVyIExhYmVsaW5nIGZvciBTdXBwb3J0IFZlY3RvciBDbHVzdGVyaW5nLnBkZgAAEwASL1ZvbHVtZXMvRG9jdW1lbnRzABUAAgAX//8AAIAG0h8gISJYJGNsYXNzZXNaJGNsYXNzbmFtZaMiIyRdTlNNdXRhYmxlRGF0YVZOU0RhdGFYTlNPYmplY3TSHyAmJ6InJFxOU0RpY3Rpb25hcnkACAARABsAJAApADIARABJAEwAUQBTAFwAYgBpAHQAfACDAIYAiACKAI0AjwCRAJMAoACqARwBIQEpA8cDyQPOA9cD4gPmA/QD+wQEBAkEDAAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAQZ},
      Bdsk-Url-1 = {http://www.cs.uml.edu/~kdaniels/papers/SeiTechReport2005.pdf}
    }

si affronta il problema col dataset Breast Cancer Wisconsin. In quell’occasione i BSV vengono prima identificati, poi rimossi e infine l’intero processo SVC viene ripetuto, per poi classificare solo allora i BSV. Così facendo ottengono un’accuratezza di circa il 92%.

Il nostro approccio evita di ripetere l’intero SVC due volte per lo stesso dataset e sul Breast Cancer Wisconsin ottiene un’accuratezza di oltre il 96%.

Il secondo piccolo contributo alla procedura CCL è stato l’intuizione di utilizzare la distanza L1 in luogo della distanza L2. In svariati test l’uso di questa distanza ha portato a guadagni nell’acuratezza di classificazione, specialmente nei casi in cui si utilizza il parametro C=1. Nel classico caso dell’IRIS dataset, dopo le migliorie passate, abbiamo ottenuto un ulteriore incremento di oltre mezzo punto percentuale (da 92.6667 a 93.3333).

Generazione dei valori di larghezza per il kernel gaussiano. Ricordiamo che l’unica tecnica proposta per la generazione di tali valori è quella in

  • S. Lee and K. M. Daniels, "Gaussian Kernel Width Generator for Support Vector Clustering," in Advances in Bioinformatics and Its Applications, 2005, pp. 151-162.
    @inproceedings{gauskergenerator2004,
      author = {Sei-Hyung Lee and Karen M. Daniels},
      Booktitle = {Advances in Bioinformatics and Its Applications},
      Date-Added = {2007-10-23 17:21:17 +0200},
      Date-Modified = {2007-10-23 17:21:17 +0200},
      Editor = {Matthew He and Giri Narasimhan and Sergei Petoukhov},
      Keywords = {SVM, clustering, gaussian kernel},
      Pages = {151–162},
      Title = {Gaussian Kernel Width Generator for Support Vector Clustering},
      Url = {http://www.cs.uml.edu/~kdaniels/papers/ICBA.pdf},
      Volume = {8},
      Year = {2005},
      Bdsk-Url-1 = {http://www.cs.uml.edu/~kdaniels/papers/ICBA.pdf}
    }

Per un fortunato errore di implementazione, ho scoperto che tale procedura genera sì dei valori di Kernel Width monotoni crescenti, ma spesso “salta” dei valori che darebbero un ottimo risultato. Ovvero, la differenza tra due valori consecutivi è spesso troppo alta. Corretto l’errore di implementazione, ho pensato a come gestire esplicitamente il comportamento avvenuto fortuitamente. Così ho introdotto un nuovo parametro per il generatore di valori, che ho chiamato costante di smorzamento o, in inglese, softening constant. Questa costante può variare nell’intervallo aperto (0,1]. Quando assume valore 1, nessun smorzamento viene effettuato. Per valori minori di 1 invece il valore ogni volta generato dal Kernel Width Generator viene “smorzato” di una percentuale. Per esempio, una costante di smorzamento di 0.5 dimezza ogni volta il valore generato. Con tale valore di smorzamento sono stati constatati due comportamenti che, a seconda del dataset, possono verificarsi al contempo o separatamente (o per niente):

  • viene trovato un valore di kernel width che assicura una maggiore accuratezza
  • il valore di kernel width giusto viene trovato più velocemente (quindi eseguendo meno volte il SVC)

Nel caso dell’IRIS ha permesso di abbattere il muro del 90% di accuratezza nello stesso numero di cicli; nel caso di un dataset sintetico (finora ancora non presentato nel blog) ha permesso di trovare un buon valore di kernel width in 9 cicli invece dei 19 necessari senza smorzamento (con un aumento di accuratezza quasi insignificante 0.1%).

Stima del valore di C (soft constraint). L’altro parametro fondamentale del SVC è C, che controlla il numero di BSV. Per generare un insieme di possibili valori per C non è stato ancora presentato nulla in letteratura. Un paio di suggerimenti per una stima di tale valore sono presenti in

  • S. Lee and K. M. Daniels, "Gaussian Kernel Width Selection and Fast Cluster Labeling for Support Vector Clustering," Department of Computer Science, University of Massachussets Lowell2005.
    @techreport{kernwidthsvc2005,
      author = {Sei-Hyung Lee and Karen M. Daniels},
      Date-Added = {2007-05-18 10:44:22 +0200},
      Date-Modified = {2007-06-20 08:28:06 +0200},
      Institution = {Department of Computer Science, University of Massachussets Lowell},
      Keywords = {svm, clustering, kernel machines},
      Title = {Gaussian Kernel Width Selection and Fast Cluster Labeling for Support Vector Clustering},
      Url = {http://www.cs.uml.edu/~kdaniels/papers/SeiTechReport2005.pdf},
      Year = {2005},
      Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEG8uLi8uLi8uLi9QYXBlcnMvTGVlL0dhdXNzaWFuIEtlcm5lbCBXaWR0aCBTZWxlY3Rpb24gYW5kIEZhc3QgQ2×1c3RlciBMYWJlbGluZyBmb3IgU3VwcG9ydCBWZWN0b3IgQ2×1c3RlcmluZy5wZGbSGw8cHVdOUy5kYXRhTxECmgAAAAACmgACAAAJRG9jdW1lbnRzAAAAAAAAAAAAAAAAAAAAAAAAvs54rkgrAAAANyVBH0dhdXNzaWFuIEtlcm5lbCBXaWR0IzMxQ0FDQS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAxysrCcyn+UERGIAAAAAAAAwADAAAJAAAAAAAAAAAAAAAAAAAAAANMZWUAABAACAAAvs5cjgAAABEACAAAwnMN3gAAAAEAFAA3JUEANxuAAACy8gAAEsYAABKtAAIATkRvY3VtZW50czpuZW1vOkRvY3VtZW50czpVbml2ZXJzaXRhOlBhcGVyczpMZWU6R2F1c3NpYW4gS2VybmVsIFdpZHQjMzFDQUNBLnBkZgAOALgAWwBHAGEAdQBzAHMAaQBhAG4AIABLAGUAcgBuAGUAbAAgAFcAaQBkAHQAaAAgAFMAZQBsAGUAYwB0AGkAbwBuACAAYQBuAGQAIABGAGEAcwB0ACAAQwBsAHUAcwB0AGUAcgAgAEwAYQBiAGUAbABpAG4AZwAgAGYAbwByACAAUwB1AHAAcABvAHIAdAAgAFYAZQBjAHQAbwByACAAQwBsAHUAcwB0AGUAcgBpAG4AZwAuAHAAZABmAA8AFAAJAEQAbwBjAHUAbQBlAG4AdABzABIAgS9uZW1vL0RvY3VtZW50cy9Vbml2ZXJzaXRhL1BhcGVycy9MZWUvR2F1c3NpYW4gS2VybmVsIFdpZHRoIFNlbGVjdGlvbiBhbmQgRmFzdCBDbHVzdGVyIExhYmVsaW5nIGZvciBTdXBwb3J0IFZlY3RvciBDbHVzdGVyaW5nLnBkZgAAEwASL1ZvbHVtZXMvRG9jdW1lbnRzABUAAgAX//8AAIAG0h8gISJYJGNsYXNzZXNaJGNsYXNzbmFtZaMiIyRdTlNNdXRhYmxlRGF0YVZOU0RhdGFYTlNPYmplY3TSHyAmJ6InJFxOU0RpY3Rpb25hcnkACAARABsAJAApADIARABJAEwAUQBTAFwAYgBpAHQAfACDAIYAiACKAI0AjwCRAJMAoACqARwBIQEpA8cDyQPOA9cD4gPmA/QD+wQEBAkEDAAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAQZ},
      Bdsk-Url-1 = {http://www.cs.uml.edu/~kdaniels/papers/SeiTechReport2005.pdf}
    }

ma nei miei esperimenti pratici non hanno prodotto un buon risultato. Probabilmente ciò è dovuto anche al fatto che tale stima veniva fatta assumendo di usare come Cluster Assignment l’algoritmo classico Complete Graph Cluster Labeling.

Pertanto inizialmente ho variato manualmente il valore di C fino a trovarne uno che desse buoni risultati. Col susseguirsi degli esperimenti, ho notato che i valori di C che davano buoni risultati avevano una certa sistematicità ed era possibile ottenerli grazie alla legge

C = 10/N

con N il numero di oggetti nel dataset.

Ho successivamente raffinato la legge come segue, al fine di poter produrre ulteriori valori di C, in base al valore dato in input per tale parametro.

C_new = 10 * C_old / N

dove C_old è il valore di C dato in input all’SVC e C_new è il nuovo valore stimato che in teoria dovrebbe dare un miglior risultato.

Dagli esperimenti effettuati in genere il primo valore generato (C_new = 10 * 1 / N, poiché all’inizio C=1) è quello ottimale.

Contributi alle prestazioni computazionali

Tali contributi sono tutti legati alla procedura di generazione di valori per la larghezza del kernel e sono stati già descritti in questo post.

August 14 2007

High dimensionality: Co-clustering + SVC

Uno degli obiettivi di tesi è avere buone prestazioni su high dimensional dataset. L’altro è quello di poter combinare le due tecniche studiate in maniera concreta e con buoni risultati.

Il test illustrato di seguito coinvolte entrambi i punti.

Il dataset coinvolto è stato l’Internet Advertisement dall’UCI repository of Machine Learning. Ne cito la descrizione

This dataset represents a set of possible advertisements on Internet pages. The features encode the geometry of the image (if available) as well as phrases occuring in the URL, the image’s URL and alt text, the anchor text, and words occuring near the anchor text. The task is to predict whether an image is an advertisement (”ad”) or not (”nonad”).

Number of Instances: 3279 (2821 nonads, 458 ads)
Number of Attributes: 1558

One or more of the first three features are missing in 28% of the instances.

Trattiamo dunque con un dataset in 1558D che possiede tra l’altro dei missing values.

Co-clustering

Il test effettuato col solo Co-clustering mette in evidenza i vantaggi dell’implicita capacità di questa tecnica di ridurre la dimensionalità effettuando feature clustering contestuale al data clustering.

Effettuando il test con lo schema 5 e la distanza euclidea, senza richiedere feature clustering, si ottiene quanto segue

Class 1
TP: 223 FP: 98
FN: 223 TN: 2735
Class 2
TP: 98 FP: 223
FN: 98 TN: 2860

Class 1 - Precision: 69.47% Recall: 50% F1: 58.148%
Class 2 - Precision: 30.53% Recall: 50% F1: 37.912%

Accuracy: 9.79%

Stesso schema, ma con feature clustering (10 cluster di feature richiesti), dà luogo a quanto segue

Class 1
TP: 223 FP: 98
FN: 236 TN: 2722
Class 2
TP: 2722 FP: 236
FN: 98 TN: 223

Class 1 - Precision: 69.47% Recall: 48.584% F1: 57.18%
Class 2 - Precision: 92.022% Recall: 96.525% F1: 94.22%

Accuracy: 89.814%

La riduzione della dimensionalità ha dato i suoi frutti (ricordiamo che il dataset possiede anche missing values)

SVC

Le risorse computazionali a disposizione insieme con la natura ancora non stabile del software sviluppato per il SVC non ha permesso di effettuare il test del SVC direttamente in 1558D. Ad ogni modo la letteratura riporta un forte degrado di prestazioni all’aumentare delle dimensioni dello spazio vettoriale, poiché allo stesso tempo si rende necessario un numero elevato di SV per la descrizione del contorno dei cluster.

Inoltre la letteratura mostra anche come la PCA perda di efficacia quando si necessita di un certo numero di componenti.

Co-clustering + SVC

Sappiamo che il Co-clustering, durante il processo di clustering, calcola una approssimazione della matrice di dati originale.

Pertanto, abbiamo usato il Co-clustering richiedendo un co-clustering di dimensioni 3279 x 3, ovvero abbiamo richiesto un numero di cluster di riga pari al numero di righe (quindi il numero di oggetti) e un numero di cluster di colonna considerevolmente inferiore al numero di colonne originale (3 feature clusters invece di 1558 features).

Così facendo il Co-clustering calcola una matrice di approssimazione dove ogni riga rappresenta ancora un solo oggetto della matrice originale, ma a differenza dell’originale, questa matrice è in 3D.

Ricordiamo inoltre che il processo di approssimazione genera una matrice priva di missing values.

Abbiamo così preso tale matrice approssimata e usata come input del SVC. Il risultato è stato addirittura superiore (anche se non di molto) al massimo ottenuto col Co-clustering

Class 0
TP: 218 FP: 70
FN: 241 TN: 2750

Precision: 75.6944 - Recall: 47.4946 - F1: 58.3668

Class 1
TP: 2750 FP: 241
FN: 70 TN: 218

Precision: 91.9425 - Recall: 97.5177 - F1: 94.6481

Accuracy: 90.5154

Tempo macchina

È importante notare che SVC ha impiegato soltanto 0.06 secondi in totale per concludere il processo, trovando il giusto valore di Kernel Width al secondo ciclo. Il Co-clustering ha invece impiegato 117 secondi a concludere il processo (ricordiamo completo anche della sua parte di clustering e non solo del calcolo della matrice approssimata, poiché allo stato attuale il software non permette di scindere le due operazioni).

Abbiamo dunque un totale di 117.06 secondi.

L’SVC su un dataset di 3000 elementi in 20 dimensioni ha impiegato circa 400 secondi per compiere due cicli. E il tempo impiegato cresce all’aumentare delle dimensioni e anche coll’avanzare dei cicli, poiché valori più alti di Kernel Width rallentano il processo di domain description. È facile dunque capire come si potrebbe comportare il solo SVC con un dataset di 1558D.

Conclusioni

Altri test simili erano stati effettuati su dataset troppo piccoli per avere dei risultati significativi.
Questo test mostra come è possibile combinare le due tecniche, sfruttando la capacità di ridurre la dimensionalità e di trattare i missing values dell’uno, e l’accuratezza, la gestione degli outliers e dei cluster di forma arbitraria dell’altro.

Altri test saranno successivamente condotti.

August 02 2007

SVC: Gaussian Kernel Width Generator

Per il Support Vector Clustering esiste un solo metodo per generare valori pertinenti della larghezza del kernel gaussiano, ed è mostrato in

  • S. Lee and K. M. Daniels, "Gaussian Kernel Width Generator for Support Vector Clustering," in Advances in Bioinformatics and Its Applications, 2005, pp. 151-162.
    @inproceedings{gauskergenerator2004,
      author = {Sei-Hyung Lee and Karen M. Daniels},
      Booktitle = {Advances in Bioinformatics and Its Applications},
      Date-Added = {2007-10-23 17:21:17 +0200},
      Date-Modified = {2007-10-23 17:21:17 +0200},
      Editor = {Matthew He and Giri Narasimhan and Sergei Petoukhov},
      Keywords = {SVM, clustering, gaussian kernel},
      Pages = {151–162},
      Title = {Gaussian Kernel Width Generator for Support Vector Clustering},
      Url = {http://www.cs.uml.edu/~kdaniels/papers/ICBA.pdf},
      Volume = {8},
      Year = {2005},
      Bdsk-Url-1 = {http://www.cs.uml.edu/~kdaniels/papers/ICBA.pdf}
    }

che di seguito indicheremo con GKWG.

Il metodo si basa sullo stesso problema di programmazione quadratica su cui si basa l’SVC stesso, unito a un algoritmo secante.

Gli autori usano questo metodo per generare una lista di valori per la larghezza del kernel prima di eseguire il Support Vector Clustering. Una volta ottenuta la lista, si esegue SVC per ogni valore di quella lista, finché non si raggiunge il criterio di stop.

Gli svantaggi di questo approccio sono due:

  • Si rischia di generare una lista di valori più lunga del necessario
  • Si aggiunge una complessità non indifferente all’intero processo di clustering, poiché il processo di risoluzione del problema di programmazione quadratica ha una notevole complessità (teoricamente O(N^3), praticamente, tramite SMO Algorithm, O(m*N^2)) e questo viene eseguito una volta per ogni valore di larghezza del kernel generato.

Nel nostro caso, si è riuscito a fondere il processo GKWG con il processo di clustering, sfruttando i calcoli che vengono eseguiti necessariamente per la fase di cluster description dell’SVC. Abbiamo dunque eliminato i due svantaggi precedenti.

Risultati

La nostra implementazione del GKWG porta, oltre a un vantaggio in termini di complessità computazionale totale, anche a risultati migliori di quelli presenti in letteratura, per ora in riferimento soltanto all’IRIS Data Set (vedere esperimenti preliminari SVC).

Si è infatti raggiunta, sull’IRIS completo di tutte le feature, un’accuratezza del 92.6667% (precedentemente ci si era fermati al 90%), grazie al valore di larghezza del kernel ottenuto dal GKWG. Risultati migliori sono stati raggiunti in letteratura soltanto riducendo lo spazio delle feature da 4D a 3D o 2D, tramite PCA o Sammon non linear mapping. Restando invece in 4D, tutti i testi in letteratura ottengono un’accuratezza inferiore alla nostra.

Dettagli del test su IRIS

Total time taken: 0.01 s

Kernel Width: 0.0917017
Number of clusters: 3
Number of non-singleton clusters: 3
Number of singleton clusters: 0
Points per cluster:
Cluster 0: 50
Cluster 1: 55
Cluster 2: 45

Class 0
TP: 50 FP: 0
FN: 0 TN: 100

Precision: 100 - Recall: 100 - F1: 100

Class 1
TP: 47 FP: 8
FN: 3 TN: 92

Precision: 85.4545 - Recall: 94 - F1: 89.5238

Class 2
TP: 42 FP: 3
FN: 8 TN: 97

Precision: 93.3333 - Recall: 84 - F1: 88.4211

Macroaveraging: 92.6483
Accuracy: 92.6667

July 10 2007

SVC: politica per classificazione BSV

L’algoritmo di Cluster Assignment usato

  • S. Lee and K. M. Daniels, "Cone Cluster Labeling for Support Vector Clustering," in Proceedings of 6th SIAM Conference on Data Mining, 2006, pp. 484-488.
    @inproceedings{cone2006,
      author = {Sei-Hyung Lee and Karen M. Daniels},
      Booktitle = {Proceedings of 6th SIAM Conference on Data Mining},
      Date-Added = {2007-04-29 16:58:13 +0200},
      Date-Modified = {2007-06-19 18:52:22 +0200},
      Keywords = {SVM, clustering},
      Month = {May},
      Pages = {484–488},
      Title = {Cone Cluster Labeling for Support Vector Clustering},
      Url = {http://www.siam.org/meetings/sdm06/proceedings/046lees.pdf},
      Year = {2006},
      Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEEsuLi8uLi8uLi9QYXBlcnMvTGVlL0NvbmUgQ2×1c3RlciBMYWJlbGluZyBmb3IgU3VwcG9ydCBWZWN0b3IgQ2×1c3RlcmluZy5wZGbSGw8cHVdOUy5kYXRhTxECLgAAAAACLgACAAAJRG9jdW1lbnRzAAAAAAAAAAAAAAAAAAAAAAAAvs54rkgrAAAANyVBH0NvbmUgQ2×1c3RlciBMYWJlbGluIzJGMDk0My5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAvCUPCWn72AAAAAAAAAAAAAwADAAAJAAAAAAAAAAAAAAAAAAAAAANMZWUAABAACAAAvs5cjgAAABEACAAAwlpi1gAAAAEAFAA3JUEANxuAAACy8gAAEsYAABKtAAIATkRvY3VtZW50czpuZW1vOkRvY3VtZW50czpVbml2ZXJzaXRhOlBhcGVyczpMZWU6Q29uZSBDbHVzdGVyIExhYmVsaW4jMkYwOTQzLnBkZgAOAHAANwBDAG8AbgBlACAAQwBsAHUAcwB0AGUAcgAgAEwAYQBiAGUAbABpAG4AZwAgAGYAbwByACAAUwB1AHAAcABvAHIAdAAgAFYAZQBjAHQAbwByACAAQwBsAHUAcwB0AGUAcgBpAG4AZwAuAHAAZABmAA8AFAAJAEQAbwBjAHUAbQBlAG4AdABzABIAXS9uZW1vL0RvY3VtZW50cy9Vbml2ZXJzaXRhL1BhcGVycy9MZWUvQ29uZSBDbHVzdGVyIExhYmVsaW5nIGZvciBTdXBwb3J0IFZlY3RvciBDbHVzdGVyaW5nLnBkZgAAEwASL1ZvbHVtZXMvRG9jdW1lbnRzABUAAgAX//8AAIAG0h8gISJYJGNsYXNzZXNaJGNsYXNzbmFtZaMiIyRdTlNNdXRhYmxlRGF0YVZOU0RhdGFYTlNPYmplY3TSHyAmJ6InJFxOU0RpY3Rpb25hcnkACAARABsAJAApADIARABJAEwAUQBTAFwAYgBpAHQAfACDAIYAiACKAI0AjwCRAJMAoACqAPgA/QEFAzcDOQM+A0cDUgNWA2QDawN0A3kDfAAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAOJ},
      Bdsk-Url-1 = {http://www.siam.org/meetings/sdm06/proceedings/046lees.pdf}
    }

come tutti gli altri proposti in letteratura non tratta esplicitamente la classificaizione dei Bounded Support Vector, ovvero di quei punti che, per effetto del valore della costante di margine morbido, finiscono fuori dalla sfera di descrizione del dominio anche se in realtà fanno parte di una delle classi del problema.

Il Cone Cluster Labeling prevede due passi:

  • classificazione dei SV
  • classificazione di tutti gli altri punti in relazione ai SV

che di fatto comprende anche i BSV in “tutti gli altri punti”.

Si è scelto di modificare in questo modo l’algoritmo:

  • classificazione dei SV
  • classificazione di tutti gli altri punti (tranne i BSV) in relazione ai SV
  • classificazione dei BSV in relazione a tutti gli altri punti già classificati

Nel caso dell’IRIS data set, questa modifica ha portato l’accuratezza da un valore di 89,333% a un valore del 90%.

July 09 2007

SVC: prestazioni del cluster assignment

With regard of this document, recently we have performed the test described at the paragraph 5 with all of two cluster assignment algorithms implemented: Complete Graph Cluster Labeling (CGCL, the classic one) and the Cone Cluster Labeling, respectively presented in

  • A. Ben-Hur, D. Horn, H. T. Siegelmann, and V. Vapnik, "Support Vector Clustering," Journal of Machine Learning Research, vol. 2, pp. 125-137, 2001.
    @article{svc,
      author = {A. Ben-Hur and D. Horn and H. T. Siegelmann and V. Vapnik},
      Date-Modified = {2007-06-19 14:44:40 +0200},
      Journal = {Journal of Machine Learning Research},
      Keywords = {clustering, SVM, gaussian kernel},
      Pages = {125-137},
      Title = {Support Vector Clustering},
      Url = {http://citeseer.ist.psu.edu/hur01support.html},
      Volume = 2, Year = 2001, Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEDUuLi8uLi8uLi9QYXBlcnMvQmVuLUh1ci9TdXBwb3J0IFZlY3RvciBDbHVzdGVyaW5nLnBkZtIbDxwdV05TLmRhdGFPEQHqAAAAAAHqAAIAAAlEb2N1bWVudHMAAAAAAAAAAAAAAAAAAAAAAAC+zniuSCsAAAA3IEUdU3VwcG9ydCBWZWN0b3IgQ2×1c3RlcmluZy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACIbMsH5WY9QREYgAAAAAAADAAMAAAkAAAAAAAAAAAAAAAAAAAAAB0Jlbi1IdXIAABAACAAAvs5cjgAAABEACAAAwflLfwAAAAEAFAA3IEUANxuAAACy8gAAEsYAABKtAAIAUERvY3VtZW50czpuZW1vOkRvY3VtZW50czpVbml2ZXJzaXRhOlBhcGVyczpCZW4tSHVyOlN1cHBvcnQgVmVjdG9yIENsdXN0ZXJpbmcucGRmAA4APAAdAFMAdQBwAHAAbwByAHQAIABWAGUAYwB0AG8AcgAgAEMAbAB1AHMAdABlAHIAaQBuAGcALgBwAGQAZgAPABQACQBEAG8AYwB1AG0AZQBuAHQAcwASAEcvbmVtby9Eb2N1bWVudHMvVW5pdmVyc2l0YS9QYXBlcnMvQmVuLUh1ci9TdXBwb3J0IFZlY3RvciBDbHVzdGVyaW5nLnBkZgAAEwASL1ZvbHVtZXMvRG9jdW1lbnRzABUAAgAX//8AAIAG0h8gISJYJGNsYXNzZXNaJGNsYXNzbmFtZaMiIyRdTlNNdXRhYmxlRGF0YVZOU0RhdGFYTlNPYmplY3TSHyAmJ6InJFxOU0RpY3Rpb25hcnkACAARABsAJAApADIARABJAEwAUQBTAFwAYgBpAHQAfACDAIYAiACKAI0AjwCRAJMAoACqAOIA5wDvAt0C3wLkAu0C+AL8AwoDEQMaAx8DIgAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAMv},
      Bdsk-Url-1 = {http://citeseer.ist.psu.edu/hur01support.html}
    }

and

  • S. Lee and K. M. Daniels, "Cone Cluster Labeling for Support Vector Clustering," in Proceedings of 6th SIAM Conference on Data Mining, 2006, pp. 484-488.
    @inproceedings{cone2006,
      author = {Sei-Hyung Lee and Karen M. Daniels},
      Booktitle = {Proceedings of 6th SIAM Conference on Data Mining},
      Date-Added = {2007-04-29 16:58:13 +0200},
      Date-Modified = {2007-06-19 18:52:22 +0200},
      Keywords = {SVM, clustering},
      Month = {May},
      Pages = {484–488},
      Title = {Cone Cluster Labeling for Support Vector Clustering},
      Url = {http://www.siam.org/meetings/sdm06/proceedings/046lees.pdf},
      Year = {2006},
      Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEEsuLi8uLi8uLi9QYXBlcnMvTGVlL0NvbmUgQ2×1c3RlciBMYWJlbGluZyBmb3IgU3VwcG9ydCBWZWN0b3IgQ2×1c3RlcmluZy5wZGbSGw8cHVdOUy5kYXRhTxECLgAAAAACLgACAAAJRG9jdW1lbnRzAAAAAAAAAAAAAAAAAAAAAAAAvs54rkgrAAAANyVBH0NvbmUgQ2×1c3RlciBMYWJlbGluIzJGMDk0My5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAvCUPCWn72AAAAAAAAAAAAAwADAAAJAAAAAAAAAAAAAAAAAAAAAANMZWUAABAACAAAvs5cjgAAABEACAAAwlpi1gAAAAEAFAA3JUEANxuAAACy8gAAEsYAABKtAAIATkRvY3VtZW50czpuZW1vOkRvY3VtZW50czpVbml2ZXJzaXRhOlBhcGVyczpMZWU6Q29uZSBDbHVzdGVyIExhYmVsaW4jMkYwOTQzLnBkZgAOAHAANwBDAG8AbgBlACAAQwBsAHUAcwB0AGUAcgAgAEwAYQBiAGUAbABpAG4AZwAgAGYAbwByACAAUwB1AHAAcABvAHIAdAAgAFYAZQBjAHQAbwByACAAQwBsAHUAcwB0AGUAcgBpAG4AZwAuAHAAZABmAA8AFAAJAEQAbwBjAHUAbQBlAG4AdABzABIAXS9uZW1vL0RvY3VtZW50cy9Vbml2ZXJzaXRhL1BhcGVycy9MZWUvQ29uZSBDbHVzdGVyIExhYmVsaW5nIGZvciBTdXBwb3J0IFZlY3RvciBDbHVzdGVyaW5nLnBkZgAAEwASL1ZvbHVtZXMvRG9jdW1lbnRzABUAAgAX//8AAIAG0h8gISJYJGNsYXNzZXNaJGNsYXNzbmFtZaMiIyRdTlNNdXRhYmxlRGF0YVZOU0RhdGFYTlNPYmplY3TSHyAmJ6InJFxOU0RpY3Rpb25hcnkACAARABsAJAApADIARABJAEwAUQBTAFwAYgBpAHQAfACDAIYAiACKAI0AjwCRAJMAoACqAPgA/QEFAzcDOQM+A0cDUgNWA2QDawN0A3kDfAAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAOJ},
      Bdsk-Url-1 = {http://www.siam.org/meetings/sdm06/proceedings/046lees.pdf}
    }

In the first case we have a total execution time of 280.87 seconds; in the second case only 0.47 seconds was taken. In both cases, 0.25 seconds was taken in the domain description by the SVM. So, we can say that the classic algorithm was slower than CCL about of 99.92%.

The clustering results are the same and they are reported in the document mentioned above.

July 06 2007

Support Vector Methods and MBI Principle

In the Documents section are available the slides entitled: “Data Clustering: High dimensionality, missing values and noise. Support Vector Methods and Minimum Bregman Information Principle

July 04 2007

SVC Preliminary Experiments

In the section Documents is available for download the PDF with the configurations used for tests and related results; is also available the ZIP archive containing the data-sets used for the experiments.

May 19 2007

Inizio sviluppo SVC software

Basandomi sulla libSVM, ho iniziato lo sviluppo del software per SVC.

Devo dire che la disponibilità dei vari ricercatori nel fornire il codice da loro usato per effettuare i test è stata abbastanza scarna, a partire da Vapnik et al. e passando tra tutti coloro che negli ultimi anni hanno lavorato per eliminare il collo di bottiglia di questo approccio (cluster labeling).

Soltanto gli autori di

  • J. Lee and D. Lee, "An Improved Cluster Labeling Method for Support Vector Clustering," IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, iss. 3, pp. 461-464, 2005.
    @article{svcimproved, Address = {Washington, DC, USA},
      Author = {Jaewook Lee and Daewon Lee},
      Date-Added = {2007-04-28 18:28:29 +0200},
      Date-Modified = {2007-06-19 15:19:40 +0200},
      Doi = {http://dx.doi.org/10.1109/TPAMI.2005.47},
      Issn = {0162-8828},
      Journal = {IEEE Trans. Pattern Anal. Mach. Intell.},
      Keywords = {clustering, SVM},
      Note = {Member-Jaewook Lee},
      Number = {3},
      Pages = {461–464},
      Publisher = {IEEE Computer Society},
      Title = {An Improved Cluster Labeling Method for Support Vector Clustering},
      Url = {http://doi.ieeecomputersociety.org/10.1109/TPAMI.2005.47},
      Volume = {27},
      Year = {2005},
      Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEFkuLi8uLi8uLi9QYXBlcnMvTGVlL0FuIEltcHJvdmVkIENsdXN0ZXIgTGFiZWxpbmcgTWV0aG9kIGZvciBTdXBwb3J0IFZlY3RvciBDbHVzdGVyaW5nLnBkZtIbDxwdV05TLmRhdGFPEQJYAAAAAAJYAAIAAAlEb2N1bWVudHMAAAAAAAAAAAAAAAAAAAAAAAC+zniuSCsAAAA3JUEfQW4gSW1wcm92ZWQgQ2×1c3RlciAjMjdDMDVELnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACfAXcIke4QAAAAAAAAAAAADAAMAAAkAAAAAAAAAAAAAAAAAAAAAA0xlZQAAEAAIAAC+zlyOAAAAEQAIAADCJG10AAAAAQAUADclQQA3G4AAALLyAAASxgAAEq0AAgBORG9jdW1lbnRzOm5lbW86RG9jdW1lbnRzOlVuaXZlcnNpdGE6UGFwZXJzOkxlZTpBbiBJbXByb3ZlZCBDbHVzdGVyICMyN0MwNUQucGRmAA4AjABFAEEAbgAgAEkAbQBwAHIAbwB2AGUAZAAgAEMAbAB1AHMAdABlAHIAIABMAGEAYgBlAGwAaQBuAGcAIABNAGUAdABoAG8AZAAgAGYAbwByACAAUwB1AHAAcABvAHIAdAAgAFYAZQBjAHQAbwByACAAQwBsAHUAcwB0AGUAcgBpAG4AZwAuAHAAZABmAA8AFAAJAEQAbwBjAHUAbQBlAG4AdABzABIAay9uZW1vL0RvY3VtZW50cy9Vbml2ZXJzaXRhL1BhcGVycy9MZWUvQW4gSW1wcm92ZWQgQ2×1c3RlciBMYWJlbGluZyBNZXRob2QgZm9yIFN1cHBvcnQgVmVjdG9yIENsdXN0ZXJpbmcucGRmAAATABIvVm9sdW1lcy9Eb2N1bWVudHMAFQACABf//wAAgAbSHyAhIlgkY2xhc3Nlc1okY2xhc3NuYW1loyIjJF1OU011dGFibGVEYXRhVk5TRGF0YVhOU09iamVjdNIfICYnoickXE5TRGljdGlvbmFyeQAIABEAGwAkACkAMgBEAEkATABRAFMAXABiAGkAdAB8AIMAhgCIAIoAjQCPAJEAkwCgAKoBBgELARMDbwNxA3YDfwOKA44DnAOjA6wDsQO0AAAAAAAAAgEAAAAAAAAAKAAAAAAAAAAAAAAAAAAAA8E=},
      Bdsk-Url-1 = {http://doi.ieeecomputersociety.org/10.1109/TPAMI.2005.47},
      Bdsk-Url-2 = {http://dx.doi.org/10.1109/TPAMI.2005.47}
    }

mi hanno gentilmente risposto inviandomi il codice (putroppo Matlab) da loro usato per i test.

Sto perciò sviluppando questo software (che avrei dovuto fare comunque) con una visione più ampia in mente. Per ora la struttura è essenziale e fortemente legata a libSVM:

Classi fondamentali

  • una classe astratta AbstractTrainer, che fornisce la fornisce l’interfaccia per l’implementazione della parte di training per il Support Vector Clustering.
  • una classe AbstractLabeler, che fornisce la fornisce l’interfaccia per l’implementazione della parte di cluster labeling per il SVC

Prime concretizzazioni

  • una classe OneClassTrainer, che implementa il training per il SVC utilizzando la One-class classification della libSVM
  • [da terminare] una classe CGLabeler, che implementa nel modo più banale (e inefficiente) l’algoritmo cluster labeling proposto in
    • A. Ben-Hur, D. Horn, H. T. Siegelmann, and V. Vapnik, "Support Vector Clustering," Journal of Machine Learning Research, vol. 2, pp. 125-137, 2001.
      @article{svc,
        author = {A. Ben-Hur and D. Horn and H. T. Siegelmann and V. Vapnik},
        Date-Modified = {2007-06-19 14:44:40 +0200},
        Journal = {Journal of Machine Learning Research},
        Keywords = {clustering, SVM, gaussian kernel},
        Pages = {125-137},
        Title = {Support Vector Clustering},
        Url = {http://citeseer.ist.psu.edu/hur01support.html},
        Volume = 2, Year = 2001, Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEDUuLi8uLi8uLi9QYXBlcnMvQmVuLUh1ci9TdXBwb3J0IFZlY3RvciBDbHVzdGVyaW5nLnBkZtIbDxwdV05TLmRhdGFPEQHqAAAAAAHqAAIAAAlEb2N1bWVudHMAAAAAAAAAAAAAAAAAAAAAAAC+zniuSCsAAAA3IEUdU3VwcG9ydCBWZWN0b3IgQ2×1c3RlcmluZy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACIbMsH5WY9QREYgAAAAAAADAAMAAAkAAAAAAAAAAAAAAAAAAAAAB0Jlbi1IdXIAABAACAAAvs5cjgAAABEACAAAwflLfwAAAAEAFAA3IEUANxuAAACy8gAAEsYAABKtAAIAUERvY3VtZW50czpuZW1vOkRvY3VtZW50czpVbml2ZXJzaXRhOlBhcGVyczpCZW4tSHVyOlN1cHBvcnQgVmVjdG9yIENsdXN0ZXJpbmcucGRmAA4APAAdAFMAdQBwAHAAbwByAHQAIABWAGUAYwB0AG8AcgAgAEMAbAB1AHMAdABlAHIAaQBuAGcALgBwAGQAZgAPABQACQBEAG8AYwB1AG0AZQBuAHQAcwASAEcvbmVtby9Eb2N1bWVudHMvVW5pdmVyc2l0YS9QYXBlcnMvQmVuLUh1ci9TdXBwb3J0IFZlY3RvciBDbHVzdGVyaW5nLnBkZgAAEwASL1ZvbHVtZXMvRG9jdW1lbnRzABUAAgAX//8AAIAG0h8gISJYJGNsYXNzZXNaJGNsYXNzbmFtZaMiIyRdTlNNdXRhYmxlRGF0YVZOU0RhdGFYTlNPYmplY3TSHyAmJ6InJFxOU0RpY3Rpb25hcnkACAARABsAJAApADIARABJAEwAUQBTAFwAYgBpAHQAfACDAIYAiACKAI0AjwCRAJMAoACqAOIA5wDvAt0C3wLkAu0C+AL8AwoDEQMaAx8DIgAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAMv},
        Bdsk-Url-1 = {http://citeseer.ist.psu.edu/hur01support.html}
      }

Questa struttura permette di implementare agevolmente le diverse versioni di cluster labeling proposte in letteratura e al contempo di sviluppare diversi trainer, compreso quello per la formulazione Least Squares delle SVM, magari implementando l’algoritmo proposto in

  • S. S. Keerthi and S. K. Shevade, "SMO algorithm for Least Squares SVM formulations," Neural Computation, vol. 15, pp. 487-507, 2003.
    @article{smo4lssvm03,
      author = {S.S. Keerthi and S.K. Shevade},
      Date-Added = {2007-05-19 19:11:38 +0200},
      Date-Modified = {2007-06-19 14:48:12 +0200},
      Journal = {Neural Computation},
      Keywords = {svm, ls-svm},
      Month = {February},
      Pages = {487–507},
      Title = {SMO algorithm for Least Squares SVM formulations},
      Url = {http://ieeexplore.ieee.org/iel5/8672/27485/01223730.pdf},
      Volume = {15},
      Year = {2003},
      Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEEwuLi8uLi8uLi9QYXBlcnMvS2VlcnRoaS9TTU8gYWxnb3JpdGhtIGZvciBMZWFzdCBTcXVhcmVzIFNWTSBmb3JtdWxhdGlvbnMucGRm0hsPHB1XTlMuZGF0YU8RAjAAAAAAAjAAAgAACURvY3VtZW50cwAAAAAAAAAAAAAAAAAAAAAAAL7OeK5IKwAAADcjpB9TTU8gYWxnb3JpdGhtIGZvciBMZSMzMThDOTYucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMYyWwm+fUwAAAAAAAAAAAAMAAwAACQAAAAAAAAAAAAAAAAAAAAAHS2VlcnRoaQAAEAAIAAC+zlyOAAAAEQAIAADCb4MzAAAAAQAUADcjpAA3G4AAALLyAAASxgAAEq0AAgBSRG9jdW1lbnRzOm5lbW86RG9jdW1lbnRzOlVuaXZlcnNpdGE6UGFwZXJzOktlZXJ0aGk6U01PIGFsZ29yaXRobSBmb3IgTGUjMzE4Qzk2LnBkZgAOAGoANABTAE0ATwAgAGEAbABnAG8AcgBpAHQAaABtACAAZgBvAHIAIABMAGUAYQBzAHQAIABTAHEAdQBhAHIAZQBzACAAUwBWAE0AIABmAG8AcgBtAHUAbABhAHQAaQBvAG4AcwAuAHAAZABmAA8AFAAJAEQAbwBjAHUAbQBlAG4AdABzABIAXi9uZW1vL0RvY3VtZW50cy9Vbml2ZXJzaXRhL1BhcGVycy9LZWVydGhpL1NNTyBhbGdvcml0aG0gZm9yIExlYXN0IFNxdWFyZXMgU1ZNIGZvcm11bGF0aW9ucy5wZGYAEwASL1ZvbHVtZXMvRG9jdW1lbnRzABUAAgAX//8AAIAG0h8gISJYJGNsYXNzZXNaJGNsYXNzbmFtZaMiIyRdTlNNdXRhYmxlRGF0YVZOU0RhdGFYTlNPYmplY3TSHyAmJ6InJFxOU0RpY3Rpb25hcnkACAARABsAJAApADIARABJAEwAUQBTAFwAYgBpAHQAfACDAIYAiACKAI0AjwCRAJMAoACqAPkA/gEGAzoDPANBA0oDVQNZA2cDbgN3A3wDfwAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAOM},
      Bdsk-Url-1 = {http://ieeexplore.ieee.org/iel5/8672/27485/01223730.pdf}
    }

May 19 2007

Support Vector Clustering e One-class classification

Alla base del training delle SVM nel caso di clustering troviamo la One-class classification. Ci sono vari metodi per effettuare la one-class classification (anche conosciuta come Distribution Estimation, Outlier Detection, Novelty Detection, Concept Learning) con le SVM, come la nu-SVM di Schölkopf o il SVDD di Tax

  • B. Schölkopf, R. C. Williamson, A. J. Smola, J. Shawe-Taylor, and J. Platt, "Support Vector Method for Novelty Detection," in Advances in Neural Information Processing Systems 12: Proceedings of the 1999 Conference, 2000.
    @inproceedings{scholkopf2000,
      author = {B. Sch”olkopf and R.C. Williamson and A.J. Smola and J. Shawe-Taylor and J. Platt},
      Booktitle = {Advances in Neural Information Processing Systems 12: Proceedings of the 1999 Conference},
      Date-Added = {2007-04-29 16:39:57 +0200},
      Date-Modified = {2007-08-10 14:18:50 +0200},
      Keywords = {SVM, clustering, SMO, one-class, novelty detection},
      Title = {Support Vector Method for Novelty Detection},
      Url = {http://axiom.anu.edu.au/~williams/papers/P126.pdf},
      Year = {2000},
      Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEEsuLi8uLi8uLi9QYXBlcnMvU2NoXCJvbGtvcGYvU3VwcG9ydCBWZWN0b3IgTWV0aG9kIGZvciBOb3ZlbHR5IERldGVjdGlvbi5wZGbSGw8cHVdOUy5kYXRhTxECLgAAAAACLgACAAAJRG9jdW1lbnRzAAAAAAAAAAAAAAAAAAAAAAAAvs54rkgrAAAANxuNH1N1cHBvcnQgVmVjdG9yIE1ldGhvIzJGMDcxRC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAvBx3CWmzZAAAAAAAAAAAAAwADAAAJAAAAAAAAAAAAAAAAAAAAAAtTY2hcIm9sa29wZgAAEAAIAAC+zlyOAAAAEQAIAADCWlC5AAAAAQAUADcbjQA3G4AAALLyAAASxgAAEq0AAgBWRG9jdW1lbnRzOm5lbW86RG9jdW1lbnRzOlVuaXZlcnNpdGE6UGFwZXJzOlNjaFwib2xrb3BmOlN1cHBvcnQgVmVjdG9yIE1ldGhvIzJGMDcxRC5wZGYADgBgAC8AUwB1AHAAcABvAHIAdAAgAFYAZQBjAHQAbwByACAATQBlAHQAaABvAGQAIABmAG8AcgAgAE4AbwB2AGUAbAB0AHkAIABEAGUAdABlAGMAdABpAG8AbgAuAHAAZABmAA8AFAAJAEQAbwBjAHUAbQBlAG4AdABzABIAXS9uZW1vL0RvY3VtZW50cy9Vbml2ZXJzaXRhL1BhcGVycy9TY2hcIm9sa29wZi9TdXBwb3J0IFZlY3RvciBNZXRob2QgZm9yIE5vdmVsdHkgRGV0ZWN0aW9uLnBkZgAAEwASL1ZvbHVtZXMvRG9jdW1lbnRzABUAAgAX//8AAIAG0h8gISJYJGNsYXNzZXNaJGNsYXNzbmFtZaMiIyRdTlNNdXRhYmxlRGF0YVZOU0RhdGFYTlNPYmplY3TSHyAmJ6InJFxOU0RpY3Rpb25hcnkACAARABsAJAApADIARABJAEwAUQBTAFwAYgBpAHQAfACDAIYAiACKAI0AjwCRAJMAoACqAPgA/QEFAzcDOQM+A0cDUgNWA2QDawN0A3kDfAAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAOJ},
      Bdsk-Url-1 = {http://axiom.anu.edu.au/~williams/papers/P126.pdf}
    }
  • D. M. J. Tax and R. P. W. Duin, "Data Domain Description using Support Vectors," in European Symposium on Artificial Neural Network, Bruges (Belgium), 1999, pp. 251-256.
    @inproceedings{es1999, Address = {Bruges (Belgium)},
      Author = {David M. J. Tax and Robert P. W. Duin},
      Booktitle = {European Symposium on Artificial Neural Network},
      Date-Added = {2007-05-07 12:55:54 +0200},
      Date-Modified = {2007-06-23 08:24:18 +0200},
      Keywords = {SVM, domain description, SVDD, novelty detection, one-class},
      Month = {April},
      Pages = {251–256},
      Title = {Data Domain Description using Support Vectors},
      Url = {http://www.dice.ucl.ac.be/Proceedings/esann/esannpdf/es1999-458.pdf},
      Year = {1999},
      Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEEUuLi8uLi8uLi9QYXBlcnMvVGF4L0RhdGEgRG9tYWluIERlc2NyaXB0aW9uIHVzaW5nIFN1cHBvcnQgVmVjdG9ycy5wZGbSGw8cHVdOUy5kYXRhTxECHAAAAAACHAACAAAJRG9jdW1lbnRzAAAAAAAAAAAAAAAAAAAAAAAAvs54rkgrAAAANxuEH0RhdGEgRG9tYWluIERlc2NyaXB0IzM2RkFCRS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA2+r7CnHJEAAAAAAAAAAAAAwADAAAJAAAAAAAAAAAAAAAAAAAAAANUYXgAABAACAAAvs5cjgAAABEACAAAwpxWJAAAAAEAFAA3G4QANxuAAACy8gAAEsYAABKtAAIATkRvY3VtZW50czpuZW1vOkRvY3VtZW50czpVbml2ZXJzaXRhOlBhcGVyczpUYXg6RGF0YSBEb21haW4gRGVzY3JpcHQjMzZGQUJFLnBkZgAOAGQAMQBEAGEAdABhACAARABvAG0AYQBpAG4AIABEAGUAcwBjAHIAaQBwAHQAaQBvAG4AIAB1AHMAaQBuAGcAIABTAHUAcABwAG8AcgB0ACAAVgBlAGMAdABvAHIAcwAuAHAAZABmAA8AFAAJAEQAbwBjAHUAbQBlAG4AdABzABIAVy9uZW1vL0RvY3VtZW50cy9Vbml2ZXJzaXRhL1BhcGVycy9UYXgvRGF0YSBEb21haW4gRGVzY3JpcHRpb24gdXNpbmcgU3VwcG9ydCBWZWN0b3JzLnBkZgAAEwASL1ZvbHVtZXMvRG9jdW1lbnRzABUAAgAX//8AAIAG0h8gISJYJGNsYXNzZXNaJGNsYXNzbmFtZaMiIyRdTlNNdXRhYmxlRGF0YVZOU0RhdGFYTlNPYmplY3TSHyAmJ6InJFxOU0RpY3Rpb25hcnkACAARABsAJAApADIARABJAEwAUQBTAFwAYgBpAHQAfACDAIYAiACKAI0AjwCRAJMAoACqAPIA9wD/Ax8DIQMmAy8DOgM+A0wDUwNcA2EDZAAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAANx},
      Bdsk-Url-1 = {http://www.dice.ucl.ac.be/Proceedings/esann/esannpdf/es1999-458.pdf}
    }

Così come formulato da Vapnik et al. in

  • A. Ben-Hur, D. Horn, H. T. Siegelmann, and V. Vapnik, "Support Vector Clustering," Journal of Machine Learning Research, vol. 2, pp. 125-137, 2001.
    @article{svc,
      author = {A. Ben-Hur and D. Horn and H. T. Siegelmann and V. Vapnik},
      Date-Modified = {2007-06-19 14:44:40 +0200},
      Journal = {Journal of Machine Learning Research},
      Keywords = {clustering, SVM, gaussian kernel},
      Pages = {125-137},
      Title = {Support Vector Clustering},
      Url = {http://citeseer.ist.psu.edu/hur01support.html},
      Volume = 2, Year = 2001, Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEDUuLi8uLi8uLi9QYXBlcnMvQmVuLUh1ci9TdXBwb3J0IFZlY3RvciBDbHVzdGVyaW5nLnBkZtIbDxwdV05TLmRhdGFPEQHqAAAAAAHqAAIAAAlEb2N1bWVudHMAAAAAAAAAAAAAAAAAAAAAAAC+zniuSCsAAAA3IEUdU3VwcG9ydCBWZWN0b3IgQ2×1c3RlcmluZy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACIbMsH5WY9QREYgAAAAAAADAAMAAAkAAAAAAAAAAAAAAAAAAAAAB0Jlbi1IdXIAABAACAAAvs5cjgAAABEACAAAwflLfwAAAAEAFAA3IEUANxuAAACy8gAAEsYAABKtAAIAUERvY3VtZW50czpuZW1vOkRvY3VtZW50czpVbml2ZXJzaXRhOlBhcGVyczpCZW4tSHVyOlN1cHBvcnQgVmVjdG9yIENsdXN0ZXJpbmcucGRmAA4APAAdAFMAdQBwAHAAbwByAHQAIABWAGUAYwB0AG8AcgAgAEMAbAB1AHMAdABlAHIAaQBuAGcALgBwAGQAZgAPABQACQBEAG8AYwB1AG0AZQBuAHQAcwASAEcvbmVtby9Eb2N1bWVudHMvVW5pdmVyc2l0YS9QYXBlcnMvQmVuLUh1ci9TdXBwb3J0IFZlY3RvciBDbHVzdGVyaW5nLnBkZgAAEwASL1ZvbHVtZXMvRG9jdW1lbnRzABUAAgAX//8AAIAG0h8gISJYJGNsYXNzZXNaJGNsYXNzbmFtZaMiIyRdTlNNdXRhYmxlRGF0YVZOU0RhdGFYTlNPYmplY3TSHyAmJ6InJFxOU0RpY3Rpb25hcnkACAARABsAJAApADIARABJAEwAUQBTAFwAYgBpAHQAfACDAIYAiACKAI0AjwCRAJMAoACqAOIA5wDvAt0C3wLkAu0C+AL8AwoDEQMaAx8DIgAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAMv},
      Bdsk-Url-1 = {http://citeseer.ist.psu.edu/hur01support.html}
    }

il training delle SVM per il clustering viene fatto tramite SVDD.

In

  • D. M. J. Tax, "One-class classification: concept learning in the absence of counter-examples," PhD Thesis , 2001.
    @phdthesis{taxsvdd05,
      author = {David Martinus Johannes Tax},
      Date-Added = {2007-05-19 14:29:53 +0200},
      Date-Modified = {2007-08-10 14:16:46 +0200},
      Keywords = {SVM, domain description, SVDD, novelty detection, one-class},
      School = {Technische Universiteit Delft},
      Title = {One-class classification: concept learning in the absence of counter-examples},
      Url = {http://www.ist.tudelft.nl/live/binaries/468f0bec-405d-4918-aae0-cada843d27f7/doc/thesis_dtax.pdf},
      Year = {2001},
      Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEGQuLi8uLi8uLi9QYXBlcnMvVEFYL09uZS1jbGFzcyBjbGFzc2lmaWNhdGlvbiBjb25jZXB0IGxlYXJuaW5nIGluIHRoZSBhYnNlbmNlIG9mIGNvdW50ZXItZXhhbXBsZXMucGRm0hsPHB1XTlMuZGF0YU8RAngAAAAAAngAAgAACURvY3VtZW50cwAAAAAAAAAAAAAAAAAAAAAAAL7OeK5IKwAAADcjbR9PbmUtY2xhc3MgY2xhc3NpZmljYSMzMUREMUEucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMd0awnR9xAAAAAAAAAAAAAMAAwAACQAAAAAAAAAAAAAAAAAAAAADVEFYAAAQAAgAAL7OXI4AAAARAAgAAMJ0YaQAAAABABQANyNtADcbgAAAsvIAABLGAAASrQACAE5Eb2N1bWVudHM6bmVtbzpEb2N1bWVudHM6VW5pdmVyc2l0YTpQYXBlcnM6VEFYOk9uZS1jbGFzcyBjbGFzc2lmaWNhIzMxREQxQS5wZGYADgCiAFAATwBuAGUALQBjAGwAYQBzAHMAIABjAGwAYQBzAHMAaQBmAGkAYwBhAHQAaQBvAG4AIABjAG8AbgBjAGUAcAB0ACAAbABlAGEAcgBuAGkAbgBnACAAaQBuACAAdABoAGUAIABhAGIAcwBlAG4AYwBlACAAbwBmACAAYwBvAHUAbgB0AGUAcgAtAGUAeABhAG0AcABsAGUAcwAuAHAAZABmAA8AFAAJAEQAbwBjAHUAbQBlAG4AdABzABIAdi9uZW1vL0RvY3VtZW50cy9Vbml2ZXJzaXRhL1BhcGVycy9UQVgvT25lLWNsYXNzIGNsYXNzaWZpY2F0aW9uIGNvbmNlcHQgbGVhcm5pbmcgaW4gdGhlIGFic2VuY2Ugb2YgY291bnRlci1leGFtcGxlcy5wZGYAEwASL1ZvbHVtZXMvRG9jdW1lbnRzABUAAgAX//8AAIAG0h8gISJYJGNsYXNzZXNaJGNsYXNzbmFtZaMiIyRdTlNNdXRhYmxlRGF0YVZOU0RhdGFYTlNPYmplY3TSHyAmJ6InJFxOU0RpY3Rpb25hcnkACAARABsAJAApADIARABJAEwAUQBTAFwAYgBpAHQAfACDAIYAiACKAI0AjwCRAJMAoACqAREBFgEeA5oDnAOhA6oDtQO5A8cDzgPXA9wD3wAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAPs},
      Bdsk-Url-1 = {http://www.ist.tudelft.nl/live/binaries/468f0bec-405d-4918-aae0-cada843d27f7/doc/thesis_dtax.pdf}
    }

è però dimostrato che, usando il kernel Gaussiano, SVVD e nu-SVM danno luogo alle stesse soluzioni (stessa superficie di decisione), laddove si abbia la medesima larghezza del kernel e C=1/nu*N, dove C è il parametro di soft-constraint in SVDD, nu è il parametro introdotto da Schölkopf per le nu-SVM, N è il numero di oggetti.

È per questo motivo che la One-class classification implementata in libSVM

  • C. Chang and C. Lin, LIBSVM: A Library for Support Vector Machines, 2007.
    @misc{libsvm,
      author = {Chih-Chung Chang and Chih-Jen Lin},
      Date-Added = {2007-04-29 15:47:28 +0200},
      Date-Modified = {2007-11-04 17:28:20 +0100},
      Keywords = {svm, software},
      Note = {Manual available at url{http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf}},
      Title = {LIBSVM: A Library for Support Vector Machines},
      Url = {http://www.csie.ntu.edu.tw/~cjlin/libsvm/},
      Year = {2007},
      Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEEYuLi8uLi8uLi9QYXBlcnMvQ2hhbmcvTElCU1ZNIEEgTGlicmFyeSBmb3IgU3VwcG9ydCBWZWN0b3IgTWFjaGluZXMucGRm0hsPHB1XTlMuZGF0YU8RAh4AAAAAAh4AAgAACURvY3VtZW50cwAAAAAAAAAAAAAAAAAAAAAAAL7OeK5IKwAAADcjZh9MSUJTVk0gQSBMaWJyYXJ5IGZvciMzNThDRDEucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAANYzRwmc1OQAAAAAAAAAAAAMAAwAACQAAAAAAAAAAAAAAAAAAAAAFQ2hhbmcAABAACAAAvs5cjgAAABEACAAAwmcZGQAAAAEAFAA3I2YANxuAAACy8gAAEsYAABKtAAIAUERvY3VtZW50czpuZW1vOkRvY3VtZW50czpVbml2ZXJzaXRhOlBhcGVyczpDaGFuZzpMSUJTVk0gQSBMaWJyYXJ5IGZvciMzNThDRDEucGRmAA4AYgAwAEwASQBCAFMAVgBNACAAQQAgAEwAaQBiAHIAYQByAHkAIABmAG8AcgAgAFMAdQBwAHAAbwByAHQAIABWAGUAYwB0AG8AcgAgAE0AYQBjAGgAaQBuAGUAcwAuAHAAZABmAA8AFAAJAEQAbwBjAHUAbQBlAG4AdABzABIAWC9uZW1vL0RvY3VtZW50cy9Vbml2ZXJzaXRhL1BhcGVycy9DaGFuZy9MSUJTVk0gQSBMaWJyYXJ5IGZvciBTdXBwb3J0IFZlY3RvciBNYWNoaW5lcy5wZGYAEwASL1ZvbHVtZXMvRG9jdW1lbnRzABUAAgAX//8AAIAG0h8gISJYJGNsYXNzZXNaJGNsYXNzbmFtZaMiIyRdTlNNdXRhYmxlRGF0YVZOU0RhdGFYTlNPYmplY3TSHyAmJ6InJFxOU0RpY3Rpb25hcnkACAARABsAJAApADIARABJAEwAUQBTAFwAYgBpAHQAfACDAIYAiACKAI0AjwCRAJMAoACqAPMA+AEAAyIDJAMpAzIDPQNBA08DVgNfA2QDZwAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAN0},
      Bdsk-Url-1 = {http://www.csie.ntu.edu.tw/~cjlin/libsvm/}
    }

può essere usata come algoritmo di training nel caso di clustering con SVM, poiché il Support Vector Clustering presuppone l’utilizzo di un kernel Gaussiano.

La libreria libSVM fornisce tuttavia gli strumenti necessari per implementare agevolmente anche SVDD, qualora lo si desideri.

May 10 2007

SMO per Unsupervised Learning

Sequential Minimal Optimization è l’algoritmo per la risoluzione del problema di programmazione quadratica per l’addestramento di una SVM. Esiste una variante di questo algoritmo per il caso non supervisionato.

Riferimenti:

  • B. Schölkopf, J. Platt, J. Shawe-Taylor, A. Smola, and R. Williamson, "Estimating the support of a high-dimensional distribution," Microsoft Research, Redmond, WA, 99–87, 1999.
    @techreport{sch99estimating, Address = {Redmond, WA},
      Author = {B. Sch”olkopf and J. Platt and J. Shawe-Taylor and A. Smola and R. Williamson},
      Date-Added = {2007-05-07 12:48:36 +0200},
      Date-Modified = {2007-06-19 13:05:29 +0200},
      Institution = {Microsoft Research},
      Keywords = {svm, SMO, one-class},
      Number = {99–87},
      Title = {Estimating the support of a high-dimensional distribution},
      Url = {http://citeseer.ist.psu.edu/251593.html},
      Year = {1999},
      Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEFkuLi8uLi8uLi9QYXBlcnMvU2NoXCJvbGtvcGYvRXN0aW1hdGluZyB0aGUgc3VwcG9ydCBvZiBhIGhpZ2gtZGltZW5zaW9uYWwgZGlzdHJpYnV0aW9uLnBkZtIbDxwdV05TLmRhdGFPEQJYAAAAAAJYAAIAAAlEb2N1bWVudHMAAAAAAAAAAAAAAAAAAAAAAAC+zniuSCsAAAA3G40fRXN0aW1hdGluZyB0aGUgc3VwcG8jMzAzMDlGLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADAwn8JkzQEAAAAAAAAAAAADAAMAAAkAAAAAAAAAAAAAAAAAAAAAC1NjaFwib2xrb3BmAAAQAAgAAL7OXI4AAAARAAgAAMJksOEAAAABABQANxuNADcbgAAAsvIAABLGAAASrQACAFZEb2N1bWVudHM6bmVtbzpEb2N1bWVudHM6VW5pdmVyc2l0YTpQYXBlcnM6U2NoXCJvbGtvcGY6RXN0aW1hdGluZyB0aGUgc3VwcG8jMzAzMDlGLnBkZgAOAHwAPQBFAHMAdABpAG0AYQB0AGkAbgBnACAAdABoAGUAIABzAHUAcABwAG8AcgB0ACAAbwBmACAAYQAgAGgAaQBnAGgALQBkAGkAbQBlAG4AcwBpAG8AbgBhAGwAIABkAGkAcwB0AHIAaQBiAHUAdABpAG8AbgAuAHAAZABmAA8AFAAJAEQAbwBjAHUAbQBlAG4AdABzABIAay9uZW1vL0RvY3VtZW50cy9Vbml2ZXJzaXRhL1BhcGVycy9TY2hcIm9sa29wZi9Fc3RpbWF0aW5nIHRoZSBzdXBwb3J0IG9mIGEgaGlnaC1kaW1lbnNpb25hbCBkaXN0cmlidXRpb24ucGRmAAATABIvVm9sdW1lcy9Eb2N1bWVudHMAFQACABf//wAAgAbSHyAhIlgkY2xhc3Nlc1okY2xhc3NuYW1loyIjJF1OU011dGFibGVEYXRhVk5TRGF0YVhOU09iamVjdNIfICYnoickXE5TRGljdGlvbmFyeQAIABEAGwAkACkAMgBEAEkATABRAFMAXABiAGkAdAB8AIMAhgCIAIoAjQCPAJEAkwCgAKoBBgELARMDbwNxA3YDfwOKA44DnAOjA6wDsQO0AAAAAAAAAgEAAAAAAAAAKAAAAAAAAAAAAAAAAAAAA8E=},
      Bdsk-Url-1 = {http://citeseer.ist.psu.edu/251593.html}
    }
  • B. Schölkopf, R. C. Williamson, A. J. Smola, J. Shawe-Taylor, and J. Platt, "Support Vector Method for Novelty Detection," in Advances in Neural Information Processing Systems 12: Proceedings of the 1999 Conference, 2000.
    @inproceedings{scholkopf2000,
      author = {B. Sch”olkopf and R.C. Williamson and A.J. Smola and J. Shawe-Taylor and J. Platt},
      Booktitle = {Advances in Neural Information Processing Systems 12: Proceedings of the 1999 Conference},
      Date-Added = {2007-04-29 16:39:57 +0200},
      Date-Modified = {2007-08-10 14:18:50 +0200},
      Keywords = {SVM, clustering, SMO, one-class, novelty detection},
      Title = {Support Vector Method for Novelty Detection},
      Url = {http://axiom.anu.edu.au/~williams/papers/P126.pdf},
      Year = {2000},
      Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGBwpZJGFyY2hpdmVyWCR2ZXJzaW9uVCR0b3BYJG9iamVjdHNfEA9OU0tleWVkQXJjaGl2ZXISAAGGoNEICVRyb290gAGoCwwXGBkaHiVVJG51bGzTDQ4PEBMWWk5TLm9iamVjdHNXTlMua2V5c1YkY2xhc3OiERKABIAFohQVgAKAA4AHXHJlbGF0aXZlUGF0aFlhbGlhc0RhdGFfEEsuLi8uLi8uLi9QYXBlcnMvU2NoXCJvbGtvcGYvU3VwcG9ydCBWZWN0b3IgTWV0aG9kIGZvciBOb3ZlbHR5IERldGVjdGlvbi5wZGbSGw8cHVdOUy5kYXRhTxECLgAAAAACLgACAAAJRG9jdW1lbnRzAAAAAAAAAAAAAAAAAAAAAAAAvs54rkgrAAAANxuNH1N1cHBvcnQgVmVjdG9yIE1ldGhvIzJGMDcxRC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAvBx3CWmzZAAAAAAAAAAAAAwADAAAJAAAAAAAAAAAAAAAAAAAAAAtTY2hcIm9sa29wZgAAEAAIAAC+zlyOAAAAEQAIAADCWlC5AAAAAQAUADcbjQA3G4AAALLyAAASxgAAEq0AAgBWRG9jdW1lbnRzOm5lbW86RG9jdW1lbnRzOlVuaXZlcnNpdGE6UGFwZXJzOlNjaFwib2xrb3BmOlN1cHBvcnQgVmVjdG9yIE1ldGhvIzJGMDcxRC5wZGYADgBgAC8AUwB1AHAAcABvAHIAdAAgAFYAZQBjAHQAbwByACAATQBlAHQAaABvAGQAIABmAG8AcgAgAE4AbwB2AGUAbAB0AHkAIABEAGUAdABlAGMAdABpAG8AbgAuAHAAZABmAA8AFAAJAEQAbwBjAHUAbQBlAG4AdABzABIAXS9uZW1vL0RvY3VtZW50cy9Vbml2ZXJzaXRhL1BhcGVycy9TY2hcIm9sa29wZi9TdXBwb3J0IFZlY3RvciBNZXRob2QgZm9yIE5vdmVsdHkgRGV0ZWN0aW9uLnBkZgAAEwASL1ZvbHVtZXMvRG9jdW1lbnRzABUAAgAX//8AAIAG0h8gISJYJGNsYXNzZXNaJGNsYXNzbmFtZaMiIyRdTlNNdXRhYmxlRGF0YVZOU0RhdGFYTlNPYmplY3TSHyAmJ6InJFxOU0RpY3Rpb25hcnkACAARABsAJAApADIARABJAEwAUQBTAFwAYgBpAHQAfACDAIYAiACKAI0AjwCRAJMAoACqAPgA/QEFAzcDOQM+A0cDUgNWA2QDawN0A3kDfAAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAOJ},
      Bdsk-Url-1 = {http://axiom.anu.edu.au/~williams/papers/P126.pdf}
    }

Molto probabilmente libSVM implementa già tale variante; infatti libSVM supporta la one-class classification (distribution estimation) e per tale tipo di problema è necessaria la stessa variante di SMO.

This blog is multi language by p.osting.it's Babel