October 14 2007

[OT] Star galaxies separation via SVM/CVM classification

We have used some astrophysics star/galaxies datasets for our clustering problems, because they have heavily overlapping clusters.

Here we present some results of an SVM classification performed on the same datasets. In fact, S/G separation is usually faced in a supervised way.

We have used a simple nonlinear SVM/CVM classifier with a linear kernel (K(x,y) = x’ * y).

For each dataset, we have used 5% of it as training set. The rest is the test set.

Datasets:

Longo 01, 2500 items, 2000 stars, 500 galaxies
Longo 02, 9816 items, 2935 stars, 6883 galaxies
Longo 03, 10940 items, 2978 stars, 7964 galaxies

Accuracy results:

Longo 01: 95%
Longo 02: 98,0746%
Longo 03: 97,925%

Accuracy results with CVM:

Longo 01: 94,98%
Longo 02: 97,5%
Longo 03: 95,2%

Probably, other kernels could lead to better results, but it is necessary to understand in which way tune the hyperparameters, such as the kernel width and the soft margin constant, etc.

Post a comment

This blog is multi language by p.osting.it's Babel