<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Thesis Neminis &#187; Software</title>
	<atom:link href="http://thesis.neminis.org/category/software/feed/" rel="self" type="application/rss+xml" />
	<link>http://thesis.neminis.org</link>
	<description>Diario di lavoro della tesi di Vincenzo Russo / Work-log of Vincenzo Russo’s Thesis</description>
	<lastBuildDate>Mon, 04 Apr 2011 09:06:36 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Knowledge Discovery portal</title>
		<link>http://thesis.neminis.org/2008/01/09/knowledge-discovery-portal/</link>
		<comments>http://thesis.neminis.org/2008/01/09/knowledge-discovery-portal/#comments</comments>
		<pubDate>Wed, 09 Jan 2008 13:27:55 +0000</pubDate>
		<dc:creator>vincenzo russo</dc:creator>
				<category><![CDATA[Classification]]></category>
		<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[Dataset]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Text Mining]]></category>

		<guid isPermaLink="false">http://thesis.neminis.org/2008/01/09/knowledge-discovery-portal/</guid>
		<description><![CDATA[KDnuggets.com (KD stands for Knowledge Discovery) is the leading source of information on Data Mining, Web Mining, Knowledge Discovery, and Decision Support Topics, including News, Software, Solutions, Companies, Jobs, Courses, Meetings, Publications, and more. Go to KDnuggets.com]]></description>
			<content:encoded><![CDATA[<p>KDnuggets.com (KD stands for Knowledge Discovery) is the leading source of information on Data Mining, Web Mining, Knowledge Discovery, and Decision Support Topics, including News, Software, Solutions, Companies, Jobs, Courses, Meetings, Publications, and more.</p>
<p><a href="http://www.kdnuggets.com/index.html">Go to KDnuggets.com</a></p>
]]></content:encoded>
			<wfw:commentRss>http://thesis.neminis.org/2008/01/09/knowledge-discovery-portal/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Creating Vector Models from Text Documents</title>
		<link>http://thesis.neminis.org/2008/01/07/creating-vector-models-from-text-documents/</link>
		<comments>http://thesis.neminis.org/2008/01/07/creating-vector-models-from-text-documents/#comments</comments>
		<pubDate>Mon, 07 Jan 2008 11:59:46 +0000</pubDate>
		<dc:creator>vincenzo russo</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[Text Mining]]></category>

		<guid isPermaLink="false">http://thesis.neminis.org/2008/01/07/creating-vector-models-from-text-documents/</guid>
		<description><![CDATA[MC is a C++ program that creates vector-space models from text documents that can be used for text mining applications. MC provides an efficient multi-threaded implementation that can process very large document collections. For example, MC took 1,189 seconds using &#8230; <a href="http://thesis.neminis.org/2008/01/07/creating-vector-models-from-text-documents/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.cs.utexas.edu/users/dml/software/mc/index.html">MC is a C++ program</a> that creates vector-space models from<br />
text documents that can be used for text mining applications. MC provides<br />
an efficient multi-threaded implementation that can process very<br />
large document collections. For example, MC took 1,189 seconds using<br />
only 17.5 MBytes of main memory to process a sample collection of<br />
about 114,000 documents (the experiment was run on a Sun Ultra10<br />
workstation). More details on MC and its use in a fast clustering<br />
algorithm are available in<br />
<a href="http://www.cs.utexas.edu/users/inderjit/public_papers/effclus.ps.gz">this paper</a>.</p>
<p><a href="http://www.cs.utexas.edu/users/dml/software/mc/index.html">Download</a></p>
]]></content:encoded>
			<wfw:commentRss>http://thesis.neminis.org/2008/01/07/creating-vector-models-from-text-documents/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Co-clustering softwares</title>
		<link>http://thesis.neminis.org/2007/12/03/co-clustering-softwares/</link>
		<comments>http://thesis.neminis.org/2007/12/03/co-clustering-softwares/#comments</comments>
		<pubDate>Mon, 03 Dec 2007 11:45:13 +0000</pubDate>
		<dc:creator>vincenzo russo</dc:creator>
				<category><![CDATA[Co-clustering]]></category>
		<category><![CDATA[Software]]></category>

		<guid isPermaLink="false">http://thesis.neminis.org/2007/12/03/co-clustering-softwares/</guid>
		<description><![CDATA[The first co-clustering software is the Co-cluster developed at University of Austin, Texas. The software you can download here is the version 1.1 you can find also at the original web page. The package hosted here includes a patch to &#8230; <a href="http://thesis.neminis.org/2007/12/03/co-clustering-softwares/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>The first co-clustering software is the <a href="http://thesis.neminis.org/wp-content/plugins/downloads-manager/upload/Software_cocluster.tar.bz2">Co-cluster</a> developed at University of Austin, Texas. The software you can download here is the version 1.1 you can find also <a href="http://www.cs.utexas.edu/users/dml/Software/cocluster.html">at the original web page</a>.</p>
<p>The package hosted here includes a patch to allow the software compilation also with gcc 4.0 and so on modern Linux and Mac OS X systems. Furthermore, it also contains some bash scripts (*.sh) to analyze co-clustering results and produce clustering quality measures with respect to labeled datasets.</p>
<p>The original software is released under GPL license, and so is this.</p>
<p><strong>Download</strong></p>
<p><a href="http://thesis.neminis.org/wp-content/plugins/downloads-manager/upload/Software_cocluster.tar.bz2">Co-clustering code</a></p>
<hr />
<p>The original version of the second Co-clustering software is available <a href="http://www.cs.utexas.edu/~hntuyen/projects/dm/">here</a> and it implements all the six approximation schemes for the Co-clustering, both for the Euclidean distance and for I-divergence.</p>
<p>The package hosted here includes also the same bash scripts included in the aforesaid Co-cluster package.</p>
<p>No license informations were included into the original Bregman co-clustering package, but it seems to be a fork of the <a href="http://www.cs.utexas.edu/users/dml/Software/coclusterOld.html">Co-cluster software v. 1.0</a>. The latter was released under GPL license, so the code of the Bregman co-clustering should be under the same license.</p>
<p><strong>Download</strong></p>
<p><a href="http://thesis.neminis.org/wp-content/plugins/downloads-manager/upload/bregmanCocluster.tar.bz2">Bregman Co-clustering code</a></p>
]]></content:encoded>
			<wfw:commentRss>http://thesis.neminis.org/2007/12/03/co-clustering-softwares/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Support Vector Clustering Code</title>
		<link>http://thesis.neminis.org/2007/12/03/support-vector-clustering-code/</link>
		<comments>http://thesis.neminis.org/2007/12/03/support-vector-clustering-code/#comments</comments>
		<pubDate>Mon, 03 Dec 2007 11:11:55 +0000</pubDate>
		<dc:creator>vincenzo russo</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[SVC]]></category>

		<guid isPermaLink="false">http://thesis.neminis.org/2007/12/03/support-vector-clustering-code/</guid>
		<description><![CDATA[UPDATE 18th of Feb, 2008: the official page of this software is now located at my official website. Here I put the preliminary alpha source code for the Support Vector Clustering. It implements the Cone Cluster Labeling for the cluster &#8230; <a href="http://thesis.neminis.org/2007/12/03/support-vector-clustering-code/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><strong>UPDATE 18th of Feb, 2008</strong>: the official page of this software is now <a href="http://neminis.org/software/support-vector-clustering/">located at my official website</a>.</p>
<hr />
<p><a href="http://thesis.neminis.org/wp-content/plugins/downloads-manager/upload/2007-12-03-svc.tar.bz2">Here</a> I put the preliminary alpha source code for the Support Vector Clustering. It implements the Cone Cluster Labeling for the cluster assignment part</p>
<ul>
</ul>
<p>It also implements the Secant-like kernel width generator.</p>
<ul>
</ul>
<p>The SVM training part is performed by the means of the <a href="http://www.csie.ntu.edu.tw/~cjlin/libsvm/">LIBSVM</a> library, whereas the graph utilities are provided by the <a href="http://www.boost.org/">Boost</a> <a href="http://www.boost.org/libs/graph/doc/table_of_contents.html">Graph Library</a>. Both libraries allow to redistribute the source code under some license terms, so the package you download contains everything you need to compile the code, you have just to type &#8220;make&#8221; in the source root directory.</p>
<p>For more information, take a look to the README directory you find once you have unpacked the tarball.</p>
<p><strong>Download</strong></p>
<p><a href="http://thesis.neminis.org/wp-content/plugins/downloads-manager/upload/2007-12-03-svc.tar.bz2">SVC Source Code</a> &#8211; <a href="http://thesis.neminis.org/svcdoc/html/">SVC Doxygen documentation</a></p>
<hr />
<strong>UPDATE 18th of Feb, 2008</strong>: the official page of this software is now <a href="http://neminis.org/software/support-vector-clustering/">located at my official website</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://thesis.neminis.org/2007/12/03/support-vector-clustering-code/feed/</wfw:commentRss>
		<slash:comments>30</slash:comments>
		</item>
		<item>
		<title>Multivariate Data Analysis Software and Resources</title>
		<link>http://thesis.neminis.org/2007/11/18/multivariate-data-analysis-software-and-resources/</link>
		<comments>http://thesis.neminis.org/2007/11/18/multivariate-data-analysis-software-and-resources/#comments</comments>
		<pubDate>Sun, 18 Nov 2007 15:54:03 +0000</pubDate>
		<dc:creator>vincenzo russo</dc:creator>
				<category><![CDATA[Software]]></category>

		<guid isPermaLink="false">http://thesis.neminis.org/2007/11/18/multivariate-data-analysis-software-and-resources/</guid>
		<description><![CDATA[A collection of the software for multivariate data analysis is available here.]]></description>
			<content:encoded><![CDATA[<p>A collection of the software for multivariate data analysis is available <a href="http://astro.u-strasbg.fr/~fmurtagh/mda-sw/">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://thesis.neminis.org/2007/11/18/multivariate-data-analysis-software-and-resources/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SVC Software Documentation</title>
		<link>http://thesis.neminis.org/2007/09/24/svc-software-documentation/</link>
		<comments>http://thesis.neminis.org/2007/09/24/svc-software-documentation/#comments</comments>
		<pubDate>Mon, 24 Sep 2007 21:47:59 +0000</pubDate>
		<dc:creator>vincenzo russo</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[SVC]]></category>

		<guid isPermaLink="false">http://thesis.neminis.org/2007/09/24/svc-software-documentation/</guid>
		<description><![CDATA[In the &#8220;Documents&#8221; section has been published the documentation (generated with Doxygen) of the C++ implementation of Support Vector Clustering (SVC). View Documentation]]></description>
			<content:encoded><![CDATA[<p>In the &#8220;<a href="http://thesis.neminis.org/documenti/">Documents</a>&#8221; section has been published the <a href="http://thesis.neminis.org/svcdoc/html/">documentation</a> (generated with Doxygen) of the C++ implementation of Support Vector Clustering (SVC).</p>
<p><a href="http://thesis.neminis.org/svcdoc/html/">View Documentation</a></p>
]]></content:encoded>
			<wfw:commentRss>http://thesis.neminis.org/2007/09/24/svc-software-documentation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SVC: Gaussian Kernel Width Generator</title>
		<link>http://thesis.neminis.org/2007/08/02/svc-gaussian-kernel-width-generator/</link>
		<comments>http://thesis.neminis.org/2007/08/02/svc-gaussian-kernel-width-generator/#comments</comments>
		<pubDate>Thu, 02 Aug 2007 12:55:51 +0000</pubDate>
		<dc:creator>vincenzo russo</dc:creator>
				<category><![CDATA[Benchmark]]></category>
		<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Kernel Width Estimation]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[SVC]]></category>
		<category><![CDATA[Test]]></category>

		<guid isPermaLink="false">http://thesis.neminis.org/2007/08/02/svc-gaussian-kernel-width-generator/</guid>
		<description><![CDATA[Per il Support Vector Clustering esiste un solo metodo per generare valori pertinenti della larghezza del kernel gaussiano, ed è mostrato in bibtex &#124; link S. Lee and K. M. Daniels, &#34;Gaussian Kernel Width Generator for Support Vector Clustering,&#34; in &#8230; <a href="http://thesis.neminis.org/2007/08/02/svc-gaussian-kernel-width-generator/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Per il Support Vector Clustering esiste un solo metodo per generare valori pertinenti della larghezza del kernel gaussiano, ed è mostrato in</p>
<ul>
<li>
<div><a href="#gauskergenerator2004" class="toggle">bibtex</a>  | <a href='http://www.cs.uml.edu/~kdaniels/papers/ICBA.pdf' title='Go to document'>link</a></div>
<div>S. Lee and K. M. Daniels, &quot;Gaussian Kernel Width Generator for Support Vector Clustering,&quot; in <em>Advances in Bioinformatics and Its Applications</em>,  2005, pp. 151-162.</div>
<div class="bibtex" id="gauskergenerator2004">
         <code>@inproceedings{gauskergenerator2004, <br />
 &nbsp;&nbsp;author = {Sei-Hyung Lee and Karen M. Daniels}, <br />
 &nbsp; Booktitle = {Advances in Bioinformatics and Its Applications}, <br />
 &nbsp; Date-Added = {2007-10-23 17:21:17 +0200}, <br />
 &nbsp; Date-Modified = {2007-10-23 17:21:17 +0200}, <br />
 &nbsp; Editor = {Matthew He and Giri Narasimhan and Sergei Petoukhov}, <br />
 &nbsp; Keywords = {SVM, clustering, gaussian kernel}, <br />
 &nbsp; Pages = {151--162}, <br />
 &nbsp; Title = {Gaussian Kernel Width Generator for Support Vector Clustering}, <br />
 &nbsp; Url = {http://www.cs.uml.edu/~kdaniels/papers/ICBA.pdf}, <br />
 &nbsp; Volume = {8}, <br />
 &nbsp; Year = {2005}, <br />
 &nbsp; Bdsk-Url-1 = {http://www.cs.uml.edu/~kdaniels/papers/ICBA.pdf}<br />
}</code>
    </div>
</li>
</ul>
<p>che di seguito indicheremo con GKWG.</p>
<p>Il metodo si basa sullo stesso problema di programmazione quadratica su cui si basa l&#8217;SVC stesso, unito a un algoritmo secante.</p>
<p>Gli autori usano questo metodo per generare una lista di valori per la larghezza del kernel prima di eseguire il Support Vector Clustering. Una volta ottenuta la lista, si esegue SVC per ogni valore di quella lista, finché non si raggiunge il criterio di stop.</p>
<p>Gli svantaggi di questo approccio sono due:
<ul>
<li>Si rischia di generare una lista di valori più lunga del necessario</li>
<li>Si aggiunge una complessità non indifferente all&#8217;intero processo di clustering, poiché il processo di risoluzione del problema di programmazione quadratica ha una notevole complessità (teoricamente O(N^3), praticamente, tramite SMO Algorithm, O(m*N^2)) e questo viene eseguito una volta per ogni valore di larghezza del kernel generato.</li>
</ul>
<p>Nel nostro caso, si è riuscito a fondere il processo GKWG con il processo di clustering, sfruttando i calcoli che vengono eseguiti necessariamente per la fase di cluster description dell&#8217;SVC. Abbiamo dunque eliminato i due svantaggi precedenti.</p>
<p><strong>Risultati</strong></p>
<p>La nostra implementazione del GKWG porta, oltre a un vantaggio in termini di complessità computazionale totale, anche a risultati migliori di quelli presenti in letteratura, per ora in riferimento soltanto all&#8217;IRIS Data Set (vedere <a href="http://thesis.neminis.org/wp-content/plugins/downloads-manager/upload/svc-experiments-results.pdf">esperimenti preliminari SVC</a>).</p>
<p>Si è infatti raggiunta, sull&#8217;IRIS completo di tutte le feature, un&#8217;accuratezza del <strong>92.6667%</strong> (<a href="http://thesis.neminis.org/2007/07/10/svc-politica-per-classificazione-bsv/">precedentemente</a> ci si era fermati al 90%), grazie al valore di larghezza del kernel ottenuto dal GKWG. Risultati migliori sono stati raggiunti in letteratura soltanto riducendo lo spazio delle feature da 4D a 3D o 2D, tramite PCA o Sammon non linear mapping. Restando invece in 4D, tutti i testi in letteratura ottengono un&#8217;accuratezza inferiore alla nostra.</p>
<p><strong>Dettagli del test su IRIS</strong></p>
<p>Total time taken: 0.01 s</p>
<p>Kernel Width: 0.0917017<br />
Number of clusters: 3<br />
Number of non-singleton clusters: 3<br />
Number of singleton clusters: 0<br />
Points per cluster:<br />
        Cluster 0: 50<br />
        Cluster 1: 55<br />
        Cluster 2: 45</p>
<p>Class 0<br />
        TP: 50  FP: 0<br />
        FN: 0   TN: 100</p>
<p>Precision: 100 &#8211; Recall: 100 &#8211; F1: 100</p>
<p>Class 1<br />
        TP: 47  FP: 8<br />
        FN: 3   TN: 92</p>
<p>Precision: 85.4545 &#8211; Recall: 94 &#8211; F1: 89.5238</p>
<p>Class 2<br />
        TP: 42  FP: 3<br />
        FN: 8   TN: 97</p>
<p>Precision: 93.3333 &#8211; Recall: 84 &#8211; F1: 88.4211</p>
<p>Macroaveraging: 92.6483<br />
Accuracy: 92.6667</p>
]]></content:encoded>
			<wfw:commentRss>http://thesis.neminis.org/2007/08/02/svc-gaussian-kernel-width-generator/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SVC: politica per classificazione BSV</title>
		<link>http://thesis.neminis.org/2007/07/10/svc-politica-per-classificazione-bsv/</link>
		<comments>http://thesis.neminis.org/2007/07/10/svc-politica-per-classificazione-bsv/#comments</comments>
		<pubDate>Mon, 09 Jul 2007 23:09:30 +0000</pubDate>
		<dc:creator>vincenzo russo</dc:creator>
				<category><![CDATA[Benchmark]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[SVC]]></category>
		<category><![CDATA[SVM]]></category>
		<category><![CDATA[Test]]></category>

		<guid isPermaLink="false">http://thesis.neminis.org/2007/07/10/svc-politica-per-classificazione-bsv/</guid>
		<description><![CDATA[L&#8217;algoritmo di Cluster Assignment usato come tutti gli altri proposti in letteratura non tratta esplicitamente la classificaizione dei Bounded Support Vector, ovvero di quei punti che, per effetto del valore della costante di margine morbido, finiscono fuori dalla sfera di &#8230; <a href="http://thesis.neminis.org/2007/07/10/svc-politica-per-classificazione-bsv/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>L&#8217;algoritmo di Cluster Assignment usato</p>
<ul>
</ul>
<p>come tutti gli altri proposti in letteratura non tratta esplicitamente la classificaizione dei Bounded Support Vector, ovvero di quei punti che, per effetto del valore della costante di margine morbido, finiscono fuori dalla sfera di descrizione del dominio anche se in realtà fanno parte di una delle classi del problema.</p>
<p>Il Cone Cluster Labeling prevede due passi:</p>
<ul>
<li>classificazione dei SV</li>
<li>classificazione di tutti gli altri punti in relazione ai SV</li>
</ul>
<p>che di fatto comprende anche i BSV in &#8220;tutti gli altri punti&#8221;.</p>
<p>Si è scelto di modificare in questo modo l&#8217;algoritmo:</p>
<ul>
<li>classificazione dei SV</li>
<li>classificazione di tutti gli altri punti (tranne i BSV) in relazione ai SV</li>
<li>classificazione dei BSV in relazione a tutti gli altri punti già classificati</li>
</ul>
<p>Nel caso dell&#8217;IRIS data set, <strong>questa modifica ha portato l&#8217;accuratezza da un valore di 89,333% a un valore del 90%</strong>.</p>
]]></content:encoded>
			<wfw:commentRss>http://thesis.neminis.org/2007/07/10/svc-politica-per-classificazione-bsv/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>SVC Preliminary Experiments</title>
		<link>http://thesis.neminis.org/2007/07/04/svc-preliminary-experiments/</link>
		<comments>http://thesis.neminis.org/2007/07/04/svc-preliminary-experiments/#comments</comments>
		<pubDate>Wed, 04 Jul 2007 07:42:58 +0000</pubDate>
		<dc:creator>vincenzo russo</dc:creator>
				<category><![CDATA[Benchmark]]></category>
		<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Dataset]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[SVC]]></category>
		<category><![CDATA[SVM]]></category>
		<category><![CDATA[Test]]></category>

		<guid isPermaLink="false">http://thesis.neminis.org/2007/07/04/svc-preliminary-experiments/</guid>
		<description><![CDATA[In the section Documents is available for download the PDF with the configurations used for tests and related results; is also available the ZIP archive containing the data-sets used for the experiments.]]></description>
			<content:encoded><![CDATA[<p>In the section <a href="http://thesis.neminis.org/documenti/"><em>Documents</em></a> is available for download the <a href="http://thesis.neminis.org/wp-content/plugins/downloads-manager/upload/svc-experiments-results.pdf">PDF</a> with the configurations used for tests and related results; is also available the <a href="http://thesis.neminis.org/wp-content/plugins/downloads-manager/upload/datasets-preliminary-coclustering.zip">ZIP archive</a> containing the data-sets used for the experiments.</p>
]]></content:encoded>
			<wfw:commentRss>http://thesis.neminis.org/2007/07/04/svc-preliminary-experiments/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Co-clustering &#8211; Synthetic Dataset Test #1</title>
		<link>http://thesis.neminis.org/2007/06/22/co-clustering-synthetic-dataset-test-1/</link>
		<comments>http://thesis.neminis.org/2007/06/22/co-clustering-synthetic-dataset-test-1/#comments</comments>
		<pubDate>Fri, 22 Jun 2007 15:40:56 +0000</pubDate>
		<dc:creator>vincenzo russo</dc:creator>
				<category><![CDATA[Bregman]]></category>
		<category><![CDATA[Co-clustering]]></category>
		<category><![CDATA[Dataset]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Test]]></category>

		<guid isPermaLink="false">http://thesis.neminis.org/2007/06/22/co-clustering-synthetic-dataset-test-1/</guid>
		<description><![CDATA[Macchina usata: PowerPC G4, 1.5GHz, 768MB RAM, Mac OS X Software usato: Dataset usato: Il dataset usato in questo test è un dataset sintetico, generato grazie a Il dataset è così composto: Oggetti: 1000 Attributi: 10 Classi: 5, per un &#8230; <a href="http://thesis.neminis.org/2007/06/22/co-clustering-synthetic-dataset-test-1/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><strong>Macchina usata:</strong><br />
PowerPC G4, 1.5GHz, 768MB RAM, Mac OS X</p>
<p><strong>Software usato:</strong></p>
<ul>
</ul>
<p><strong>Dataset usato:</strong><br />
Il dataset usato in questo test è un dataset sintetico, generato grazie a</p>
<ul>
</ul>
<p>Il dataset è così composto:<br />
Oggetti: 1000<br />
Attributi: 10<br />
Classi: 5, per un totale di 888 punti (Cluster 0: 327, Cluster 1: 134, Cluster 2: 162, Cluster 3: 132, Cluster 4: 133)<br />
<strong>Punti di disturbo: 112 (punti non classificabili)</strong></p>
<p><strong>Algoritmo di co-clustering usato:</strong> Euclidean Distance Based, Minimum Sum Squared, Information Theoretic</p>
<p><strong>Problemi:</strong> Da questo primo test condotto su un dataset <em>disturbato</em>, lo schema di co-clustering sembra non essere pensato per identificare il rumore e separarlo dal resto della classificazione, col risultato che tutte le istanze di co-clustering tendono a classificare il rumore in una delle cinque classi richieste, sfalsando i risultati.</p>
<p><strong>Eliminazione punti di rumore:</strong> Eliminando i punti di rumore, abbiamo ottenuto un dataset di 888 punti e l&#8217;algoritmo (Euclidean Distance Based, con 5 co-cluster richiesti) ha separato perfettamente le 5 classi senza alcun errore in un tempo così espresso:<br />
  User     = 0 second(s) 138552 ms<br />
  System   = 0 second(s) 6630 ms<br />
  Time/Run = 0.138552 second(s)</p>
]]></content:encoded>
			<wfw:commentRss>http://thesis.neminis.org/2007/06/22/co-clustering-synthetic-dataset-test-1/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

