Skip navigation
Run Run Shaw Library City University of Hong KongRun Run Shaw Library

Please use this identifier to cite or link to this item: http://dspace.cityu.edu.hk/handle/2031/619
Full metadata record
DC FieldValueLanguage
dc.contributor.authorChow, Man Chungen_US
dc.date.accessioned2006-01-26T02:28:20Zen_US
dc.date.accessioned2007-05-14T06:53:44Z
dc.date.accessioned2017-09-19T09:10:48Z
dc.date.accessioned2019-02-12T07:32:47Z-
dc.date.available2006-01-26T02:28:20Zen_US
dc.date.available2007-05-14T06:53:44Z
dc.date.available2017-09-19T09:10:48Z
dc.date.available2019-02-12T07:32:47Z-
dc.date.issued2005en_US
dc.identifier.other2005eecmc759en_US
dc.identifier.urihttp://144.214.8.231/handle/2031/619-
dc.description.abstractReal world classification tasks involve different types of attributes such as categorical, nominal and numerical data. Classifiers can handle categorical and nominal values, but not all classifiers can handle numerical data. If a classifier can handle numerical data, it will perform a discretization before running any classification tasks, decision trees is one of the representatives. Decision trees play an important role in classification tasks and behave in an efficient manner. In the project, I have implemented three different heuristic discretization methods which aim to increase the classification accuracy of decision trees. I have empirically evaluated more than twenty datasets. All the experiments were conducted under the same computational environment. “Weka”, a popular and efficient machine learning tools, was used as a benchmark to measure the classification accuracy of different algorithms including the non-discretized numerical datasets. The obtained results show that the classification results can be retained or improved in terms of accuracy after discretization being applied. In addition, it was found that the proposed algorithms not only enhance the efficiency of decision trees classifiers, it also increases the clustering accuracy. This corroborates my argument that with the aid of an appropriate discretization method, classification accuracy can be increased either in a supervised or unsupervised classification. In order to demonstrate the benefits of the proposed algorithms, a “MP3 players” survey, designed to identify and study certain interesting data such as customer behaviors, has been conducted. The classification result of the survey data indicate that an improved accuracy was achieved after the application of the developed discretization method. Thus, it is believed that the proposed methods are applicable to many real life problems.en_US
dc.format.extent164 bytes
dc.format.mimetypetext/html
dc.language.isoen_US
dc.rightsThis work is protected by copyright. Reproduction or distribution of the work in any format is prohibited without written permission of the copyright owner.en_US
dc.rightsAccess is restricted to CityU users.en_US
dc.titleData conversion from numerical to nominal data for classification and clusteringen_US
dc.contributor.departmentDepartment of Electronic Engineeringen_US
dc.description.supervisorProf. Chow, Tommy W S. Assessor: Dr. Tang, K Sen_US
Appears in Collections:Electrical Engineering - Undergraduate Final Year Projects 

Files in This Item:
File SizeFormat 
fulltext.html164 BHTMLView/Open
Show simple item record


Items in Digital CityU Collections are protected by copyright, with all rights reserved, unless otherwise indicated.

Send feedback to Library Systems
Privacy Policy | Copyright | Disclaimer