![]() |
![]() |
|||
|
||||
|
|
||||
| Title: | A FRAMEWORK FOR TOPIC CATEGORIZATION OF XML DOCUMENTS USING SUPPORT VECTOR MACHINES | |
| DOI No: | 10.1142/9781860948534_0057 | |
| Source: | INNOVATIVE APPLICATIONS OF INFORMATION TECHNOLOGY FOR THE DEVELOPING WORLD (pp 367-371) | |
| Author(s): | K. G. SRINIVASA
Department of CSE, Bangalore University, Bangalore, India S. SHARATH Department of CSE, Bangalore University, Bangalore, India K. R. VENUGOPAL Department of CSE, Bangalore University, Bangalore, India L. M. PATNAIK Microprocessor Applications Laboratory, IISc, Bangalore, India |
|
| Abstract: | Extensible Markup Language (XML) has emerged as a medium for interoperability over the Internet. As the number of documents published in the form of XML is increasing, there is a need for categorization of XML documents into specific user interest categories. However, manually performing the categorization task is not feasible due to the sheer amount of XML documents available on the Internet. In this paper, we present a machine learning approach to topic categorization which makes use of a multi class Support Vector Machine (SVM) for exploiting the semantic content of XML documents. The SVM is supplemented by a feature selection technique which is used to extract the useful features. Experimental evaluations performed over a wide range of XML documents indicate that the proposed approach significantly improves the performance of the topic categorization task, with respect to accuracy and efficiency. | |
| Full Text: | View full text in PDF format (250KB) | |
| TOC: | Back to Table of Contents | |
|
||