Ifrim, Georgiana and Theobald, Martin and Weikum, Gerhard (2005) Learning Word-to-Concept Mappings for Automatic Text Classification. In: 22nd International Conference on Machine Learning.
Full text not available from this repository.Abstract
For both classification and retrieval of natural language text documents, the standard document representation is a term vector where a term is simply a morphological normal form of the corresponding word. A potentially better approach would be to map every word onto a concept, the proper word sense and use this additional information in the learning process. In this paper we address the problem of automatically classifying natural language text documents. We investigate the effect of word to concept mappings and word sense disambiguation techniques on improving classification accuracy. We use the WordNet thesaurus as a background knowledge base and propose a generative language model approach to document classification. We show experimental results comparing the performance of our model with Naive Bayes and SVM classifiers.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Subjects: | DBIS Research > Publications |
Divisions: | Faculty of Engineering, Electronics and Computer Science > Institute of Databases and Informations Systems > DBIS Research and Teaching > DBIS Research > Publications |
Depositing User: | Prof. Dr. Martin Theobald |
Date Deposited: | 09 Sep 2015 19:59 |
Last Modified: | 09 Sep 2015 19:59 |
URI: | http://dbis.eprints.uni-ulm.de/id/eprint/1277 |