Learning Word-to-Concept Mappings for Automatic Text Classification

Ifrim, Georgiana and Theobald, Martin and Weikum, Gerhard (2005) Learning Word-to-Concept Mappings for Automatic Text Classification. In: 22nd International Conference on Machine Learning.

Full text not available from this repository.


For both classification and retrieval of natural language text documents, the standard document representation is a term vector where a term is simply a morphological normal form of the corresponding word. A potentially better approach would be to map every word onto a concept, the proper word sense and use this additional information in the learning process. In this paper we address the problem of automatically classifying natural language text documents. We investigate the effect of word to concept mappings and word sense disambiguation techniques on improving classification accuracy. We use the WordNet thesaurus as a background knowledge base and propose a generative language model approach to document classification. We show experimental results comparing the performance of our model with Naive Bayes and SVM classifiers.

Item Type: Conference or Workshop Item (Paper)
Subjects: DBIS Research > Publications
Divisions: Faculty of Engineering, Electronics and Computer Science > Institute of Databases and Informations Systems > DBIS Research and Teaching > DBIS Research > Publications
Depositing User: Prof. Dr. Martin Theobald
Date Deposited: 09 Sep 2015 19:59
Last Modified: 09 Sep 2015 19:59
URI: http://dbis.eprints.uni-ulm.de/id/eprint/1277

Actions (login required)

View Item
View Item