Theobald, Martin and Bast, Holger and Majumdar, Debapriyo and Schenkel, Ralf and Weikum, Gerhard (2008) TopX: Efficient and Versatile Top-k Query Processing for Semistructured Data. The VLDB Journal, 17 (2). pp. 81-115.
Full text not available from this repository.Abstract
Recent IR extensions to XML query languages such as Xpath 1.0 Full-Text or the NEXI query language of the INEX benchmark series reflect the emerging interest in IR-style ranked retrieval over semistructured data. TopX is a top-k retrieval engine for text and semistructured data. It terminates query execution as soon as it can safely determine the k top-ranked result elements according to a monotonic score aggregation function with respect to a multidimensional query. It efficiently supports vague search on both content- and structure-oriented query conditions for dy\-namic query relaxation with controllable influence on the result ranking. The main contributions of this paper unfold into four main points: 1) fully implemented models and algorithms for ranked XML retrieval with XPath Full-Text functionality, 2) efficient and effective top-k query processing for semistructured data, 3) support for integrating thesauri and ontologies with statistically quantified relationships among concepts, leveraged for word-sense disambiguation and łinebreak query expansion, and 4) a comprehensive description of the TopX system, with performance experiments on large-scale corpora like TREC Terabyte and INEX Wikipedia.
Item Type: | Article |
---|---|
Subjects: | DBIS Research > Publications |
Divisions: | Faculty of Engineering, Electronics and Computer Science > Institute of Databases and Informations Systems > DBIS Research and Teaching > DBIS Research > Publications |
Depositing User: | Prof. Dr. Martin Theobald |
Date Deposited: | 28 Aug 2015 19:49 |
Last Modified: | 28 Aug 2015 19:49 |
URI: | http://dbis.eprints.uni-ulm.de/id/eprint/1256 |