Top-k Query Evaluation with Probabilistic Guarantees

Theobald, Martin and Weikum, Gerhard and Schenkel, Ralf (2004) Top-k Query Evaluation with Probabilistic Guarantees. In: 30th International Conference on Very Large Databases (VLDB 2004).

Full text not available from this repository.


Top-k queries based on ranking elements of multidimensional datasets are a fundamental building block for many kinds of information discovery. The best known general-purpose algo-rithm for evaluating top-k queries is Fagin’s threshold algorithm (TA). Since the user’s goal behind top-k queries is to identify one or a few relevant and novel data items, it is intriguing to use approximative variants of TA to reduce run-time costs. This paper introduces a family of approximative top-k algorithms based on probabilistic arguments. When scanning index lists of the underlying multidimensional data space in descending order of local scores, various forms of convolution and derived bounds are employed to predict when it is safe, with high probability, to drop candidate items and to prune the index scans. The precision and the efficiency of the developed methods are experimentally evaluated based on a large Web corpus and a structured data collection.

Item Type: Conference or Workshop Item (Paper)
Subjects: DBIS Research > Publications
Divisions: Faculty of Engineering, Electronics and Computer Science > Institute of Databases and Informations Systems > DBIS Research and Teaching > DBIS Research > Publications
Depositing User: Prof. Dr. Martin Theobald
Date Deposited: 09 Sep 2015 20:03
Last Modified: 09 Sep 2015 20:03

Actions (login required)

View Item
View Item