Abstract
Top-k dominating queries combine the natural idea of selecting the k best items with a comprehensive “goodness” criterion based on dominance. A point p1 dominates p2 if p1 is as good as p2 in all attributes and is strictly better in at least one. Existing works address the problem in settings where data objects are multidimensional points. However, there are domains where we only have access to the distance between two objects. In cases like these, attributes reflect distances from a set of input objects and are dynamically generated as the input objects change. Consequently, prior works from the literature cannot be applied, despite the fact that the dominance relation is still meaningful and valid. For this reason, in this work, we present the first study for processing top-k dominating queries over distance-based dynamic attribute vectors, defined over a metric space. We propose four progressive algorithms that utilize the properties of the underlying metric space to efficiently solve the problem and present an extensive, comparative evaluation on both synthetic and real-world datasets.
- Wolf-Tilo Balke, Ulrich Gntzer, and Jason Xin Zheng. 2004. Efficient distributed skylining for web information systems. In EDBT. 256--273.Google Scholar
- John Louis Bentley, Hsiang-Tsung Kung, Mario Schkolnick, and C. D. Thompson. 1978. On the average number of maxima in a set of vectors and applications. J. ACM 25, 4 (1978), 536--543. Google ScholarDigital Library
- Stephan Börzsönyi, Donald Kossmann, and Konrad Stocker. 2001. The skyline operator. In Proceedings of ICDE’01. 421--430. Google ScholarDigital Library
- Tolga Bozkaya and Meral Ozsoyoglu. 1999. Indexing large metric spaces for similarity search queries. ACM Trans. Database Syst. 24, 3 (Sept. 1999), 361--404. Google ScholarDigital Library
- Sergey Brin. 1995. Near neighbor search in large metric spaces. In Proceedings of the 21th International Conference on Very Large Data Bases (VLDB’95). Morgan Kaufmann Publishers Inc., San Francisco, CA, 574--584. Google ScholarDigital Library
- Edgar Chávez, Gonzalo Navarro, Ricardo Baeza-Yates, and José Luis Marroquín. 2001. Searching in metric spaces. ACM Comput. Surv. 33, 3 (Sept. 2001), 273--321. Google ScholarDigital Library
- Lei Chen and Xiang Lian. 2008. Dynamic skyline queries in metric spaces. In Proceedings of the 11th International Conference on Extending Database Technology: Advances in Database Technology (EDBT’08). ACM, New York, NY, 333--343. Google ScholarDigital Library
- Lei Chen and Xiang Lian. 2009. Efficient processing of metric skyline queries. IEEE Trans. Knowl. Data Eng. 21, 3 (2009), 351--365. Google ScholarDigital Library
- Paolo Ciaccia, Marco Patella, and Pavel Zezula. 1997. M-tree: An efficient access method for similarity search in metric spaces. In Proceedings of the 23rd International Conference on Very Large Data Bases (VLDB’97). Morgan Kaufmann Publishers Inc., San Francisco, CA, 426--435. Google ScholarDigital Library
- J. Shane Culpepper, Matthias Petri, and Falk Scholer. 2012. Efficient in-memory top-k document retrieval. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’12). 225--234. Google ScholarDigital Library
- Atish Das Sarma, Ashwin Lall, Danupon Nanongkai, and Jun Xu. 2009. Randomized multi-pass streaming skyline algorithms. Proc. of VLDB Endowment 2, 1 (Aug. 2009), 85--96. Google ScholarDigital Library
- Ke Deng, Xiaofang Zhou, and Tao Shen. 2007. Multi-source skyline query processing in road networks. In Proceedings of the 23rd International Conference on Data Engineering (ICDE’07). 796--805.Google ScholarCross Ref
- Cynthia Dwork, Ravi Kumar, Moni Naor, and D. Sivakumar. 2001. Rank aggregation methods for the web. In Proceedings of the 10th International Conference on World Wide Web (WWW’01). 613--622. Google ScholarDigital Library
- R. Fadel, K. V. Jakobsen, Jyrki Katajainen, and Jukka Teuhola. 1999. Heaps and heapsort on secondary storage. Theor. Comput. Sci. 220, 2 (June 1999), 345--362. Google ScholarDigital Library
- Ronald Fagin, Amnon Lotem, and Moni Naor. 2001. Optimal aggregation algorithms for middleware. In Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’01). ACM, New York, NY, 102--113. Google ScholarDigital Library
- David Fuhry, Ruoming Jin, and Donghui Zhang. 2009. Efficient skyline computation in metric space. In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology (EDBT’09). ACM, New York, NY, 1042--1051. Google ScholarDigital Library
- Gisli R. Hjaltason and Hanan Samet. 1995. Ranking in spatial databases. In Proceedings of the 4th International Symposium on Advances in Spatial Databases (SSD’95). Springer-Verlag, London, UK, 83--95. Google ScholarDigital Library
- Gisli R. Hjaltason and Hanan Samet. 2003. Index-driven similarity search in metric spaces (survey article). ACM Trans. Database Syst. 28, 4 (2003), 517--580. Google ScholarDigital Library
- Vagelis Hristidis, Nick Koudas, and Yannis Papakonstantinou. 2001. PREFER: A system for the efficient execution of multi-parametric ranked queries. In Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data (SIGMOD’01). ACM, New York, NY, 259--270. Google ScholarDigital Library
- Ihab F. Ilyas, Walid G. Aref, and Ahmed K. Elmagarmid. 2004. Supporting top-k join queries in relational databases. VLDB J. 13, 3 (2004), 207--221. Google ScholarDigital Library
- Ihab F. Ilyas, George Beskales, and Mohamed A. Soliman. 2008. A survey of top-k query processing techniques in relational database systems. Comput. Surv. 40, 4, Article 11 (2008), 11:1--11:58. Google ScholarDigital Library
- Maria Kontaki, Apostolos N. Papadopoulos, and Yannis Manolopoulos. 2012. Continuous top-k dominating queries. IEEE Trans. Knowl. Data Eng. 24, 5 (May 2012), 840--853. Google ScholarDigital Library
- Andreas Kosmatopoulos, Apostolos N. Papadopoulos, and Kostas Tsichlas. 2014. Dynamic processing of dominating queries with performance guarantees. In Proceedings of the 17th International Conference on Database Theory (ICDT), Athens, Greece, March 24--28, 2014. 225--234.Google Scholar
- Iosif Lazaridis and Sharad Mehrotra. 2001. Progressive approximate aggregate queries with a multi-resolution tree structure. In Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data (SIGMOD’01). ACM, New York, NY, 401--412. Google ScholarDigital Library
- Xiang Lian and Lei Chen. 2009. Top-k dominating queries in uncertain databases. In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology (EDBT’09). ACM, New York, NY, 660--671. Google ScholarDigital Library
- Amélie Marian, Nicolas Bruno, and Luis Gravano. 2004. Evaluating top-k queries over web-accessible databases. ACM Trans. Database Syst. 29, 2 (2004), 319--362. Google ScholarDigital Library
- Dimitris Papadias, Yufei Tao, Greg Fu, and Bernhard Seeger. 2005a. Progressive skyline computation in database systems. ACM Trans. Database Syst. 30, 1 (March 2005), 41--82. Google ScholarDigital Library
- Dimitris Papadias, Yufei Tao, Kyriakos Mouratidis, and Chun Kit Hui. 2005b. Aggregate nearest neighbor queries in spatial databases. ACM Trans. Database Syst. 30, 2 (June 2005), 529--576. Google ScholarDigital Library
- Katerina Raptopoulou, Apostolos N. Papadopoulos, and Yannis Manolopoulos. 2003. Fast nearest-neighbor query processing in moving-object databases. Geoinformatica 7, 2 (2003), 113--137. Google ScholarDigital Library
- Nick Roussopoulos, Stephen Kelley, and Frédéric Vincent. 1995. Nearest neighbor queries. In Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data (SIGMOD’95). 71--79. Google ScholarDigital Library
- Mehdi Sharifzadeh and Cyrus Shahabi. 2006. The spatial skyline queries. In Proceedings of the 32nd International Conference on Very Large Data Bases (VLDB’06). VLDB Endowment, 751--762. Google ScholarDigital Library
- Cheng Sheng and Yufei Tao. 2011. On finding skylines in external memory. In Proceedings of the 30th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’11). 107--116. Google ScholarDigital Library
- Dimitrios Skoutas, Dimitris Sacharidis, Alkis Simitsis, Verena Kantere, and Timos Sellis. 2009. Top-k dominant web services under multi-criteria matching. In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology (EDBT’09). ACM, New York, NY, 898--909. Google ScholarDigital Library
- Eleftherios Tiakas, Apostolos N. Papadopoulos, and Yannis Manolopoulos. 2011. Progressive processing of subspace dominating queries. VLDB J. 20, 6 (Dec. 2011), 921--948. Google ScholarDigital Library
- Eleftherios Tiakas, George Valkanas, Apostolos N. Papadopoulos, Yannis Manolopoulos, and Dimitrios Gunopulos. 2014. Metric-based top-k dominating queries. In Proceedings of the 17th International Conference on Extending Database Technology (EDBT), Athens, Greece, March 24--28, 2014. 415--426.Google Scholar
- Akrivi Vlachou, Christos Doulkeridis, and Yannis Kotidis. 2008. Angle-based space partitioning for efficient parallel skyline computation. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD’08). 227--238. Google ScholarDigital Library
- Yingqi Xu, Tao-Yang Fu, Wang-Chien Lee, and Julian Winter. 2007. Processing K nearest neighbor queries in location-aware sensor networks. Signal Proc. 87, 12 (2007), 2861--2881. Google ScholarDigital Library
- Man Lung Yiu and Nikos Mamoulis. 2007. Efficient processing of top-k dominating queries on multi-dimensional data. In Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB’07). VLDB Endowment, 483--494. Google ScholarDigital Library
- Man Lung Yiu and Nikos Mamoulis. 2009. Multi-dimensional top-k dominating queries. VLDB J. 18, 3 (June 2009), 695--718. Google ScholarDigital Library
- Wenjie Zhang, Xuemin Lin, Ying Zhang, Jian Pei, and Wei Wang. 2010. Threshold-based probabilistic top-k dominating queries. VLDB J. 19, 2 (April 2010), 283--305. Google ScholarDigital Library
Index Terms
- Processing Top-k Dominating Queries in Metric Spaces
Recommendations
Efficient Processing of Reverse Top-k Dominating Queries
CSAI '18: Proceedings of the 2018 2nd International Conference on Computer Science and Artificial IntelligenceThe top-k dominating queries (kDQ) are very useful to the users who hope to select their favorite products. It combines the characteristics of top-k query and skyline query. Although kDQ has been well-studied in the literature, there is, to the best of ...
Efficient processing of top-k dominating queries in distributed environments
Due to the recent massive data generation, preference queries are becoming an increasingly important for users because such queries retrieve only a small number of preferable data objects from a huge multi-dimensional dataset. A top-k dominating query, ...
Probabilistic top-k dominating queries in uncertain databases
Due to the existence of uncertain data in a wide spectrum of real applications, uncertain query processing has become increasingly important, which dramatically differs from handling certain data in a traditional database. In this paper, we formulate ...
Comments