Advancing the Discovery of Unique Column Combinations
Author | : Ziawasch Abedjan |
Publisher | : Universitätsverlag Potsdam |
Total Pages | : 30 |
Release | : 2011 |
ISBN-10 | : 9783869561486 |
ISBN-13 | : 3869561483 |
Rating | : 4/5 (86 Downloads) |
Download or read book Advancing the Discovery of Unique Column Combinations written by Ziawasch Abedjan and published by Universitätsverlag Potsdam. This book was released on 2011 with total page 30 pages. Available in PDF, EPUB and Kindle. Book excerpt: Unique column combinations of a relational database table are sets of columns that contain only unique values. Discovering such combinations is a fundamental research problem and has many different data management and knowledge discovery applications. Existing discovery algorithms are either brute force or have a high memory load and can thus be applied only to small datasets or samples. In this paper, the wellknown GORDIAN algorithm and "Apriori-based" algorithms are compared and analyzed for further optimization. We greatly improve the Apriori algorithms through efficient candidate generation and statistics-based pruning methods. A hybrid solution HCAGORDIAN combines the advantages of GORDIAN and our new algorithm HCA, and it significantly outperforms all previous work in many situations.