Set Predicates in SQL: Enabling Set- Level Comparisons for Dynamically Formed Groups
In data warehousing and OLAP applications, scalar level predicates in SQL become increasingly inadequate to support a class of operations that require set-level comparison semantics, i.e., comparing a group of tuples with multiple values. Currently, complex SQL queries composed by scalar-level operations are often formed to obtain even very simple set-level semantics. Such queries are not only difficult to write but also challenging for a database engine to optimize, thus can result in costly evaluation. This paper proposes to augment SQL with set predicate, to bring out otherwise obscured set-level semantics. We studied two approaches to processing set predicates—an aggregate function-based approach and a bitmap index-based approach. Moreover, we designed a histogram-based probabilistic method of set predicate selectivity estimation, for optimizing queries with multiple predicates. The experiments verified its accuracy and effectiveness in optimizing queries.