← Back to team overview

maria-developers team mailing list archive

DBT-3/TPC-H RQG tests are now available

 

Hello,

In response to popular demand, the DBT-3 dataset will be used for testing along with a new set of grammars that generate queries against that dataset.

While I personally doubt that the realism of the DBT-3, it being 99% random, here is what we have so far:

1. DBT-3 datasets for scales 0.1 0.01 and 0.001

2. RQG grammars that implement the following:

- a grammar on the range optimizer via single-table queries against the lineitem database. The WHERE clause consists of nested AND and OR expressions of varying depth, where each individual expression involves an indexed column and is generated to be realistic. E.g. for a date column, we generate expressions that filter records for a specific month or a specific year.

- a grammar for general join tests - realistic multi-table joins are generated by observing the star structure of the dataset. The WHERE and HAVING conditions are generated to be realistic with respect to the columns being queried or filtered out. GROUP BY is used so that it matches the ONLY_FULL_GROUP_BY mode.

- (forthcoming) - subquery tests where subqueries that return a currency value are used in various locations and expressions within the query where a currency value would generally be expected;

In addition, I have reviewed the actual queries from the benchmark and I think the RQG grammars more or less cover the scenarios described in the specification.

From the test runs that have been performed so far, no new bugs are being
discovered as compared to the existing RQG grammars that use purely random queries against purely random data.

Philip Stoev