Hi,
It is probably known to those of you using the Transformer validator
that some query transforms are not comparable to the original in every
case. One such case is the ExecuteAsUnion transforms when using
TRANSFORM_OUTCOME_DISTINCT against queries using varchar columns.
With this transform, results from the original query (e.g. SELECT
`col_varchar_10_latin1` FROM L) are compared to results of transformed
queries using UNION. For example,
( SELECT `col_varchar_10_latin1` FROM L )
UNION ALL ( SELECT `col_varchar_10_latin1` FROM L )
/* TRANSFORM_OUTCOME_SUPERSET */;
where the /* TRANFORM_OUTCOME_SUPERSET */ signals to the validator that
results are OK if results from the transformed query is a superset of
the original query. This is all well.
On the other hand, one other type of transformed UNION query is:
( SELECT `col_varchar_10_latin1` FROM L )
UNION ( SELECT `col_varchar_10_latin1` FROM L )
/* TRANSFORM_OUTCOME_DISTINCT */;
where the /* TRANFORM_OUTCOME_DISTINCT */ signals to the validator that
results are OK if results from the transformed query after removing
duplicates is equal to the results of the original query.
This works OK in most cases, but when dealing with text values in a
column with a case-insensitive collation (default in MySQL) the
validator reports false positives. This is because the validator's idea
of "DISTINCT" is not equal to the the database server's idea of
DISTINCT. The issue is visible with transforms such as "ExecuteAsUnion"
and "Distinct".