randgen team mailing list archive
-
randgen team
-
Mailing list archive
-
Message #00086
ExecutionTimeComparator is made more statistics-aware
-
To:
randgen@xxxxxxxxxxxxxxxxxxx
-
From:
"John H. Embretsen" <johnemb@xxxxxxxxx>
-
Date:
Thu, 14 Apr 2011 16:16:32 +0200
-
User-agent:
Mozilla/5.0 (X11; U; SunOS i86pc; en-US; rv:1.9.1.9) Gecko/20100318 Lightning/1.0b1 Thunderbird/3.0.4
Hi,
To those of you who may be using the ExecutionTimeComparator validator
in RQG testing:
I have just pushed a patch to the randgen repository which changes parts
of how this validator works. The default case should be more or less as
before, however, some of the tunable settings are slightly different,
and more advanced statistical measurements are now possible.
To summarize:
The default is still that the validator compares the execution times
from each of two servers for each query that is generated, and reports
those that have a difference (or ratio) above or below a given threshold.
Previous extensions (pushed 2011-01-21) enabled the tester to tell the
validator to repeat each query a number of times and calculate average
numbers, to e.g. reduce the chance of false positives. This was tunable
via the setting QUERY_REPEATS. This is now renamed/replaced by the
MIN_SAMPLES and MAX_SAMPLES setting.
If MIN_SAMPLES and/or MAX_SAMPLES is 0, only the original results will
be used. If MIN_SAMPLES and MAX_SAMPLES are both larger than 0, the
query will be repeated MAX_SAMPLES times if Statistics::Descriptive
module is not available (read more below for details).
NEW: Statistics::Descriptive:
=============================
A big change with this patch is in the cases where the module
Statistics::Descriptive is available in the Perl runtime.
In that case, if MIN_SAMPLES and MAX_SAMPLES is set (greater than 0),
the query will be repeated at least MIN_SAMPLES times (or at least
twice if MIN_SAMPLES = 1) and at most MAX_SAMPLES times. The mean value
of these samples will be used as the result for each server. However, if
the standard deviation of the samples for a query is above a given
threshold (MAX_DEVIATION) after MAX_SAMPLES samples, the result is
discarded.
The MAX_DEVIATION threshold is relative, and is given in terms of a
percentage of the mean value. The higher the threshold is, the more
unstable results are accepted.
The idea behind the MIN/MAX samples approach is that the standard
deviation and the statistical significance of the result may improve if
we have more samples.
The standard deviation is used as an indication of how widely dispersed
the measurements are. If the standard deviation is below the threshold
after MIN_SAMPLES samples or more, the result is deemed stable enough
and returned for further validation (comparison).
Relative standard deviations for each notable query will be included in
the output file that is generated, if applicable.
If the --debug option is given to the RQG, more statistical details are
written to the output when each query is validated (Warning: Output can
be huge if the number of queries is large).
I hope this will be of value, and further improvements are of course
welcome.
--
John