And if lets say we decide that N=100 (or N=10%) is the best cutoff
value, and then find out that by not filling the queue completely we
lose even 1% in recall,we might want to stay with the full queue. What
is the time difference between running 50 tests and 100 tests? Almost
nothing, especially comparing to what we spend on preparation of the
tests.
I don't think this project will determine what the "best" value is. It
can only find the "best set of model parameters" for recall as a
function of cutoff (here the "best set" means that for any other set of
parameters, recall - in the target recall range - will be less for any
given value of cutoff).
In other words, this project will deliver a function recall(cutoff).
And then we can decide what cutoff we want and how many failures we can
afford to miss. For example, for 80% recall one achieve in 5% of the
time, 90% recall - in 10% of the time, 95% recal - in 50% of the time,
etc.
And then you can say, "no, we cannot miss more than 5% of failures, so
we'll have to live with 50% speedup only". But no experiments will tell
you how many failures are acceptable for us to miss.
Regards,
Sergei