← Back to team overview

maria-developers team mailing list archive

Re: [GSoC] Optimize mysql-test-runs - Results of new strategy

 

Hi Elena,

On Thu, Jul 24, 2014 at 8:06 PM, Elena Stepanova <elenst@xxxxxxxxxxxxxxxx>
wrote:

> Hi Pablo,
>
> Okay, thanks for the update.
>
> As I understand, the last two graphs were for the new strategy taking into
> account all edited files, no branch/platform, no time factor?
>

- Yes, new strategy. Using 'co-occurrence' of code file edits and failures.
Also a weighted average of failures.
- No time factor.
- No branch/platform scores are kept. The data for the tests is the same,
no matter platform.
- But when calculating relevance, we use the failures occurred in the last
run as parameter. The last run does depend of branch and platform.


> Also, if it's not too long and if it's possible with your current code,
> can you run the old strategy on the same exact data, learning/running set,
> and input files, so that we could clearly see the difference?
>

I have not incorporated the logic for input file list for the old strategy,
but I will work on it, and it should be ready by tomorrow, hopefully.


> I suppose your new tree does not include the input lists? Are you using
> the raw log files, or have you pre-processed them and made clean lists? If
> you are using the raw files, did you rename them?
>

It does not include them.

I am using the raw files. I included a tiny shell (downlaod_files.sh) that
you can execute to download and decompress the files in the directory where
the program will look by default.
Also, I forgot to change it when uploading, but in basic_testcase.py, you
would need to erase the file_dir parameter passed to s.wrapper(), so that
the program defaults in looking for the files.

Regards
Pablo

Follow ups

References