maria-developers team mailing list archive

Thread
Date

Re: For Google Summer of Code 2014, Interested in the task of "statistically optimize mysql-test runs by running less tests"

To: Sergei Golubchik <serg@xxxxxxxxxxx>
From: 胡仲义 <sunnyddhzy@xxxxxxxxx>
Date: Mon, 17 Mar 2014 01:38:19 +0800
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20140313183037.GA12787@meddwl.fritz.box>

Hi, Sergei Golubchik!

I am afraid I don't understand the following items very well, could you
explain them for me?

   - average over different combinations or builders
   - or don't average and treat triplets (test,combination,builder) as
   individual "tests"


Regards,
Zhongyi Hu


2014-03-14 2:30 GMT+08:00 Sergei Golubchik <serg@xxxxxxxxxxx>:

> Hi, Zhongyi Hu!
>
> On Mar 14, Zhongyi Hu wrote:
> > Dear Sergei Golubchik,
> >
> > I am a post graduate student of Institute of Software, Chinese Academy
> > of Sciences and my name is Zhongyi Hu.
> > I major in computer science and my research field is data stream
> > mining.  Because I have got enough papers and works for graduation, I
> > want to do something interesting, meaningful and valuable in the rest
> > time as student.
>
> I see. That's very nice :)
>
> > I have participated in two projects about database, one is main memory
> > database and the other is database cluster.  I got some experience of
> > database system design and implementaion from them.  Although I am
> > just a beginner of this area, I really like it and expect to make it
> > as my career.  I often use Mysql in research and work, but MariaDB is
> > not very familiar to me.  I am tremendously optimistic about it's
> > future because all of you.
> >
> > Well, let's come to the point.  I am interested in the task of
> > "statistically optimize mysql-test runs by running less tests".  I
> > chose this task because I have written a few tools for automatic test.
> > I know the performance is very important if there are a large amount
> > of data or cases to test.
>
> This task won't make you familiar with database system design or
> implementation. For this task it doesn't matter whether tests are
> database tests, unit tests, or something completely different. As far as
> this task is concerned, they're abstracts units of work that can be
> executed in arbitrary order and they can "succeed" or "fail", and the
> goal is to execute as few of these "tests" as possible, while detecting
> as many "failures" as possible.
>
> > I read the MDEV-5776 and I think the major job is as follow.
> >
> > When the code is changed, the mysql-test is used to do the requisite
> > tests.  We need to integrate the information of the changes and the
> > scenarios to predict the probability of failure for each test and get
> > the relationships of the tests.
> > Then decide what to test and what test cases should be used.  The
> > purpose is to optimize the efficiency of testing.  All of these should
> > be done by algorithm and program.
>
> Yes. But it's also useful to take into account the historical data -
> what tests failed before and where.
>
> In my experiments historical data were most important (I've got good
> results purely from statistical analysys of historical data), and the
> information about what files were changed didn't improve the results
> much. But perhaps I was doing it wrong?
>
> > In addition, I think that the job is in some ways like mining in data
> > stream, such as many data need to be statistical analyzed and the
> > hidden patterns changing over time.
>
> Yes, exactly.
>
> > At last, I have two basic questions.
> > 1) What exactly are the builder and the combination?
> > I thought they refer to compiler and runtime environment.
>
> Kind of, yes. See this my reply:
> https://lists.launchpad.net/maria-developers/msg06972.html
>
> it contains links to our buildbot (the tool that automatically builds
> and tests mariadb on different platforms - "builders").
>
> There you will see what builders are, what combinations are, and so on.
>
> > 2) What does the "individual tests within a big test file" mean?
>
> Most tests use "mysqltest" tool. It is conceptually very simple -
> execute a set of commands, record the output. Compare with the correct
> pre-recorded output.
>
> A test file contains SQL statements (and sometimes mysqltest
> directives). Technically, one can have many logical tests in one test
> file.
>
> > Maybe I am completely wrong, but I still look forward to your reply.
> > I hope to have the opportunity to learn from you in work and discussion.
>
> If you want to participate in Google Summer of Code, don't forget to
> submit a proposal before the deadline:
> http://www.google-melange.com/gsoc/events/google/gsoc2014
>
> Regards,
> Sergei
>
>

Follow ups

Re: For Google Summer of Code 2014, Interested in the task of "statistically optimize mysql-test runs by running less tests"
From: Sergei Golubchik, 2014-03-16

References

For Google Summer of Code 2014, Interested in the task of "statistically optimize mysql-test runs by running less tests"
From: 胡仲义, 2014-03-13
Re: For Google Summer of Code 2014, Interested in the task of "statistically optimize mysql-test runs by running less tests"
From: Sergei Golubchik, 2014-03-13