← Back to team overview

randgen team mailing list archive

Concurrency problem in IPC::Channel

 

Hello RQG people,

I've got a puzzle for you to solve.

For a long while errors of this kind have been bugging me:

Number found where operator expected at (eval 9331) line 1, near "* PREP 2208"
        (Missing operator before  2208?)
Bareword found where operator expected at (eval 9331) line 1, near "2208 CON_ID"
        (Missing operator before CON_ID?)
Bareword found where operator expected at (eval 9331) line 1, near "*/ PREPARE"
        (Missing operator before PREPARE?)
String found where operator expected at (eval 9331) line 1, at end of line
        (Do you need to predeclare FROM?)
Bareword found where operator expected at (eval 9331) line 1, near "',1146,'Table"
        (Missing operator before Table?)
Backslash found where operator expected at (eval 9331) line 1, near "Table \"
        (Do you need to predeclare Table?)
Backslash found where operator expected at (eval 9332) line 1, near "jiyeqaugwbntudxxgkjuzttcekqknyayysiulqgtppkezwdvoija
hshqoxqpyfmsygluddnyhaktihdinuhqogvfnmk\"

etc.

They appear every now and then and don't cause any obvious problems with tests, but now I decided it's time to get rid of them.

And got stuck.

Here is a tiny grammar which was specifically crafted to trigger them massively:

----------------------------------
query:
PREPARE st1 FROM " UPDATE `t1` SET `col_varchar_nokey` = _text(8000) " ; EXECUTE st1 ;

----------------------------------

It can be run this way on the current randgen tree:

perl ./runall.pl --threads=2 --duration=300 --queries=100M --grammar=1.yy --basedir1=<basedir> --vardir1=<vardir>


(Yes, this way all queries from the grammar will fail either with 'unknown table' or with unknown prepared statement; that's intentional).


My main suspect is ErrorFilter which uses IPC::Channel to receive error messages from executors. When there are lots of them, and they come concurrently from different threads, they get mixed up somewhere in the channels. The actual errors are generated by Channel::recv method which at some point eval's what it received from the pipe.

Don't take any of the above for granted, that's what my digging shows, but I might be wrong.

So, here the actual puzzle starts.

I tried all I could think of to get rid of the conflicts -- locking, flocking, blocking, and whatever derivative of 'lock' there is; nothing helped. Again, I might have done it wrong and missed something obvious.

Does anyone want to show their brilliance and solve the problem?

Ideally, if you want to jump on it, I hope you'll check your solutions locally (assuming you can reproduce it), and then tell us "this resolves the problem", instead of just suggesting them in the form of "did you try this?". As I said, I tried a lot of stuff, all in vain, but I could have made a mistake and miss something.


If the problem doesn't happen for you, please do tell, it might also be important somehow.

Regards,
Elena