← Back to team overview

c2c-oerpscenario team mailing list archive

[Bug 501617] Re: tiny_socket does not handle EINTR and may fail if SIGCHLD signal received

 

Borja, thanks a lot for the report and for the patch

** Changed in: openobject-server
   Importance: Undecided => Wishlist

** Changed in: openobject-server
       Status: New => Confirmed

** Changed in: openobject-server
     Assignee: (unassigned) => OpenERP's Framework R&D (openerp-dev-framework)

-- 
tiny_socket does not handle EINTR and may fail if SIGCHLD signal received
https://bugs.launchpad.net/bugs/501617
You received this bug notification because you are a member of C2C
OERPScenario, which is subscribed to the OpenERP Project Group.

Status in OpenObject Server: Confirmed

Bug description:
The basic network functions used by OpenERP for receiving and sending data (tiny_socket.py) don't handle EINTR ("interrupted system call") errors, and that may cause weird race conditions.

EINTR errors happen when the process receives signals while doing some low level I/O (like receiving or sending data over a socket): the I/O operation is interrupted by the kernel (is just not performed*) so the process can take care of the signal, and should be retried again afterwards (the I/O didn't really fail, it was just interrupted to wake up the process).

   (*) EINTR for socket fuctions means "The recv() function was interrupted by a signal that was caught, before any data was available." / "A signal interrupted send() before any data was transmitted." (http://www.opengroup.org/onlinepubs/000095399/functions/recv.html) so the calls can be safely retried (http://www.wlug.org.nz/EINTR)

Python does not handle EINTR errors by itself (there had been discussions about this: http://bugs.python.org/issue1628205) so is the Python programmer that uses I/O who must take care (and retry the operation).

***

This bug was first detected on the Koo client, that uses a copy of the tiny_socket.py file for NetRPC communication, but may affect all the code that depends on tiny_socket.py (like the server itself, the GTK client and the Web client). The bug shown up on computers running Linux Mint 7 (kernel 2.6.28-16 32bits) and Linux Ubuntu 9.10 64 bit (2.6.31-14) - (https://bugs.launchpad.net/openobject-client-kde/+bug/484651).


On (tiny_socket.py) mysocket.myreceive, some data may have been received (in calls to recv) when the EINTR error happens; as the EINTR is not handled, mysocket.myreceive will just raise up the Exception so the current operation will fail. That means that OpenERP is susceptible to weird race conditions (it will fail only when the SIGCHLD, or other non-ignored signal, arrives while performing I/O) or denial of service attacks (sending lots of signals to OpenERP).

For example, on the OpenERP server, some addons use spawnlp or other similar functions to create sub-processes. Some of them, like the jasper_reports, need to run those sub-process without waiting for the spawned process to end (os.P_NOWAIT). In that context, OpenERP will receive SIGCHLD signals when the spawned sub-process end. If OpenERP receives one of those signals while it is performing a socket I/O operation (mainly using socket.recv or socket.send functions in tiny_socket.py), the call may fail with an EINTR error (4) and data may be lost.

***

A possible fix is to patch tiny_socket.py so it handles EINTR errors, retrying the recv/send operations. This would make sure that no signal breaks mysocket.mysend or mysocket.myreceive.

As an optional workaround, if no fix is applied to tiny_socket.py, SIGCHLD signals could be ignored ("signal.signal(signal.SIGCHLD, signal.SIG_IGN)"), and no EINTR error will be raised then when a sub-process end. This would avoid the spawn* with os.P_NOWAIT problem.