c2c-oerpscenario team mailing list archive
-
c2c-oerpscenario team
-
Mailing list archive
-
Message #17194
[Bug 715418] Re: [5.x] ir.cron - simultaneous start of cron jobs
May we give our ideas:
********** Mr. Könighofer's input ****
The OpenERP-ir-cron behavior is - despite its similarity of names -
totally different from a Unix-cron behavior.
This is neither documented nor expected by the user.
Short explanation:
Unix-cron:
A job is
* started at a given point in time,
* executed with a given (OS-)priority (which indicates which portion
of CPU-time it will receive) and
* rerun after the specified interval
This means that jobs run concurrently, if their execution intervals
overlaps.
OpenERP-ir-cron:
After a schedule-event, which occurs at least once per hour, a list of
jobs will be executed.
The list of jobs contains all jobs, which have their starting-time
("nextcall") expired.
The list of jobs is sorted according to their priority (which has
nothing to do with their CPU-time portion).
After the list of jobs has finished, a new schedule-event is calculated
(V5 and V6 differ in this calculation).
It looks like concurrency is sought to be avoided (a current bug-report
shows that this is not achieved:
https://bugs.launchpad.net/openobject-server/+bug/715418?comments=all).
This means, that the starting time and the interval of a single job
depends on the other jobs (their priority and runtime).
In the presence of other jobs, you can never know when a certain job is
executed.
This is unusable for many applications that we can imagine.
Before entering into a discussion about design- and
implementation-details, the requirements (=the behavior) shall be
discussed and specified.
1. If the current OpenERP-ir-cron behavior shall be kept (for
whatever reason) the name shall be changed, as it is maliciously
misleading.
2. There shall be a module that behaves similar to a Unix-cron
(concurrent execution at a precise interval and point in time).
There may be additional constraints on concurrency, but they can only be
discussed after 1) and 2) are decided.
Design ideas for a Unix-cron-like module (ignore the following
paragraphs before the requirements are fixed):
- concurrent jobs may be started using threads (Python module:
"threading").
- threads allow a simple communication with the "main-thread" (e.g. a
GUI-refresh after a job has finished)
- threads shall be "daemons" (i.e. their life-span depends on the
"main-thread").
- there is no "priority", as threads do not support OS-priority (i.e.
CPU-time portion) and the start-time overrules any priority (i.e.
sequence) rules.
- a time-granularity is needed for starting the "actual" job. This
time-granularity shall be smaller than the shortest cron interval (i.e.
1 minute)
Alternatively, instead of "light-weight" threads, "heavy-weight" tasks
may be designed (e.g.: start a new OpenERP-server for each job, that
executes exactly one function).
Such an approach has the advantage, that CPU-priority may be utilized
and the jobs could even run on different CPUs in a network.
The disadvantage is, that the management of this architecture is more
complex, the communication to the "main-server" is more difficult and it
can be doubted, that CPU-resources are preserved.
**********
see also PEP 3148 - which might offer useful features for distributed accounting
--
You received this bug notification because you are a member of C2C
OERPScenario, which is subscribed to the OpenERP Project Group.
https://bugs.launchpad.net/bugs/715418
Title:
[5.x] ir.cron - simultaneous start of cron jobs
Status in OpenERP Server:
Triaged
Bug description:
We have not checked if this is fixed in v6
Batch-jobs defined via ir_cron may be started simultaneously/concurrently.
(We have seen up to 17 identical executions of the same job)
/base/ir/ir_cron.py starts timer (via netsvc.startTimer) without
checking, whether there is already a timer started for the same point in
time (or: without removing such simultaneous timers).
Hence, ir_cron._poolJobs will be invoked several times simultaneously.
I suggest to maintain a list of timers in /base/ir/ir_cron.py which
inhibits duplicate timers.
Alternatively "duplicate" timers could be removed - although this might
be difficult to implement.
Such a mechanism could also be implemented within netsvc.startTimer - on
the other hand: simultaneous jobs can also be useful (outside ir_cron).
When are jobs "simultaneous"?
I suggest to implement a "time-granularity" which is less or equal the
smallest granularity of ir_cron-Jobs, i.e. "minutes".
To be on the safe side for long-running jobs, the time-granularity may
also be the shortest interval currently defined via ir_cron (e.g. 10
minutes).
any ideas ?
References