← Back to team overview

instant team mailing list archive

Re: [DOLFIN-dev] Release and buildbot status

 

On Mon, April 6, 2009 15:14, Anders Logg wrote:
> On Mon, Apr 06, 2009 at 01:41:59PM +0200, Johannes Ring wrote:
>> On Mon, April 6, 2009 13:35, Anders Logg wrote:
>> > The buildbot is still not happy on all platforms. The MPI fixes by
>> > Johan seem to have helped some but there are still problems.
>> >
>> > For mac-osx, the tests seem to hang when running the submesh demo:
>> >
>> >   command timed out: 1200 seconds without output, killing pid 12857
>> >   process killed by signal 9
>> >   program finished with exit code -1
>>
>> I think the macbot has crashed. I will try to restart it.

It didn't help with a restart. It was hanging on the submesh demo again.
The output contains an PETSc error. I guess it's related with the mpi fix:

Updating mesh coordinates using transfinite mean value interpolation
(Hermite).
Plotting mesh (DOLFIN mesh), press 'q' to continue...
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 10 BUS: Bus Error, possibly illegal
memory access
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see
http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC
ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to
find memory corruption errors
[0]PETSC ERROR: likely location of problem given in stack below
[0]PETSC ERROR: ---------------------  Stack Frames
------------------------------------
[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[0]PETSC ERROR:       INSTEAD the line number of the start of the function
[0]PETSC ERROR:       is given.
[0]PETSC ERROR: --------------------- Error Message
------------------------------------
[0]PETSC ERROR: Signal received!
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 2.3.3, Patch 15, Tue Sep 23 10:02:49
CDT 2008 HG revision: 31306062cd1a6f6a2496fccb4878f485c9b91760
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: Unknown Name on a darwin9.2 named Amira-iMac.local by
fenicsslave Mon Apr  6 15:13:16 2009
[0]PETSC ERROR: Libraries linked from
/usr/local/src/petsc-2.3.3-p15/lib/darwin9.2.2-cxx-debug
[0]PETSC ERROR: Configure run at Thu Mar  5 10:40:39 2009
[0]PETSC ERROR: Configure options --with-clanguage=cxx --with-shared=1
--with-x=0 --with-x11=0 --with-fortran=0
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: User provided function() line 0 in unknown directory
unknown file
[Amira-iMac.local:09565] MPI_ABORT invoked on rank 0 in communicator
MPI_COMM_WORLD with errorcode 59

^C
Program received signal SIGINT, Interrupt.
0x963b54ba in wait4 ()
(gdb) where
#0  0x963b54ba in wait4 ()
#1  0x963b3007 in system$UNIX2003 ()
#2  0x00eb3fd2 in plot_object<dolfin::Mesh> (t=@0xbfffeb7c,
mode=@0xbfffea58) at dolfin/plot/plot.cpp:35
#3  0x00eb2d30 in std::string::_M_rep () at basic_string.h:48
#4  0x00eb2d30 in ~basic_string [inlined] () at dolfin/plot/plot.cpp:472
#5  ~basic_string [inlined] () at basic_string.h:472
#6  dolfin::plot (mesh=@0xbfffeb7c) at dolfin/plot/plot.cpp:48
#7  0x00003323 in main () at demo/mesh/submesh/cpp/main.cpp:53
(gdb)


>> > For linux64-exp, most tests seem to fail.
>>
>> We thought that our fixes would make this slave green but now we got a
>> strange error:
>>
>>  ./../../demo/fem/simple/python (Python)
>>
>> Traceback (most recent call last):
>>   File "./demo.py", line 19, in <module>
>>     V = FunctionSpace(mesh, "CG", 1)
>>   File
>> "/work/jhbuildbot/fenics/lib/python2.5/site-packages/dolfin/functionspace.py",
>> line 184, in __init__
>>     FunctionSpaceBase.__init__(self, mesh, element)
>>   File
>> "/work/jhbuildbot/fenics/lib/python2.5/site-packages/dolfin/functionspace.py",
>> line 46, in __init__
>>     ufc_element, ufc_dofmap = jit(self._element)
>>   File
>> "/work/jhbuildbot/fenics/lib/python2.5/site-packages/dolfin/jit.py",
>> line 30, in jit
>>     if not check_swig_version(cpp.__swigversion__,same=True):
>>   File
>> "/work/jhbuildbot/fenics/lib/python2.5/site-packages/instant/config.py",
>> line 35, in check_swig_version
>>     installed_version = map(int, get_swig_version().split('.'))
>>   File
>> "/work/jhbuildbot/fenics/lib/python2.5/site-packages/instant/config.py",
>> line 14, in get_swig_version
>>     r = re.search(pattern, output)
>>   File "/usr/lib/python2.5/re.py", line 142, in search
>>     return _compile(pattern, flags).search(string)
>> TypeError: expected string or buffer
>>
>> Johannes
>
> This error appears when the second argument to re.search is wrong.
>
> So the output variable returned from Instant's get_status_output
> is wrong.

Yes, the output variable was None. It is strangely related with todays mpi
fixes. When I reverted the changes the output variable contained the
correct string.

> Why does Instant require its own get_status_output instead of using
> commands.getstatusoutput?

Because commands.getstatusoutput is not available on Windows.

Johannes




References