yade-users team mailing list archive
-
yade-users team
-
Mailing list archive
-
Message #27967
[Question #702331]: CHOLMOD test with GPU acceleration
New question #702331 on Yade:
https://answers.launchpad.net/yade/+question/702331
Hi,
I have encountered problems similar to those in [1] while trying to GPU acceleration.similarly,I run the sh gpu.sh to test CHOLMOD’s GPU functionality, I got the result below.
My versions:Ubuntu 18.04.6 LTS, CUDA version:11.7, NVIDIA driver version: 515.43.04, SuiteSparse:5.12.0
Thanks for any help!
---------------------------------- cholmod_l_demo:
cholmod version 3.0.14
SuiteSparse version 5.12.0
norm (A,inf) = 203.333
norm (A,1) = 203.333
CHOLMOD sparse: A: 18000-by-18000, nz 3457658, upper. OK
CHOLMOD dense: B: 18000-by-1, OK
bnorm 1.99994
Analyze: flop 1.15165e+11 lnz 4.07336e+07
Factorizing A
CHOLMOD factor: L: 18000-by-18000 supernodal, LL'. nz 41793167 OK
nmethods: 1
Ordering: AMD fl/lnz 3911.5 lnz/anz 14.8
Ordering: METIS fl/lnz 2827.3 lnz/anz 11.8
ints in L: 212740, doubles in L: 55587325
factor flops 1.15165e+11 nnz(L) 40733584 (w/no amalgamation)
nnz(A*A'): 3457658
flops / nnz(L): 2827.3
nnz(L) / nnz(A): 11.8
analyze cputime: 0.9524
factor cputime: 7.9463 mflop: 14492.9
solve cputime: 0.0529 mflop: 3078.8
overall cputime: 8.9516 mflop: 12883.5
solve cputime: 0.0437 mflop: 3724.8 (100 trials)
solve2 cputime: 0.0000 mflop: 0.0 (100 trials)
peak memory usage: 631 (MB)
residual (|Ax-b|/(|A||x|+|b|)): 1.60e-15 2.58e-15
residual 1.5e-16 (|Ax-b|/(|A||x|+|b|)) after iterative refinement
rcond 4.9e-04
CHOLMOD GPU/CPU statistics:
SYRK CPU calls 799 time 2.1152e+00
GPU calls 0 time 0.0000e+00
GEMM CPU calls 628 time 5.8567e-01
GPU calls 0 time 0.0000e+00
POTRF CPU calls 172 time 7.2829e-01
GPU calls 0 time 0.0000e+00
TRSM CPU calls 171 time 3.3862e-01
GPU calls 0 time 0.0000e+00
time in the BLAS: CPU 3.7678e+00 GPU 0.0000e+00 total: 3.7678e+00
assembly time 0.0000e+00 0.0000e+00
---------------------------------- cholmod_l_demo:
cholmod version 3.0.14
SuiteSparse version 5.12.0
norm (A,inf) = 203.333
norm (A,1) = 203.333
CHOLMOD sparse: A: 18000-by-18000, nz 3457658, upper. OK
CHOLMOD dense: B: 18000-by-1, OK
bnorm 1.99994
Analyze: flop 1.15165e+11 lnz 4.07336e+07
Factorizing A
CHOLMOD factor: L: 18000-by-18000 supernodal, LL'. nz 41793167 OK
nmethods: 1
Ordering: AMD fl/lnz 3911.5 lnz/anz 14.8
Ordering: METIS fl/lnz 2827.3 lnz/anz 11.8
ints in L: 212740, doubles in L: 55587325
factor flops 1.15165e+11 nnz(L) 40733584 (w/no amalgamation)
nnz(A*A'): 3457658
flops / nnz(L): 2827.3
nnz(L) / nnz(A): 11.8
analyze cputime: 0.9847
factor cputime: 8.2289 mflop: 13995.3
solve cputime: 0.0376 mflop: 4335.6
overall cputime: 9.2511 mflop: 12466.4
solve cputime: 0.0410 mflop: 3969.8 (100 trials)
solve2 cputime: 0.0000 mflop: 0.0 (100 trials)
peak memory usage: 631 (MB)
residual (|Ax-b|/(|A||x|+|b|)): 1.60e-15 2.58e-15
residual 1.5e-16 (|Ax-b|/(|A||x|+|b|)) after iterative refinement
rcond 4.9e-04
CHOLMOD GPU/CPU statistics:
SYRK CPU calls 799 time 1.9403e+00
GPU calls 0 time 0.0000e+00
GEMM CPU calls 628 time 6.2919e-01
GPU calls 0 time 0.0000e+00
POTRF CPU calls 172 time 7.3685e-01
GPU calls 0 time 0.0000e+00
TRSM CPU calls 171 time 2.8382e-01
GPU calls 0 time 0.0000e+00
time in the BLAS: CPU 3.5902e+00 GPU 0.0000e+00 total: 3.5902e+00
assembly time 0.0000e+00 0.0000e+00
[1]https://answers.launchpad.net/yade/+question/685647
--
You received this question notification because your team yade-users is
an answer contact for Yade.