← Back to team overview

yade-users team mailing list archive

Re: [Question #685647]: Problem in CHOLMOD test with GPU acceleration

 

Question #685647 on Yade changed:
https://answers.launchpad.net/yade/+question/685647

    Status: Needs information => Open

Chu gave more information on the question:
Dear Robert,

Thanks for your information.

>make config
----------------------------------------------------------------
SuiteSparse package compilation options:
----------------------------------------------------------------
 
SuiteSparse Version:      5.6.0
SuiteSparse top folder:   /usr/local/SuiteSparse
Package:                  LIBRARY=         PackageNameWillGoHere
Version:                  VERSION=         x.y.z
SO version:               SO_VERSION=      x
System:                   UNAME=           Linux
Install directory:        INSTALL=         /usr/local/SuiteSparse
Install libraries in:     INSTALL_LIB=     /usr/local/SuiteSparse/lib
Install include files in: INSTALL_INCLUDE= /usr/local/SuiteSparse/include
Install documentation in: INSTALL_DOC=     /usr/local/SuiteSparse/share/doc/suitesparse-5.6.0
Optimization level:       OPTIMIZATION=    -O3
parallel make jobs:       JOBS=            8
BLAS library:             BLAS=            -lopenblas
LAPACK library:           LAPACK=          -llapack
Intel TBB library:        TBB=             
Other libraries:          LDLIBS=          -lm -lrt -Wl,-rpath=/usr/local/SuiteSparse/lib
static library:           AR_TARGET=       PackageNameWillGoHere.a
shared library (full):    SO_TARGET=       PackageNameWillGoHere.so.x.y.z
shared library (main):    SO_MAIN=         PackageNameWillGoHere.so.x
shared library (short):   SO_PLAIN=        PackageNameWillGoHere.so
shared library options:   SO_OPTS=         -L/usr/local/SuiteSparse/lib -shared -Wl,-soname -Wl,PackageNameWillGoHere.so.x -Wl,--no-undefined
shared library name tool: SO_INSTALL_NAME= echo
ranlib, for static libs:  RANLIB=          ranlib
static library command:   ARCHIVE=         ar rv
copy file:                CP=              cp -f
move file:                MV=              mv -f
remove file:              RM=              rm -f
pretty (for Tcov tests):  PRETTY=          grep -v "^#" | indent -bl -nce -bli0 -i4 -sob -l120
C compiler:               CC=              cc
C++ compiler:             CXX=             g++
CUDA compiler:            NVCC=            /usr/local/cuda/bin/nvcc
CUDA root directory:      CUDA_PATH=       /usr/local/cuda
OpenMP flags:             CFOPENMP=        -fopenmp
C/C++ compiler flags:     CF=                 -O3 -fexceptions -fPIC -fopenmp
LD flags:                 LDFLAGS=         -L/usr/local/SuiteSparse/lib
Fortran compiler:         F77=             f77
Fortran flags:            F77FLAGS=        
Intel MKL root:           MKLROOT=         
Auto detect Intel icc:    AUTOCC=          no
UMFPACK config:           UMFPACK_CONFIG=  
CHOLMOD config:           CHOLMOD_CONFIG=  -DGPU_BLAS
SuiteSparseQR config:     SPQR_CONFIG=     -DGPU_BLAS
CUDA library:             CUDART_LIB=      /usr/local/cuda/lib64/libcudart.so
CUBLAS library:           CUBLAS_LIB=      /usr/local/cuda/lib64/libcublas.so
METIS and CHOLMOD/Partition configuration:
Your METIS library:       MY_METIS_LIB=    
Your metis.h is in:       MY_METIS_INC=    
METIS is used via the CHOLMOD/Partition module, configured as follows.
If the next line has -DNPARTITION then METIS will not be used:
CHOLMOD Partition config:  
CHOLMOD Partition libs:    -lccolamd -lcamd -lmetis
CHOLMOD Partition include: -I/usr/local/SuiteSparse/CCOLAMD/Include -I/usr/local/SuiteSparse/CAMD/Include -I/usr/local/SuiteSparse/metis-5.1.0/include
MAKE:  make
CMake options:  -DCMAKE_INSTALL_PREFIX=/usr/local/SuiteSparse -DCMAKE_CXX_COMPILER=g++ -DCMAKE_C_COMPILER=cc

################################################
>./deviceQuery

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "Quadro P1000"
  CUDA Driver Version / Runtime Version          10.1 / 10.1
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 4040 MBytes (4236312576 bytes)
  ( 4) Multiprocessors, (128) CUDA Cores/MP:     512 CUDA Cores
  GPU Max Clock rate:                            1519 MHz (1.52 GHz)
  Memory Clock rate:                             3004 Mhz
  Memory Bus Width:                              128-bit
  L2 Cache Size:                                 524288 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.1, CUDA Runtime Version = 10.1, NumDevs = 1
Result = PASS

-- 
You received this question notification because your team yade-users is
an answer contact for Yade.