yade-users team mailing list archive
-
yade-users team
-
Mailing list archive
-
Message #21190
Re: [Question #685647]: Problem in CHOLMOD test with GPU acceleration
Question #685647 on Yade changed:
https://answers.launchpad.net/yade/+question/685647
Status: Needs information => Open
Chu gave more information on the question:
Dear Robert,
Thanks for your information.
>make config
----------------------------------------------------------------
SuiteSparse package compilation options:
----------------------------------------------------------------
SuiteSparse Version: 5.6.0
SuiteSparse top folder: /usr/local/SuiteSparse
Package: LIBRARY= PackageNameWillGoHere
Version: VERSION= x.y.z
SO version: SO_VERSION= x
System: UNAME= Linux
Install directory: INSTALL= /usr/local/SuiteSparse
Install libraries in: INSTALL_LIB= /usr/local/SuiteSparse/lib
Install include files in: INSTALL_INCLUDE= /usr/local/SuiteSparse/include
Install documentation in: INSTALL_DOC= /usr/local/SuiteSparse/share/doc/suitesparse-5.6.0
Optimization level: OPTIMIZATION= -O3
parallel make jobs: JOBS= 8
BLAS library: BLAS= -lopenblas
LAPACK library: LAPACK= -llapack
Intel TBB library: TBB=
Other libraries: LDLIBS= -lm -lrt -Wl,-rpath=/usr/local/SuiteSparse/lib
static library: AR_TARGET= PackageNameWillGoHere.a
shared library (full): SO_TARGET= PackageNameWillGoHere.so.x.y.z
shared library (main): SO_MAIN= PackageNameWillGoHere.so.x
shared library (short): SO_PLAIN= PackageNameWillGoHere.so
shared library options: SO_OPTS= -L/usr/local/SuiteSparse/lib -shared -Wl,-soname -Wl,PackageNameWillGoHere.so.x -Wl,--no-undefined
shared library name tool: SO_INSTALL_NAME= echo
ranlib, for static libs: RANLIB= ranlib
static library command: ARCHIVE= ar rv
copy file: CP= cp -f
move file: MV= mv -f
remove file: RM= rm -f
pretty (for Tcov tests): PRETTY= grep -v "^#" | indent -bl -nce -bli0 -i4 -sob -l120
C compiler: CC= cc
C++ compiler: CXX= g++
CUDA compiler: NVCC= /usr/local/cuda/bin/nvcc
CUDA root directory: CUDA_PATH= /usr/local/cuda
OpenMP flags: CFOPENMP= -fopenmp
C/C++ compiler flags: CF= -O3 -fexceptions -fPIC -fopenmp
LD flags: LDFLAGS= -L/usr/local/SuiteSparse/lib
Fortran compiler: F77= f77
Fortran flags: F77FLAGS=
Intel MKL root: MKLROOT=
Auto detect Intel icc: AUTOCC= no
UMFPACK config: UMFPACK_CONFIG=
CHOLMOD config: CHOLMOD_CONFIG= -DGPU_BLAS
SuiteSparseQR config: SPQR_CONFIG= -DGPU_BLAS
CUDA library: CUDART_LIB= /usr/local/cuda/lib64/libcudart.so
CUBLAS library: CUBLAS_LIB= /usr/local/cuda/lib64/libcublas.so
METIS and CHOLMOD/Partition configuration:
Your METIS library: MY_METIS_LIB=
Your metis.h is in: MY_METIS_INC=
METIS is used via the CHOLMOD/Partition module, configured as follows.
If the next line has -DNPARTITION then METIS will not be used:
CHOLMOD Partition config:
CHOLMOD Partition libs: -lccolamd -lcamd -lmetis
CHOLMOD Partition include: -I/usr/local/SuiteSparse/CCOLAMD/Include -I/usr/local/SuiteSparse/CAMD/Include -I/usr/local/SuiteSparse/metis-5.1.0/include
MAKE: make
CMake options: -DCMAKE_INSTALL_PREFIX=/usr/local/SuiteSparse -DCMAKE_CXX_COMPILER=g++ -DCMAKE_C_COMPILER=cc
################################################
>./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "Quadro P1000"
CUDA Driver Version / Runtime Version 10.1 / 10.1
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 4040 MBytes (4236312576 bytes)
( 4) Multiprocessors, (128) CUDA Cores/MP: 512 CUDA Cores
GPU Max Clock rate: 1519 MHz (1.52 GHz)
Memory Clock rate: 3004 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 524288 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.1, CUDA Runtime Version = 10.1, NumDevs = 1
Result = PASS
--
You received this question notification because your team yade-users is
an answer contact for Yade.