batuman batuman - 6 months ago 481
Linux Question

Could not insert 'nvidia_352': No such device

I am trying to run caffe on

Linux Ubuntu
.
After installation, I run caffe in gpu and the error is

I0910 13:28:13.606891 10629 caffe.cpp:296] Use GPU with device ID 0
modprobe: ERROR: could not insert 'nvidia_352': No such device
F0910 13:28:13.728612 10629 common.cpp:142] Check failed: error == cudaSuccess (38 vs. 0) no CUDA-capable device is detected
*** Check failure stack trace: ***
@ 0x7ffd3b9a7daa (unknown)
@ 0x7ffd3b9a7ce4 (unknown)
@ 0x7ffd3b9a76e6 (unknown)
@ 0x7ffd3b9aa687 (unknown)
@ 0x7ffd3bf91cb5 caffe::Caffe::SetDevice()
@ 0x40a5a7 time()
@ 0x4080f8 main
@ 0x7ffd3aeb9ec5 (unknown)
@ 0x408618 (unknown)
@ (nil) (unknown)
Aborted (core dumped)


My NVIDIA driver is 352.41.
I installed 352 and it is installed latest version.

sudo apt-get install nvidia-352[sudo]
Reading package lists... Done
Building dependency tree
Reading state information... Done
nvidia-352 is already the newest version.
The following packages were automatically installed and are no longer required:
account-plugin-windows-live libupstart1
Use 'apt-get autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 31 not upgraded.


My Ubuntu has NVIDIA driver 352 and why I have error like

I0910 13:28:13.606891 10629 caffe.cpp:296] Use GPU with device ID 0
modprobe: ERROR: could not insert 'nvidia_352': No such device
F0910 13:28:13.728612 10629 common.cpp:142] Check failed: error == cudaSuccess (38 vs. 0) no CUDA-capable device is detected


I checked whether I have CUDA capable device like

lspci | grep -i nvidia
05:00.0 VGA compatible controller: NVIDIA Corporation GK107GL [Quadro K2000] (rev a1)
05:00.1 Audio device: NVIDIA Corporation GK107 HDMI Audio Controller (rev a1)


I have CUDA capable device and why I get the error?

EDIT 1:
Yeah my test with ./deviceQuery failed.

../NVIDIA_CUDA-7.5_Samples/bin/x86_64/linux/release/deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 38
-> no CUDA-capable device is detected
Result = FAIL


I checked in the dev/ folder, I have nvidia0.

crwxrwxrwx 1 root root 195, 0 Sep 10 16:51 nvidia0
crw-rw-rw- 1 root root 195, 255 Sep 10 16:51 nvidiactl


My nvcc -V check gave me

li@li-HP-Z420-Workstation:/dev$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Tue_Aug_11_14:27:32_CDT_2015
Cuda compilation tools, release 7.5, V7.5.17


Then my version check

li@li-HP-Z420-Workstation:/dev$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 352.41 Fri Aug 21 23:09:52 PDT 2015
GCC version: gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04)


What could be wrong?

Answer

Now the problem is solved. I checked sudo dpkg --list | grep nvidia I found as my kernel has 352.41, but the client has 304.12. So I did sudo apt-get remove --purge nvidia-*. It removed all packages. Then, install 352.41 as

$ sudo add-apt-repository ppa:xorg-edgers/ppa -y
$ sudo apt-get update
$ sudo apt-get install nvidia-352

After that

$ sudo dpkg --list | grep nvidia
rc nvidia-304 304.128-0ubuntu0~gpu14.04.2 amd64 NVIDIA legacy binary driver - version 304.128
rc nvidia-304-updates 304.125-0ubuntu0.0.2 amd64 NVIDIA legacy binary driver - version 304.125
ii nvidia-352 352.41-0ubuntu0~gpu14.04.1 amd64 NVIDIA binary driver - version 352.41
rc nvidia-opencl-icd-304 304.128-0ubuntu0~gpu14.04.2 amd64 NVIDIA OpenCL ICD
rc nvidia-opencl-icd-304-updates 304.125-0ubuntu0.0.2 amd64 NVIDIA OpenCL ICD
ii nvidia-opencl-icd-352 352.41-0ubuntu0~gpu14.04.1 amd64 NVIDIA OpenCL ICD
ii nvidia-prime 0.6.2 amd64 Tools to enable NVIDIA's Prime
ii nvidia-settings 355.11-0ubuntu0~gpu14.04.1 amd64 Tool for configuring the NVIDIA graphics driver

Now version matches. Then ./deviceQuery and all work as expected. Thanks