Torch 7 fails with CUDA 10 by producing below error:
/build/torch7-cutorch-git/src/torch7-cutorch-git/lib/THC/THCAtomics.cuh(97): error: cannot overload functions distinguished by return type alone
1 error detected in the compilation of "/run/user/1000/tmpxft_00007438_00000000-4_THCTensorIndex.cpp4.ii".
CMake Error at THC_generated_THCTensorIndex.cu.o.Release.cmake:279 (message):
Error generating file
/build/torch7-cutorch-git/src/torch7-cutorch-git/lib/THC/CMakeFiles/THC.dir//./THC_generated_THCTensorIndex.cu.o
It has two issues:
- cmake/3.6/Modules/FindCUDA.cmake is outdated.
atomicAdd(__half *address, __half val)
is defined in/usr/local/cuda/include/cuda_fp16.h
(duplicated).
Step 1: Install the latest CMake from github repo.
$ sudo apt-get purge cmake $ git clone https://github.com/Kitware/CMake.git $ cd CMake $ ./bootstrap; make; sudo make install
The master branch of cuDNN.torch does not support cuDNN v7. Installing from R7
branch probably works fine.
$ git clone https://github.com/soumith/cudnn.torch.git -b R7 $ cd cudnn.torch $ luarocks make cudnn-scm-1.rockspec
Step 2: Remove FindCUDA.cmake.
$ cd ~/torch $ rm -fr cmake/3.6/Modules/FindCUDA*
Step 3: Apply the following patch to cutorch
diff --git a/lib/THC/THCAtomics.cuh b/lib/THC/THCAtomics.cuh index 400875c..ccb7a1c 100644 --- a/lib/THC/THCAtomics.cuh +++ b/lib/THC/THCAtomics.cuh @@ -94,6 +94,7 @@ static inline __device__ void atomicAdd(long *address, long val) { } #ifdef CUDA_HALF_TENSOR +#if !(__CUDA_ARCH__ >= 700 || !defined(__CUDA_ARCH__) ) static inline __device__ void atomicAdd(half *address, half val) { unsigned int * address_as_ui = (unsigned int *) ((char *)address - ((size_t)address & 2)); @@ -117,6 +118,7 @@ static inline __device__ void atomicAdd(half *address, half val) { } while (assumed != old); } #endif +#endif
$ cd extra/cutorch $ cat > atomic.patch <copy and paste the patch> $ patch -p1 < atomic.patch
Step 4: Build it.
$ ./clean.sh $ export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__" $ ./install.sh
Let me know if this helps !!