Gary Sieling

Fixing error where jCuda and ND4j/Nd4s interact

If you use jCuda and Nd4s at the same time, you can get an exception (I’m doing this to get memory information, while nd4s is doing the work)

[error] Exception in thread "main" org.nd4j.linalg.exception.ND4JException: CUDA exception happened. Terminating. Last op: [null]
[error]         at org.nd4j.jita.allocator.pointers.cuda.cudaEvent_t.register(cudaEvent_t.java:63)
[error]         at org.nd4j.jita.flow.impl.SynchronousFlowController.registerAction(SynchronousFlowController.java:215)
[error]         at org.nd4j.jita.handler.impl.CudaZeroHandler.memcpyAsync(CudaZeroHandler.java:585)
[error]         at org.nd4j.jita.allocator.impl.AtomicAllocator.memcpyAsync(AtomicAllocator.java:920)
[error]         at org.nd4j.linalg.jcublas.buffer.BaseCudaDataBuffer.set(BaseCudaDataBuffer.java:457)
[error]         at org.nd4j.linalg.jcublas.buffer.BaseCudaDataBuffer.setData(BaseCudaDataBuffer.java:581)
[error]         at org.nd4j.linalg.jcublas.buffer.CudaIntDataBuffer.(CudaIntDataBuffer.java:82)
[error]         at org.nd4j.linalg.jcublas.buffer.factory.CudaDataBufferFactory.createInt(CudaDataBufferFactory.java:356)
[error]         at org.nd4j.linalg.factory.Nd4j.createBufferDetached(Nd4j.java:1430)
[error]         at org.nd4j.linalg.api.shape.Shape.createShapeInformation(Shape.java:2045)
[error]         at org.nd4j.linalg.api.ndarray.BaseShapeInfoProvider.createShapeInformation(BaseShapeInfoProvider.java:47)
[error]         at org.nd4j.jita.constant.ProtectedCudaShapeInfoProvider.createShapeInformation(ProtectedCudaShapeInfoProvider.java:64)
[error]         at org.nd4j.linalg.jcublas.CachedShapeInfoProvider.createShapeInformation(CachedShapeInfoProvider.java:26)
[error]         at org.nd4j.linalg.api.ndarray.BaseNDArray.(BaseNDArray.java:163)
[error]         at org.nd4j.linalg.jcublas.JCublasNDArray.(JCublasNDArray.java:335)
[error]         at org.nd4j.linalg.jcublas.JCublasNDArrayFactory.create(JCublasNDArrayFactory.java:257)
[error]         at org.nd4j.linalg.factory.Nd4j.create(Nd4j.java:4231)
[error]         at org.nd4j.linalg.api.shape.Shape.newShapeNoCopy(Shape.java:1230)
[error]         at org.nd4j.linalg.api.ndarray.BaseNDArray.reshape(BaseNDArray.java:3741)
[error]         at org.nd4j.linalg.api.ndarray.BaseNDArray.reshape(BaseNDArray.java:3796)
[error]         at org.nd4j.linalg.api.ndarray.BaseNDArray.reshape(BaseNDArray.java:4100)
[error]         at indexer.GpuConcepts$$anonfun$score$1.apply(GpuConcepts.scala:260)
[error]         at indexer.GpuConcepts$$anonfun$score$1.apply(GpuConcepts.scala:259)
[error]         at indexer.GpuConcepts$.time(GpuConcepts.scala:42)
[error]         at indexer.GpuConcepts$.score(GpuConcepts.scala:258)
[error]         at indexer.GpuConcepts$$anonfun$exec$1$$anonfun$apply$20$$anonfun$apply$21.apply(GpuConcepts.scala:222)
[error]         at indexer.GpuConcepts$$anonfun$exec$1$$anonfun$apply$20$$anonfun$apply$21.apply(GpuConcepts.scala:222)
[error]         at scala.collection.immutable.List.map(List.scala:284)
[error]         at indexer.GpuConcepts$$anonfun$exec$1$$anonfun$apply$20.apply(GpuConcepts.scala:221)
[error]         at indexer.GpuConcepts$$anonfun$exec$1$$anonfun$apply$20.apply(GpuConcepts.scala:221)
[error]         at scala.util.Try$.apply(Try.scala:192)
[error]         at indexer.GpuConcepts$$anonfun$exec$1.apply(GpuConcepts.scala:220)
[error]         at indexer.GpuConcepts$$anonfun$exec$1.apply(GpuConcepts.scala:224)
[error]         at indexer.GpuConcepts$.time(GpuConcepts.scala:42)
[error]         at indexer.GpuConcepts$.exec(GpuConcepts.scala:218)
[error]         at indexer.GpuConcepts$$anonfun$main$1.apply$mcV$sp(GpuConcepts.scala:103)
[error]         at indexer.GpuConcepts$$anonfun$main$1.apply(GpuConcepts.scala:103)
[error]         at indexer.GpuConcepts$$anonfun$main$1.apply(GpuConcepts.scala:103)
[error]         at indexer.GpuConcepts$.time(GpuConcepts.scala:42)
[error]         at indexer.GpuConcepts$.main(GpuConcepts.scala:103)
[error]         at indexer.GpuConcepts.main(GpuConcepts.scala)
[error] CUDA error at D:/jenkins/workspace/dl4j/all-multiplatform_windows-x86_64/libnd4j/stream3/libnd4j/blas/cuda/NativeOps.cu:4866 code=33(cudaErrorInvalidResourceHandle) "result"
[error] CUDA error at D:/jenkins/workspace/dl4j/all-multiplatform_windows-x86_64/libnd4j/stream3/libnd4j/blas/cuda/NativeOps.cu:4749 code=33(cudaErrorInvalidResourceHandle) "result"
java.lang.RuntimeException: Nonzero exit code returned from runner: 1
        at scala.sys.package$.error(package.scala:27)
[trace] Stack trace suppressed: run last compile:run for the full output.
[error] (compile:run) Nonzero exit code returned from runner: 1
[error] Total time: 51 s, completed Jan 3, 2018 9:02:54 PM 

The issue is that you need to close the context you created:

cuCtxDestroy(context)
Exit mobile version