If you use jCuda and Nd4s at the same time, you can get an exception (I’m doing this to get memory information, while nd4s is doing the work)
[error] Exception in thread "main" org.nd4j.linalg.exception.ND4JException: CUDA exception happened. Terminating. Last op: [null] [error] at org.nd4j.jita.allocator.pointers.cuda.cudaEvent_t.register(cudaEvent_t.java:63) [error] at org.nd4j.jita.flow.impl.SynchronousFlowController.registerAction(SynchronousFlowController.java:215) [error] at org.nd4j.jita.handler.impl.CudaZeroHandler.memcpyAsync(CudaZeroHandler.java:585) [error] at org.nd4j.jita.allocator.impl.AtomicAllocator.memcpyAsync(AtomicAllocator.java:920) [error] at org.nd4j.linalg.jcublas.buffer.BaseCudaDataBuffer.set(BaseCudaDataBuffer.java:457) [error] at org.nd4j.linalg.jcublas.buffer.BaseCudaDataBuffer.setData(BaseCudaDataBuffer.java:581) [error] at org.nd4j.linalg.jcublas.buffer.CudaIntDataBuffer.(CudaIntDataBuffer.java:82) [error] at org.nd4j.linalg.jcublas.buffer.factory.CudaDataBufferFactory.createInt(CudaDataBufferFactory.java:356) [error] at org.nd4j.linalg.factory.Nd4j.createBufferDetached(Nd4j.java:1430) [error] at org.nd4j.linalg.api.shape.Shape.createShapeInformation(Shape.java:2045) [error] at org.nd4j.linalg.api.ndarray.BaseShapeInfoProvider.createShapeInformation(BaseShapeInfoProvider.java:47) [error] at org.nd4j.jita.constant.ProtectedCudaShapeInfoProvider.createShapeInformation(ProtectedCudaShapeInfoProvider.java:64) [error] at org.nd4j.linalg.jcublas.CachedShapeInfoProvider.createShapeInformation(CachedShapeInfoProvider.java:26) [error] at org.nd4j.linalg.api.ndarray.BaseNDArray. (BaseNDArray.java:163) [error] at org.nd4j.linalg.jcublas.JCublasNDArray. (JCublasNDArray.java:335) [error] at org.nd4j.linalg.jcublas.JCublasNDArrayFactory.create(JCublasNDArrayFactory.java:257) [error] at org.nd4j.linalg.factory.Nd4j.create(Nd4j.java:4231) [error] at org.nd4j.linalg.api.shape.Shape.newShapeNoCopy(Shape.java:1230) [error] at org.nd4j.linalg.api.ndarray.BaseNDArray.reshape(BaseNDArray.java:3741) [error] at org.nd4j.linalg.api.ndarray.BaseNDArray.reshape(BaseNDArray.java:3796) [error] at org.nd4j.linalg.api.ndarray.BaseNDArray.reshape(BaseNDArray.java:4100) [error] at indexer.GpuConcepts$$anonfun$score$1.apply(GpuConcepts.scala:260) [error] at indexer.GpuConcepts$$anonfun$score$1.apply(GpuConcepts.scala:259) [error] at indexer.GpuConcepts$.time(GpuConcepts.scala:42) [error] at indexer.GpuConcepts$.score(GpuConcepts.scala:258) [error] at indexer.GpuConcepts$$anonfun$exec$1$$anonfun$apply$20$$anonfun$apply$21.apply(GpuConcepts.scala:222) [error] at indexer.GpuConcepts$$anonfun$exec$1$$anonfun$apply$20$$anonfun$apply$21.apply(GpuConcepts.scala:222) [error] at scala.collection.immutable.List.map(List.scala:284) [error] at indexer.GpuConcepts$$anonfun$exec$1$$anonfun$apply$20.apply(GpuConcepts.scala:221) [error] at indexer.GpuConcepts$$anonfun$exec$1$$anonfun$apply$20.apply(GpuConcepts.scala:221) [error] at scala.util.Try$.apply(Try.scala:192) [error] at indexer.GpuConcepts$$anonfun$exec$1.apply(GpuConcepts.scala:220) [error] at indexer.GpuConcepts$$anonfun$exec$1.apply(GpuConcepts.scala:224) [error] at indexer.GpuConcepts$.time(GpuConcepts.scala:42) [error] at indexer.GpuConcepts$.exec(GpuConcepts.scala:218) [error] at indexer.GpuConcepts$$anonfun$main$1.apply$mcV$sp(GpuConcepts.scala:103) [error] at indexer.GpuConcepts$$anonfun$main$1.apply(GpuConcepts.scala:103) [error] at indexer.GpuConcepts$$anonfun$main$1.apply(GpuConcepts.scala:103) [error] at indexer.GpuConcepts$.time(GpuConcepts.scala:42) [error] at indexer.GpuConcepts$.main(GpuConcepts.scala:103) [error] at indexer.GpuConcepts.main(GpuConcepts.scala) [error] CUDA error at D:/jenkins/workspace/dl4j/all-multiplatform_windows-x86_64/libnd4j/stream3/libnd4j/blas/cuda/NativeOps.cu:4866 code=33(cudaErrorInvalidResourceHandle) "result" [error] CUDA error at D:/jenkins/workspace/dl4j/all-multiplatform_windows-x86_64/libnd4j/stream3/libnd4j/blas/cuda/NativeOps.cu:4749 code=33(cudaErrorInvalidResourceHandle) "result" java.lang.RuntimeException: Nonzero exit code returned from runner: 1 at scala.sys.package$.error(package.scala:27) [trace] Stack trace suppressed: run last compile:run for the full output. [error] (compile:run) Nonzero exit code returned from runner: 1 [error] Total time: 51 s, completed Jan 3, 2018 9:02:54 PM
The issue is that you need to close the context you created:
cuCtxDestroy(context)
Hi, What version of the software is this? Could you file an issue so we can help with this? You shouldn’t need to do anything with the internals. Thanks (Adam from deeplearning4j)
Please file an issue here: https://github.com/deeplearning4j/nd4j/issues – I would seriously consider not trying these kinds of workarounds if you can help it.