tf.one_hot()函数Windows GPU版本报错Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_INSTRUCTION;Unexpected Event status: 1


Windows10GPU上跑一段普通的TensorFlow代码报错如下

2019-04-02 09:50:47.986024: I C:\users\nwani\_bazel_nwani\swultrt5\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1356] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:01:00.0
totalMemory: 11.00GiB freeMemory: 9.10GiB
2019-04-02 09:50:47.991931: I C:\users\nwani\_bazel_nwani\swultrt5\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding visible gpu devices: 0
2019-04-02 09:50:48.667536: I C:\users\nwani\_bazel_nwani\swultrt5\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-04-02 09:50:48.672794: I C:\users\nwani\_bazel_nwani\swultrt5\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929]      0
2019-04-02 09:50:48.675436: I C:\users\nwani\_bazel_nwani\swultrt5\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0:   N
2019-04-02 09:50:48.678921: I C:\users\nwani\_bazel_nwani\swultrt5\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8795 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2019-04-02 09:50:51.208473: E C:\users\nwani\_bazel_nwani\swultrt5\execroot\org_tensorflow\tensorflow\stream_executor\cuda\cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_INSTRUCTION
2019-04-02 09:50:51.213582: F C:\users\nwani\_bazel_nwani\swultrt5\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_event_mgr.cc:208] Unexpected Event status: 1

这是因为Windows版本下,GPU版本的TensorFlow里,tf.one_hot()函数有bug,最简单的解决办法是把这个函数放到CPU上(会影响速度,但不会报错)

with tf.device('/cpu:0'):
    b = tf.one_hot(a, 123)

还有一些其他报错也是这个问题,比如这里stackoverflow

E c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\cuda\cuda_driver.cc:1177] could not synchronize on CUDA context: CUDA_ERROR_LAUNCH_FAILED :: No stack trace available
E c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\cuda\cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_LAUNCH_FAILED
F c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_util.cc:370] GPU sync failed


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM