Visual Studio debug模式和release模式 CUDA 结果不同的解决

116 阅读 0 评论 77 点赞

我是靠谱客的博主英俊身影，最近开发中收集的这篇文章主要介绍Visual Studio debug模式和release模式 CUDA 结果不同的解决，觉得挺不错的，现在分享给大家，希望可以做个参考。

非常费解在visual studio下，cuda编程中会出现debug模式和release模式结果不同的情况。

我们知道Release会去掉很多编译和调用信息，但是出现结果误差真是让人费解。

查阅CUDA开发文档，发现use_fast_math。。。原来默认状态下release版本下的cuda为了速度快居然牺牲了精度

Name	Description
use_fast_math	Make use of fast math library. `--use_fast_math` implies `--ftz=true` `--prec-div=false` `--prec-sqrt=false` `--fmad=true`.
`--ftz` {`true`\|`false`}	This option controls single-precision denormals support. --ftz=true flushes denormal values to zero and --ftz=false preserves denormal values. --use_fast_math implies --ftz=true. Allowed values for this option: true, false. Default value: false
`--prec-div` {`true`\|`false`}	This option controls single-precision floating-point division and reciprocals. --prec-div=true enables the IEEE round-to-nearest mode and --prec-div=false enables the fast approximation mode. --use_fast_math implies --prec-div=false. Allowed values for this option: true, false. Default value: true
`--prec-sqrt` {`true`\|`false`}	This option controls single-precision floating-point squre root. --prec-sqrt=true enables the IEEE round-to-nearest mode and --prec-sqrt=false enables the fast approximation mode. --use_fast_math implies --prec-sqrt=false. Allowed values for this option: true, false. Default value: true
`--fmad` {`true`\|`false`}	This option enables (disables) the contraction of floating-point multiplies and adds/subtracts into floating-point multiply-add operations (FMAD, FFMA, or DFMA). --use_fast_math implies --fmad=true. Allowed values for this option: true, false. Default value: true