无法调用CUDA内存的推力-Java 学习之路

我试图使用推力库找到一个数组的总和（已经存在于CUDA内存中） . 这里很少有回复，说可以通过使用thrust :: device_ptr包装它，但它给我带来了错误 .

初始代码

cudaMemcpy((void *)(data + stride), (void *)d_output, sizeof(unsigned int) * rows * cols, cudaMemcpyDeviceToHost);
thrust::device_vector<unsigned int> vec((data + stride), (data + stride + (rows * cols)));
sum = thrust::reduce(vec.begin(), vec.end());

上面的代码完全正常 . 但如果我把它改成

thrust::device_ptr<unsigned int> outputPtrBegin(d_output);
thrust::device_ptr<unsigned int> outputPtrEnd((d_output + stride + (rows * cols)));
sum = thrust::reduce(outputPtrBegin, outputPtrEnd);

它抛出以下错误 .

terminate called after throwing an instance of 'thrust::system::system_error'
 what():  an illegal memory access was encountered
 Aborted (core dumped)

可能是什么问题呢？非常感谢你的时间 .

Robert Crovella的编辑输入错误是使用步幅 . 我有以下问题（与上述声明有关）

根据切换的值，我需要调用推力

if(toggle) {
    thrust::device_ptr<unsigned int> outputPtrBegin(d_output);
    thrust::device_ptr<unsigned int> outputPtrEnd((d_output + (rows * cols)));
}
else {
    thrust::device_ptr<unsigned int> outputPtrBegin(d_X);
    thrust::device_ptr<unsigned int> outputPtrEnd((d_X + (rows * cols)));
}

但是编译说没有声明outputPtrBegin和outputPtrEnd，因为它们在if语句中 . 如何在使用之前声明这些设备指针然后再使用？

1 回答

这是错的：

thrust::device_ptr<unsigned int> outputPtrEnd((d_output + stride + (rows * cols)));

它应该是：

thrust::device_ptr<unsigned int> outputPtrEnd((d_output + (rows * cols)));

在您的第一个（工作）示例中，您正在将区域从设备复制到主机 . 在设备上，该区域从 d_output 开始，长度为 rows*cols 个元素 . 这是您正在通过 reduce 操作的数据 . 是的，在主机上，它恰好被复制到一个从 data + stride 开始的区域，但这是无关紧要的 . 最终，您在第一个实现中执行了减少 rows*cols 元素 .

很明显，在第二个实现中，您尝试从 d_output 开始执行reduce操作并转到 d_output+stride+(rows*cols) . 这与大小操作不同 .

此外，您可能希望执行以下操作：

thrust::device_ptr<unsigned int> outputPtrBegin(d_output);
thrust::device_ptr<unsigned int> outputPtrEnd = outputPtrBegin + (rows * cols);
sum = thrust::reduce(outputPtrBegin, outputPtrEnd);

关于你的第二个问题（请将新问题作为新问题发布），而不是：

if(toggle) {
    thrust::device_ptr<unsigned int> outputPtrBegin(d_output);
    thrust::device_ptr<unsigned int> outputPtrEnd((d_output + (rows * cols)));
}
else {
    thrust::device_ptr<unsigned int> outputPtrBegin(d_X);
    thrust::device_ptr<unsigned int> outputPtrEnd((d_X + (rows * cols)));
}

做这样的事情：

thrust::device_ptr<unsigned int> outputPtrBegin;
thrust::device_ptr<unsigned int> outputPtrEnd;
if(toggle) outputPtrBegin=thrust::device_pointer_cast<unsigned int>(d_output);
else outputPtrBegin=thrust::device_pointer_cast<unsigned_int>(d_X);
outputPtrEnd = outputPtrBegin + (rows * cols);

回复于 2024-05-01T08:30:41+08:00

无法调用CUDA内存的推力

1 回答

相关问题