首页 文章

无法找到就地操作:梯度计算所需的变量之一已通过就地操作进行了修改

提问于
浏览
0

我试图计算网络jacobian的损失(即执行双反向),我得到以下错误:RuntimeError:梯度计算所需的变量之一已被inplace操作修改

我在代码中找不到inplace操作,所以我不知道要修复哪一行 .

*错误发生在最后一行:loss3.backward()

inputs_reg = Variable(data, requires_grad=True)
            output_reg = self.model.forward(inputs_reg)

            num_classes = output.size()[1]
            jacobian_list = []
            grad_output = torch.zeros(*output_reg.size())

            if inputs_reg.is_cuda:
                grad_output = grad_output.cuda()
                jacobian_list = jacobian.cuda()

            for i in range(10):

                zero_gradients(inputs_reg)
                grad_output.zero_()
                grad_output[:, i] = 1
                jacobian_list.append(torch.autograd.grad(outputs=output_reg,
                                                  inputs=inputs_reg,
                                                  grad_outputs=grad_output,
                                                  only_inputs=True,
                                                  retain_graph=True,
                                                  create_graph=True)[0])


            jacobian = torch.stack(jacobian_list, dim=0)
            loss3 = jacobian.norm()
            loss3.backward()

2 回答

  • 0

    谢谢!我用grad_output替换了inplace操作中有问题的代码:

    inputs_reg = Variable(data, requires_grad=True)
                output_reg = self.model.forward(inputs_reg)
                num_classes = output.size()[1]
    
                jacobian_list = []
                grad_output = torch.zeros(*output_reg.size())
    
                if inputs_reg.is_cuda:
                    grad_output = grad_output.cuda()
    
                for i in range(5):
                    zero_gradients(inputs_reg)
    
                    grad_output_curr = grad_output.clone()
                    grad_output_curr[:, i] = 1
                    jacobian_list.append(torch.autograd.grad(outputs=output_reg,
                                                             inputs=inputs_reg,
                                                             grad_outputs=grad_output_curr,
                                                             only_inputs=True,
                                                             retain_graph=True,
                                                             create_graph=True)[0])
    
                jacobian = torch.stack(jacobian_list, dim=0)
                loss3 = jacobian.norm()
                loss3.backward()
    
  • 0

    grad_output.zero_() 就位, grad_output[:, i-1] = 0 也是如此 . 就地意味着"modify a tensor instead of returning a new one, which has the modifications applied" . 一个非就地的示例解决方案是torch.where . 用于将第1列清零的示例用法

    import torch
    t = torch.randn(3, 3)
    ixs = torch.arange(3, dtype=torch.int64)
    zeroed = torch.where(ixs[None, :] == 1, torch.tensor(0.), t)
    
    zeroed
    tensor([[-0.6616,  0.0000,  0.7329],
            [ 0.8961,  0.0000, -0.1978],
            [ 0.0798,  0.0000, -1.2041]])
    
    t
    tensor([[-0.6616, -1.6422,  0.7329],
            [ 0.8961, -0.9623, -0.1978],
            [ 0.0798, -0.7733, -1.2041]])
    

    注意 t 如何保留之前的值, zeroed 具有您想要的值 .

相关问题