Home Articles

从pytorch模型中保存预测

Asked
Viewed 1696 times
0

我按照pytorch转移学习教程并将其应用于kaggle种子分类任务,我只是不确定如何将预测保存在csv文件中以便我可以提交,任何建议都会有所帮助,这就是我所拥有的,

use_gpu = torch.cuda.is_available()
 model = models.resnet50(pretrained=True)
for param in model.parameters():
    param.requires_grad = False

num_ftrs = model.fc.in_features
model.fc = torch.nn.Linear(num_ftrs, len(classes))
if use_gpu:
    model = model.cuda()

criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.fc.parameters(), lr=0.001, momentum=0.9)
exp_lr_scheduler = lr_scheduler.StepLR(optimizer, step_size=7, gamma=0.1)

loaders = {'train':train_loader, 'valid':valid_loader, 'test': test_loader}

model = train_model(loaders, model, criterion, optimizer, exp_lr_scheduler, num_epochs=50)

2 Answers

  • 2

    培训完模型后,可以根据测试数据对其进行评估 . 这给你一个Variable,可能在GPU上 . 从那里,您将要使用 cpu() 将其tensor复制到CPU并将其转换为带有 numpy() 的numpy数组 . 然后你可以使用numpy的CSV functionality或使用例如熊猫'DataFrame.to_csv . 在第一种情况下,你会有这样的事情:

    # evaluate on Variable x with testing data
    y = model(x)
    # access Variable's tensor, copy back to CPU, convert to numpy
    arr = y.data.cpu().numpy()
    # write CSV
    np.savetxt('output.csv', arr)
    
  • 1

    我正在分享我用于SNLI任务的评估功能 . 请注意,这只是一个例子,而不是确切的答案,可能你正在寻找 . 我希望它会对你有所帮助!!

    def evaluate(model, batches, dictionary, outfile=None):
        # Turn on evaluation mode which disables dropout.
        model.eval()
    
        n_correct, n_total = 0, 0
        y_preds, y_true, output = [], [], []
        for batch_no in range(len(batches)):
            sent1, sent_len1, sent2, sent_len2, labels = helper.batch_to_tensors(batches[batch_no], dictionary)
            if model.config.cuda:
                sent1 = sent1.cuda()
                sent2 = sent2.cuda()
                labels = labels.cuda()
    
            score = model(sent1, sent_len1, sent2, sent_len2)
            preds = torch.max(score, 1)[1]
            if outfile:
                predictions = preds.data.cpu().tolist()
                for i in range(len(batches[batch_no])):
                    output.append([batches[batch_no][i].id, predictions[i]])
            else:
                y_preds.extend(preds.data.cpu().tolist())
                y_true.extend(labels.data.cpu().tolist())
                n_correct += (preds.view(labels.size()).data == labels.data).sum()
                n_total += len(batches[batch_no])
    
        if outfile:
            target_names = ['entailment', 'neutral', 'contradiction']
            with open(outfile, 'w') as f:
                f.write('pairID,gold_label' + '\n')
                for item in output:
                    f.write(str(item[0]) + ',' + target_names[item[1]] + '\n')
        else:
            return 100. * n_correct / n_total, 100. * f1_score(numpy.asarray(y_true), numpy.asarray(y_preds),
                                                               average='weighted')
    

    通常,我调用eval函数如下:

    evaluate(model, test_batches, dictionary, args.save_path + 'predictions.csv')
    

Related