首页 文章

找出一列中的序列与另一列中的序列相同的百分比

提问于
浏览
2

我希望我能恰当地说明这一点 . 我有一个数据集,有两列我想在内存实验中进行比较 . Recall.CRESP 是一个列,用于指定通过网格坐标选择的内存测试的正确答案 . Recall.RESP 显示参与者的回应 .

列看起来像这样:

|Recall.CRESP                     | Recall.RESP                     |
|---------------------------------|---------------------------------|                 
|grid35grid51grid12grid43grid54   | grid35grid51grid12grid43grid54  |                
|grid22grid53grid35grid21grid44   | grid23grid53grid35grid21grid43  |
|grid12grid14grid15grid41grid23   | grid12grid24grid31grid41grid25  |
|grid15grid41grid33grid24grid55   | grid15grid41grid33grid14grid55  |

我有以下代码行告诉我每行的列相互相同的百分比:

paste0((100*with(Data, mean(Recall.CRESP==Recall.RESP, na.rm = "TRUE"))), "%")

因此,例如,在我的数据集中,20%的时间列 Recall.CRESP 完全匹配 Recall.RESP ,表示主体在20%的时间内在内存测试中得分为5分(满分5分) .

但是,我希望能够以两种方式扩展这一点 . 第一个不是给我一个百分比的行何时相同,我想在序列中有部分匹配时的百分比 . 例如 grid11gird42gird22grid51grid32grid11gird15gird55grid42grid32 共享2/5的匹配,第一个和最后一个网格坐标是相同的 . 我不知道如何在R中为2/5的部分序列匹配(或5中的任何其他结果)指定请求 . 另请注意,在此示例中, grid42 显示在两个序列中,但由于在 Recall.RESP 中记住了位置,因此无法正确调用 . 顺序在这些序列中很重要 .

另一点是,到目前为止,我已经在检查记忆项目的前向召回的准确性方面描述了实验 . 然而,我也有单独的数据,参与者以倒退的顺序回忆 . 例如,来自 Recall.CRESPgrid11gird22gird33grid44grid55 和来自 Recall.RESPgrid51grid44grid33grid22grid11 正确匹配4/5次 . 如何转换代码以检查反向序列并计算5个百分比?

任何想法将不胜感激 .

2 回答

  • 1

    这是我的解决方案:

    Recall.CRESP <- c('grid35grid51grid12grid43grid54',
                      'grid22grid53grid35grid21grid44',
                      'grid12grid14grid15grid41grid23',
                      'grid15grid41grid33grid24grid55')
    
    Recall.RESP <- c('grid35grid51grid12grid43grid54',
                     'grid23grid53grid35grid21grid43',
                     'grid12grid24grid31grid41grid25',
                     'grid15grid41grid33grid14grid55')
    
    df <- data.frame(Recall.CRESP, Recall.RESP, stringsAsFactors = F)
    df$correctNormal <- NA
    df$correctReverse <- NA
    
    for (row in 1:nrow(df)) {
      crespVector <- unlist(strsplit(as.character(df[row, 1]), 'grid'))[-1]
      respVector <- unlist(strsplit(as.character(df[row, 2]), 'grid'))[-1]
      correctNormal <- 0
      correctReverse <- 0
      for (i in 1:length(crespVector)) {
        if (crespVector[i] == respVector[i]) correctNormal <- correctNormal + 1
        if (crespVector[i] == respVector[length(respVector) + 1 - i]) correctReverse <- correctReverse + 1
      }
      df$correctNormal[row] = correctNormal / 5
      df$correctReverse[row] = correctReverse / 5
    }
    
    df
    
    ##                     Recall.CRESP                    Recall.RESP correctNormal correctReverse
    ## 1 grid35grid51grid12grid43grid54 grid35grid51grid12grid43grid54           1.0            0.2
    ## 2 grid22grid53grid35grid21grid44 grid23grid53grid35grid21grid43           0.6            0.2
    ## 3 grid12grid14grid15grid41grid23 grid12grid24grid31grid41grid25           0.4            0.0
    ## 4 grid15grid41grid33grid24grid55 grid15grid41grid33grid14grid55           0.8            0.2
    
  • 1

    我会将字符串分成矩阵列,这将使它们易于比较和操作:

    # borrowing Oriol's nicely shared data
    Recall.CRESP <- c('grid35grid51grid12grid43grid54',
                      'grid22grid53grid35grid21grid44',
                      'grid12grid14grid15grid41grid23',
                      'grid15grid41grid33grid24grid55')
    
    Recall.RESP <- c('grid35grid51grid12grid43grid54',
                     'grid23grid53grid35grid21grid43',
                     'grid12grid24grid31grid41grid25',
                     'grid15grid41grid33grid14grid55')
    
    # function to create matrices
    matrixify = function(dat) {
        dat = do.call(rbind, strsplit(dat, split = "grid"))
        dat = dat[, -1]
        mode(dat) = "numeric"
        return(dat)
    }
    
    cresp_mat = matrixify(Recall.CRESP)
    resp_mat = matrixify(Recall.RESP)
    
    ## an example of what we made: just the numbers in the right order
    cresp_mat
    #      [,1] [,2] [,3] [,4] [,5]
    # [1,]   35   51   12   43   54
    # [2,]   22   53   35   21   44
    # [3,]   12   14   15   41   23
    # [4,]   15   41   33   24   55
    
    ## Calculating results is now easy:
    (forwards = rowMeans(cresp_mat == resp_mat))
    # [1] 1.0 0.6 0.4 0.8
    
    (reverse = rowMeans(cresp_mat == resp_mat[, 5:1]))
    # [1] 0.2 0.2 0.0 0.2
    

    当然,您可以将结果指定为原始数据的新列 .

相关问题