首页 文章

如何将两列(纬度/经度)中的数据与另外两列中的最接近值进行匹配?

提问于
浏览
1

我有一个已知的纬度/经度坐标列表,我需要在纬度/经度网格中找到它们并拉出相邻的数据 . 我的已知纬度/经度坐标位于数据框中,如:

LatLong <- structure(list(Lat_orig = c(-55.417, -55.417, -55.417, -55.417, 
-55.417), Long_orig = c(-69.58, -69.249, -69.0831, -69.417, -69.749
), Lat_new = c(NA, NA, NA, NA, NA), Long_new = c(NA, NA, NA, 
NA, NA), Jan = c(NA, NA, NA, NA, NA), Feb = c(NA, NA, NA, NA, 
NA), Mar = c(NA, NA, NA, NA, NA), Apr = c(NA, NA, NA, NA, NA), 
May = c(NA, NA, NA, NA, NA), Jun = c(NA, NA, NA, NA, NA), 
Jul = c(NA, NA, NA, NA, NA), Aug = c(NA, NA, NA, NA, NA), 
Sep = c(NA, NA, NA, NA, NA), Oct = c(NA, NA, NA, NA, NA), 
Nov = c(NA, NA, NA, NA, NA), Dec = c(NA, NA, NA, NA, NA)), .Names = c("Lat_orig", 
"Long_orig", "Lat_grid", "Long_grid", "Jan", "Feb", "Mar", "Apr", 
"May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"), class = "data.frame", row.names = c(NA, 
-5L))



    Lat_orig    Long_orig   Lat_grid    Long_grid   Jan Feb Mar Apr May  Jun    Jul Aug Sep Oct Nov Dec
-55.417 -69.5800    NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
-55.417 -69.2490    NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
-55.417 -69.0831    NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
-55.417 -69.4170    NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
-55.417 -69.7490    NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA

我有第二个包含网格化全局数据集的数据框 . 它非常大,但这里有一小块:

Grid <- structure(list(lat = c(-55.417, -55.417, -55.417, -55.417, -55.417
), long = c(-69.75, -69.583, -69.417, -69.25, -69.083), jan = c(8.5, 
8.5, 8.4, 8.7, 8.8), feb = c(8.4, 8.5, 8.3, 8.6, 8.8), mar = c(7.3, 
7.3, 7.2, 7.5, 7.6), apr = c(5.8, 5.8, 5.7, 5.9, 6), may = c(4, 
3.9, 3.7, 4, 4), jun = c(2.7, 2.7, 2.4, 2.7, 2.7), jul = c(2.2, 
2.2, 2, 2.2, 2.3), aug = c(2.6, 2.6, 2.4, 2.7, 2.8), sep = c(3.8, 
3.9, 3.7, 4, 4.1), oct = c(5.5, 5.5, 5.3, 5.7, 5.8), nov = c(6.6, 
6.7, 6.5, 6.9, 7), dec = c(7.9, 7.9, 7.7, 8.1, 8.2)), .Names = c("lat", 
"long", "jan", "feb", "mar", "apr", "may", "jun", "jul", "aug", 
"sep", "oct", "nov", "dec"), class = "data.frame", row.names = c(NA, 
-5L))



    lat long    jan feb mar apr may jun jul aug sep oct nov dec
-55.417 -69.750 8.5 8.4 7.3 5.8 4.0 2.7 2.2 2.6 3.8 5.5 6.6 7.9
-55.417 -69.583 8.5 8.5 7.3 5.8 3.9 2.7 2.2 2.6 3.9 5.5 6.7 7.9
-55.417 -69.417 8.4 8.3 7.2 5.7 3.7 2.4 2.0 2.4 3.7 5.3 6.5 7.7
-55.417 -69.250 8.7 8.6 7.5 5.9 4.0 2.7 2.2 2.7 4.0 5.7 6.9 8.1
-55.417 -69.083 8.8 8.8 7.6 6.0 4.0 2.7 2.3 2.8 4.1 5.8 7.0 8.2

我需要在 Grid 中找到 LatLong 的每个纬度/经度坐标,然后将 Gridjan 中的相邻数据拉到 dec 并将它们放入数据框 LatLong 中的相应列中 . 当我手动执行此操作时,我首先找到最近的纬度,然后查看相关的经度以找到最接近的匹配 . 这会给我一个像这样的解决方案:

Lat_orig    Long_orig   Lat_grid    Long_grid   Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
-55.417 -69.58  -55.417 -69.583 8.5 8.5 7.3 5.8 3.9 2.7 2.2 2.6 3.9 5.5 6.7 7.9
-55.417 -69.249 -55.417 -69.25  8.7 8.6 7.5 5.9 4   2.7 2.2 2.7 4   5.7 6.9 8.1
-55.417 -69.0831    -55.417 -69.083 8.8 8.8 7.6 6   4   2.7 2.3 2.8 4.1 5.8 7   8.2
-55.417 -69.417 -55.417 -69.417 8.4 8.3 7.2 5.7 3.7 2.4 2   2.4 3.7 5.3 6.5 7.7
-55.417 -69.749 -55.417 -69.75  8.5 8.4 7.3 5.8 4   2.7 2.2 2.6 3.8 5.5 6.6 7.9

请注意,在我的示例中,所有纬度值都是常量,但这两个值在两个数据帧中也有所不同 .

有谁知道最好的方法吗?我尝试使用 Imap 包中的 gdist ,我可以找到最近的点,但一次只能找到一个坐标!有没有人知道找到这些点并将这些数据移动到新数据帧的好方法?

1 回答

  • 1

    广义问题,一维:

    给定一组采样点和一组参考点,如何将点映射到其最近的参考点?

    让我们生成一些点和参考点 .

    set.seed(100)
    pp <- sample(0:100, 10, replace = FALSE)
    # [1] 31 25 54  5 45 46 77 34 50 15
    rr <- sort(sample(0:100, 10, replace = FALSE))
    # [1]  19  27  33  39  63  64  73  88  93 100
    

    使用 findInterval 和中点:

    ## finds midpoints between reference points
    midpoints <- head(rr,-1) + diff(rr)/2
    # [1] 23.0 30.0 36.0 51.0 63.5 68.5 80.5 90.5 96.5
    ## determines which reference interval each sample point falls into
    intv <- findInterval(pp, midpoints)
    # [1] 2 1 4 0 3 3 6 2 3 0
    ## index back into reference point to find closest reference point
    rr[intv+1]
    # [1] 33 27 63 19 39 39 73 33 39 19
    

    对你的纬度和经度做这个,你可以找到合适的点 .

    要拉出剩余的记录数据,请使用 merge 但要注意浮点错误) .

相关问题