最简单的操作栅格数据的方法，用于离散日常温度的年度分布-Java 学习之路

我在大型 RasterBrick 对象中有德国' historical daily temperature observation (15 years'历史日平均温度的栅格网格数据 . 以下是我的栅格网格化数据的样子：

> Temperature_rasterData
class       : RasterBrick 
dimensions  : 31, 37, 1147, 5479  (nrow, ncol, ncell, nlayers)
resolution  : 0.25, 0.25  (x, y)
extent      : 5.75, 15, 47.25, 55  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
data source : in memory
names       : X1980.01.01, X1980.01.02, X1980.01.03, X1980.01.04, X1980.01.05, X1980.01.06, X1980.01.07, X1980.01.08, X1980.01.09, X1980.01.10, X1980.01.11, X1980.01.12, X1980.01.13, X1980.01.14, X1980.01.15, ... 
min values  :       -9.24,      -11.32,      -12.05,      -14.12,       -7.91,       -6.35,       -6.74,       -7.77,       -9.79,      -10.17,      -12.20,      -14.90,      -15.68,      -15.61,      -15.22, ... 
max values  :        2.19,        0.68,        0.30,        2.91,        5.25,        5.03,        4.33,        3.40,        1.52,        0.33,       -1.10,       -1.61,       -3.55,       -0.12,        0.19, ...

但是，我打算将每日温度的年度分布区分为一组固定的温度箱（我每年需要10箱），在这里你可以找到详细的方法：Temperature Effects on Productivity and Factor Reallocation . 为此，我需要从所有这些多层栅格网格化数据中找到最大和最小温度值 . 寻找温度范围的原因是因为我需要根据 MAX/MIN 温度值来划分每个网格中每日温度的年度分布 .

不幸的是，在这里我无法在 R 中重现这些多层 RaterBrick 数据，因为原始栅格网格化数据非常大且难以重现小栅格 . 我希望 SO 社区能够了解情况 . 以下是可重复使用的较小栅格数据：please give it try smallest example raster data，这是我的 R 脚本，用于处理下载的栅格数据：

temp_raster <- raster::stack('~/tg_day_2017_grid_ensmean.nc')
data(wrld_simpl) 
Germany <- wrld_simpl[wrld_simpl@data$NAME == "Germany",]
deu_ext <- extent(Germany)
Deu_crop <- crop(temp_raster ,deu_ext)

为了获得这些多个栅格后期数据的温度范围，我尝试了以下内容并不聪明，因为我需要一个更简化的解决方案 . 这是我在 R 的尝试：

nms <- names(Deu_crop)
yrs <- unique(sub('X(\\d+).+','\\1',nms))

getRange <- lapply(yrs,function(x) {
    range(Deu_crop[[grep(x,nms)]],na.rm=TRUE)
})

我真的不知道如何将大型 RasterBrick 对象中的数据离散化 . 特别是，对于我来说，如何操纵数据以进行离散化目的并不十分清楚，因为这个数据具有多个具有巨大日平均温度观测值的层 . 我怎样才能在R中实现这一点？是否可以操纵多层 raster 数据进行离散化？任何的想法？

如果有更简单的方法来操纵大型数据，我如何将每日温度的年度分布进行离散化并制作每年的条形图？在R中完成这项任务最简单的方法是什么？提前致谢！

以下是我想从多层 raster 数据制作的条形图：

enter image description here

Update ：

我将把每个德国' regions (AKA, polygon), here is the Germany' NUTS地区每年每日温度观测的年度分布进行离散化：Germany' shapefile .

1 回答

这是一个解决方案（包括一个可重复的例子）：

library(raster)
library(lubridate)
library(tidyverse)

# creating some fake temperature data which matches your rasterstack

# create template raster
r <- raster(xmn=5.75, xmx= 15, ymn = 47.25, ymx =55,res=c(0.25,0.25))

# add fake temperature values
Deu_crop <- do.call(stack,lapply(1:5479,function(i) setValues(r,round(runif(n = ncell(r),min = -10,max = 25)))))

# add layer names
names(Deu_crop) <- paste0('X',gsub('-','.',ymd('1980.01.01') + days(1:5479)))

# check rasterstack

Deu_crop

# output
#
# class       : RasterStack 
# dimensions  : 31, 37, 1147, 5479  (nrow, ncol, ncell, nlayers)
# resolution  : 0.25, 0.25  (x, y)
# extent      : 5.75, 15, 47.25, 55  (xmin, xmax, ymin, ymax)
# coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
# names       : X1980.01.02, X1980.01.03, X1980.01.04, X1980.01.05, X1980.01.06, X1980.01.07, ...
# min values  :         -10,         -10,         -10,         -10,         -10,         -10, ...
# max values  :          25,          25,          25,          25,          25,          25, ...

所以 Deu_crop 应该在结构方面可以与您的数据共存，当然还有随机温度值 .

shapefile不易重现，所以我已下载并使用它 . 正如我已经提到的，一些多边形对于提取来说有点小 .

最快的方法是将shapefile光栅化以匹配您的数据栅格，但是某些多边形不会被转换而其他多边形可能转换为错误的单元格...所以在这种情况下，最好直接使用 raster::extract shapefile，尽管它可以忍受 - 在此期间喝咖啡 .

shp <- shapefile('eurostat_NUTS3_29-May-18/deu_adm_2006.shp')

# coffee time
e <- extract(Deu_crop,shp)

# add NUTS_ID as names to list 

names(e) <- shp$NUTS_ID

要计算每个bin每年的天数，我创建一个使用 tidiverse 功能的函数，并使用 lapply 迭代整个提取列表（一个列表项对应一个多边形）：

# define bins

bins <- seq(-10,25,length.out = 5)

myfun <- function(ix){

gather(data.frame(e[[ix]],stringsAsFactors = F),'colname','temp') %>% 
    group_by(colname) %>% summarise(temp = mean(temp)) %>% ungroup() %>% # spatial mean
    mutate(year = sub('X(\\d{4}).+','\\1',colname)) %>% # get years
  select(- colname) %>% # drop colname column
  mutate(bin1= (temp <= bins[1]) * 1) %>%  # bin1
  mutate(bin2= (temp > bins[1] & temp <= bins[2]) * 1) %>% # bin2
  mutate(bin3= (temp > bins[2] & temp <= bins[3]) * 1) %>% # bin3
  mutate(bin4= (temp > bins[3] & temp <= bins[4]) * 1) %>% # bin4
  mutate(bin5= (temp > bins[4] & temp <= bins[5]) * 1) %>% # bin5
  mutate(bin6= (temp > bins[5]) * 1) %>% select(-temp) %>% # bin6
  group_by(year) %>% summarise_all(funs(sum)) %>% mutate(NUTS_ID = names(e)[ix]) # drop year, calculate occurences and add NUTS_ID

}

# create single dataframe

result <- do.call(rbind,lapply(1:length(e),function(ix) myfun(ix)))

快速查看 result 变量：

result

# output:
#
# # A tibble: 6,864 x 8
# year  bin1  bin2  bin3  bin4  bin5  bin6 NUTS_ID
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>   <chr>
# 1  1980    12    85    91    92    85     0   DEA54
# 2  1981     3    64    99   113    86     0   DEA54
# 3  1982     3    80   113    86    83     0   DEA54
# 4  1983     6    84    90    85   100     0   DEA54
# 5  1984     8    90    92    86    90     0   DEA54
# 6  1985     5    86    85    95    94     0   DEA54
# 7  1986     6    74    97   108    80     0   DEA54
# 8  1987     4    82    99    94    86     0   DEA54
# 9  1988     3    89    87    91    96     0   DEA54
#10  1989     8   103    92    73    89     0   DEA54
# # ... with 6,854 more rows

更新：

为了处理这些箱子我首先从整个数据的最小值和最大值计算箱子，然后我使用一个新函数 createBins 将它们添加到每个多边形的提取物中 . 这将取代原始解决方案中的 myfun 部分 .

# new function

createBins <- function(df,bins_mat){

  for (i in 1:nrow(bins_mat)){

    bin <- sprintf('Bin%s;%s;%s',bins_mat[i,1],bins_mat[i,2],bins_mat[i,3])

    if (i ==1) df <- df %>% mutate(!!bin := (temp >= bins_mat[i,2] & temp <= bins_mat[i,3])*1)
    else df <- df %>% mutate(!!bin := (temp > bins_mat[i,2] & temp <= bins_mat[i,3])*1)
  }
  return(df)
}

# new version of myfun

myfun2 <- function(ix,bins_mat){
gather(data.frame(e[[ix]],stringsAsFactors = F),'colname','temp') %>% 
    group_by(colname) %>% summarise(temp = mean(temp)) %>% ungroup() %>% # spatial mean
    mutate(year = sub('X(\\d{4}).+','\\1',colname)) %>% # get years
    select(- colname) %>%  # drop colname column
    createBins(.,bins_mat) %>% select(-temp) %>%  
    group_by(year) %>% summarise_all(funs(sum)) %>% mutate(NUTS_ID = names(e)[ix])
}


# 11 values to create 10 interval bins

bins <- seq(min(cellStats(Deu_crop,'min')),min(cellStats(Deu_crop,'max')),length.out = 11)

# create a bin matrix (number, bin_minimum, bin_maximum) for later function

bins_mat <- cbind(1:10,bins[1:10],bins[2:11])

# create new result

result <- do.call(rbind,lapply(1:length(e),function(ix) myfun2(ix,binsmat)))

回复于 2024-05-08T09:30:01+08:00

最简单的操作栅格数据的方法，用于离散日常温度的年度分布

1 回答

更新：

相关问题