首页 文章

在R中重塑数据帧(融化?)

提问于
浏览
0

所以,我目前有一个看起来像这样的数据框:

country   continent year lifeExp   pop     gdpPercap
       <fctr>    <fctr> <int>   <dbl>    <int>     <dbl>
1 Afghanistan      Asia  1952  28.801  8425333  779.4453
2 Afghanistan      Asia  1957  30.332  9240934  820.8530
3 Afghanistan      Asia  1962  31.997 10267083  853.1007
4 Afghanistan      Asia  1967  34.020 11537966  836.1971
5 Afghanistan      Asia  1972  36.088 13079460  739.9811
6 Afghanistan      Asia  1977  38.438 14880372  786.1134

有140个国家 . 这些年是5年间隔 . 从1952年到2007年,我希望重塑我的数据框,以便获得 .

Country   gdpPercap(1952)     gdpPercap(1957)   ...   gdpPercap(2007)
      <fctr>      <dbl>
1  Afghanistan   974.5803           ....                      ...
2      Albania  5937.0295           ...                       ...
3      Algeria  6223.3675           ...                       ...
4       Angola  4797.2313
5    Argentina 12779.3796
6    Australia 34435.3674
7      Austria 36126.4927
8      Bahrain 29796.0483
9   Bangladesh  1391.2538
10     Belgium 33692.6051

我的尝试是这样的:

gapminder %>% #my dataframe
  filter(year >= 1952) %>%
  group_by(country) %>%
  summarise(gdpPercap = mean(gdpPercap))

OUTPUT:

country  gdpPercap <- but this takes the mean of gdpPercap from 1952-2007
        <fctr>      <dbl>
1  Afghanistan   802.6746
2      Albania  3255.3666
3      Algeria  4426.0260
4       Angola  3607.1005
5    Argentina  8955.5538
6    Australia 19980.5956
7      Austria 20411.9163
8      Bahrain 18077.6639
9   Bangladesh   817.5588
10     Belgium 19900.7581
# ... with 132 more rows

有任何想法吗? PS:我是R.的新手 . 我也在看熔化() . 任何帮助将不胜感激!

3 回答

  • 1

    tidyr::spread() 会解决您的问题

    library(dplyr); library(tidyr)
    
    gapminder %>% 
      select(country, year, gdpPercap) %>% 
      spread(year, gdpPercap)
    
  • 0

    你也应该在group_by中使用year,在摘要之后,只需使用 dcastrehape 重新整形数据 .

    这是一个示例解决方案:

    library(dplyr)
    library(reshape2)
    gapminder <- data.frame(cbind(gdpPercap=runif(10000), year =as.integer(seq(from=1952, to=2007, by=5)), country = c("India", "US", "UK")))
    gapminder$gdpPercap <- as.numeric(as.character(gapminder$gdpPercap))
    gapminder$year <- as.integer(as.character(gapminder$year))
    gapminder %>% #my dataframe
      filter(year >= 1952) %>%
      group_by(country, year) %>%
      summarise(gdpPercap = mean(gdpPercap)) %>%
       dcast(country ~ year, value.var="gdpPercap")
    

    我必须生成一个新数据,因为您的示例不可重现 . 点击链接How to make a great R reproducible example? . 它有助于回答和理解问题,以及更快的答案 .

  • 2

    内置 reshape 可以做到这一点 .

    foo.data.frame <- data.frame(
        Country=rep(c("Here", "There"), each=3),
        year=rep(c(1952, 1957, 1962),2),
        gdpPercap=779:784
        # ... other variables
    )
    
    reshape(foo.data.frame[, c("Country", "year", "gdpPercap")], 
        timevar="year", idvar="Country", direction="wide", sep=" ")
    
    #   Country gdpPercap 1952 gdpPercap 1957 gdpPercap 1962
    # 1    Here            779            780            781
    # 4   There            782            783            784
    

相关问题