首页 文章

在r中创建年整数的因子变量

提问于
浏览
0

我有一个面板数据集如下 . 但实际数据集有数千个观测值 . 我想创建14个facotors作为1984-1998(15年)的新专栏“Year_dum” . 我在r中搜索创建虚拟变量,但是找不到使用年整数的方法 . 谁能帮我在r做这个 .

+--------+------+------+------+----------+
|  Time  | year | Firm | Prod | Year_dum |
+--------+------+------+------+----------+
| Jan-84 | 1984 | A    | 28.2 |        0 |
| Feb-84 | 1984 | A    | 26.6 |        0 |
| Mar-84 | 1984 | A    | 30.3 |        0 |
| Apr-85 | 1985 | A    | 33.2 |        1 |
| May-85 | 1985 | A    | 30.1 |        1 |
| Jun-85 | 1985 | A    | 28.3 |        1 |
| Jan-84 | 1984 | B    | 28.6 |        0 |
| Feb-84 | 1984 | B    | 28.9 |        0 |
| Mar-84 | 1984 | B    | 28.1 |        0 |
| Oct-84 | 1984 | C    | 28.8 |        0 |
| Nov-85 | 1985 | C    | 31.6 |        1 |
| Dec-86 | 1986 | C    | 26.9 |        2 |
| Jan-89 | 1989 | C    | 28.6 |        5 |
| Feb-98 | 1998 | C    | 29.6 |       14 |
+--------+------+------+------+----------+

可以使用以下dput访问此简单数据集 .

structure(list(Time = structure(c(6L, 4L, 9L, 2L, 10L, 8L, 6L, 
4L, 9L, 12L, 11L, 3L, 7L, 5L, 1L, 1L, 1L), .Label = c("", "Apr-85", 
"Dec-86", "Feb-84", "Feb-98", "Jan-84", "Jan-89", "Jun-85", "Mar-84", 
"May-85", "Nov-85", "Oct-84"), class = "factor"), year = c(1984L, 
1984L, 1984L, 1985L, 1985L, 1985L, 1984L, 1984L, 1984L, 1984L, 
1985L, 1986L, 1989L, 1998L, NA, NA, NA), Firm = structure(c(2L, 
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 1L, 1L, 1L
), .Label = c("", "A", "B", "C"), class = "factor"), Prod = c(28.2, 
26.6, 30.3, 33.2, 30.1, 28.3, 28.6, 28.9, 28.1, 28.8, 31.6, 26.9, 
28.6, 29.6, NA, NA, NA), Year_dum = c(0L, 0L, 0L, 1L, 1L, 1L, 
0L, 0L, 0L, 0L, 1L, 2L, 5L, 14L, NA, NA, NA)), .Names = c("Time", 
"year", "Firm", "Prod", "Year_dum"), class = "data.frame", row.names = c(NA, 
-17L))

3 回答

  • 2

    我们可以尝试

    df$Year_dum <- df$year-min(df$year)
    df$Year_dum
    #[1] 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1
    

    或者使用 match

    with(df, match(year, unique(year))-1)
    #[1] 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1
    
  • 0

    例如,您可以使用 dummies 包(首先使用 install.packages("dummies") 安装它) . 一个例子:

    library(dummies)
    
    df <- data.frame("val" = 1:5, "year" = c(1984, 1984, 1985, 1985, 1986))
    # after creating the dummies, column-bind it to the original dataframe
    df <- cbind(df, dummy("year", df, sep = "_"))
    > df
    
      val year year_1984 year_1985 year_1986
    1   1 1984         1         0         0
    2   2 1984         1         0         0
    3   3 1985         0         1         0
    4   4 1985         0         1         0
    5   5 1986         0         0         1
    
  • 0

    以下是仅使用 base 的示例:

    for(i in 1:nrow(x)) assign(paste("year", x$year[i], sep="_"), x$year == x$year[i])
    

相关问题