首页 文章

如何计算描述性统计[关闭]

提问于
浏览
-4

我有类的数据框,包含名称,性别(女性,男性)和身高的变量 . 我需要按性别计算身高变量的描述性统计数据 . 我想在输出中有以下内容,任何帮助将不胜感激 .

GenderFreqMeanMedianModeStdDevminmax

2 回答

  • 1

    函数 summarise() 来自包 dplyr 将是一个很好的解决方案:

    library('dplyr')
    df %>% 
      na.omit %>%
      group_by(gender) %>% 
      summarise(Freq = n(), #unshure, maybe n()/NROW(df)
                Mean = mean(height),
                Median = median(height),
                Mode = moda(height),
                Std.Dev = sd(height),
                min = min(height),
                max = max(height))
    

    其中 moda(x) 是来自连续变量的样本的estimatin模式的函数:

    moda <- function(x, na.omit = TRUE){
      if (na.omit) x <- na.omit(x)
      d <- density(x)
      return(d$x[which.max(d$y)])
    }
    
  • 0

    尝试使用这个

    # assuming df is your data.frame and gender, height are column names
    tapply(df$height, df$gender, function(grp) c( Freq = length(grp), mean = mean(grp),mode = mode(grp),SD =  sd(grp),min =  min(grp),max = max(grp)))
    

相关问题