首页 文章

SAS使用值作为列名进行转置并进行汇总

提问于
浏览
0

我正在尝试使用值作为变量名来转置数据并按组汇总数值数据,我尝试使用proc转置和proc报告(跨越)但我不能这样做,我知道这样做的独特方式是使用数据集(如果是else和sum但是更改不是动态的)

例如,我有这个数据集:

school name   subject      picked  saving   expenses
raget  John   math         10      10500    3500
raget  John   spanish      5       1200     2000
raget  Ruby   nosubject    10      5000     1000
raget  Ruby   nosubject    2       3000      0
raget  Ruby   math         3       2000     500
raget  peter  geography    2       1000      0
raget  noname nosubject    0        0       1200

我需要一行,学生姓名“挑选”的总和,以及后来按主题挑选的总和,最后3列是挑选,储蓄和费用的总和:

school john ruby peter noname math spanish geography nosubject  picked saving expenses 
raget   15   15   2      0     13   5        2         12        32     22700  8200

如果我在学校或学科中有新学生,是否可以动态更改?

2 回答

  • 1

    它's a little difficult because you'总结在多个级别,所以我使用了 PROC SUMMARY 并选择了不同的 _TYPE_ 值 . 见下文:

    data have;
      infile datalines;  
      input school $ name $ subject : $10. picked saving expenses;
      datalines;
    raget  John   math         10      10500    3500
    raget  John   spanish      5       1200     2000
    raget  Ruby   nosubject    10      5000     1000
    raget  Ruby   nosubject    2       3000      0
    raget  Ruby   math         3       2000     500
    raget  peter  geography    2       1000      0
    raget  noname nosubject    0        0       1200
    ;
    run;
    
    proc summary data=have;
      class school name subject;
      var picked saving expenses;
      output out=want1 sum(picked)=picked sum(saving)=saving sum(expenses)=expenses;
    run;
    
    proc transpose data=want1 (where=(_type_=5)) out=subs (where=(_NAME_='picked'));
      by school;
      id subject;
    run;
    
    proc transpose data=want1 (where=(_type_=6)) out=names (where=(_NAME_='picked'));
      by school;
      id name;
    run;
    
    proc sql;
      create table want (drop=_TYPE_ _FREQ_ name subject) as
      select
      n.*,
      s.*,
      w.*
      from want1 (where=(_TYPE_=4)) w,
      names (drop=_NAME_) n,
      subs (drop=_NAME_) s
      where w.school = n.school
      and w.school = s.school;
    quit;
    

    我注意到我没有硬编码任何东西(例如没有引用 mathJohn ),因此代码足够动态 .

  • 1

    PROC REPORT是一个有趣的替代方案,特别是如果您想要打印输出而不是数据集 . 您可以使用 ODS OUTPUT 来获取输出数据集,但由于某种原因(它们是“C2”等)而定义了它.'s messy as the variable names aren't . 这个打印输出有点乱,因为 Headers 行不是所希望的 .

    data have;
    input school $ name   $ subject      $ picked  saving   expenses;
    datalines;
    raget  John   math         10      10500    3500
    raget  John   spanish      5       1200     2000
    raget  Ruby   nosubject    10      5000     1000
    raget  Ruby   nosubject    2       3000      0
    raget  Ruby   math         3       2000     500
    raget  peter  geography    2       1000      0
    raget  noname nosubject    0        0       1200
    ;;;;
    run;
    
    ods output report=want;
    proc report nowd data=have;
    columns school (name subject),(picked) picked=picked2 saving expenses;
    define picked/analysis sum ' ';
    define picked2/analysis sum;
    define saving/analysis sum ;
    define expenses/analysis sum;
    define name/across;
    define subject/across;
    define school/group;
    run;
    

相关问题