- PROC SORT,其中有两个选项NODUPKEY、NODUPRECS(NODUP),第一个是按照BY变量来去重,第二是比较整条记录来去重,重复的记录可以用DUPOUT=来保留。程序如下:
proc sort data=sashelp.class out=unq nodupkey dupout=dup; by WEIGHT; run;
- HASH,程序如下:
data _null_; if 0 then set sashelp.class; if _n_=1 then do; declare hash h(dataset: 'sashelp.class', ordered: 'y'); h.definekey('WEIGHT'); h.definedata(all:'y'); h.definedone(); end; h.output(dataset: 'uni'); stop; run;
- DATA步,程序如下:
proc sort data=sashelp.class out=class; by WEIGHT; run; data uni dup; set class; by WEIGHT; if first.WEIGHT and last.WEIGHT then output uni; else output dup; run;
- PROC SQL,程序如下:
proc sql; create table uni as select * from sashelp.class group by WEIGHT having count(*) = 1 ; create table dup as select * from sashelp.class group by WEIGHT having count(*) > 1 ; quit;
- HASH,程序(SAS9.2+)如下:
data uni(drop=rc: i); if _n_=1 then do; if 0 then set sashelp.class; dcl hash h1(dataset: 'sashelp.class', multidata:'y'); h1.definekey('WEIGHT'); h1.definedata(all: 'yes'); h1.definedone(); dcl hash h2(dataset: 'sashelp.class'); dcl hiter hi('h2'); h2.definekey('WEIGHT'); h2.definedone(); end; rc1=hi.first(); do while(rc1=0); rc2= h1.find(); i=0; do while(rc2=0 and i < 2); i+1; rc2=h1.find_next(); end; if i < 2 then do; output; if i < 2 then h1.remove(); end;; end; h1.output(dataset: 'dup'); run;