項目遇到非4的倍數的sample size,百度知網均沒有找到現成的‘輪子’,嘗試自己造。幾名統計師討論之后得到的結論,望大家批評指正。
首先,常規的隨機區組使用 proc plan,沒有異議。但是只能在factors 設置相同長度的區組長度,如24例受試者,factors block=6 length=4,
cite:SAS proc plan的內部邏輯為先隨機出6個區組,后給每個區組出4個隨機數
(
The procedure first generates a random permutation of the integers 1 to 4 and then, for each of these, generates a random permutation of the integers 1 to 3. You can think of factor Two
as being nested within factor One
, where the levels of factor One
are to be randomly assigned to 4 units.
)
本例邏輯是基於完全隨機思想,id=1 2 3 4 5 ...,rand=uniform(seed),
見 code
/*上文第3步*/
舉例:若sample size為30,邏輯為:
- 設置區組長度length為 4 4 4 4 4 4 6
- 給length隨機排序,eg: 4 4 4 4 6 4 4
- 創建block_ID,如第二區組中的四個受試者為24 24 24 24 ,第五個區組的受試者為56 56 56 56 56 56(此步為了方便后面proc rank 的by statement)
- 給每個受試者一個隨機數
- 在區組內排序、分組
code如下:
data core;
input blockl blockID;
cards;
4 1
4 2
4 3
4 4
4 5
4 6
6 7
;/*保證sum(blockl)=30*/
run;
%macro rd_blk;
/*先區組隨機排序*/
data random;
set core;
rand=uniform(2020);
output;
run;
proc rank data=random out=rank;
var rand;
ranks r_rank;
run;
proc sort data=rank out=result;
by r_rank;
run;
/*上文第3步*/
data core1;
set result;
do i=1 to blockl;
lent=blockl;
block_id=input(compress(r_rank)||compress(lent),8.);
output;
end;
run;
data random1;
set core1;
rand1=uniform(111);
run;
proc rank data=random1 out=rank1;
by block_id ;
var rand1;
ranks r_rank1;
run;
%mend rd_blk;
%rd_blk;
data result1(keep=blockID r_rank1 SUBJid g);
set rank1;
retain SUBJid 100;
SUBJid+1;
id=input(compress(lent)||compress(r_rank1),8.);
if id in (41 42 61 62 63) then g='A(T-R)';
if id in (43 44 64 65 66) then g='B(R-T)';
label blockID='區組號' r_rank1='隨機數' SUBJid='受試者編號' g='組別';
run;
proc sql;
select count(*) as numA from result1
where g='A';
quit;
proc sql;
select count(*) as numB from result1
where g='B';
quit;
其中需要自定義的有:區組長度和等於sample size、種子數、最后的分組,若blockl=8,
if id in (81 82 83 84) then g='A';
if id in (85 86 87 88) then g='B';