Design Resources Server |
|
||||
IASRI | |||||
Home |
Analysis Using SASFor
performing the analysis, following steps may be used: Data Input: /*please see that if varn
is replaced by treatment and yield by the character
required, then this code can be used for any other situation
as well*/ Prepare a SAS data file
using options nodate nonumber ; data
wholedata; input loc $
rep
blk
varn
syield ; cards; Shillongani
1
1
1
579 Shillongani
1
1
5
532 Shillongani
1
1
9
579 Shillongani
1
1
13
463 Shillongani
1
1
17
694 Shillongani
1
1
21
671 Shillongani
1
2
2
1157 Shillongani
1
2
6
880 Shillongani
1
2
10
509 Shillongani
1
2
14
856 Shillongani
1
2
18
625 Shillongani
1
2
22
532 Shillongani
1
3
3
926 Shillongani
1
3
7
347 Shillongani
1
3
11
694 Shillongani
1
3
15
648 Shillongani
1
3
19
579 Shillongani
1
3
23
810 Shillongani
1
4
4
694 Shillongani
1
4
8
694 Shillongani
1
4
12
347 Shillongani
1
4
16
810 Shillongani
1
4
20
926 Shillongani
1
4
24
856 Shillongani
2
1
1
463 Shillongani
2
1
6
810 Shillongani
2
1
11
440 Shillongani
2
1
16
625 Shillongani
2
1
19
417 Shillongani
2
1
22
579 Shillongani
2
2
2
810 Shillongani
2
2
7
417 Shillongani
2
2
12
463 Shillongani
2
2
13
579 Shillongani
2
2
20
880 Shillongani
2
2
23
810 Shillongani
2
3
3
856 Shillongani
2
3
8
694 Shillongani
2
3
9
810 Shillongani
2
3
14
741 Shillongani
2
3
17
926 Shillongani
2
3
24
741 Shillongani
2
4
4
463 Shillongani
2
4
5
694 Shillongani
2
4
10
741 Shillongani
2
4
15
694 Shillongani
2
4
18
810 Shillongani
2
4
21
810 Shillongani
3
1
1
579 Shillongani
3
1
7
370 Shillongani
3
1
12
347 Shillongani
3
1
15
417 Shillongani
3
1
18
463 Shillongani
3
1
24
880 Shillongani
3
2
2
694 Shillongani
3
2
8
833 Shillongani
3
2
9
810 Shillongani
3
2
16
856 Shillongani
3
2
19
463 Shillongani
3
2
21
833 Shillongani
3
3
3
1042 Shillongani
3
3
5
1065 Shillongani
3
3
10
579 Shillongani
3
3
13
810 Shillongani
3
3
20
810 Shillongani
3
3
22
579 Shillongani
3
4
4
856 Shillongani
3
4
6
995 Shillongani
3
4
11
463 Shillongani
3
4
14
810 Shillongani
3
4
17
463 Shillongani
3
4
23
926 Jagdalpur
1
1
1
470.9 Jagdalpur
1
1
5
609.4 Jagdalpur
1
1
9
692.5 Jagdalpur
1
1
13
554 Jagdalpur
1
1
17
415.5 Jagdalpur
1
1
21
831 Jagdalpur
1
2
2
277 Jagdalpur
1
2
6
304.7 Jagdalpur
1
2
10
637.1 Jagdalpur
1
2
14
637.1 Jagdalpur
1
2
18
409.4 Jagdalpur
1
2
22
831 Jagdalpur
1
3
3
831 Jagdalpur
1
3
7
221.6 Jagdalpur
1
3
11
692.5 Jagdalpur
1
3
15
637.5 Jagdalpur
1
3
19
554 Jagdalpur
1
3
23
554 Jagdalpur
1
4
4
692.5 Jagdalpur
1
4
8
360.1 Jagdalpur
1
4
12
609.4 Jagdalpur
1
4
16
692.5 Jagdalpur
1
4
20
831 Jagdalpur
1
4
24
1108 Jagdalpur
2
1
1
554 Jagdalpur
2
1
6
332.4 Jagdalpur
2
1
11
831 Jagdalpur
2
1
16
692.5 Jagdalpur
2
1
19
554 Jagdalpur
2
1
22
969 Jagdalpur
2
2
2
249.3 Jagdalpur
2
2
7
221.6 Jagdalpur
2
2
12
775.6 Jagdalpur
2
2
13
554 Jagdalpur
2
2
20
969.5 Jagdalpur
2
2
23
969.5 Jagdalpur
2
3
3
969.5 Jagdalpur
2
3
8
387.8 Jagdalpur
2
3
9
831 Jagdalpur
2
3
14
675.6 Jagdalpur
2
3
17
470.9 Jagdalpur
2
3
24
1246 Jagdalpur
2
4
4
637.1 Jagdalpur
2
4
5
775.6 Jagdalpur
2
4
10
415.5 Jagdalpur
2
4
15
692.5 Jagdalpur
2
4
18
387.8 Jagdalpur
2
4
21
914.1 Jagdalpur
3
1
1
554 Jagdalpur
3
1
7
277 Jagdalpur
3
1
12
554 Jagdalpur
3
1
15
692.5 Jagdalpur
3
1
18
415.5 Jagdalpur
3
1
24
886.4 Jagdalpur
3
2
2
332.4 Jagdalpur
3
2
8
415.5 Jagdalpur
3
2
9
692.5 Jagdalpur
3
2
16
692.5 Jagdalpur
3
2
19
554 Jagdalpur
3
2
21
692.5 Jagdalpur
3
3
3
831 Jagdalpur
3
3
5
775.6 Jagdalpur
3
3
10
637.1 Jagdalpur
3
3
13
692.5 Jagdalpur
3
3
20
1052 Jagdalpur
3
3
22
831 Jagdalpur
3
4
4
664.8 Jagdalpur
3
4
6
304.7 Jagdalpur
3
4
11
554 Jagdalpur
3
4
14
554 Jagdalpur
3
4
17
415.5 Jagdalpur
3
4
23
692.5 Kanke
1
1
1
556 Kanke
1
1
5
729 Kanke
1
1
9
609 Kanke
1
1
13
521 Kanke
1
1
17
671 Kanke
1
1
21
486 Kanke
1
2
2
498 Kanke
1
2
6
602 Kanke
1
2
10
553 Kanke
1
2
14
683 Kanke
1
2
18
752 Kanke
1
2
22
755 Kanke
1
3
3
748 Kanke
1
3
7
475 Kanke
1
3
11
394 Kanke
1
3
15
646 Kanke
1
3
19
465 Kanke
1
3
23
625 Kanke
1
4
4
597 Kanke
1
4
8
532 Kanke
1
4
12
440 Kanke
1
4
16
519 Kanke
1
4
20
880 Kanke
1
4
24
660 Kanke
2
1
1
523 Kanke
2
1
6
741 Kanke
2
1
11
347 Kanke
2
1
16
567 Kanke
2
1
19
486 Kanke
2
1
22
808 Kanke
2
2
2
590 Kanke
2
2
7
694 Kanke
2
2
12
498 Kanke
2
2
13
826 Kanke
2
2
20
579 Kanke
2
2
23
613 Kanke
2
3
3
683 Kanke
2
3
8
706 Kanke
2
3
9
539 Kanke
2
3
14
595 Kanke
2
3
17
683 Kanke
2
3
24
664 Kanke
2
4
4
660 Kanke
2
4
5
660 Kanke
2
4
10
556 Kanke
2
4
15
544 Kanke
2
4
18
789 Kanke
2
4
21
535 Kanke
3
1
1
602 Kanke
3
1
7
602 Kanke
3
1
12
660 Kanke
3
1
15
556 Kanke
3
1
18
671 Kanke
3
1
24
671 Kanke
3
2
2
463 Kanke
3
2
8
544 Kanke
3
2
9
509 Kanke
3
2
16
475 Kanke
3
2
19
602 Kanke
3
2
21
556 Kanke
3
3
3
665 Kanke
3
3
5
826 Kanke
3
3
10
752 Kanke
3
3
13
579 Kanke
3
3
20
625 Kanke
3
3
22
903 Kanke
3
4
4
810 Kanke
3
4
6
856 Kanke
3
4
11
532 Kanke
3
4
14
671 Kanke
3
4
17
597 Kanke
3
4
23
856 ; proc sort;
/*
This SAS statement sort the data with respect to the
locations in the data*/ by loc;
/* if one has years in place of locations, replace loc by
year*/ run;
/*
To create a table for degrees of freedom and the respective
mean square errors for each *use
ods to output the anova table; ods output
overallanova=MSerror; ods output
LSmeans=lsmean; *
1.
To perform the analysis of data for each of the locations
separately one can use the following SAS statements. proc glm data = wholedata; class rep blk varn; model syield =rep blk(rep) varn ; means varn; lsmeans varn; by loc; quit; ods output close; ods output close;
ods trace
off; *
2.
To test the homogeneity of error variances using Bartlett's
Chi-square test one can use the /*
This creates a data set named "required"
containing MSE for each location with their respective data
required; set MSerror(where=(source='Error') keep=loc source
df ms); run; /*To
test the homogeneity of error variances, apply the
Bartlett's Chi-square test as follows*/ /*
SAS Code for testing the homogeneity of variances, when
variances and the degrees of freedom are given. It
is useful for testing the homogeneity of error variances,
when the experiments are conducted over environments.*/ /*This
code is written by Dr Rajender Parsad and Sh. Ajeet Kumar*/ /*IASRI,
Library Avenue, New Delhi, 110 012, India*/ proc iml; use required; read all into a; /*
use error variances in m1 variable*/ *a
=m1[2:nrow(m1),ncol(m1)-1:ncol(m1)];/*from m1 extract
variances and number of observations */ v
=0;ct = 0;nchi = 0;St = 0; do i = 1 to nrow(a);
/* computing pooled variance */
St = St + (a[i,1]-1)*a[i,2];
v = v + (a[i,1]-1);
ct = ct + 1/(a[i,1]-1); end; S
= St/v; dchi
= (1
+ (1/(3*(nrow(a)-1)))*(ct-(1/v))); /*computing denominator of Bartlett's chi-square
statistic*/ do i = 1 to nrow(a); nchi
= nchi + (a[i,1]-1)*(log(S/a[i,2])); end; chi
= nchi/dchi;probability = 1 - probchi(chi,(nrow(a)-1));/*computing chi-square test statistic and
probability.*/ df
= (nrow(a)-1); print probability chi df S; /*
printing chi-square test statistic value, probability and
degree of freedom*/ if probability >= 0.05 then Interpretation = "Data
is Homogeneous at 5% level of Significance"; else Interpretation = "Data
is Heterogeneous at 5% level of Significance"; print Interpretation; /*
testing and printing interpretation*/ pb
= char(probability); *
3.
If the error variances are heterogeneous, then for
applying Aitken's transformation one can use the following SAS statements. /*
If error variances are homogeneous, there is no need of
transformation, if error variances are /* This SAS statement creates a table for the values of mean square error(MSE) to be used for transformation of data*/ ods html
body = 'mse.xls'; proc print data = required; var loc ms; run; ods html
close; data
tranformed; /*
This set of SAS statements transforms the data*/ set wholedata; if loc="Jagdalpu"
then new_var=syield/sqrt(7029.712); if loc="Kanke"
then new_var=syield/sqrt(8438.3); if loc="Shillong"
then new_var=syield/sqrt(17457.515); run; /*
4.
To perform the combined analysis of the above data set
considering the locations as fixed proc glm data = tranformed; class loc rep blk varn; model new_var syield=
loc rep(loc) blk(rep loc) varn loc*varn; means varn; lsmeans varn/pdiff; run; /*please
note that for performing comparisons, transformed data (new_var)
should only be used. However,
for just having the original lsmeans, the analysis of syield
should be seen*/ /*
5.
To perform the combined analysis of the above data set
considering the locations as random effects /*If
one consider locations as random effects, then one can
perform the combined analysis as follows*/ Analysis using PROC GLM*/ proc glm data = wholedata; class loc rep blk varn; model syield = loc rep(loc) blk(rep loc) varn loc*varn; random loc blk(rep loc) loc*varn/test; lsmeans varn/pdiff; run; /*Analysis
using Proc Mixed*/ proc mixed
ratio covtest data = wholedata; class loc rep blk varn; model syield = varn; random loc rep(loc) blk(rep loc) loc*varn/s; lsmeans varn/pdiff; run; * 6. To prepare a Site Regression (SREG) or GGE Biplot. ; /*
this proc print statement prints the data set for means in
MS_EXCEL format to be used in SREG biplot*/ ods html
body = 'lsmean.xls'; proc print data = lsmean; var loc varn syieldLSMean; run; ods html
close; /*
If loc*varn interaction is significant, then same genotype
cannot be recommended for all locations. In
such a situation, one can see the performance of genotype
and genotype*environment interaction using SREG biplot*/ /*For
SREG Biplot we create a data file named raw where Locations
are termed as ENV (environment here is location), treatment
numbers as GEN and lsmeans for gen as GYLD*/ /*We
are using the program developed by Jose Crossa and his
coworkers at CIMMYT, Mexico after some minor modifications.*/ /*
Please see if RCB design has been used at all the locations,
then one can use means instead of lsmeans*/ /*
If some of the cells in genotype in environment table are
missing, then one can obtain BLUP and use in place
of lsmeans. A word of caution, no more than 20% of cells
should be empty in Genotype × Environment Table*/ OPTIONS PS = 5000 LS=78
NODATE; /*after
removing * one can get the output as a cgm file directly, which
can be imported in PowerPoint or word documents for
clarity. */ *FILENAME
BIPLOT 'C:\Documents and Settings\owner\Desktop\comana.cgm'; *To have cgm files run it in BATCH; *GOPTIONS
DEVICE=CGMOF97L GSFNAME=BIPLOT GSFMODE=REPLACE; /*one
has to run the program twice, first time to see the portion
of variation explained by two components in
the output file, then one has to change the value of factor
1 and factor 2 in the file at appropriate place.*/ DATA
RAW;
INPUT ENV $ GEN $ GYLD;
YLD=GYLD;
CARDS; Jagdalpu
1
542.09147 Jagdalpu
2
267.39841 Jagdalpu
3
848.89802 Jagdalpu
4
696.1121 Jagdalpu
5
720.86558 Jagdalpu
6
311.71538 Jagdalpu
7
260.20248 Jagdalpu
8
369.21657 Jagdalpu
9
729.74581 Jagdalpu
10
542.26323 Jagdalpu
11
717.70558 Jagdalpu
12
651.01871 Jagdalpu
13
577.83177 Jagdalpu
14
606.85026 Jagdalpu
15
717.30296 Jagdalpu
16
687.08169 Jagdalpu
17
440.21927 Jagdalpu
18
419.94609 Jagdalpu
19
564.03212 Jagdalpu
20
918.83585 Jagdalpu
21
838.33456 Jagdalpu
22
834.47294 Jagdalpu
23
762.42873 Jagdalpu
24
1073.0971 Kanke
1
571.00595 Kanke
2
531.85913 Kanke
3
692.86905 Kanke
4
669.26587 Kanke
5
751.46488 Kanke
6
721.89464 Kanke
7
582.61012 Kanke
8
599.69702 Kanke
9
562.27679 Kanke
10
642.56369 Kanke
11
404.89821 Kanke
12
519.92798 Kanke
13
631.2502 Kanke
14
623.57956 Kanke
15
598.15813 Kanke
16
541.0121 Kanke
17
615.14742 Kanke
18
761.82123 Kanke
19
543.36091 Kanke
20
679.67044 Kanke
21
555.30833 Kanke
22
839.51389 Kanke
23
659.4 Kanke
24
656.44444 Shillong
1
641.08333 Shillong
2
867.08333 Shillong
3
884.34921 Shillong
4
647.15079 Shillong
5
752.90437 Shillong
6
906.97659 Shillong
7
383.52341 Shillong
8
733.59563 Shillong
9
775.75675 Shillong
10
537.22897 Shillong
11
545.5377 Shillong
12
402.14325 Shillong
13
619.79881 Shillong
14
780.52897 Shillong
15
578.62897 Shillong
16
790.70992 Shillong
17
734.20437 Shillong
18
623.73452 Shillong
19
502.42341 Shillong
20
824.97103 Shillong
21
799.86429 Shillong
22
538.90238 Shillong
23
827.09206 Shillong
24
843.14127 ; proc glm data=raw outstat=stats
;
class env gen;
model yld = env gen env*gen/ss4; /*One
has to replace MSE by the MSE in combined analysis, DFE with
error degrees of freedom in combined
analysis, NREP number of replications at each
locations in the combined analysis. */ data
stats2;
set stats ;
drop _name_ _type_;
if _source_ = 'error' then
delete;
mse=10975.176; * mse in combined analysis when location are random;
dfe=111;
* degrees of freedom in combined analysis;
nrep=3;
* number of replications at each locations;
ss=ss*nrep;
ms=ss/df;
f=ms/mse;
prob=1-probf(f,df,dfe); proc print data=stats2 noobs;
var _source_ df ss ms f prob; proc glm data=raw noprint;
class env gen;
model yld = env / ss4 ;
output out=outres
r=resid; proc sort data=outres;
by gen env; proc transpose data=outres out=outres2;
by gen;
id env;
var resid; proc iml; use outres2; read all into resid; ngen=nrow(resid); nenv=ncol(resid); use stats2; read var {mse}
into msem; read var {dfe}
into dfem; read var {nrep} into
nrep; call svd (u,l,v,resid);
minimo=min(ngen,nenv); l=l[1:minimo,]; ss=(l##2)*nrep;
suma=sum(ss); porcent=((1/suma)#ss)*100;
minimo=min(ngen,nenv);
porcenta=0;
do i = 1 to minimo;
df=(ngen-1)+(nenv-1)-(2*i-1);
dfa=dfa//df;
porceacu=porcent[i,];
porcenta=porcenta+porceacu;
porcenac=porcenac//porcenta;
end; dfe=j(minimo,1,dfem); mse=j(minimo,1,msem); ssdf=ss||porcent||porcenac||dfa||dfe||mse; l12=l##0.5;
scoreg1=u[,1]#l12[1,]; scoreg2=u[,2]#l12[2,]; scoreg3=u[,3]#l12[3,]; scoree1=v[,1]#l12[1,]; scoree2=v[,2]#l12[2,]; scoree3=v[,3]#l12[3,]; factor1=max(abs(scoreg1||scoreg2)); factor2=max(abs(scoree1||scoree2)); factor=max(factor1,factor2); scoreg=(scoreg1||scoreg2||scoreg3)*(1/factor); scoree=(scoree1||scoree2||scoree3)*(1/factor); scores=scoreg//scoree; create sumas from ssdf;
append from ssdf;
close sumas;
create scores from scores; append from scores ; close scores; data
ss_sreg;
set sumas;
ss_sreg =col1;
porcent =col2;
porcenac=col3;
df_sreg =col4;
dfe
=col5;
mse
=col6;
drop col1 - col6;
ms_sreg=ss_sreg/df_sreg;
f_sreg=ms_sreg/mse;
probf=1-probf(f__sreg,df_sreg,dfe); proc print data=ss_sreg noobs;
var ss_sreg porcent porcenac ; proc sort data=raw;
by gen; proc means data = raw noprint;
by gen ;
var yld;
output out
= mediag mean=yld; data
nameg;
set mediag;
type = 'gen';
name = gen;
keep type name yld; proc sort data=raw;
by env; proc means data = raw noprint;
by env ;
var yld;
output out
= mediae mean=yld; data
namee;
set mediae;
type = 'env';
name1 = 's'||env;
name = compress(name1);
keep type name yld; data
nametype;
set nameg namee; data
biplot ;
merge nametype scores;
dim1=col1;
dim2=col2;
dim3=col3;
drop col1-col3; title1 'biplot of audpc'; proc print data=biplot noobs;
var type name yld dim1 dim2 dim3; Data
labels;
set biplot;
retain xsys '2' ysys '2' ;
length function text $8 ;
text = name ;
if type = 'GEN' then
do;
color='red
';
size = 1.0;
style = 'hwcgm001';
x = dim1;
y = dim2;
if dim1 >=0
then position='5';
else position='5';
function = 'LABEL';
output;
end;
if type = 'ENV' then
DO;
color='blue
';
size = 1.0;
style = 'hwcgm001';
x = 0.0;
y = 0.0;
function='MOVE';
output;
x = dim1;
y = dim2;
function='DRAW' ;
output;
if dim1 >=0
then position='6';
else position='4';
function='LABEL';
output;
end; /*one
has to run the program twice, first time see the portion of
variation explained by two components,
then change in the program file at appropriate places
for the factor 1 and factor 2 */ Proc gplot data=biplot; Plot dim2*dim1 / Annotate=labels
frame
Vref=0.0 Href
= 0.0
cvref=black chref=black
lvref=3 lhref=3
vaxis=axis2 haxis=axis1
vminor=1
hminor=1 nolegend; symbol1 v=none
c=black h=0.7
; symbol2 v=none
c=black h=0.7
; axis2
length = 5.0
in
order = (-1 to 1.0
by 0.2) /*one
has to change the value for factor 2(.)*/
label=(f=hwcgm001
c=green h=1.2
a=90 r=0 'Factor
2 (28.18%)') /*please change the % variation
explained as per data*/
offset = (3)
value=(h=1.0)
minor=none; *
length = 7.0 in FOR CGM files; axis1
length = 7.0
in
order = (-0.8 to 1.0
by 0.2) /*one
has to change the value for factor 1(.)*/
label=(f=hwcgm001
c=green h=1.2 'Factor
1 (61.25%)') /*please change the % variation
explained as per data*/
offset = (3)
value=(h=1.0)
minor=none; *
length = 7.0 in FOR CGM files; Title1 f=hwcgm001 c=Red h=2.0 'SREG biplot of the Grain Yield of
Toria at 3 Locations'; /*Give
the title as is required in the output*/ run;
Home Descriptive Statistics Tests of Significance Correlation and Regression Completely Randomised Design RCB Design Incomplete Block Design Resolvable Block Design Augmented Design Latin Square Design Factorial RCB Design Partially Confounded Design Factorial Experiment with Extra Treatments Split Plot Design Strip Plot Design Response Surface Design Cross Over Design Analysis of Covariance Diagnostics and Remedial Measures Principal Component Analysis Cluster Analysis Groups of Experiments Non-Linear Models
Copyright Disclaimer How to Quote this page Report Error Comments/suggestions |
||||
Descriptive Statistics | |||||
Tests of Significance | |||||
Correlation and Regression | |||||
Completely Randomised Design | |||||
RCB Design | |||||
Incomplete Block Design | |||||
Resolvable Block Design | |||||
Augmented Design | |||||
Latin Square Design | |||||
Factorial RCB Design | |||||
Partially Confounded Design | |||||
Factorial Experiment with Extra Treatments | |||||
Split Plot Design | |||||
Strip Plot Design | |||||
Response Surface Design | |||||
Cross Over Design | |||||
Analysis of Covariance | |||||
Diagnostics and Remedial Measures | |||||
Principal Component Analysis | |||||
Cluster Analysis | |||||
Groups of Experiments | |||||
Non-Linear Models | |||||
Contact Us | |||||
Other
Designed Experiments (Under Development) |
|||||
For exposure on SAS, SPSS, MINITAB, SYSTAT and MS-EXCEL for analysis of data from designed experiments:
Please see Module I of Electronic Book II: Advances in Data Analytical Techniques available at Design Resource Server (www.iasri.res.in/design) |
|||||