Design Resources Server

Analysis of Data from Designed Experiments

Cluster Analysis

IASRI
Home

<<Back

Analysis Using SPSS

Analysis Using SAS

 

For performing the analysis, input the data in the following format.{ Here genotypes are termed as trt (trt or the ID variable should be a string variable) and y1-y8 are the eight characters of interest.} 

 Prepare a SAS data file using

 

DATA cluster;

input trt $        y1 - y8;/* y1 to y8 are 8 characters of interest*/

/* For the variable trt one can also give the original names for the genotypes*/

cards;

1          42.43   7.51     78.32   128.24             45.26   6.51     7.74     7.66

2          43.83   7.91     71.20   114.71             82.90   6.51     6.34     7.05

3          40.77   9.77     85.47   129.50             189.22 6.64     6.91     10.09

4          43.35   8.18     68.25   113.16             176.59 6.77     6.57     8.91

5          45.74   7.97     83.52   123.77             80.53   6.57     6.83     7.34

6          45.43   8.11     84.45   132.23             77.30   6.47     6.81     7.35

7          42.68   7.68     92.47   128.09             36.70   6.55     7.16     6.43

8          39.18   6.64     73.50   122.04             49.24   6.25     7.20     6.75

9          45.98   8.10     67.41   123.39             45.25   6.20     8.01     6.68

10        43.64   8.56     84.44   138.81             90.98   6.64     6.28     7.04

11        44.68   8.11     91.71   125.12             65.78   6.40     6.49     6.51

12        45.90   7.50     70.85   122.81             54.94   6.36     8.85     8.48

13        42.61   7.57     75.78   120.91             85.98   6.57     7.02     6.91

14        42.56   8.21     94.64   134.48             111.13 6.65     6.60     8.06

15        45.86   7.78     84.67   123.80             82.93   6.67     6.75     6.89

16        41.70   8.00     95.02   137.69             116.10 6.63     6.58     7.45

17        43.25   7.78     82.50   129.05             107.23 6.68     6.83     7.46

18        43.05   8.10     73.76   120.28             203.81 6.64     6.65     10.18

19        40.24   7.48     74.66   121.99             88.66   6.69     6.53     7.46

20        44.43   8.32     83.58   122.86             83.86   6.58     6.58     8.57

21        44.34   7.81     75.01   129.42             74.00   6.61     7.53     8.53

22        44.67   7.98     75.55   123.79             102.31 6.62     6.72     7.85

23        43.54   7.65     94.30   134.73             77.17   6.53     6.93     7.26

24        45.10   8.01     91.40   134.84             86.93   6.52     7.32     7.12

25        45.03   10.15   85.85   133.02             73.07   6.63     6.32     6.85

26        46.82   7.54     85.22   130.92             65.22   6.63     6.59     7.10

27        46.52   7.65     82.51   125.34             66.98   6.59     6.60     6.90

28        42.33   7.20     65.85   120.38             66.24   6.61     7.41     7.17

29        42.98   8.06     84.04   128.80             94.25   6.65     6.91     8.18

30        46.19   7.60     94.02   137.15             97.35   6.59     6.26     7.38

31        45.69   8.03     92.94   136.41             68.96   6.47     6.41     6.54

32        44.46   7.79     85.94   132.44             83.46   6.65     6.74     7.15

33        46.50   7.92     82.59   133.78             56.92   6.39     7.11     6.43

34        46.45   8.30     81.20   134.81             92.09   6.43     8.09     7.52

35        43.05   7.98     84.54   131.97             88.59   6.62     6.59     7.78

36        43.64   7.49     66.53   114.51             107.24 6.78     7.11     8.29

37        44.57   7.76     91.55   135.29             49.93   6.48     6.94     6.74

38        44.34   8.00     94.67   140.19             103.65 6.48     6.50     7.40

39        43.65   7.66     66.88   119.78             127.28 6.59     7.13     8.70

40        43.58   7.92     80.75   133.31             147.71 6.79     6.40     8.58

41        44.45   8.06     84.76   121.68             85.58   6.10     8.22     7.75

42        47.87   8.12     86.94   125.43             81.97   6.39     7.61     8.11

43        45.23   7.22     79.38   126.39             51.82   6.33     9.44     7.25

44        43.53   7.55     82.47   131.00             35.56   6.04     10.36   7.59

45        43.13   7.52     67.45   116.49             133.48 6.70     6.59     9.20

46        41.95   7.34     71.86   121.80             95.91   6.74     6.52     7.58

47        41.06   7.28     69.43   120.32             74.69   6.65     6.78     7.18

48        39.71   7.24     64.23   114.25             130.62 6.76     6.59     8.83

49        41.30   7.18     64.53   113.78             93.68   6.78     6.52     7.71

50        41.95   7.30     66.53   115.44             139.30 6.71     6.32     7.96

51        43.48   7.38     66.36   115.12             106.66 6.74     6.24     8.01

52        43.22   7.56     71.29   115.29             159.95 6.68     6.44     8.54

53        40.15   7.31     67.65   115.35             140.49 6.77     6.69     8.48

54        44.30   7.73     63.26   117.41             144.21 6.71     6.59     8.42

55        38.25   7.24     63.71   113.24             104.85 6.76     6.69     7.46

56        44.07   7.53     64.34   113.68             96.62   6.75     6.48     8.01

57        43.79   7.75     65.22   116.27             128.92 6.70     6.55     8.58

58        43.11   7.68     62.48   115.50             143.87 6.79     6.52     8.97

59        40.87   7.55     66.39   114.95             141.49 6.79     6.43     8.31

60        42.98   7.36     68.64   115.32             115.23 6.72     6.46     8.06

61        47.40   7.76     65.37   115.10             134.99 6.64     6.60     7.41

62        39.00   7.67     64.94   113.05             122.66 6.72     6.49     7.71

63        44.38   7.41     68.30   115.99             128.34 6.76     6.66     8.65

64        42.13   7.19     68.88   119.77             90.78   6.67     7.21     7.90

65        42.68   7.40     65.26   118.73             115.82 6.79     7.03     8.35

66        40.62   7.85     65.17   113.06             134.02 6.79     6.38     8.44

67        44.43   7.40     67.14   117.82             115.09 6.73     7.37     8.47

68        41.56   6.94     69.03   115.53             93.68   6.63     6.79     7.23

69        41.07   7.00     63.97   115.42             91.20   6.83     7.53     7.92

70        41.10   7.71     63.98   113.52             144.02 6.86     6.65     11.09

71        42.45   7.12     65.92   117.29             79.98   6.75     7.18     7.15

72        42.12   7.35     60.95   108.99             128.10 6.77     6.40     8.13

73        41.00   7.33     65.33   113.44             130.96 6.77     6.37     7.97

74        43.67   7.64     62.95   118.32             119.09 6.72     7.15     9.26

75        46.49   7.97     88.06   126.87             75.97   6.58     6.53     6.78

76        42.98   7.39     66.57   119.79             118.84 6.65     7.04     8.54

77        41.01   7.02     59.90   113.64             104.40 6.70     7.09     8.36

78        48.85   6.84     45.32   104.53             66.53   6.75     8.27     7.69

79        49.60   7.17     59.37   110.36             82.16   6.76     7.93     7.80

80        49.50   7.40     62.24   113.10             144.37 6.50     6.64     9.01

81        44.53   7.63     65.14   113.71             140.34 6.76     6.74     8.31

82        46.59   7.47     85.08   123.53             72.37   6.67     6.37     6.74

83        44.78   7.47     85.72   126.54             113.46 6.69     6.59     6.92

84        42.22   7.33     69.77   115.38             105.00 6.70     6.67     8.11

85        37.10   7.13     80.79   122.32             64.18   6.44     6.29     6.69

86        44.42   6.94     66.76   120.34             49.67   6.33     8.35     7.28

87        45.53   7.52     67.79   114.47             146.09 6.71     6.54     8.13

88        42.50   7.48     62.49   114.59             130.72 6.79     6.61     8.24

89        46.06   7.67     86.69   125.51             75.84   6.44     6.44     7.12

90        36.44   7.45     71.74   114.81             137.22 6.73     6.28     8.23

91        42.67   7.36     70.64   121.17             43.06   6.71     6.79     6.13

92        40.44   7.06     59.99   107.81             99.93   6.86     7.11     8.53

93        39.35   7.51     55.26   120.88             91.51   6.63     6.86     6.98

94        45.41   7.08     67.92   118.80             64.25   6.63     7.93     7.61

95        43.19   7.32     65.39   118.82             98.70   6.67     7.07     8.18

96        45.43   7.52     65.12   118.90             95.82   6.79     7.61     8.21

97        44.34   7.48     63.66   113.72             110.46 6.79     7.02     7.97

98        42.76   7.25     63.02   109.59             104.84 6.84     7.08     9.16

99        45.07   8.56     92.05   135.80             94.57   6.68     6.55     7.60

100      40.96   7.39     63.33   112.40             96.36   6.76     6.90     7.56

101      42.15   6.97     61.06   114.76             118.46 6.78     6.71     8.42

102      40.36   7.00     69.93   120.29             53.84   6.23     8.44     7.15

103      42.68   7.81     62.51   113.08             152.97 6.75     6.83     8.49

104      39.78   7.00     69.38   121.18             76.18   6.57     8.58     9.46

105      41.20   7.06     68.03   118.67             69.91   6.61     6.48     6.91

106      40.65   7.16     65.63   118.89             92.78   6.71     7.03     7.99

107      43.32   7.78     65.56   118.91             128.72 6.82     6.92     8.63

108      50.53   7.07     68.18   118.80             89.23   6.72     7.84     8.66

109      44.89   7.82     65.60   114.99             124.45 6.77     6.67     8.40

110      46.23   7.72     67.76   120.83             111.15 6.72     7.52     9.12

 ;

 

/* This statement uses the DISTANCE procedure to obtain a distance matrix that will be used as input to a subsequent clustering procedure.An output SAS data set called Distmat that contains the distance  matrix is created through the OUT= option.METHOD= Euclid requests that Euclidean (which also is the default) distances should be computed.For use in PROC CLUSTER, distance or dissimilarity measures such as METHOD= EUCLID or METHOD= DGOWER should be chosen. */

 


  
PROC distance data=cluster method=EUCLID  out=distmat;

      var interval(y1-y8); /*The VAR statement lists the variables*/

      id trt;  /*variable in the ID option should be a character variable*/ 

   run

 

  /*The ID statement specifies that the variable trt should be copied to the OUT= data set and used to  generate names for the distance variables. The distance variables in the output data set are named by the values in the ID variable. This statement prints the Euclidean distances between pairs of treatments*/

 

   PROC print data=distmat;

      id trt;

   run;

 

 /*The PROC CLUSTER statement starts the CLUSTER procedure, identifies a clustering method. The METHOD= specification determines the clustering method used by the procedure.

Here we have used method = AVERAGE (unweighted pair-group method using arithmetic averages, UPGMA). The output data set Tree is created (through outtree= Tree)  and used as input to the TREE procedure that produces the dendrogram .*/

 

   PROC cluster data=distmat method=average outtree=tree;

     id trt;

   run;

 

 /* The following statements use the TREE procedure to produce a dendrogram of the clusters.

 The preceding statements use the SAS data set Tree as input. The OUT= option creates an output  SAS data set named out to contain information on cluster membership. The NCLUSTERS= option specifies the number of clusters desired in the data set out(here we have taken 3 clusters).*/

 

/*after removing * one can get the output as a cgm file directly, which can be imported

in PowerPoint or word documents for clarity. */

 

*OPTIONS PS = 5000 LS=78 NODATE;

*FILENAME DENDRO 'C:\Documents and Settings\Owner\Desktop\dend.cgm'; 

*GOPTIONS DEVICE=CGMOF97L GSFNAME=DENDRO GSFMODE=REPLACE;

 

goptions   htext=1pct ;

PROC tree data=Tree nclusters=3 horizontal hordisplay=right lines=(color=blue) out=out;

id trt;

run;

 

 /* The following statement use the SORT procedure to sort the data set out*/

   PROC sort data=out;

      by trt;

   run;

 

 /* The following statement use the SORT procedure to sort the data set cluster*/

 

   PROC sort data = cluster;

   by trt;

   run;

 

    /* The following statement merges the two data sets cluster and out*/

 

   DATA clus;

      merge cluster out;

      by trt;

   run;

 

    /* The following statement use the SORT procedure to sort the data set clus*/

 

   PROC sort data=clus;

      by cluster;

   run;

 

    /* The following statement use the PRINT procedure to print the clusters*/

 

   PROC print;

      id trt;

      by cluster;

   run;

 

 

Data File

Result File  Dendrogram

 <<Back

 

 Analysis Using SAS                                                         Analysis Using SPSS                                     

 

 

Home Descriptive Statistics  Tests of Significance Correlation and Regression Completely Randomised Design  RCB Design  

Incomplete Block Design  Resolvable Block Design  Augmented Design  Latin Square Design Factorial RCB Design  

Partially Confounded Design Factorial Experiment with Extra Treatments Split Plot Design Strip Plot Design 

Response Surface Design Cross Over Design  Analysis of Covariance Diagnostics and Remedial Measures 

Principal Component Analysis Cluster Analysis Groups of Experiments  Non-Linear Models  

Contact Us 

 

 

 

  Copyright        Disclaimer        How to Quote this page        Report Error        Comments/suggestions

Descriptive Statistics
Tests of Significance
Correlation and Regression
Completely Randomised Design
RCB Design
Incomplete Block Design
Resolvable Block Design
Augmented Design
Latin Square Design
Factorial RCB Design
Partially Confounded Design
Factorial Experiment with Extra Treatments
Split Plot Design
Strip Plot Design
Response Surface Design
Cross-Over Designs
Analysis of Covariance
Diagnostics and Remedial Measures
Principal Component Analysis
Cluster Analysis
Groups of Experiments
Non-Linear Models
Contact Us

Other Designed Experiments
    
(Under Development)

For exposure on SAS, SPSS, 
MINITAB, SYSTAT and
 
MS-EXCEL for analysis of data from designed experiments:

 Please see Module I of Electronic Book II:
Advances in Data Analytical Techniques

available at Design Resources Server (www.iasri.res.in/design)