Design Resources Server

Analysis of Data from Designed Experiments

Correlation and Regression

IASRI
Home

             <<Back

                                                               Analysis Using SPSS  

 Analysis Using SAS

 

The questions 1 and 2 can be answered using PROC CORR of SAS. Scatter plot can be drawn using PROC PLOT and questions 4 to 8 can be answered using PROC REG. We begin with answering questions 1 and 2. The SAS statements for answering questions 1 and 2 are given in the sequel.

 

Data Input:

For performing analysis, input the data in the following format. 

{Here serial number is termed as SN, plant population as PP, average plant height as PH, average number of green leaves as (NGL) and yield as YLD. It may, however, be noted that one can retain the same name or can code in any other fashion}.

Prepare a SAS data file using

 

Data corr;    /*one can enter any other name for data*/

input sn pp ph ngl yld;

cards;

1        142.00   0.525   8.2     2.47

2        143.00   0.64     9.5     4.76

3        107.00   0.66     9.3     3.31

4          78.00   0.66     7.5     1.97

5        100.00   0.46     5.9     1.34

6           86.50  0.345   6.4     1.14

7         103.50  0.86     6.4     1.5

8         155.99  0.33     7.5     2.03

9           80.88  0.285   8.4     2.54

10       109.77  0.59   10.6     4.9

11         61.77  0.265   8.3     2.91

12         79.11  0.66   11.6     2.76

13       155.99  0.42     8.1     0.59

14         61.81  0.34     9.4     0.84

15         74.5 0.63     8.4     3.87

16         97.00  0.705    7.2     4.47

17         93.14  0.68      6.4     3.31

18         37.43  0.665    8.4     1.57

19         36.44  0.275    7.4     0.53

20         51.00  0.28      7.4     1.15

21       104.00  0.28      9.8     1.08

22         49.00  0.49      4.8     1.83

23        54.66   0.385    5.5     0.76

24        55.55   0.265    5.0     0.43

25        88.44   0.98      5.0     4.08

26        99.55   0.645    9.6     2.83

27        63.99   0.635    5.6     2.57

28      101.77   0.29      8.2     7.42

29      138.66   0.72      9.9     2.62

30        90.22   0.63      8.4     2.00

31        76.92   1.25      7.3     1.99

32      126.22   0.58      6.9     1.36

33        80.36   0.605    6.8     0.68

34      150.23   1.19      8.8     5.36

35        56.50   0.355    9.7     2.12

36      136.00   0.59    10.2     4.16

37      144.50   0.61      9.8     3.12

38      157.33   0.605    8.8     2.07

39        91.99   0.38      7.7     1.17

40      121.50   0.55      7.7     3.62

41        64.50   0.32      5.7     0.67

42      116.00   0.455    6.8     3.05

43        77.50   0.72    11.8     1.70

44        70.43   0.625  10.0     1.55

45      133.77   0.535    9.3     3.28

46        89.99   0.49      9.8     2.69

;

 

/* Obtain correlation coefficient between each pair of the variables PP, PH, NGL and yield using the following SAS statements*/

proc corr;

var pp ph ngl yld;

run;

 

 

/* Obtain partial correlation between NGL and yield after removing the linear effect of PP and PH by using the following SAS statements*/

 

proc corr;

var ngl yld;

partial pp ph;

run;

 

/* Obtain the scatter plot using the following SAS statements */

proc plot;

plot  pp*yld = '*';

/*pp=VERTICAL AXIS yld = HORIZONTAL AXIS.*/

run;

 

/* Fit a multiple linear regression equation by taking yield as dependent variable and biometrical characters as explanatory variables. Print the matrices used in the regression computations using the following SAS statements*/

 

proc reg;

model yld= pp ph ngl/p r influence vif collin xpx i;

/* testing the significance of regression coefficients. This is also done by default in regression fitting*/

test1: test pp=0;

test2: test ph=0;

test3: test ngl=0;

*testing the equality of two regression coefficients;

test4: test pp-ph=0;

test4a: test pp=ph=0;

/*test 4 tests the equality of regression coefficients of pp and ph, whereas test4a test whether regression coefficients of pp and ph simultaneously are significantly different from zero*/

test5: test ph-ngl=0;

test5a: test ph=ngl=0;

run;

/* 

p:  It calculates predicted values from the input data and the estimated model. The display includes the observation number, the ID variable (if one is specified), the actual and predicted values, and the residual. If the CLI, CLM, or R option is specified, the P option is unnecessary

r:  Requests an analysis of the residuals. The results include everything requested by the p option plus the standard errors of the mean predicted and residual values, the studentized residual, and Cook's D statistic to measure the influence of each observation on the parameter estimates.

influence: Computes influence statistics

vif : Produces variance inflation factors with the parameter estimates. Variance inflation is the reciprocal of tolerance.

collin : produces collinearity analysis. It requests a detailed analysis of collinearity among the regressors. This includes eigenvalues, condition indices, and decomposition of the variances of the estimates with respect to each eigenvalue.

xpx: Displays the X'X crossproducts matrix for the model. The crossproducts matrix is bordered by the X'Y and Y'Y matrices.

i: displays sums-of-squares and crossproducts matrix. It displays the (X'X)-1 matrix. The inverse of the crossproducts matrix is bordered by the parameter estimates and SSE matrices. */

 

/* A regression model without intercept can be fitted by any of the following two procedures*/

 

proc reg;

model yld=pp ph ngl;

restrict intercept=0;  /* A RESTRICT statement is used to place restrictions on the parameter estimates in the MODEL preceding it. */

run;

   

proc reg;

model yld=pp ph ngl/noint;   /* Use the NOINT option to fit a model without an intercept term */

run;  

 

 

Data File

  

Result File

<<Back

 

  Analysis Using SAS                                Analysis Using SPSS                                            

 

 

 

 

Home Descriptive Statistics  Tests of Significance Correlation and Regression Completely Randomised Design  RCB Design  

Incomplete Block Design  Resolvable Block Design  Augmented Design  Latin Square Design Factorial RCB Design  

Partially Confounded Design Factorial Experiment with Extra Treatments Split Plot Design Strip Plot Design 

Response Surface Design Cross Over Design  Analysis of Covariance Diagnostics and Remedial Measures 

Principal Component Analysis Cluster Analysis Groups of Experiments  Non-Linear Models  

Contact Us

 

 

 

 

Copyright        Disclaimer        How to Quote this page        Report Error        Comments/suggestions 

Descriptive Statistics
Tests of Significance
Correlation and Regression
Completely Randomised Design
RCB Design
Incomplete Block Design
Resolvable Block Design
Augmented Design
Latin Square Design
Factorial RCB Design
Partially Confounded Design
Factorial Experiment with Extra Treatments
Split Plot Design
Strip Plot Design
Response Surface Design
Cross Over Design
Analysis of Covariance
Diagnostics and Remedial Measures
Principal Component Analysis
Cluster Analysis
Groups of Experiments
Non-Linear Models
Contact Us
Other Designed Experiments
    
(Under Development)

For exposure on SAS, SPSS, 

MINITAB, SYSTAT and  

MS-EXCEL for analysis of 

data from designed experiments:

 

Please see Module I of Electronic Book II: Advances in Data Analytical Techniques

available at Design Resource Server      (www.iasri.res.in/design)