AGENDA Linear Regression LINEAR REGRESSION Statistical method for determining a linear relationship between two variables assuming normality and independence Two types of variables Independent Dependent Method of least squares DATA WHICH PREDICTION LINE PREDICTION LINE DETERMINING PREDICTION LINE Calculate the total squared error between the Actual value Predicted value according to the prediction line DIFFERENTIATE Want the minimum of the sum of squared error Take partial derivatives of the previous equation with respect to a and b Set to 0 NORMAL EQUATIONS With some rearrangement Resulting linear equations are called normal equations INTERCEPT AND SLOPE Substitute x, y, xy, and x squared values in normal equations Solve for a Intercept with y axis at x=0 Solve for b Coefficient of slope for x values EXAMPLE NORMAL EQUATIONS NORMAL EQUATIONS PREDICTION EQUATION INFERENCES R squared… T value for coefficients… R SQUARED What percentage of the variation in the data can be accounted for by the equation The closer to 1.0, the better the fit For physical experiments r squared should be > 0.9 For people related experiments r squared should be >0.30 R squared formula… R SQUARED INTERCEPT PARAMETER Alpha Typically tested against the intercept alpha being 0 Hypotheses test Ho: alpha=0 Ha: alpha < > 0 Two sided critical value based on n-2 df. Test statistic of… TEST STATISTIC FOR INTERCEPT SLOPE PARAMETER Beta Typically tested against the slope Beta being 0 Hypotheses test Ho: Beta=0 Ha: Beta < > 0 Two sided critical value based on n-2 df. Test statistic of… SLOPE PARAMETER B IN EXCEL Data Analysis-Regression Specify Input x range Input y range Returns R squared Coefficients T stat P value