AGENDA Treatment of Data Homework for next Wednesday Quiz next Wednesday TREATMENT OF DATA General Concepts Pareto and Dot Diagrams Frequency Distributions Graphs of Frequency Distributions Descriptive Measures Quartiles and Percentiles X bar and s GENERAL CONCEPTS Raw statistical data from surveys, experiments, etc can be too overwhelming to understand The data must be condensed and represented in a manner that is more easily understood Graphically Tabular or Numerical form PARETO DIAGRAMS Special bar chart Based on the Pareto 80-20 Principle Ordered in descending order of interest Allows attention to be directed on most important areas Frequently include cost related data PARETO CHART Gasket Ring PARETO CHART Gasket Ring DOT DIAGRAMS Visually summarizes individual data Check for unusual patterns Easily identifies outliers Differences in data sources Machines Personnel Materials DOT DIAGRAMS FREQUENCY DISTRIBUTIONS Table of data Divided in classes / categories / cells Number of cells is usually related to the total obs Class / category / cell limits Class / category / cell frequencies FREQUENCY DISTRIBUTIONS CUMULATIVE DISTRIBUTION Total number of observations less than a given value CUMULATIVE DISTRIBUTION GRAPHS OF FREQUENCY DISTRIBUTIONS Histogram of cell observations Horizontal or vertical Size is based on observations in each cell GRAPHS OF FREQUENCY DISTRIBUTIONS OGIVE Graph of cumulative distribution OGIVE STEM AND LEAF DISPLAYS Smaller sets of data Does not lose any information Class, as well as, actually data values Data values are listed to the right of the classes STEM AND LEAF DISPLAY STEM AND LEAF DISPLAY DESCRIPTIVE MEASURES Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coefficient of variation MEAN X bar Arithmetic average of all values Sum of all values divided by number of values Sample mean and population mean MEDIAN “Middle value” Observations are ordered from smallest to largest Median observation depends on number of obs Odd number of observations (n+1)/2 For 5 observations, median is value of (5+1)/2=3rd observation Even number of observations Median value is average of the two observations in positions n/2 and (n+2)/2 For 6 observations, average values of 3rd and 4th observations MODE Most common value MINIMUM Smallest value MAXIMUM Largest value RANGE Method to measure the dispersion of the values Largest value minus the smallest value Can be misleading when outliers are present Does not take into account the distribution of bunching of values Simple and fast to calculate so commonly used in industry particularly with SPC charts RANGE SAMPLE VARIANCE Absolute measure of dispersion When many values are away from the mean, the variance is large When many values are close to the mean, the variance is small Based on Sample mean Squared difference of observations from sample mean Number of observations in sample SAMPLE VARIANCE SAMPLE STANDARD DEVIATION Absolute measure of dispersion Based on square root of variance SAMPLE STANDARD DEVIATION COEFFICIENT OF VARIATION Relative measure of dispersion Relative to the sample mean COEFFICIENT OF VARIATION QUARTILES AND PERCENTILES Quartiles Groupings of 25% observations 1st, 2nd, 3rd, 4th quartile Percentiles At least 100 p % are at or below value At least 100 (1-p) % are at or above value PROCEDURE FOR CALCULATING PERCENTILES Order observations smallest to largest Calculate n * p Not an integer Round up to next highest integer and find value Integer Calculate mean of kth and (k+1)th observations BOX PLOT CALCULATOR FORMULA Mathematically equivalent formula for calculating sample variance Commonly used in programming applications CALCULATOR FORMULA Homework Due next Wednesday 2.68, 2.70 QUIZ During the second half of the class next Wednesday Covers the material discussed on chapter 2 Does not cover everything presented in the book