##### Statistics Chapter 8, 10-13

Population

All units of possible interest

Sample

The part of the population upon which we actually take measurements

Frequency Curves

A "picture" of the population

Systematic Frequency Curve

Not skewed, symmetric

Left Skewed Frequency Curve

Negatively Skewed

Right Skewed Frequency Curve

Positively Skewed

Proportion

The PROPORTION of the population with a measurement in a certain range is equal to the AREA under the frequency curve over that range

Standardized Scores (Z-Scores)

The number of standard deviations away from the mean an observation is

Percentiles

An observation's PERCENTILE (or PERCENTILE RANK) is the percentage of the population that is equal to or less than the observation

Measurement Variables

Continuous: every fraction is a possible value

Discrete: only a countable number of values are possible

Deterministic Relationships

Perfect

One variables cant be perfectly predicted using the other variables

y= function of x

Ex: x= height is feet

y= height in inches

y=12x

Statistical Relationships

Not perfect

The two variables may be related, but perfect prediction is not possible

y= function of x+error

Error: everything that affects y besides x

Ex: x= amount of time spent studying

y= score on exam

What factors affect exam scores other then amount of time studying?

Linear Relationships

A relationship of the form, y=a+bx where

a= y-intercept

b= slope

Correlation

A measurement of the STRENGTH and DIRECTION of the LINEAR relationship between two measurement variables

r= coefficient of correlation

Positive Correlation

r>0

x and y tend to be of the same magnitude

Negative Correlation

r<0

x and y tend to be of opposite magnitude

No Correlation

r=0

x and y are not linearaly related

Perfect Correlation

r=1 or r=-1

All the points fall on a line

Regression

A procedure for eliminating the "best" line relating two variables

Linear equation: y=a+bx

To fit the line we need to "guess" the a and b

Extrapolation

Using the regression line for prediction at an x-value that is outside of the x-range of the original data, predictions that involve extrapolation are NOT valid

Outliers

Points that are far removed from the rest of the data due to error in collecting or entering data, and rare events

Can either strengthen or weaken a correlation

Non-Linear Relationships

A correlation of r=0 means that in the sample no LINEAR relationship exists between x and y (even if r is close to 0 a strong, even perfect, non-linear relationship may exist).

Risk

Proportion with "the trait"

Baseline Risk

Risk of having "the trait" for a group used as the basic for comparison

Contingency Tables

A contingency table is a display showing the combinations of categories for two categorical variables and the number of individuals that fall in eahc category

Rows

The sides of the table, usually EXPL

Columns

The top of the table, usually RESP