PSYC 325 Research Test 1
one observes nature, proposes a law to generalise an observed pattern, confirms it by many observations, while discarding disconfirmed laws
According to scientific inductivism...
science is based on observation, and the acceptance and rejection of possible hypotheses based on these observations
Inductive reasoning makes
broad generalisations from specific observations ie. there are data, then conclusions are drawn from the data
Example of inductivism
Swans I've seen are white, so I draw the conclusion that all swans are white (jump in logic)
Falsificationism is based on...
based on deduction: you are trying to reject something
we cannot confirm hypotheses, only falsify them
hold a theory and based on it we make a prediction of its consequences (we predict what the observations should be if the theory were correct). From general (theory) to specific (the observations)
from specific (data) to general (theory)
Deductive hypothesis testing
Begin with a hypothesis (eg. all swans are white), and then collect data. Data can falsify (eg. run across black swan).
Hypothesis testing t-test
P-value is below .05
We "reject the null hypothesis that there is no difference between the two groups"
Falsificationism: we have "failed to falsify our hypothesis" ie. "we have found support for our hypothesis"
no difference between groups
if our hypothesis is supported: we want to reject this
P-value is above .05
We "fail to reject to the null hypothesis"
Isn't significant, so can't prove that we reject (our hypothesis is supported) or fail to reject null hypothesis
Cyclical process: general priciple
you create a formal inductive rule
Cyclical process: deduction
you recognise that your informal observations aren't rooted in science, so you desire to test your hypothesis
Cyclical process: prediction
you make a formal deductive hypothesis
Cyclical process: specific instances: individual events
you systematically collect data in an effort to falsify your hypothesis. Based on your findings, you will probably revise your hypothesis
Cyclical process: Observations
you casually notice lots of white swans at a nearby lake
Cyclical process: induction
your use of logic leads you to believe that swans are white
Deductive/inductive hypothesis testing cycle
3. general principle
eg. frequencies, means, and SDs Aligned with inductive purposes because one can discern patterns
eg. t-tests, manovas
Does statistical test reject or fail to reject the null hypothesis
What is the role of insights, observations, and theories in setting up a study?
They can play a large role in constructing inductive theories or hypotheses. Qualitative research can fulfil this role.
How do we collect data that will be useful in elucidating a hypothesis?
Design a good study using strong methodology that prevents threats to internal validity (i.e., biases such as social desirability)
How do we analyse that data so that we end up with good evidence?
Statistical methods that illuminate associations between variables or differences between groups. If performed with inferential statistics, these results can be considered reliable and valid.
How do we combine and collect the results of numerous studies in irder to make good conclusions?
Meta-analyses: combination of numerous studies by independent researchers, to make valid conclusions about the data. eg. most studies rejected the null hypothesis, or most studies accepted the null hypothesis
Essential goal of ethics:
don't harm people who are helping you
Essential concepts of ethics
- stress and psychological harm
- informed consent
- privacy and confidentiality
- care of animals
- costs and benefits
One should warn the participants, and he/she has the right to refuse, or to stop participation
stress that is above that experiences on a daily basis is considered to be too much (eg. seeing pictures of dead bodies; finding oneself in a possible building fire; seeing explicit sexual images)
Stress and psychological harm
One can use these manipulations, but one should forewarn participants of excessive stress to allow them to withdraw. As long as you gain informed consent, ethics committees can approve that study. Participants must know all details in order to give informed consent.
telling someone something that is not true, or leaving out something that they should know
these photos are of convicted murderers vs. false feedback that tells a subject that they are deficient, flawed, or abnormal.
Must only be harmless deception in psychological research
Even if you tell the subject later that you deceived them...
the manipulation may still have an unintended lasting effect. Person may distrust psychologists afterwards, and form a bad opinion of research.
All participants should ideally be given an accurate description of what they will experience in the study, and have the opportunity to decide whether they wish to participate or not.
Must also be told that they can cease participation at any time without penalty.
Informed consent + deception
You cannot describe the study accurately if you are performing deception. Researchers usually deceive by omitting information or being vague
Anonymous: one need not sign a consent form, participation is taken as consent
Not anonymous: mandatory to obtain informed consent
Participant should be debriefed: told about the precise nature of the study
-Gives person satisfaction of knowing exactly what they participated in
-If there was deception, debriefing must be thorough and complete
-Gives participant chance to talk to the experimenter, give feedback, complain etc.
the experimenter doesn't know who contributed data
the experimenter knows who contributed data, but will not tell anyone else. Experimenter protects identities
Why not make everything anonymous?
Not always feasible eg. interviewing families over time-- get to know them intimately. Can store the data separately from the list of names, and then destroy the list after it is no longer needed.
Aggregated data (grouped)
No identification of specific individuals
Used for individuals who are quoted or described. Or initials.
anyone younger than 16 years of age, elderly persons or anyone with a cognitive deficit/ mental disorder should have a guardian sign for them
prisoners are a special case too
Who speaks for special populations?
IRB (Institutional Review Board) in U.S
Ethics Committees in NZ
Review applications to determine whether individuals are sufficiently protected
New Zealand Psychological Society's Ethics Code
All universities have an "ethical treatment of lab animals" code of ethics- must be treated in a "humane" fashion, no unnecessary suffering
Costs v benefits
Each researcher must consider the balance of costs to participants vs. benefits to society
Fraud in research
Changing data to get predicted results (replications expose the truth)
Theory is made up of ... whereas observations are based on ...
We understand constructs through capturing data that represents constructs
Formal representation of constructs
Conceptual variables and operationalizations
Can't measure constructs directly because they are hypothetical, so have to measure them indirectly through variables
Conceptual variables and operationalizations example
Wellbeing (construct): (operationalizations) 5-item scale, no. of smiles, brain scan
Three types of operationalization
Conceptual variables and operationalizations example
Does savouring predict wellbeing?
We want to conduct research based on variables that are...
reliable and valid
We want to conduct research based on variables that are reliable and valid. Why?
Then we have confidence that they are fairly representing the construct and not something else.
a measurement tool that consistently generates a similar empirical estimate
Most-least reliable variables:
stable demographic variables: most
psychological variables rooted in personality: mood
quickly changing and highly variable variables such as mood: lowest
How do we assess reliability?
most measures of test-retest reliability are simply correlations of scores for the same individuals at two or more points-in-time
value depends on situation eg. you don't want mood measure to yield high correlation but you you want gender to be very high
Types of reliability
-test-retest variability: correlation over time for the same individuals
-internal reliability (Cronbach's Alpha): average level of intercorrelation among all of the items
close to 1: excellent
below 0.5: unacceptable
How to find a Cronbach's a
Algebraic equation that combines number of items, average variance, and average covariance to come up with final numerical value
More items increases alpha. Higher average increases alpha.
Improving internal reliability?
You can remove items if they improve the overall alpha, esp. to shorten the scale
whether the scale yields similar numerical values for the SAME INDIVIDUALS over time
low reliability: might mean scale is psychometrically poor, or that your phenomenon is just inherently unstable
how internally consistent the items of the measure are.
high cronbach's alpha: indicates that the items on the scale tend to correlate with each other to a high degree
A good scale:
will evidence reasonable stability over time, and it will be internally consistent
our scale measures what we intend it to measure
Types of validity
Do the items on the scale relate to or tap the overall construct? Does the following item assess what you are measuring
to what extent does the scale predict expected outcomes?
to what extent does the scale measure the intended hypothetical
(scale, not indiv. items. pay attention to definition of construct)
More types of validity
Convergent validity, discriminant validity
measures the extent to which the scale in question correlates with scales that assess something similar
extent to which a scale does NOT correlate with scales that are expected to be unrelated
looking for NON-significance, not a negative correlation (eg. comparing to an opposite scale would be convergent validity)
Why are reliability and validity good?
We want "good scales," and these are defined as scales possessing reliability and validity
-we want our scales to RELIABLY produce a similar score for the same individuals for attributes that don't change much and those that change moderately
-we want our scales to measure what they are intended to measure, and nothing else.
If you are using a pre-existing scale, you need to be assured that:
- internal reliability is acceptable
- test-retest reliability is good
- the items of the scale seem to capture the intended construct (content validity)
- it has been shown to predict expected outcomes (criterion validity)
- it has been shown to correlate with similar scales (convergent validity)
- it does not correlate with dissimilar scales
Construct validity is the...
highest order, most abstract type of validity, and can only be demonstrated through repeated demonstrations that the scale represents the intended construct in numerous and various contexts-- the real world
good construct validity if numerous studies evidence all of the above-mentioned types of validity
Types of variables/ scales of measurement
Nominal variables (categorical: classical nominal, ordinal variables, interval (continuous variables), ratio variable
numerical variables that indicate membership within a particular group eg. men= 0, women= 1, other= 2
based on rankings- only feasible with relatively small groups of comparisons
interval (continuous) variables
variables with numerous obtained numerical values between the maximum and minimum
like interval variables, but has a true zero point.
minimum numerical value has a special meaning
In psychology, the most common type of variable is... Why?
Continuous/interval. Many statistical tests (t-test, ANOVA, regression, etc.) rely on assumptions of equal spacing between points on a scale and normal distributions. Interval and ratio data are more likely to achieve normality than other types (it is impossible for nominal and ordinal
Other types of analyses must be used if your outcome variable is...
nominal or ordinal
nominal or ordinal variables use:
non-parametric tests (can use parametric but only as predictors or IVs)
Ratio and interval variables use:
based on normality
Self report measures advantages
-who knows better than the individual in question?
particularly useful for internal beliefs, attitudes, and emotions that are not evident to other people (eg. depression, anxiety, mindfulness, intentions)
-easier and more efficient
and maybe more accurate than obtaining observations of the person, other people's reports, or neuropsycological indices
self-report measures problems
- response set/bias
- format of the question
- questions tailored for samples
levels of measurement
Categorical/ nominal, ordinal/ ranked, continuous: interval, ratio
Yes/no binary pros and cons
good for children/ simple, very restrictive, lacks richness that other data can give you, limits participant responses
gives participants lots of freedom and range, doesn't constrain anything, good for qualitative studies: when you're not worried about numbers, not good for children: creative answers not useful sometimes, wording must give you some sort of data that is useful to you
Fill-in-the-blank produces... data
categorical data in more of a variable form
Likert produces... data
Multiple choice pros and cons
gives participants options, but don't give them all of the options, must be mindful of what answers you offer
Multiple choice produces... data
Yes/no binary produces... data
- yes/no (binary)
- multiple choice
Use of "don't know" in a survey
Must weigh up need for data because it is nice to give option
Sometimes "don't know" response is just as useful
good for kids who aren't good with putting emotions into words
Smiley/frowny faces scale
commonly used to asses liking/disliking
also use to assess pain
easy for participants to respond to this because they're relevant to lots of people: everyone knows what a smiley/frowny face is
BUT there is still ambiguity: guy in middle
good for unique populations (children, low literacy)
veg. visual analogues: good for pain- blank line, cursor along line
Problems with digital administration
- Can't skim forward and backward quickly and easily
- Can't quickly determine how far through the survey you are
- Fonts can be small and hard to read
- Tied to a screen (typically a desk computer or laptop, although tablets and smartphones can work well depending on the survey platform)
- Computers can die and data can be lost
Why digital administration?
- Point-an-click is easier and faster than using pen or pencil on paper
- Can compile data (in Excel or SPSS formats) very quickly and without error
- Can create a survey (a set of self-report questionnaires) more easily, and can edit it more easily
- Almost everyone has a screen to read the questions (although smartphones have small screens)
- Can enact "skip and branched" more easily than in paper surveys
Item wording: what to avoid (examples?)
- technical terms
- double-barreled questions
- emotive language
- leading questions
- invasion of privacy
- sensitive topics with young people
It's usually not practical to obtain data from your entire population, so take data from representative population and generalise outwards
to what degree can you generalise the findings to a larger group?
If you can't afford to sample everyone in your population, focus on a subset of the population to draw your sample
eg: population: children in NZ
sampling frame: children in Wellington
sample: a subset of children in Wellington
Population and sample examples
Probability Sampling tends to be...
expensive, time-consuming, but it's better than non-probability sampling
types of probability sampling
simple random, stratified random, cluster
Simple random sampling:
every person in the population has an equal chance of being sampled
stratified random sampling:
divide population along dimensions (eg. gender, SES, ethnicity, etc.) and be sure that you sample proportionately across these dimensions
obtain participants from pre-existing groups or clusters. Try to get a random sample of clusters
Non-probability sampling tends to be...
cheaper and easier, but you worry about representativeness
Types of nonprobability sampling:
sample from readily available sources. handy for the researcher. biases are introduced.
obtain appropriate percentages of different types of participants (eg. gender, ethnicity(, but one is still obtaining these participants from readily available sources
you select individuals who fit within a particular category to fit a purpose
recruit an initial group of participants, and then you obtain referrals from them to obtain data from their friends and acquaintances. Useful for rare types of participants (e.g surfers).
Most research psychologists use... because...
non-probability sampling, because it costs a lot of money, time, and effort to obtain probability samples
What to pay attention to samples when you read research:
who is the most commonly sampled population in psychology research?
are we missing out?
Who is the most commonly sampled population in Psychology research?
western educated industrial rich democratic
(+ lots of uni students)
if you're studying children and adolescents and only obtain about 60% parental permissions, what are the other 40% like?
When you have a low response rate, who are the participants you get?
Nonrepresentativeness of the sample bias
when the sampling frame significantly differs from the population, you have introduced biases
nonrepresentativeness of the sample, self-selection bias, ethics
- passive ethical consent for children and adolescents
- compensation and inducements
- interesting ways to collect data: laptops or iPads; internet; testing on cell phones; diary studies; etc.
- underutilised samples: eg. from school to after-school program, to avoid survey fatigue (people don't want to answer lots of surveys) and increase motivation
The role of technology
increases our access to information:
-surveys over internet
-observations of naturally occurring behaviour
-through cell phones and tablets (multi-media portable computers)
-through surveillance of one's use of technological devices
Event Sampling Method (ESM)
- captures data on an hourly, daily, or weekly basis
- good for rapidly changing variables on relatively small samples
"the experience sampling methods, also referred to as daily diary method, is an intensive longitudinal research methodology that onvolves asking participants to report on their thoughts, feelings, behaviours, and/or environment on multiple occasions over time"
Advantages of ESM
- capturing phenomena nearer the time that they occur: better memory for events, feelings, and thoughts
- obtaining multiple assessments of variables of interest, and more assessments of a construct yield better reliability and validity
- can identify contexts for important psychological states
Problems with ESM:
- difficult to recruit and retain individuals who don't mind that their day is interrupted by signals to report states
- participants must be comfortable with the recording device
- lots of missing data
- difficult to analyse this type of data: numerous repeated measures
Probability of not making a Type 2 error
The more statistical power we have in our design, the better our decision making
Type 1 error (alpha)
Incorrect decision: we reject null hypothesis, but null hypotheses was true
Type 2 error (beta)
Incorrect decision: we do not reject null hypothesis, but null hypothesis is not true
Values of beta
from 1 (perfect ability to avoid this error) to 0 (totally wrong all of the time). Ideally power is on high side (.80 or so)
Type 2: Not rejecting null hypothesis when it is false: Null hypothesis is false
this means that there probably actually is a difference between your means- you'd find a difference most of the time. Your test is one of the only times you didn't find a difference.
Type 2: Do not reject null hypothesis when it is false: fail to reject
We accept the null hypothesis, and say that there is no difference between the means (our p-value is non-significant: above .05), when in fact there is a difference (null hypothesis is false)
A type 2 error is "not rejecting the null hypothesis when it is
in the world, there is a true difference but your statistical test yielded a p-value greater than .05, so you mistakingly do NOT reject the null hypothesis.
Greater statistical power comes from...
larger sample sizes
Options for missing values in a dataset
Ignore the missing values and allow SPSS to perform listwise or pairwise deletions
Impute the missing value with a proper method
an analysis drops only those participants that have a missing value for a variable involved in the analysis
If you conduct correlations on a variety of variables that are missing different value, you get different Ns.
Types of imputation
MI (multiple imputation), EM (expectation maximisation), and FIML, (full information maximum likelihood)
MI (multiple imputation)
can be computed in SPSS but it is unwieldy because it generates only a single dataset
EM (expectation maximisation)
also can be computed in SPSS, and generates only a single dataset
FIML (full information maximum likelihood)
used in structural equation modelling
Why is imputation good?
it increases the number of participants who have complete data, so your sample size reaches its maximum: it increases power
More power= decreases chance that you'll mistakenly accept the null hypothesis