0% found this document useful (0 votes)
57 views59 pages

Validity and Reliability

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views59 pages

Validity and Reliability

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

Validity and Reliability

Professor.Dr.Md. Abdur Rahim, PhD


MBBS,DPH, Dip (H- Econ), M.Phil
University of Oslo,Norway

08/15/25 Dr. M A Rahim,PhD 1


What is validity?
Validity refers to how accurately a method
measures what it is intended to measure.

Validity is the ability of a test to indicate which


individuals have the disease and which do not

If research has high validity, that means it produces


results that correspond to real properties,
characteristics, and variations in the physical or
social world.
08/15/25 Dr. M A Rahim,PhD 2
Validity is analogous to accuracy

The validity of a test is how well the


given test reflects another test of
known greater accuracy

Validity assumes that there is a gold


standard to which a test can be
compared

08/15/25 Dr. M A Rahim,PhD 3


What is reliability?
Reliability refers to how consistently a
method measures something.

•If the same result can be consistently


achieved by using the same methods under
the same circumstances, the measurement
is considered reliable.
•You measure the temperature of a liquid sample
several times under identical conditions.
•The thermometer displays the same temperature
every time, so the results are reliable.
08/15/25 Dr. M A Rahim,PhD 4
Reproducibility, repeatability,
reliability
All mean that the results of a test or
measure are identical or closely similar
each time it is conducted
Because of variation in laboratory
procedures, observers, or changing
conditions of test subjects (such as
time, location), a test may not
consistently yield the same result
when repeated
Different types of variation
Intra-observerDr. variation
08/15/25 M A Rahim,PhD 5
• A doctor uses a symptom questionnaire to
diagnose a patient with a long-term medical
condition.
• Several different doctors use the same
questionnaire with the same patient but give
different diagnoses.
• This indicates that the questionnaire has low
reliability as a measure of the condition.
• Reliability and validity are closely related, but
they mean different things. A measurement
can be reliable without being valid. However, if
08/15/25
a measurement Dr.isMvalid, it is usually also
A Rahim,PhD 6
reliable.
For example,
•if you measure a cup of rice
three times, and you get the
same result each time, that
result is reliable.
•The validity, on the other
hand, refers to the
measurement's accuracy. This
means that if the standard
weight for a cup of rice is 5
08/15/25 Dr. M A Rahim,PhD 7
Test of Validity
• Sensitivity
– the ability of a test to correctly identify those
who have a disease
• a test with high sensitivity will have few false
negatives
• Specificity
– the ability of a test to correctly identify those
who do not have the disease
• a test that has high specificity will have few false
positives

08/15/25 Dr. M A Rahim,PhD 8


• High reliability is one indicator that a
measurement is valid.
• If a method is not reliable, it probably isn’t
valid.
• If the thermometer shows different
temperatures each time, even though you
have carefully controlled conditions to ensure
the sample’s temperature stays the same,
• The thermometer is probably malfunctioning,
and therefore its measurements are not valid.

08/15/25 Dr. M A Rahim,PhD 9


08/15/25 Dr. M A Rahim,PhD 10
Types of validity
Research validity is
categorized into four main
types, which involve:
1.Construct validity
2.Content validity
3.Face validity
4.Criterion validity
08/15/25 Dr. M A Rahim,PhD 11
1. Construct Validity
With the help of construct
validity, it becomes easy to
evaluate if a particular
measurement tool actually
represents the thing that we
want to measure. It plays a key
role in signifying the overall
validity of a specific method.
08/15/25 Dr. M A Rahim,PhD 12
2. Content Validity
Content validity is used for
evaluating if a test can
represent the different aspects
of a specific construct. In order
to generate valid results, it’s
essential that the content of
the survey, test or any
measurement method you use
must cover the relevant &
08/15/25 Dr. M A Rahim,PhD 13
3. Face Validity
Face validity is used for
considering how appropriate
the content of a particular
test looks on the surface. It
seems quite similar to the
content validity, but it is
considered to be a more
subjective and informal type
of assessment
08/15/25 Dr. M A Rahim,PhD 14
4. Criterion Validity
Criterion validity plays a crucial
role in evaluating the relativity of
your test results, i.e. how closely a
test’s results correspond to
another test’s results.
What is a criterion?
A criterion can be defined as an
external measurement of a similar
thing. In other words, it is a widely
popular and established test that
08/15/25 Dr. M A Rahim,PhD 15
Conclusion
Research validity is the backbone of credible
and meaningful research. By ensuring that your
study accurately measures the intended
objective, you can draw valuable conclusions.
The different types of validity work together to
provide a comprehensive assessment of your
research’s credibility. By prioritizing validity in
research, you can contribute to the advancement
of knowledge and create a more accurate
understanding of the world.
08/15/25 Dr. M A Rahim,PhD 16
The 4 Types of Reliability in
Research Definitions & Examples
Reliability tells you how consistently a
method measures something. When you
apply the same method to the
same sample under the same conditions,
you should get the same results. If not,
the method of measurement may be
unreliable or bias may have crept into
your research.
08/15/25 Dr. M A Rahim,PhD 17
The 4 Types of Reliability in
Research Definitions & Examples

There are four main types of


reliability.
Each can be estimated by comparing
different sets of results produced by
the same method.

08/15/25 Dr. M A Rahim,PhD 18


Type of Measures the consistency of…
reliability
Test-retest The same test over time.
Interrater The same test conducted by
different people.
Parallel Different versions of a test which are
forms designed to be equivalent.

Internal The individual items of a test.


consistency

08/15/25 Dr. M A Rahim,PhD 19


Reliability of A Tests
Sources of variability that can affect the
reproducibility of results of a test:

1. Biological variation (e.g. blood pressure)


2. Reliability of the instrument itself
3. Intra-observer variability (differences in
repeated measurement by the same
screener)
4. Inter-observer variability (inconsistency in
the way different screeners apply or
interpret test results)
08/15/25 Dr. M A Rahim,PhD 20
Validity and reliability in research findings

08/15/25 Dr. MA Rahim,PhD 21


Type I & Type II Errors | Differences,
Examples, Visualizations
In statistics, a Type I error is a false
positive conclusion, while a Type II error is
a false negative conclusion.
Making a statistical decision always involves
uncertainties, so the risks of making these
errors are unavoidable in hypothesis testing.

08/15/25 Dr. M A Rahim,PhD 22


Type I & Type II Errors | Differences,
Examples, Visualizations
The probability of making a Type I error is
the significance level, or alpha (α), while the
probability of making a Type II error is beta
(β). These risks can be minimized through
careful planning in your study design.
Example: Type I vs Type II errorYou decide to get
tested for COVID-19 based on mild symptoms.
There are two errors that could potentially occur:
08/15/25 Dr. M A Rahim,PhD 23
Type I error (false positive): the test
result says you have coronavirus,
but you actually don’t.
Type II error (false negative): the
test result says you don’t have
coronavirus, but you actually do.

08/15/25 Dr. M A Rahim,PhD 24


Error in statistical decision-making
Using hypothesis testing, you can make
decisions about whether your data support or
dispove your research predictions with null
and alternative hypotheses.
Hypothesis testing starts with the assumption
of no difference between groups or no
relationship between variables in the
population—this is the null hypothesis.
08/15/25 Dr. M A Rahim,PhD 25
Error in statistical decision-making
It’s always paired with an alternative
hypothesis, which is your research
prediction of an actual difference between
groups or a true relationship
between variables.
Example: Null and alternative hypothesisYou test
whether a new drug intervention can alleviate
symptoms of an autoimmune disease.

08/15/25 Dr. M A Rahim,PhD 26


In this case:
The null hypothesis (H ) is that the new drug has

0

no effect on symptoms of the disease.


The alternative hypothesis (H ) is that the drug is

1

effective for alleviating symptoms of the disease.


Then, you decide whether the null hypothesis
can be rejected based on your data and the
results of a statistical test. Since these
decisions are based on probabilities, there is
always a risk of making the wrong conclusion.

08/15/25 Dr. M A Rahim,PhD 27



If your results show statistical significance,
that means they are very unlikely to occur if
the null hypothesis is true. In this case, you
would reject your null hypothesis. But
sometimes, this may actually be a Type I
error.

If your findings do not show statistical
significance, they have a high chance of
occurring if the null hypothesis is true.
Therefore, you fail to reject your null
hypothesis. But sometimes,
08/15/25 Dr. M A Rahim,PhD
this may be a 28

Type II error.
Example: Type I and Type II errors
A Type I error happens when you get
false positive results: you conclude that
the drug intervention improved
symptoms when it actually didn’t.
These improvements could have arisen
from other random factors or
measurement errors.
08/15/25 Dr. M A Rahim,PhD 29
Example: Type I and Type II errors
A Type II error happens when you get
false negative results: you conclude that
the drug intervention didn’t improve
symptoms when it actually did. Your
study may have missed key indicators
of improvements or attributed any
improvements to other factors instead.

08/15/25 Dr. M A Rahim,PhD 30


08/15/25 Dr. M A Rahim,PhD 31
A type I error occurs if a null hypothesis is
rejected that is actually true in the
population.
This type of error is representative of a
false positive.
Alternatively, a type II error occurs if a null
hypothesis is not rejected that is actually
false in the population.

08/15/25 Dr. M A Rahim,PhD 32


What Causes Type II Errors?
A type II error is commonly caused if the
statistical power of a test is too low.
The higher the statistical power, the
greater the chance of avoiding an error.
It’s often recommended that the statistical
power should be set to at least 80% prior
to conducting any testing.

08/15/25 Dr. M A Rahim,PhD 33


Intra-subject variationis a variation in
the results of a test conducted over (a
short period of) time on the same
individual
The difference is due to the changes
(such as physiological, environmental,
etc.) occurring to that individual over that
time period

08/15/25 Dr. M A Rahim,PhD 34


Inter-observer variationis a
variation in the result of a test due to
multiple observers examining the
result (inter = between)
Intra-observer variationis a
variation in the result of a test due to
the same observer examining the
result at different times (intra =
within)
The difference is due to the extent to
which observer(s) agree or disagree
when interpreting
08/15/25 the same test result
Dr. M A Rahim,PhD 35
Validity of A Tests
Key Measures
• Sensitivity
• Specificity
• Positive Predictive Value
• Negative Predictive Value

08/15/25 Dr. M A Rahim,PhD 36


Validity Principles
• Sensitivity
– the ability of a test to correctly identify those
who have a disease
• a test with high sensitivity will have few false
negatives
• Specificity
– the ability of a test to correctly identify those
who do not have the disease
• a test that has high specificity will have few false
positives

08/15/25 Dr. M A Rahim,PhD 37


Sensitivity
• Proportion of individuals who have the
disease who test positive (a.k.a. true positive
rate)
• tells us how well a “+” test picks up disease

Disease
yes no
a
Screening

+ a b a+b Sensitivity =
Test

- c d c+d a+c
a+c b+d N
08/15/25 Dr. M A Rahim,PhD 38
Specificity
• Proportion of individuals who don’t have the
disease who test negative (true negative
rate)
• tell us how well a “-” test detects no disease

Disease
yes no
d
Screening

+ a b a+b Specificity =
Test

- c d c+d b+d
a+c b+d N
08/15/25 Dr. M A Rahim,PhD 39
Predictive Value
• Measures whether or not an individual
actually has the disease, given the results of
a test
• Affected by
– specificity
– prevalence of preclinical disease
– Sensitivity
• Prevalence = a+c
a+b+c+d
08/15/25 Dr. M A Rahim,PhD 40
Positive Predictive Value

• Proportion of individuals who test


positive who actually have the disease

Disease
yes no
a
P.P.V. =
Test

+ a b a+b

- c d c+d a+b
a+c b+d N
08/15/25 Dr. M A Rahim,PhD 41
Negative Predictive Value

• Proportion of individuals who test


negative who don’t have the disease

Disease
yes no
d
=
Test

+ a b a+b N.P.V.
- c d c+d c+d
a+c b+d N
08/15/25 Dr. M A Rahim,PhD 42
A test is used in 50 people with disease and
50 people without. These are the results.
Disease
Present Absent

Positive 48 3 51
Test

Negative 2 47 49

50 50 100
08/15/25 Dr. M A Rahim,PhD 43
Disease
Present Absent

Test
Positive 48 3 51

Negative 2 47 49

50 50 100
Sensitivity = 48/50
Specificity = 47/50
Positive Predictive Value = 48/51
Negative Predictive Value = 47/49
08/15/25 Dr. M A Rahim,PhD 44
Validity of A Tests
True Disease Status
Results of + -
Test
+ a b

- c d

a = true positive
b = false positive
c = false negative
d = true negative
08/15/25 Dr. M A Rahim,PhD 45
Validity of A Tests
True Disease Status
Results of + -
Test
+ a b

- c d

Sensitivity: The probability of testing


positive if the disease is truly present

Sensitivity = a / (a + c)
08/15/25 Dr. M A Rahim,PhD 46
Validity of Screening Tests
True Disease Status
Results of + -
Test
+ a b

- c d

Specificity: The probability of screening


negative if the disease is truly absent

Specificity = d / (b + d)
08/15/25 Dr. M A Rahim,PhD 47
Validity of Tests
Breast Cancer
Physical Exam + -
and Mammo-
graphy + 132 983

- 45 63650

Sensitivity: a / (a + c)
Sensitivity =
Specificity: d / (b + d)
08/15/25
SpecificityDr. =M A Rahim,PhD 48
Validity of Tests
Breast Cancer
Physical Exam + -
and Mammo-
graphy + 132 983

- 45 63650

Sensitivity: a / (a + c)
Sensitivity = 132 / (132 + 45) = 74.6%

Specificity: d / (b + d)
Specificity = 63650 / (983 + 63650) = 98.5%
08/15/25 Dr. M A Rahim,PhD 49
Validity of Tests

Sensitivity: a / (a + c)
Sensitivity = 132 / (132 + 45) = 74.6%

Specificity: d / (b + d)
Specificity = 63650 / (983 + 63650) = 98.5%

Sensitivity: Screening by physical exam and


mammography will identify 75% of all true breast
cancer cases.

Specificity: Screening by physical exam and


mammography will correctly classify 98.5% of all
non-breast cancer patients as being disease free.
08/15/25 Dr. M A Rahim,PhD 50
Performance Yield
True Disease Status
+ -

Results of + a b
Test
- c d

Predictive value positive (PV+): The probability that a


person actually has the disease given that he or she
tests positive.

PV+ = a / (a + b)
08/15/25 Dr. M A Rahim,PhD 51
Performance Yield
True Disease Status
+ -

Results of + a b
Test
- c d

Predictive value negative (PV-): The probability that a


person is truly disease free given that he or she
tests negative.

PV- = d / (c + d)
08/15/25 Dr. M A Rahim,PhD 52
Performance Yield
True Disease Status
+ -

Results of + 400 995


Test
- 100 98905

Sensitivity: a / (a + c) = 400 / (400 + 100) = 80%


Specificity: d / (b + d) = 98905 / (995 + 98905) = 99%
PV+: a / (a + b) = 400 / (400 + 995) = 29%
PV-: d / (c + d) = 98905 / (100 + 98905) = 99%
08/15/25 Dr. M A Rahim,PhD 53
Performance Yield
True Disease Status
+ -

Results of + 400 995


Test
- 100 98905

PV+: a / (a + b) = 400 / (400 + 995) = 29%

Among persons who screen positive, 29% are found


to have the disease.
08/15/25 Dr. M A Rahim,PhD 54
Performance Yield
True Disease Status
+ -

Results of + 400 995


Test
- 100 98905

PV-: d / (c + d) = 98905 / (100 + 98905) = 99.9%

Among persons who screen negative, 99.9% are found


to be disease free.
08/15/25 Dr. M A Rahim,PhD 55
Factors that influence PV+ and PV-

1. The more specific the test, the higher


the PV+

2. The higher the prevalence of preclinical


disease in the screened population, the
higher the PV+

3. The more sensitive the test, the higher


the PV-
08/15/25 Dr. M A Rahim,PhD 56
Validity and reliability in research findings

08/15/25 Dr. MA Rahim,PhD 57


08/15/25 Dr. M A Rahim,PhD 58
THANKS FOR YOUR ATTENTION

08/15/25 Dr. M A Rahim,PhD 59

You might also like