Reliability and Validity

Quantitative Methodology

Reliability and validity are important aspects of selecting a survey instrument.  Reliability refers to the extent that the instrument yields the same results over multiple trials.  Validity refers to the extent that the instrument measures what it was designed to measure.  In research, there are three ways to approach validity and they include content validity, construct validity, and criterion-related validity.

Content validity measures the extent to which the items that comprise the scale accurately represent or measure the information that is being assessed.  Are the questions that are asked representative of the possible questions that could be asked?

Construct validity measures what the calculated scores mean and if they can be generalized.  Construct validity uses statistical analyses, such as correlations, to verify the relevance of the questions.  Questions from an existing, similar instrument, that has been found reliable, can be correlated with questions from the instrument under examination to determine if construct validity is present.  If the scores are highly correlated it is called convergent validity.  If convergent validity exists, construct validity is supported.

Criterion-related validity has to do with how well the scores from the instrument predict a known outcome they are expected to predict.  Statistical analyses, such as correlations, are used to determine if criterion-related validity exists.  Scores from the instrument in question should be correlated with an item they are known to predict.  If a correlation of > .60 exists, criterion related validity exists as well.

Reliability can be assessed with the test-retest method, alternative form method, internal consistency method, the split-halves method, and inter-rater reliability.

Need help with your analysis?

Schedule a time to speak with an expert using the calendar below.

Test-retest is a method that administers the same instrument to the same sample at two different points in time, perhaps one year intervals.  If the scores at both time periods are highly correlated, > .60, they can be considered reliable.  The alternative form method requires two different instruments consisting of similar content.  The same sample must take both instruments and the scores from both instruments must be correlated.  If the correlations are high, the instrument is considered reliable.  Internal consistency uses one instrument administered only once.  The coefficient alpha (or Cronbach’s alpha) is used to assess the internal consistency of the item.  If the alpha value is .70 or higher, the instrument is considered reliable.  The split-halves method also requires one test administered once.  The number of items in the scale are divided into halves and a correlation is taken to estimate the reliability of each half of the test.   To estimate the reliability of the entire survey, the Spearman-Brown correction must be applied.  Inter-rater reliability involves comparing the observations of two or more individuals and assessing the agreement of the observations.  Kappa values can be calculated in this instance.