Reliability and validity are important aspects of selecting a survey instrument. Reliability refers to the extent that the instrument yields the same results over multiple trials. Validity refers to how well the instrument measures what you intend it to measure. In research, there are three ways to approach validity and they include content validity, construct validity, and criterion-related validity.
Content validity evaluates how well the items on the scale represent or measure the information you intend to assess. Do the questions you ask represent all the possible questions you could ask?
Construct validity measures what the calculated scores represent and whether you can generalize them. Construct validity uses statistical analyses, such as correlations, to verify the relevance of the questions. You can correlate questions from an existing, reliable instrument with questions from the instrument under examination to determine if construct validity is present. High correlation between the scores indicates convergent validity. If you establish convergent validity, you support construct validity.
Criterion-related validity refers to how well the instrument’s scores predict a known outcome that you expect them to predict. You use statistical analyses, such as correlations, to determine if criterion-related validity exists. You should correlate scores from the instrument with an item they knew to predict. If a correlation of > .60 exists, criterion related validity exists as well.
You can assess reliability using the test-retest method, alternative form method, internal consistency method, split-halves method, and inter-rater reliability.
Test-retest is a method that administers the same instrument to the same sample at two different points in time, perhaps one year intervals. If you find that the scores at both time periods correlate highly (> .60), you can consider them reliable. The alternative form method requires two different instruments consisting of similar content. You must have the same sample take both instruments, and then you correlate the scores from both instruments. If you find high correlations, you can consider the instrument reliable. Internal consistency uses one instrument administered only once.
You use the coefficient alpha (or Cronbach’s alpha) to assess the internal consistency of the items. If the alpha value is .70 or higher, you can consider the instrument reliable. The split-halves method also requires one test administered once. The number of items in the scale are divided into halves and a correlation is taken to estimate the reliability of each half of the test. To estimate the reliability of the entire survey, the Spearman-Brown correction must be applied. Inter-rater reliability involves comparing the observations of two or more individuals and assessing the agreement of the observations. Kappa values can be calculated in this instance.