Data Cleaning

Quantitative Results

Most times after data has been collected, data cleaning, or screening, should take place to ensure that the data to be examined is as ‘perfect’ as it can be.  Data cleaning can involve a number of assessments.  For example, let’s say a survey questionnaire was put online and data was collected via a website.  A question that is most often asked is one that pertains to agreeing to participate.  If a participant selects ‘do not wish or agree to participate,’ his or her responses should not be examined and should be removed from the data set. 

request a consultation

Discover How We Assist to Edit Your Dissertation Chapters

Aligning theoretical framework, gathering articles, synthesizing gaps, articulating a clear methodology and data plan, and writing about the theoretical and practical implications of your research are part of our comprehensive dissertation editing services.

  • Bring dissertation editing expertise to chapters 1-5 in timely manner.
  • Track all changes, then work with you to bring about scholarly writing.
  • Ongoing support to address committee feedback, reducing revisions.

Another data screening assessment is inclusion criteria.  If a participant does not fit into a specified inclusion criteria, then his or her responses should be not be examined and removed from the data set.  Specific inclusion criteria depend on the goal of the research.  For example, if a study only wants to examine the responses from male participants, any responses that came from females should be removed from the data set. Or, if a study wants to examine on a certain age group, then those participants that do not fit into that age group should be removed from the data set.  Another assessment that often occurs is the examination of missing cases.  Participants sometimes are able to skip questions in the survey questionnaire and leave blank or missing data.  For example, if a study had 40 survey questions and one participant chose to only answer three survey questions, then that participant does not really contribute much and should be removed from the data set.

Outliers should also be checked for.  When examining scores within the data set, it is important to not have values that skew a variable too much.  For example, if a study focused on test scores, and the variable on test scores averaged around an 80, a participant with a test score of 12 would most likely be considered an outlier and should be removed from the data set.