Transforming variables to meet an assumption

Quantitative Results
  • Transforming variables can be done to correct for outliers and assumption failures (normality, linearity, and homoscedasticity/homogeneity); however, interpretation is then limited to the transformed scores.
    • Normality assumes that the dependent variables are normally distributed (symmetrical bell shaped) for each group
    • Homogeneity of variance assumes that groups have equal error variances
    • Linearity assumes a straight line relationship between the variables
    • Homoscedasticity assumes that scores are normally distributed about the regression line
request a consultation

Discover How We Assist to Edit Your Dissertation Chapters

Aligning theoretical framework, gathering articles, synthesizing gaps, articulating a clear methodology and data plan, and writing about the theoretical and practical implications of your research are part of our comprehensive dissertation editing services.

  • Bring dissertation editing expertise to chapters 1-5 in timely manner.
  • Track all changes, then work with you to bring about scholarly writing.
  • Ongoing support to address committee feedback, reducing revisions.
  • Violations of homogeneity usually can be corrected by transforming the DV.  OR, instead of transforming the DV, use a more stringent alpha level for the untransformed DV
  • Ensure that the transformed variable(s) meets the assumptions (such as normality, little to no outliers, etc…).  Often, you are not sure what transformation would work best to meet the assumptions; trial and error.
  • Usually, if some variables are skewed and others are not, the transformations provide an improvement; however, that is not always the case.
  • Examining the means for untransformed scores is the same as examining the medians for transformed scores; the transformation affects the mean but not the median because the median only depends on rank order.  Therefore, the means of transformed variables is the same as the median of untransformed variables. Interpret results appropriately.
  • Examples of different transformations are: taking the square root of the variable(s); taking the natural logarithm; multiplicative inverse; for skewed variables, reflect the variable and then apply the appropriate transformation; etc…
  • According to Tabachnick & Fidell (2007), to reflect a variable, find the largest score in the distribution and then add 1 to it; this forms a constant that is larger than any other score in the distribution.  Create a new variable by subtracting each score from the constant.  Interpret this reflected variable appropriately: reverse the direction of the interpretation or re-reflect the variable after transforming it; or, keep in mind that if smaller scores represented negative units before the transformation, then after the transformation the smaller scores will represent positive units.
  • To transform for normality: According to Bradley (1982), taking the inverse of the scores is the best of several alternatives for skewed (or J-shaped) distributions.  However, according to Tabachnick & Fidell (2007), this alternative may not render the distribution normal.
  • Example: A multiple linear regression is proposed on GPA scores and IQ scores predicting security scores (the dependent or outcome variable).  Normality on the security scores (where 5 = highly secure and 1 = not at all secure) was assessed with a Kolmogorov-Smirnov (KS) test.  The test resulted in a significant value, indicating the assumption of normality was not met.  Homoscedasticity was assessed with residual plots and the assumption was not met.  Due to these violations, the dependent variable (securities) was transformed according to the recommendations described by Tabachnick and Fidell (2007).  The natural logarithm of the dependent value was used.  The assumptions of normality and homoscedasticity were re-assessed on the transformed variable; the assumptions were met.  The regression model was conducted using the transformed dependent variable: the natural logarithm of securities.  Results of further analyses must be interpreted on the natural logarithm of security scores.