US 877.437.8622    UK 0.808.101.0930    info@statisticssolutions.com

Our Mission

"To serve graduate students and researchers by producing and delivering expert data analysis and clear sample size justification, comprehensible results, and ongoing support with unsurpassed response time and the most aggressive pricing in the statistical consulting field."

"Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse ultricies scelerisque bibendum. Maecenas sodales fermentum nisl id dapibus. Praesent malesuada, lacus non accumsan imperdiet, quam ante euismod dui, quis fermentum felis metus non nisi"

CHAID

CHAID stands for Chi-square Automatic Interaction Detector. The CHAID technique was created by Gordon V. Kass in 1980. CHAID is a technique of decision tree or regression tree. CHAID is the best tool used to discover the relationship between variables. CHAID analysis determines how the variables best combine to explain the outcome in given dependent variables. In CHAID analysis, categorical or ordinal data is used. CHAID technique converts continuous data into ordinal data during analysis. The best use of CHAID analysis in contingency tables is to decide which variable is the maximum impotency in classification. CHAID analysis has the ability to build the non-binary classification tree as well. This is where more than two branches may go from the node. In the CHAID technique, we can visually see the relationship between the variable and the associated related factor with a tree. In business or psychology, most research is conducted using a survey. In most of the surveys, the answer is a categorical value instead of a continuous value. Discovering the relationship between the categorical values is a challenging job. The CHAID technique is the best tool to answer the survey research question. In CHAID analysis, we develop the decision tree or classification tree. CHAID analysis starts with identifying the target variable or dependent variable. CHAID analysis splits the target in two or more categories that are called the Initial nodes. In CHAID analysis, nodes are split using statistical algorithms. In CHAID analysis, there are two components: predictor variables and target variables. In CHAID analysis, predictor variables may be one, and should be ordinal, nominal or continuous in nature. Target variables should be one, and should be nominal, ordinal or continuous in nature. The CHAID technique is a better technique than regression analysis technique because in CHAID analysis, normal distribution is not required. Like cluster analysis, in CHAID analysis, categories of the impendent variables are merged.

Merging: In CHAID analysis, a two way cross tabulation is formed between each dependent and independent variable and categories are merged within and across the independent variable. In CHAID analysis, Bonferroni adjusted p-value is calculated for merged crosstab.

CHAID algorithm: CHAID algorithm is the process of merging the categories based on their similarity in relation to their dependent variable. CHIAD algorithm is a decision tree, which is constructed by splitting the subset of space into two or more nodes. This process is continued until the non-significant pair is not found.

Decision tree components in CHAID analysis:

In CHAID analysis, the following are the components of the decision tree:

  1. Root node: In CHAID analysis, root node is the dependent variable or the target variable. For example, CHAID can be used if a bank wants to predict the credit card risk based upon information like age, income, number of credit cards, etc. In this example, the credit card risk dependent variable will be the root node.
  2. Parent’s node: Ehen algorithm splits the target variable into two or more categories. These categories are called parent’s node or initial node. For the bank example, high, medium and low categories are the parent’s nodes.
  3. Child node: independent variable categories which come below the parent’s categories in the CHAID analysis tree are called the child node.
  4. Terminal node: The last categories of the CHAID analysis tree are called the terminal node. In the CHAID analysis tree, the category that is a major influence on the dependent variable comes first and the less important category comes last. Thus, it is called the terminal node.

CHAID and SPSS:

Most software does not support this technique, but SPSS has the option to perform this analysis. In SPSS, we can perform this technique by using the “analysis” menu and selecting the “tree” from the “classify” option. Select the dependent and independent variable and other necessary options.

Contact Request Form

Fill-out the form below to learn how we can assist you with CHAID

We respect your privacy and guarantee that information will never be shared with third parties

  • Ph.D. Research Methodologists
  • Ph.D. Statisticians
  • Timely ongoing support
  • Accurate Statistics Guaranteed
  • Will Accommodate Your Schedule
  • Statistics Coaching
  • Quantitative & Qualitative Expertise
  • Customized Video Tutorials
Email Newsletter icon, E-mail Newsletter icon, Email List icon, E-mail List icon Sign Up For Our Weekly Email Newsletter
For Email Newsletters you can trust
WebsiteFeedback
Feedback Analytics