A decision tree is a widely used data mining technique for classifying multiple covariates (those variables that are co-related to dependent as well as independent variables) to develop a prediction algorithm for the target variable. A decision tree helps classify the data into branch-like sections starting from the root node, internal, and leaf nodes.
Decision tree starts with a root node, and subsequently, it branches into possible outcomes. These outcomes then result in additional nodes until the correct output is achieved. The biggest advantage of using a decision tree is that it presents the necessary information to the user in the most effective manner.
A decision tree is amongst the most supervised machine learning algorithm that is used for regression analysis and problems related to the classification of variables.
Let us consider an example to understand whether an individual is fit or not fit based on food choice and lifestyle using a decision tree:
From the above decision tree, one can quickly figure out that an individual whose age is less than or equal to 30 years and does not eat junk food is healthy. On the other hand, a person over the age of 30 years and exercises regularly is fit, and if the person does not exercise, then he/she is considered unfit.
In the above example, “Age” is the root node, “Eat Junk food” and “Exercise” are the decision node. “Fit” and “Unfit” are terminal nodes.
Let us understand the key terms used in the decision tree.
A decision tree is generally used by businesses while making any decision. Practically, there are ‘n’ numbers of solutions possible using a decision tree, which can be adopted by companies as well as the government of any nation.
After understanding the decision tree, its components, and its application, one would like to know how a decision tree works.
In this section, we would look at specific steps that are followed while making a judgement using a decision tree. Let us consider an example where we have to decide whether Alan should accept a new job role or not using a decision tree. We would follow the below-mentioned steps:
Example:
Alan has an offer of $25,000, and we have to decide if Alan should accept the offer or not. In the below diagram, whether Alan would take the new role or not would not only depend on the salary offered, but he would also look at other convenience like travel time, food and travelling expenses provided by the company or not.
Using a decision tree has several advantages. Some of them are highlighted below: