Since we know that decision trees are widely used in Data mining, text mining, information extraction, machine learning, pattern recognition etc. but there are many other types of classifiers like you may have heard of neural network or support virtual machines. So why did we use a decision tree? Well, decision tree is represented graphically as hierarchical structures so they have a very unique property that they are easy to read and understand. In fact, they are among the few models that are interpretable, where you can understand exactly why the classifier makes a decision.Also, it is able to handle numerical and categorical data.
Now we know the importance of using decision tree, so lets have a look what the Decision tree actually is?
A decision tree is a classifier expressed as a recursive partition of the instance space and is a popular technique in data mining.In a decision tree, each internal node splits the instance space into two or more sub-spaces according to a certain discrete function of the input attributes values. It represents decisions and decision making and allows user to
take a problem with multiple possible solution and display it into a simple easy understandable format.
Parts of decision tree:-
--root node ( it is the topmost node in tree and have no incoming edges).
--internal node or test node(it denotes test on attribute and have outgoing edges and are represented as circles).
--branch node(it denotes outcome of test).
--leaf node or decision nodes(represents a classification or decision and are denoted as triangles).
Decision tree which are used in Data mining can be divided into two types as follow--
-Regression tree analysis(when predicted result can be taken as real number).
-Classification tree analysis(when predicted result is the class to which the data belongs).
The term Classification And Regression Tree (CART) analysis is an umbrella term used to refer to both of the above
Decision tree algorithms-
--ID3(Interactive Dichotomiser 3)
--C4.5(Successor of ID3)
--CART(Classification of Regression tree)
--CHAID(CHI-squared Automatic Interaction Detector)
--MARS(Multivariate Adaptive Regression Splines)
For more information on Decision Tree refer to the following links:-