Need get list attributes and initial decision tree node
602 Chapter 7 Learning
Data Structures and Interfaces
The pseudo-code assumes that examples have the following interface:
|
---|
The initial decision node can simply be created empty. So the call may look some-thing like
1 |
|
---|
7.5 Decision Tree Learning 603
the attribute values must be divided into a small number of discrete categories (usu-ally two). This division can be performed automatically as an independent process, and with the categories in place, the rest of the decision tree learning algorithm re-mains identical.
We repeat the process by moving the lowest valued example from category B into category A and calculating the information gained in the same way. Whichever di-vision gave the greatest information gained is used as the division. To enable future examples, not in the set, to be correctly classified by the resulting tree, we need a nu-meric threshold value. This is calculated by finding the average of the highest value in category A and the lowest value in category B.
This process works by trying every possible position to place the threshold that will give different daughter sets of examples. It finds the split with the best informa-tion gain and uses that.
39 Attack
17 Defend
604 | ||||
---|---|---|---|---|
Category | Attribute Value | |||
|
||||
B | ||||
39 | ||||
50 |
|
|||
Category | Attribute Value |
|
||
A | ||||
|
||||
B |
|
|||
50 | ||||
Category | Attribute Value |
|
||
A |
|
|||
25 | ||||
B |
|
|
Pseudo-Code
We can incorporate this threshold step in the splitByAttribute function from the previous pseudo-code.
7.5 Decision Tree Learning |
|
|
---|---|---|
|
Data Structures and Interfaces
We have used the list of examples as a stack in the code above. An object is removed from one list and added to another list using push and pop. Many collection data struc-tures have these fundamental operations. If you are implementing your own lists, using a linked list, for example, this can be simply achieved by moving the “next”pointer from one list to another.
|
|
---|
A similar approach can be used to create more than one threshold value. As the number of splits increases, there is an exponential increase in the number of different scenarios that needs to have its information gain calculated.
There are several algorithms for multi-splitting input data for lowest entropy. In general, the same thing can also be achieved using any classification algorithm, such as a neural network.