Catalogue of Artificial Intelligence Techniques


Jump to: Top | Entry | References | Comments

View Maths as: Images | MathML

ID3 Algorithm

Aliases: ID3 Approach

Keywords: classification, nonnumeric, pattern, recognition

Categories: Pattern Recognition and Image Processing

Author(s): Joe Bain

This algorithm is a procedure for generating efficient discrimination trees for elements that have non-numeric values or attributes. The algorithm works on a set of training data, which has been sorted into classes, and creates a tree which divides the data based on common attributes. The tree can then be used to classify real-world data of the same variety. It is best used on very large sets of data where each element is made up of a long list of attributes. There are more sophisticated techniques, such as the Procedure M, but these are not so efficient.

The algorithm works by dividing the given data based on each of the different attributes the elements have and determines the increase or decrease in mixed-class branches after each divide. The divide which produces the most sorted set of branches (or least entropy) in terms of class is finalised and the algorithm moves onto the second stage to repeat the procedure on each of the branches. This continues until each lowest branch contains only elements of one class.

The main advantages of the ID3 algorithm are that it is easily implemented, being quite a simple process, and its running time increases only linearly with the complexity of the problem. The main disadvantage is that the tree cannot be updated when new data is classified incorrectly, instead a new tree must be generated.



Add Comment

No comments.