Catalogue of Artificial Intelligence Techniques
Aliases: Ensemble learning
Keywords: boosting, classification, ensemble
Author(s): Chris Jeffery
A useful technique in binary classification, the separation of data into two classes. Boosting is a method of learning that builds an ensemble of weighted hypotheses generated by some weak learner that together form a more accurate hypothesis space.
Consider a problem where a collection of data has to be separated into two classes A and B, where some unknown function exists that maps each object of the data to it's classification.
Given a sample set and number of iterations k
Initially every object in the sample space is assigned a weighting to form a uniform distribution.
loop for i = 1 ... k
- A weak learner builds a hypothesis H(i) based on the weighted sample set
- A training error E(i) is calculated for H(i) over the data and the algorithm terminates if the error E(i) exceeds 0.5
- A weighting W(i) is assigned to H(i) as to minimise some loss function which is intended to relate to the expected misclassification of data not covered in the sample space.
- Each object of the sample space is then assigned a new weighting. The weighting of those which were misclassified being increased the most, while those which were furthest from the decision boundary are decreased. Weightings are then scaled as a distribution.
The final hypothesis is the average of all weighted hypotheses W(1)H(1) ... W(k)H(k)
- R. Meir and G. Rätsch, An introduction to boosting and leveraging, Advanced Lectures on Machine Learning (S. Mendelson and A. Smola, eds.), Springer, 2003, pp.119 - 184 (Copyright by Springer Verlag.).