Catalogue of Artificial Intelligence Techniques


Jump to: Top | Entry | References | Comments

View Maths as: Images | MathML

Dimensionality Reduction

Aliases: Dimension Reduction

Keywords: construction, extraction, feature, learning, mining, selection, statistics

Categories: Knowledge Representation

Author(s): Steven Holmes

Dimensionality reduction is a technique for reducing the number of features (random variables) associated with a dataset. It is utilised in many problem domains, including data mining, data classification and pattern recognition. Reducing the number of features is advantageous as it reduces time and space requirements needed for analysis and mitigates the "curse of dimensionality" problem in which the number of measurements needed to estimate a probability distribution grows exponentially with the number of features. This latter property is particularly crucial for machine learning, where often only a few samples are available, each with a large number of features.

Effective dimensionality reduction retains the properties of the original dataset that are of interest. A bank, for instance, might wish to find out which customers may be receptive to mortgage offers. In this case, given the full customer records, it would be useful to reduce them to a feature vector that indicates who is likely to soon buy a house. Similarly, in OCR it may be useful to reduce the (likely large) feature vector for an input to those features that most discriminate between letters.

Feature selection is an approach that tries to select the most useful subset of features from the original feature vector. Techniques for this include using "filters" that rank and select individual features, and "subset selection" methods that try to select groups of complementary features (which avoids the pitfalls of considering features in isolation).

Feature construction (or feature extraction) tries to construct a new feature vector from the original. Two common, simple, general techniques for this are PCA (Principal Component Analysis) and LDA (Linear Discriminant Analysis) which utilise a linear map from the original feature vector to the new one.

Both feature selection and feature construction can benefit dramatically if domain-specific knowledge is incorporated into their results.



Add Comment

No comments.