Catalogue of Artificial Intelligence Techniques

   

Jump to: Top | Entry | References | Comments

View Maths as: Images | MathML

N-grams

Keywords: bi-gram, markov, natural language processing, tri-gram, word prediction

Categories: Speech


Author(s): Hannah Stewart

The n-gram is a model of word prediction used in natural language processing. The model works by looking at the previous N-1 words in order to predict the next word. Predictions are made by comparing the combination of words with those of a copora (singular: corpus). A copora being an on-line collection of text and speech and is used as a basis for statistical processing natural languages.

The probability of a combination of words is calculated by using relative frequencies:
p/q
where p = number of times used word appears in corpus and q = number of words in corpus.
This creates a probability distribuition across the possible words. The final predicted word is chosen by comparing the conditional probabilities of the possible words.

Markov assupmtion is used within n-grams. This is when it is assumed that the probability of a word depends only on the previous word. For example, a bi-gram is a first-order Markov model which approximates the probability of the next word by looking at one word into the past; a tri-gram is a second-order Markov model and looks two words into the past; whilst the N-gram is a N-1 Markov model and looks N-1 words into the past.

As n-grams must be trained from a/some corpus/corpora they are not perfect, because corpora are finite and therefore there will be words and combinations of words missing. In order to help this problem a technique called smoothing is used.


References:


Comments:

Add Comment

No comments.