This experiment considers the effect of using part-of-speech tags instead of words to create an n-gram model from a corpus. The goal of using part-of-speech tags is to mitigate the sparsity that comes from higher-order word n-gram models. Shakespeare’s plays were used as training and test corpora. After using part-of-speech tag n-grams to calculate log probabilities on the test corpora, word probabilities are incorporated by using the MLE estimates for P(word|tag). The perplexities of using a word n-gram model and a part-of-speech tag n-gram model are compared to find that the part-of-speech model severely underperforms. …