Perplexity in Markov N-Gram Models

November 26, 2007

While implementing the perplexity function on Markov n-gram models as it is described on page 14 of PDF Jurafsky & Martin's SLP to appear, I came across some floating point overflow, underflow issues and had to come up with equations to avoid them.

Here is PDF my solution in detail and its bigram implementation.

It took me a lot of whiteboarding and a few hours to figure this one out but the resulting libcorsis code is %100 foreign intellectual property free ^_^.

I am now looking for ways to analyze distributions graphically and testing different encapsulations of probability values and their interaction with the public API.

Search This Blog

CORSIS

Perplexity in Markov N-Gram Models

Comments

Popular posts from this blog

Levenshtein Distance Algorithm: Fastest Implementation in C#

Tiny F# EDSL for creating system / hardware IDs on Windows using WMI classes

Mono 1.2.5 binaries for Solaris 10/x86