nltk.lm.smoothing module¶
Smoothing algorithms for language modeling.
According to Chen & Goodman 1995 these should work with both Backoff and Interpolation.
- class nltk.lm.smoothing.AbsoluteDiscounting[source]¶
Bases:
Smoothing
Smoothing with absolute discount.
- __init__(vocabulary, counter, discount=0.75, **kwargs)[source]¶
- Parameters
vocabulary (nltk.lm.vocab.Vocabulary) – The Ngram vocabulary object.
counter (nltk.lm.counter.NgramCounter) – The counts of the vocabulary items.
- class nltk.lm.smoothing.KneserNey[source]¶
Bases:
Smoothing
Kneser-Ney Smoothing.
This is an extension of smoothing with a discount.
Resources: - https://pages.ucsd.edu/~rlevy/lign256/winter2008/kneser_ney_mini_example.pdf - https://www.youtube.com/watch?v=ody1ysUTD7o - https://medium.com/@dennyc/a-simple-numerical-example-for-kneser-ney-smoothing-nlp-4600addf38b8 - https://www.cl.uni-heidelberg.de/courses/ss15/smt/scribe6.pdf - https://www-i6.informatik.rwth-aachen.de/publications/download/951/Kneser-ICASSP-1995.pdf
- __init__(vocabulary, counter, order, discount=0.1, **kwargs)[source]¶
- Parameters
vocabulary (nltk.lm.vocab.Vocabulary) – The Ngram vocabulary object.
counter (nltk.lm.counter.NgramCounter) – The counts of the vocabulary items.