nltk.corpus.reader.lin module

class nltk.corpus.reader.lin.LinThesaurusCorpusReader[source]

Bases: CorpusReader

Wrapper for the LISP-formatted thesauruses distributed by Dekang Lin.

__init__(root, badscore=0.0)[source]

Initialize the thesaurus.

Parameters
  • root (C{string}) – root directory containing thesaurus LISP files

  • badscore (C{float}) – the score to give to words which do not appear in each other’s sets of synonyms

scored_synonyms(ngram, fileid=None)[source]

Returns a list of scored synonyms (tuples of synonyms and scores) for the current ngram

Parameters
  • ngram (C{string}) – ngram to lookup

  • fileid (C{string}) – thesaurus fileid to search in. If None, search all fileids.

Returns

If fileid is specified, list of tuples of scores and synonyms; otherwise, list of tuples of fileids and lists, where inner lists consist of tuples of scores and synonyms.

similarity(ngram1, ngram2, fileid=None)[source]

Returns the similarity score for two ngrams.

Parameters
  • ngram1 (C{string}) – first ngram to compare

  • ngram2 (C{string}) – second ngram to compare

  • fileid (C{string}) – thesaurus fileid to search in. If None, search all fileids.

Returns

If fileid is specified, just the score for the two ngrams; otherwise, list of tuples of fileids and scores.

synonyms(ngram, fileid=None)[source]

Returns a list of synonyms for the current ngram.

Parameters
  • ngram (C{string}) – ngram to lookup

  • fileid (C{string}) – thesaurus fileid to search in. If None, search all fileids.

Returns

If fileid is specified, list of synonyms; otherwise, list of tuples of fileids and lists, where inner lists contain synonyms.

nltk.corpus.reader.lin.demo()[source]