cntlist - file listing number of times each tagged sense occurs in a semantic concordance
A cntlist file for a semantic concordance lists the number of times each semantically tagged sense occurs in the concordance and its sense number in the WordNet database. Each line in the file corresponds to a sense in the WordNet database to which at least one semantic tag points. Only senses that are tagged in a concordance are in the concordance's cntlist file. See semcor(7WN) for information about semantic concordances and a list of those included in this release.
A file, cntlist, is provided with each semantic concordance for informational and statistical purposes only. The files are not used by any software provided with the WordNet or semantic concordance packages. A cntlist file is generated by counting the number of sent_num,word_num pairs in all location_lists for each sense_key in a taglist(5WN) file. The data is sorted in descending numerical order, and the resulting file ranks the senses from most to least frequently tagged in the semantic concordance.
WordNet Database cntlist File
In the WordNet database, words are assigned sense numbers
based on frequency of use in the semantic concordances. The
cntlist file used by grind(1WN)
to build the WordNet database
and assign the sense numbers is a union of the cntlist
files from the various semantic concordances. This combined
cntlist file is provided with the WordNet package and is
found in the WNSEARCHDIR directory.
File Format
Each line in a cntlist file contains information for one
sense. The file is ordered from most to least frequently
tagged sense. The fields are separated by one space, and
each line is terminated with a newline character. Senses
having the same tag_cnt value are listed in reverse alphabetical
order of the lemma field of the sense_key.
Each line in cntlist is of the form:
tag_cnt sense_key sense_number
where tag_cnt is the decimal number of times the sense is tagged in the corresponding semantic concordance. sense_key is a WordNet sense encoding and sense_number is a WordNet sense number as described in senseidx(5WN) .
In directory SEMCORDIR/conc on Unix platforms, SEMCORDIR\conc on PC platforms, and SEMCORDIR:conc on Macintosh platforms:
In directory WNSEARCHDIR:
grind(1WN) , senseidx(5WN) , taglist(5WN) , semcor(7WN) .