| Tanl Linguistic Pipeline |
Base class for parsers. More...
#include <Parser.h>
Public Member Functions | |
| Parser (WordIndex &predIndex) | |
| virtual void | train (SentenceReader *sentenceReader, char const *modelFile) |
Train statistical model using sentences obtained through a sentenceReader, and save the generated model to modelFile. | |
| virtual Sentence * | parse (Sentence *sentence) |
Parse the given Sentence sentence. | |
| virtual void | parse (SentenceReader *sentenceReader, std::ostream &os=std::cout) |
Parse all sentences extracted by sentenceReader, sending output to os. | |
| virtual void | revise (SentenceReader *sentenceReader, char const *actionFile=0) |
| Produce a revision of a document parses, using either a model or an action file. | |
| std::deque< Sentence * > | collectSentences (Enumerator< Sentence * > *sentenceReader) |
| Collect sentences and replace unfrequent token attributes with UNKNOWN. | |
| virtual void | showEval (int tokenCount, int las, int uas, int sentCount) |
| Print accuracy estimates. | |
| void | writeHeader (std::ostream &os) |
| Write model header to stream. | |
| Enumerator< Sentence * > * | pipe (Enumerator< std::vector< Token * > * > &tve) |
| IPipe interface. | |
| Enumerator< Sentence * > * | pipe (Enumerator< Sentence * > &tce) |
| Alternative pipeline interface, that allows connecting directly to a SentenceReader. | |
| virtual void | preprocess (Sentence *sentence) |
| Preprocess sentence, e.g. | |
Static Public Member Functions | |
| static Parser * | create (char const *modelFile=0) |
Create a Parser based on configuration and data in file modelFile. | |
| static bool | readHeader (std::istream &is) |
| Read model header from stream. | |
| static std::string | procStat () |
| Return a string of process statistics: time: user+sys elapsed, realtime elapsed, CPU usage, memory usage. | |
Public Attributes | |
| WordIndex & | predIndex |
| GlobalInfo | info |
Static Public Attributes | |
| static IXE::conf< int > | featureCutoff |
| Drop features which occur less than this number of times. | |
| static IXE::conf< int > | lexCutoff |
| Form or lemmas occurring less than LexCutoff are collapsed to Unknown. | |
| static IXE::conf< bool > | verbose |
| Control output. | |
Base class for parsers.
Parse the given Sentence sentence.
Reimplemented in Parser::ApParser, Parser::MeParser, Parser::MlpParser, Parser::MultiSvmParser, and Parser::SvmParser.
Referenced by Parser::ParserSentPipe::Current(), Parser::ParserPipe::Current(), and Parser::ParserPipePython::Current().
| Enumerator< Sentence * > * Parser::Parser::pipe | ( | Enumerator< std::vector< Token * > * > & | tve | ) |
IPipe interface.
| tve. |
| void Parser::Parser::preprocess | ( | Sentence * | sentence | ) | [virtual] |
Preprocess sentence, e.g.
normalize tokens.
References Tanl::Token::links.
Referenced by collectSentences(), Parser::SvmParser::parse(), Parser::MultiSvmParser::parse(), Parser::MlpParser::parse(), Parser::MeParser::parse(), and Parser::ApParser::parse().
| static bool Parser::Parser::readHeader | ( | std::istream & | is | ) | [static] |
| virtual void Parser::Parser::revise | ( | SentenceReader * | sentenceReader, | |
| char const * | actionFile = 0 | |||
| ) | [inline, virtual] |
Produce a revision of a document parses, using either a model or an action file.
If an actionFile is provided, it must contain a list of actions, one per line, to apply to the parse trees, otherwise the actions to perform revisions are determined using the model.
Reimplemented in Parser::ApParser, Parser::MeParser, and Parser::MlpParser.
| void Parser::Parser::showEval | ( | int | tokenCount, | |
| int | las, | |||
| int | uas, | |||
| int | sentCount | |||
| ) | [virtual] |
Print accuracy estimates.
| void Parser::Parser::writeHeader | ( | std::ostream & | os | ) |
Write model header to stream.
| os |
Referenced by Parser::MultiSvmParser::train(), Parser::MlpParser::train(), Parser::MeParser::train(), and Parser::ApParser::train().
conf< int > Parser::Parser::featureCutoff [static] |
Drop features which occur less than this number of times.
Referenced by Parser::MlpModel::collectEvents(), Parser::MultiSvmParser::train(), Parser::MeParser::train(), and Parser::ApParser::train().
conf< int > Parser::Parser::lexCutoff [static] |
Form or lemmas occurring less than LexCutoff are collapsed to Unknown.