| Tanl Linguistic Pipeline |
Public Types | |
|
typedef std::vector < Tanl::Classifier::PID > | X |
| typedef Tanl::Classifier::ClassID | Y |
| typedef std::pair< X, Y > | Case |
| typedef std::vector< Case > | Cases |
| typedef std::vector< Case * > | ValidationSet |
Public Member Functions | |
| MlpModel (int numFeatures, int numOutcomes, int numHidden, int numLayers=1) | |
| list< Event * > | collectEvents (Enumerator< Sentence * > &sentenceReader, GlobalInfo &info) |
Collect events from sentenceReader. | |
| void | buildCases (list< Event * > &events, Cases &cases) |
| Create numeric. | |
| double | train (Case &, int &) |
| Compute the gradients with respect to negative log likelihodd:. | |
| void | train (Cases &cases, int epoch, ofstream &ofs) |
| Train model with. | |
| void | validate (ValidationSet &vs, double &avg, double &std) |
| int | crossentropy_softmax (Vector &x, double sm[]) |
| Compute:. | |
| Vector | gradCrossentropy (Vector &x, int y) |
| int | estimate (std::vector< PID > &features, double prob[]) |
| void | load (ifstream &ifs, char const *file="") |
| void | save (ofstream &ofs) |
| void | writeLabels (ofstream &ofs) |
| streampos | writeData (ofstream &ofs) |
| void | clearLabels () |
Protected Attributes | |
| Matrix | w1 |
| Matrix | w2 |
| Matrix | wh |
| Vector | b1 |
| Vector | b2 |
| Vector | bh |
| int | numLayers |
| number of hidden layers | |
| int | numHidden |
| number of hidden variables | |
| int | numFeatures |
| number of features | |
| WordIndex | outcomeIndex |
| void Parser::MlpModel::buildCases | ( | list< Event * > & | events, | |
| Cases & | cases | |||
| ) |
Create numeric.
| cases | out of training | |
| events. |
References numFeatures, numLayers, and Tanl::Classifier::Classifier::verbose.
Referenced by Parser::MlpParser::train().
| int Parser::MlpModel::crossentropy_softmax | ( | Vector & | x, | |
| double | sm[] | |||
| ) |
Compute:.
softmax(x)[i] = exp(x[i]) / sum_j(exp(x[j]))
We compute this by subtracting off the max of x. This avoids numerical instability.
m = max_j x[j] softmax(x)[i] = exp(x[i] -m) / sum_j(exp(x[j] - m))
Negative log likelihood at index t is:
nll(x,t) = -log(softmax(x)[t])
Referenced by estimate(), and train().
| int Parser::MlpModel::estimate | ( | std::vector< PID > & | features, | |
| double | prob[] | |||
| ) |
References crossentropy_softmax(), and numLayers.
Referenced by Parser::MlpParser::parse().
| void Parser::MlpModel::train | ( | Cases & | cases, | |
| int | epoch, | |||
| ofstream & | ofs | |||
| ) |
Train model with.
| cases,performing | ||
| epoch | iterations, saving intermediate model weights to | |
| ofs. |
References Parser::MovingAverage::add(), Parser::Parser::procStat(), train(), Tanl::Classifier::Classifier::verbose, and writeData().
| double Parser::MlpModel::train | ( | Case & | cas, | |
| int & | argmax | |||
| ) |
Compute the gradients with respect to negative log likelihodd:.
x = h w2 + b2 xw1 = SUM_f w1[f] h = softsign(xw1 + b1) h' = 1 / (1 + abs(xw1 + b1))^2 nll = -x[t] + log(Sum_j(exp(x[j]))) d nll/dx = - d x[t]/dx + 1/Sum_j(exp(x[j])) Sum_j(exp(x[j]) d x[j]/ dx) = [0 .. -1 .. 0] + 1
d nll/dw1 = dnll/dx dx/dw1 = dnll/dx (dh/dw1 w2) = dnll/dx (h' w2) dxw1/dx d nll/db1 = dnll/dx dx/db1 = dnll/dx (dh/db1 w2) = dnll/dx (h' w2) d nll/dw2 = dnll/dx dx/dw2 = dnll/dx h d nll/db2 = dnll/dx dx/db2 = dnll/dx
Return in
| argmax | the most likely result. |
References crossentropy_softmax(), and numLayers.
Referenced by Parser::MlpParser::train(), and train().
| streampos Parser::MlpModel::writeData | ( | ofstream & | ofs | ) |
References numFeatures, numHidden, and numLayers.
Referenced by Parser::MlpParser::train(), and train().