| Tanl Linguistic Pipeline |
Public Types | |
| typedef unsigned char | ID |
Public Member Functions | |
| Encoding (char const name[], ID id, float averageBytesPerChar=1.0, float maxBytesPerChar=1.0) | |
| Encoding (char const *name, float averageBytesPerChar=1.0, float maxBytesPerChar=1.0) | |
| std::string | Name () |
| name of this encoding | |
| size_t | Encode (Encoding const *fromCode, char const *in, size_t inlen, char *&out, size_t outlen=0) const |
Converts a multibyte sequence starting at in, of length inlen, from character encoding fromCode to this encoding. | |
Static Public Member Functions | |
| static Encoding const * | get (char const *name) |
| Get the encoding with the given name. | |
| static Encoding const * | get (ID id) |
| Get the encoding with the given id. | |
| static void | Register (Encoding *encoding) |
| register known Encodings | |
| static void | Register (char const *alias, char const *canonical) |
Public Attributes | |
| std::string | name |
| the official canonical name | |
| ID | id |
| the internal id for the encoding | |
| float | averageBytesPerChar |
| the average bytes used to encode one character | |
| float | maxBytesPerChar |
| the maximum count of bytes use to encode one character | |
| size_t Tanl::Text::Encoding::Encode | ( | Encoding const * | fromCode, | |
| char const * | in, | |||
| size_t | inlen, | |||
| char *& | out, | |||
| size_t | outlen = 0 | |||
| ) | const |
Converts a multibyte sequence starting at in, of length inlen, from character encoding fromCode to this encoding.
The converted sequence is stored in out, for a maximum size of outlen. If outlen is 0, a buffer is allocated with malloc() and returned in out.
References averageBytesPerChar, and name.