Message Archiving Benchmark: How Many Letters Are in Messages?
Let’s look at distribution of the number of letters in message’s body. Note, that it’s not a byte length, it’s an amount of Unicode symbols. Cyrillic characters are represented using 2 bytes in UTF-8, so some messages can be actually 2 times longer in bytes. Also AFAIK English sentences are generally shorter than Russian, so […]