making sense of the world
To an ever-increasing extent, conversations on the Internet are global conversations, with individuals representing different markets all valuable parts of the whole. Frequently, a set of texts online will represent multiple languages with no explicit indication of in which language each text was written. Happily, this is a problem with a good solution, as automated Language Detection can be quite accurate for many applications. While the accuracy is limited when the texts are very short, use unconventional vocabulary, and/or two languages are very similar to one another, Language Detection systems are about as accurate as a skilled person in classfying documents' language, and are capable of serving a much larger number of languages than a person could typically recognize.
Named Entity Recognition