The practical part of introductions to Text Analysis with R usually starts to the sound of the hunting horn. "Tokenization" is the parole, and "elimination" the first goal. The analyst does not care about punctuation (remove_punct =TRUE)! The analyst has to reduce the burden of his word sack, and there he or she notices the "stopwords", functional words that connect and move ideas as words .. "then" and "when", "how" and "because". Any contents? No! Throw them away! It might be true that programs for Data Analysis have been developed for, and are mostly used by marketing experts who do not really care about moving ideas, and usually end up with some "sentiment analysis": Feelin good? Ok! It is also true that in AI, things like Natural Language Understanding, caution is the dominant rule. Keep the stopwords, you never know! There are good reasons for doing so. AI should include every word that touches huma...
The literary quality of these texts is evident. If you have studied German literature, you cannot ignore the allusions not only to GDR songs and brothers Grimm, but also to Trakl and French Symbolism. But how can we analyze Rammstein texts? I want to make them read by the machine. The programming languages R and Python offer a lot of packages with interesting methods of getting into the texts. I am curious about how we can, by machine Learning, find something out.