2879 words (tokens) are not very much, unique tokens: 1570, when stemmed: 1317. All the same, finding patterns across the 97 poems in Till Lindemann's "Quiet Nights" seems to be difficult. Looking for n-gams and skipgrams: found nothing that went beyond the limits of single poems. Asking with R Quanteda for co-occurences, with n=50 we get a cloud: What we see? A predominance of words for human and animal bodies or parts of them: "hund", "mund", "haar", "schweiß", "blut", "fleisch", "herz" .... Inquiring on these elements could be an interesting project. As a net of words, though, the entire image is only confusing. Lowering n to 20, the picture gets clearer, but the connecting lines are really weak. "Dass" should be eliminated, right. Going down to n=7. The animals have vanished, "heart" and "face" are the only remaning parts of the body. At the end, after stemming and low...
Rammstein Read by the Machine
The literary quality of these texts is evident. If you have studied German literature, you cannot ignore the allusions not only to GDR songs and brothers Grimm, but also to Trakl and French Symbolism. But how can we analyze Rammstein texts? I want to make them read by the machine. The programming languages R and Python offer a lot of packages with interesting methods of getting into the texts. I am curious about how we can, by machine Learning, find something out.