Ella Marija Lani Yelich-O'Connor, better known by her stage name "Lorde", is a New Zealand artist born in 1996. With two studio albums, "Pure Heroine" and "Melodrama", released in 2013 and 2017 respectively, she has consolidated herself as one of the greatest artists in international pop music - but what do her compositions talk about if analyzed computationally? Here, we will use Natural Language Processing, a set of Data Science techniques to understand texts, and we will analyze the lexicography, feelings and behaviors of the artist's lyrics.
Between the two albums analyzed, Lorde released around 20 songs overall. How long are they "mathematically"? In this part, we will see how many words make up each song, and see how long the lyrics are. In this count neither words with less than 3 characters or stopwords are counted.
It is possible to notice that most of the songs have compositions with 100 up to 160 words, with only a few songs outside this range. The most "repetitive" song is Liability(Reprise), with a total of 34 words used, while Hard Feelings-Loveless is the most "diverse", with 201 distinct words. Both tracks are from the album "Melodrama", and it is arguable that the latter is the most diverse, as it is actually composed of two independent songs. Because they are presented on the same track of the album, it was considered as a single track.
Now that we understand the distribution of terms used, we have another question: What are the most used words in Lorde's compositions? For this question, we will first see a general summary of the two albums analyzed, and later we will see how those terms behave between the two albums individually.
Analyzing both albums, the words "like", "love" and "know" are the most used in the singer's songs. But here we are analyzing all the songs together. What are the most used words in Melodrama songs? What about Pure Heroine?
How different are the most used terms on each Lorde album?
You can see that "love" is a term used a lot in both albums, but "like", which is the most used term in general, doesn't even appear in the most used words in Melodrama, so Pure Heroine heavily influences this metric - not just in this term, but in many others. In Melodrama, "supercut" is a term used a lot, and is also the name of one of the tracks of the album.
We understand which terms are most commonly used on each album, but that doesn't mean the terms are necessarily important. In Data Science, we have a metric called TF-IDF, which measures how important a specific term is against all terms in a body of text. Basically, the TF-IDF is a score, and the higher the score of a term, the more significant it is for the entire work. What are the most important terms according to TF-IDS metric in each album?
For Lorde's listeners, it's possible to see the presence of individual songs weighing a lot in on the albums' most important terms. In Meldorama,
words like "boom" and "supercut" continue to be very important in the album "description", while in Pure Heroine we see great value in words like
"club", "people" and "love" which I personally think describes well the theme of the album.
The NRC Emotion Lexicon (MOHAMMAD, S. e Turney, P.; 2013), which we will refer to simply as "NRC", is a set of English terms that are linked to 8 distinct emotions (Joy, Sadness, Fear, Anger, Confidence, Anticipation, Surprise and Disgust), and also to general emotions perceived as "positive" or "negative". Using it as a tool, we will better understand the Lorde's intrinsic feelings for each of her studio albums.
As its name implies, "Melodrama" is a more dramatic and melancholic album, with "Fear" and "Sadness" being the most identified feelings. in the compositions, followed by doses of "joy" and "anger". Pure Heroine is more hopeful and energetic, showing high doses of "confidence" and "anticipation" in their melodies, also followed by "joy".
As explained earlier, the NRC classifies many words as "positive" or "negative" sentimentally. It is evident in the previous charts that both albums have emotional fluctuations, so which terms contribute the most to each of the emotions? This is what we will analyze in this part of the study.
In Melodrama's so-called "positive" moments, Lorde talks a lot about love, inspiration, conversations and perfection, while negative feelings of the album are related to losses, departures and fights. Pure Heroine also talks about love and conversations, but it also talks about friendship and kindness in its positive moments. For negative feelings, they are present in situation losses and errors. In this second case, it is curious to notice that "teen" and "teens" are portrayed as causing positive and negative feelings - which makes sense given the stage of life.