Analysis by Mikaela Schultz
Spotify is a digital music, podcast and video streaming service that gives you access to millions of songs and other content from artists all over the world. As of April 2019, Spotify had 217 million active users, including 100 million paying subscribers. Spotify pays royalities based on the number of artists’ streams as a proportion of total songs streams. When it was released, Spotify completely changed the music industry for the best. The application tracks top artists by the number of streams they have overall. This is a fairly accurate representation of how successful each artist is. With this information, I completed a sentiment analysis of the top 10 most-streamed artists on Spotify to see if the sentiments of the lyrics changed from the first and most recent albums.
Does musical success have an effect on the sentiment of lyrics?
In this project, the following libraries were used: tidyverse, tidytext, gridExtra, dplyr, genius, and wordcloud2.
In order to carry out the analysis, I needed information for the top streaming artists on spotify. Rolling stone came out with an article that referenced Spotify’s “Decade Of Discovery”. This article was released on October 10th, 2018. The pictograph in the article, created by Spotify, includes the most-streamed artists of all time since the app was created. Below is a snapshot of the pictograph that shows the top 10 most streamed artists of all time on Spotify.
The artists included are all from the US, with three exceptions. Although Ed Sheeran, Calvin Harris and The Weeknd are very popular in the US, Sheeran and Harris reside in the UK and The Weeknd resides in Canada. I was not able to find a complete list of Spotify’s most downloaded artists for US artists only.
From this list, I was able to complile each of the artists first and most recent albums using Wikipedia’s discography pages. Next, I used the Genius API and compile this information into R. The afinn lexicon was used for this project which assigns words with a score that runs between -5 and 5, with negative scores indicating a negative sentiment and positive scores indicating a positive sentiment.
Below is one chart I created using Drake’s first album. I first started out by created a variable D1 which contains all of the lyrics from Drake’s album “Thank Me Later”.
#Creating a variable called D1 which contains all of the lyrics from Drake's album "Thank Me Later"D1 <- genius_album(artist = "Drake", album = "Thank Me Later")#Using the piping operator to create a count of the most popular words in the album D1 %>% unnest_tokens(word, lyric) %>% anti_join(stop_words) %>% count(word, sort = TRUE) -> D1Count#Using more piping operators to create afinn sentiment as well as create a subset of the data that includes the top 20 most popular words D1Count %>% inner_join(get_sentiments("afinn")) -> D1SentimentD1Sentiment %>% head(20) -> D1Sentiment2#Creating a column called color that uses an if else statment to color the sentiment score by red if it is below 0 and green if it is aboveD1Sentiment2$color <- ifelse(D1Sentiment2$score < 0, "red", "green")#Creating a bar graph that shows each sentiment. The "color=color" in the ggplot() and scale_color_identity() are what allow the graph to color by red and green based off of the ifelse statement.Drake1 <- ggplot(D1Sentiment2, aes(reorder(word, -n), score, color=color)) + geom_col(fill="white") + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + labs(title="Drake - Thank me later", x="Top 20 Most Popular Words", y="Sentiment Score") + theme(plot.title = element_text(size=15,hjust = 0.5)) + scale_color_identity()Drake1
This process was completed for each album for all 10 artists. Further, each artist has this code twice - one for their first album as well as their most recent album.
For viewing purposes, I decided to use the gridExtra package to separate each artist by genre to examine the sentiments for each. Each artist has two bar graphs listed. The top graph represents the artist’s first album and the bottom graph represents the artist’s most recent album. I also created word clouds for each genre to visualize the most used words in each of the artists first and most recent albums. This was done by creating a data frame that included the counts for each artists first and last albums. The data frame created was then used to make word clouds to visulize the results. Note, because words like “love” and “yeah” were popular across all genres they were not included in each wordcloud.
Included in this genre are three pop artists - Ed Sheeran, Justin Bieber and Ariana Grande.
Ed Sheeran’s first album, “+” came out in 2011. The sentiment analysis shows the sentiments evenly split between positive and negative. His most recent album “Divide”, however, has a seemingly large increase of words with a positive sentiment value. Justin Bieber and Ariana Grande have similar patterns and similar positive and negative ratings for both of their albums. It is interesting to note for all 6 albums in this genre that the most popular word had a positive sentiment rating. The most popular words besides love are hold, wanna life and heart for the pop cateory.
Drake, The Weeknd, and Rihanna were the artists included in this genre.
The difference in sentiments for this genre is very glaring compared to the rest. The Weeknd has a stark difference between positive and negative sentiments over time. The first album, “Kiss Land” came out in 2013. Among some of the top lyrics throughout the album are love, sh*t, diamond, leave and die. The most popular words in the album are split between having a postive and negative sentiment. However, The Weeknd’s most recent album “Starboy” which came out in 2016 had mostly negative sentiment ratings. 16 out of the top 20 words throughout the album had a negative sentiment rating. This is quite shocking. Drake and Rihanna had a similar pattern to The Weeknd, just not as apparent. All artists here are going from fairly even sentiments to more negative dominated sentiments across the most popular words. Additionally, the wordcloud shows that “life” seemed was the most popular word for all 6 albums.