Rows: 132
Columns: 4
$ `Artist ` <chr> "Taylor Swift ", "Taylor Swift ", "Taylor Swift ", "Taylor S…
$ Album <chr> "Taylor Swift ", "Taylor Swift ", "Taylor Swift ", "Taylor S…
$ `Title ` <chr> "Tim McGraw", "Picture to Burn", "Teardrops on my Guitar ", …
$ Lyrics <chr> "He said the way my blue eyes shinx\nPut those Georgia stars…
4. Tidy the Data
Stop words should be approached in a custom way given the context of the data. Because lyrics tend to use lots of traditional stop words that are meaningful to the song, I chose to only filter out stop words of three characters or less (using the snowball lexicon).
I’d like to visualize the bigrams, or word pairs, in their lyrics, excluding any that are repeated words. I also considered removing any numbers, but given that I plan to only visualize the top bigrams, these won’t be a factor.
my_theme<-theme(# choose font family text =element_text(family ='Charm', size=14, color='gray98'), axis.ticks =element_blank(), plot.background =element_rect(fill ="gray20", color ="gray20"), panel.background =element_rect(fill ="gray20", color ="gray20"), panel.grid.major =element_blank(), panel.grid.minor =element_blank(), panel.border =element_blank(), axis.line.x =element_line(color='gray98'), axis.title.x =element_text(color='gray98'), axis.text.y =element_blank(), axis.text.x =element_text(color='gray98'), legend.position ='none')
6. Plot
Show code
tswift_viz<-tok_taylor_swift_lyrics%>%slice_max(percent, n=10)%>%ggplot(.,aes(x=percent, y=reorder(bigram, percent)))+geom_col(fill ='cyan3')+geom_text(aes(label =bigram), hjust =1.5, size =12, color ='gray98', family ='Charm', fontface ='bold')+labs(x='Word Count (% of corpus)', y='')+scale_x_continuous(labels =percent)+my_themebeyonce_viz<-tok_beyonce_lyrics%>%slice_max(percent, n=10)%>%ggplot(.,aes(x=percent, y=reorder(bigram, percent)))+geom_col(fill ='green3')+geom_text(aes(label =bigram), hjust =1.5, size =12, color ='gray98', family ='Charm', fontface ='bold')+labs(x='Word Count (% of corpus)', y='')+scale_x_continuous(labels =percent)+my_themep<-tswift_viz+beyonce_viz+plot_annotation(title =glue("Most frequently used bigrams in <span style='color:cyan3'>**Taylor Swift**</span> and <span style='color:green3'>**Beyonce**</span> songs"), subtitle ='Top 10 bigrams by artist', caption ='<br>Tidy Tuesday Week 40 (2020)<br>Created by @mickey_rafa', theme =theme(plot.title =element_textbox_simple(size=rel(4), family='Charm', face='bold', color='gray98', margin =margin(t=10)), plot.subtitle =element_textbox_simple(size=rel(2.5), family='Charm', face='bold', color='gray98'), plot.caption =element_textbox(size=rel(2), color='gray98', family='Charm', lineheight=.3), plot.background =element_rect(fill ="gray20", color =NA), panel.background =element_rect(fill ="gray20"), plot.margin =unit(c(0, 0, 0, 0), "pt")))
7. Save
Show code
# Save the plot as PNGggsave( filename =glue("tt_{tt_year}_{tt_week}.png"), plot =p, width =6, height =4, units ="in", dpi =320)# make thumbnail for pagemagick::image_read(glue("tt_{tt_year}_{tt_week}.png"))%>%magick::image_resize(geometry ="400")%>%magick::image_write(glue("tt_{tt_year}_{tt_week}_thumbnail.png"))