Title: | All Things Data and Springsteen |
---|---|
Description: | An R data package containing setlists from all Bruce Springsteen concerts over 1973-2021. Also includes all his song details such as lyrics and albums. Data extracted from: <http://brucebase.wikidot.com/>. |
Authors: | Joey O'Brien [aut, cre] |
Maintainer: | Joey O'Brien <[email protected]> |
License: | CC0 |
Version: | 0.1.0 |
Built: | 2025-01-31 03:24:39 UTC |
Source: | https://github.com/obrienjoey/springsteen |
Metadata for concerts played by Bruce Springsteen both solo and
with numerous bands from the years 1973 to present day. Can be joined with
setlists
using gig_key
.
concerts
concerts
A data frame with 6 variables:
Primary key of the data frame.
Date of the concert.
Full location of concert including venue name.
State concert was performed in (if in USA).
City in which the concert was performed (if not in USA).
Country concert was performed in.
library(dplyr) # What countries have been played in the most? concerts %>% count(country, sort = TRUE) # What decade did most shows take place in? library(lubridate) concerts %>% select(date) %>% mutate(decade = (year(date) %/% 10) * 10) %>% count(decade)
library(dplyr) # What countries have been played in the most? concerts %>% count(country, sort = TRUE) # What decade did most shows take place in? library(lubridate) concerts %>% select(date) %>% mutate(decade = (year(date) %/% 10) * 10) %>% count(decade)
Metadata for the setlists of concerts played by Bruce Springsteen both solo and with numerous bands from the years 1973 to present day.
setlists
setlists
A data frame with4 variables:
Key associated with the concert which the setlist is from.
Key associated with the song played.
Name of the song played.
Order of appearance for the song in the setlist.
library(dplyr) # what are the top five most played songs? setlists %>% count(song, sort = TRUE) %>% slice(1:5) # what is the average show length? setlists %>% count(gig_key) %>% summarise(ave_length = mean(n))
library(dplyr) # what are the top five most played songs? setlists %>% count(song, sort = TRUE) %>% slice(1:5) # what is the average show length? setlists %>% count(gig_key) %>% summarise(ave_length = mean(n))
Data describing all songs which have been played by Bruce Springsteen both
solo and with numerous bands from the year 1973 to present day. Can be joined with
setlists
using song_key
.
songs
songs
A data frame with 4 variables:
Primary key of the data frame.
Title of the song.
Lyrics of the song if available in the database.
Name of the album on which the song appears if available in the database.
library(dplyr) # What are the most common albums? songs %>% filter(!is.na(album)) %>% count(album, sort = TRUE) # What word occurs most frequently in the lyrics from the album 'Born To Run' library(tidytext) songs %>% filter(album == 'Born To Run') %>% select(title, lyrics) %>% unnest_tokens(word, lyrics) %>% count(word, sort = TRUE) %>% anti_join(stop_words, by = 'word')
library(dplyr) # What are the most common albums? songs %>% filter(!is.na(album)) %>% count(album, sort = TRUE) # What word occurs most frequently in the lyrics from the album 'Born To Run' library(tidytext) songs %>% filter(album == 'Born To Run') %>% select(title, lyrics) %>% unnest_tokens(word, lyrics) %>% count(word, sort = TRUE) %>% anti_join(stop_words, by = 'word')
Data describing the tours associated with concerts played by Bruce
Springsteen both solo and with numerous bands from the years 1973 to present day.
Note that concerts prior to 1973 and non-tour, e.g., practice shows,
promotion shows, have been removed. Furthermore some of the shows are
associated with more than one tour as such some of the entries from
concerts
appear twice. Can be joined with setlists
or concerts
using gig_key
.
tours
tours
A data frame with 2 variables:
Primary key of the data frame.
Tour associated with the concert. Note some concerts have more than one tour associated with them.
library(dplyr) # How many shows were on each tour? tours %>% count(tour, sort = TRUE)
library(dplyr) # How many shows were on each tour? tours %>% count(tour, sort = TRUE)
Checks if new data is available on the package dev version (Github). In case new data is available the function will enable the user the update the datasets
update_data()
update_data()
## Not run: data_update() ## End(Not run)
## Not run: data_update() ## End(Not run)