Data Analysis: The Eleven Myriad Things (110,000 Songs)

As promised, herewith begins the data analysis of my first 110,000 songs heard through iTunes.

Top Level

Saving the caveats for later, we dive in with the statement that I have listened at least once to 110,000 tracks in iTunes. Crossing this milestone occurred just before 5PM on April 17, 2019. Remaining to be heard at that point were 81,831 audio tracks, meaning 57.34% of my listenable* collection has been heard. The heard files occupy 705.21 GB of data, and span 442 days, 10 hours, 37 minutes, and 36 seconds in length. The unheard tracks take up 567.17 GB of disk space, and would take 326 days, 13 hours, 38 minutes, and 42 seconds of uninterrupted time to hear in total. Dividing the total time to hear the 110,000 tracks (a little over a year and two-and-a-half months) by the number of tracks gives us an average track length of 5 minutes and 48 seconds, thought that summary average is a trifle deceptive, as we shall see.

Average Track Length: 5:48

The average file size is 6.57 MB, though these have an even wider variance than the length of the files in question.

The average song rating for those tracks which I have rated is 3-2/3 stars, using iTunes much-deprecated rating system.

Average Track Rating: 3.67

Not all songs have been rated: I do not rate tracks designed to impart instruction in a foreign language (including Old English), and I have only begun consistently rating songs in more recent years (see caveats for issues with this datapoint). Only 142 tracks fall into the Language genre, leaving 32,830 songs I’ve heard without rating them.

 

Total Songs Rated: 77,028

The vast majority of the eleven myriad† songs here were listened to only one time. 91,172 tracks were heard only once. Of the remaining 18,828 songs, the majority (10,881) were heard twice. The numbers drop off drastically after that as the play count increases, with only 760 songs being heard more than 12 times. (There are some issues with the higher play counts, explained in caveats below.)

With all due caveats about the earliest play dates, it took 5,935 days to listen to these 110K tracks, that is to say, 16 years and 3 months (plus 3 days as change). Thus, I listened to, on average, eighteen-and-a-half songs per day. As will be seen, this listening rate fluctuated drastically over the course of the past sixteen years, though the overall rate is remarkably similar to the rate for the most recent 10,000 songs heard.

18.53 New Songs Heard per Day

Conversely, although only 442 days worth of music were listened to during that 5,935 day period, when we summarize based on the number of plays, we find that I’ve listened to just under 600 days of sound files of one species or another. (The actual figure is 599 days, 2 hours, 39 minutes, and 5 seconds, but who’s counting?) This means that I listened to iTunes (on average) just over two hour and thirty-five minutes each day.

2 hours, 35.33 minutes per day Listening to Stuff

Top Level Genre Information

Though I tried to scrub this data as effectively as possible (see caveats below), I found that my Genre information contained much garbage, and then found as I tried to correct that problem that this was and is an endless task, so I decided to end the task, and present you with the information as I have it now. You may feel free to object all you wish to the stuff presented here, and I look forward to your vicious attacks upon this and any other useless data.

That said, the songs or other audio already heard fall into 89 different genres. Fully one quarter of all tracks are classified as Rock (27,639 files). The next two most populated genres are Radio Show and the bipolar and not quite useful Alternative & Punk categories, each making up approximately 6% of all tracks already heard (6,773 and 6,567 examples, respectively). I am endeavoring to separate ‘Alternative’ from ‘Punk’, and hope to complete that before another ten thousand songs are heard; currently 4,421 files are classed as simply Punk, while another 3,009 are labeled as Alternative. If those were added to the almost pointless ‘Apples & Oranges’ category mentioned before, that would make the combined weight of these tracks equal to over 12.5% of all heard files. Also of note are 3,109 files for which no genre information is attached. Looking at those files for which at least 1,000 examples are found, the breakdown by genre is as follows:

Genre Count %
Rock 27,639 25.13%
Radio Show 6,773 6.16%
Alternative & Punk 6,567 5.97%
Pop 5,771 5.25%
Jazz 4,614 4.19%
Punk 4,421 4.02%
Folk 4,314 3.92%
Country 3,985 3.62%
Classical 3,589 3.26%
World 3,191 2.90%
(blank) 3,109 2.83%
Blues 3,066 2.79%
Alternative 3,009 2.74%
Soundtrack 2,341 2.13%
Electronica/Dance 2,315 2.10%
Hip Hop/Rap 1,998 1.82%
R&B 1,864 1.69%
Latin 1,767 1.61%
Easy Listening 1,340 1.22%
Analog CyberPunk 1,193 1.08%
Spoken Word 1,114 1.01%
Metal 1,093 0.99%
Gospel & Religious 1,092 0.99%

And of course there’s a picture for those visually-minded among you:

 

Top Level Popular Artists

One way of determining the most popular artists is by looking at the number of songs played during the eleven myriad tracks I’ve listened to. Doing so give us the following breakdown:

Most Popular Artists by Number of Songs Played
Artist (#1-#25) Songs % Artist (#26-#50) Songs %
Bob Dylan 3,439 3.13% Theater Five 253 0.23%
{unknown} 1,435 1.30% Emmylou Harris 251 0.23%
CBS Radio Mystery Theater 1,049 0.95% Ella Fitzgerald 249 0.23%
The Beatles 1,023 0.93% Led Zeppelin 247 0.22%
The Grateful Dead 1,021 0.93% Pink Floyd 242 0.22%
Wanda Jackson 774 0.70% {sound effects} 239 0.22%
Johann Sebastian Bach 756 0.69% Eric Clapton 238 0.22%
Lux Radio Theatre 670 0.61% Green Day 235 0.21%
Jerry Garcia 560 0.51% Duke Ellington 232 0.21%
Johnny Cash 518 0.47% Frank Sinatra 231 0.21%
Neil Young 494 0.45% Leonard Cohen 231 0.21%
The Rolling Stones 481 0.44% The Bevis Frond 226 0.21%
Wolfgang Amadeus Mozart 335 0.30% Radiohead 216 0.20%
David Bowie 327 0.30% Ludwig van Beethoven 215 0.20%
The Byrds 326 0.30% Pete Seeger 204 0.19%
The Clash 313 0.28% R.E.M. 200 0.18%
The Ramones 308 0.28% Talking Heads 200 0.18%
Jimi Hendrix 296 0.27% Lloyd Cole 199 0.18%
The Green Hornet 281 0.26% Richard Wagner 197 0.18%
William Conrad 278 0.25% The Simpsons 186 0.17%
The Who 276 0.25% Jack Webb 183 0.17%
Suspense 274 0.25% Electric Light Orchestra 181 0.16%
Bruce Springsteen 263 0.24% Jethro Tull 181 0.16%
The Beach Boys 254 0.23% Tom Waits 181 0.16%
Elvis Costello 253 0.23% Earle Graser 180 0.16%

There are, however, other ways of looking at this data. We can, for example, look at the total number of plays for each artist to garner a different view of my biggest faves, for there are bound to be some I love but for whom I don’t have lots of tracks, or simply for whom there were never that many songs released to begin with, but which I listen to over and over again. When I looked upon this data from this perspective, malhéreusement, I found that the issues surfaced above and noted in the caveats regarding higher play counts washed away any useful information. I next attempted to use ratings as a guide (despite the caveats), but that merely rearranged the data already given in the table above without surfacing any really new insights. I also tried using these same factors with an additional weighting by length of each track, but that merely promoted the radio shows and other artists with longer songs (or at least longer noodling; I’m looking at you, Jerry).

There were a few more revelations to come when we looked at the popular artists over time, which data are presented in the next section.

 

The View Over Time

The revelations of these 110,000 datapoints of sound attain sharper perspective when viewed over time. Here we learn some surprises and also see how drastically my listening habits have changed over the past 20K tracks. Three main factors appear: consistent average song length until the past few years, increasing file size, and widely varying listening rates until recently. The following chart gives a high-level overview (broken out by successive groups of ten thousand songs heard) of these and other aggregate data.

Summary Data Over Time
Total Songs Played Data Size Avg Track Length Begin/End Dates Days to reach 10K Songs Avg Songs per Day
10K 49.28 GB 3:36 1/15/2003, 2:43 AM 803 12.45
3/28/2005, 11:35 PM
20K 49.44 GB 3:43 3/28/2005, 11:38 AM 252 39.68
12/5/2005, 9:32 AM
30K 49.83 GB 3:51 12/5/2005, 9:47 AM 446 22.42
2/24/2007, 1:46 PM
40K 46.84 GB 3:54 2/24/2007, 1:53 PM 1278 7.82
8/25/2010, 7:32 PM
50K 49.61 GB 3:51 8/25/2010, 7:42 PM 962 10.40
4/13/2013, 1:33 PM
60K 56.91 GB 3:46 4/13/2013, 1:33 PM 370 27.03
4/18/2014, 7:20 PM
70K 68.35 GB 3:51 4/19/2014, 7:28 AM 187 53.48
10/23/2014, 3:11 PM
80K 63.52 GB 3:57 10/23/2014, 3:16 PM 334 29.94
9/22/2015, 11:38 AM
90K 66.1 GB 4:19 9/22/2015, 11:40 AM 180 55.56
3/20/2016, 10:42 AM
100K 104.71 GB 11:57 3/20/2016, 10:54 AM 577 17.33
10/18/2017, 2:45 PM
110K 100.61 GB 16:58 10/18/2017, 3:13 PM 546 18.32
4/17/2019, 4:51 PM
Totals 705.21 GB 5:48 1/15/2003, 2:43 AM 5,935 (for 110K) 18.53
4/17/2019, 4:51 PM

 

The primary change to my listening habits, which is apparent in the chart above, has been the introduction of large swathes of radio shows to my listening budget. These old shows are freely available on Archive.org and many other sites, and you can find them easily by searching on the term “OTR” (for “Old Time Radio” — although all radio narrative content is now ‘Old Time’, since ‘New Time Radio’ is now called “podcasts”). Several factors impelled the addition of this material to my frequent listening, the primary one being the material changes to my conditions of existence on this vale of tears some few years back. Naturally, these tracks have added substantially to the Average Track Length of my songs heard, as most shows are at least a half hour in length, with some exceptional cases lasting over two-and-a-half hours.‡ Although a very small number of radio shows were heard in earlier years, this category of audio file entered heavy rotation at the end of August in 2015, and has been a consistent part of my iTunes diet since that time.

One other change in — not my listening, but — my ripping habits has been to convert songs into higher bitrate mp3 files. I’ve done this because of the larger hard disks available, as well as the supposed benefits which accrue from the bigger files. I have to confess, however, that my hearing is possibly too poor to notice the difference, though even I can hear the ‘tin can’ effect of a few files I’ve grabbed which were ripped at 32 or even 16 (horrors!) kbps. I generally use 320 kbps, and I do continue to rip as mp3, primarily for ease of portability and future-proofing (to the extent that that is possible).

Lastly, I see that though the aggregate numbers show a remarkable consistency (indeed, the overall value for songs per day matches almost preternaturally well with the same datapoint for the past two tranches of 10K songs), my listening shows long lulls as well as frantic listening. The difference between listening to eight songs a day versus over fifty-five a day, particularly over such a long stretch as 10,000 songs, is quite substantial. A quick back-of-the-envelope (have you considered getting your statements online?) calculation reveals that the last figure, for the 81st-90th thousand songs, multiplied by the average song length during that period of four and a third minutes, means I was listening to four hours of iTunes each and every day. Perhaps not that remarkable to some, but it impresses me, though with what I am still unclear.

Listening Rates

Obviously not only the underlying material but also the environmental factors have changed over this 4/5ths of a score of years. That is, at times I had little ability to listen to my own music in the car, or I was just not driving that much, while at other times I did nothing but listen to my tunes. Or, I listened to iTunes at work, or at times the work was so intricate and involved that music would have been only a distraction. These and many other factors lie behind the differing song rates shown in the table above.

Looking more closely at the data by slicing not at the 10k but at each 1,000 songs presents a more nuanced picture. We see a long stretch where my listening dropped to almost single digits per day. Before and after that lengthy period of time (which lasted from approximately April 2006 through January 2013) the listening rate jumps up and down higgledy piggledy. So says the following chart:


 

Or, we can look at the inverse of this chart, and view how many days were required to reach each set of 1,000 songs:

(click on either chart for more detail)

 

Popular Artists Over Time

As mentioned in the discussion of popular artists, the popularity of the various artists changed over the over sixteen year span covered by this data. Though so many disparate artists were heard that any attempt to catalogue the ‘top’ artists seems futile, yet I shall assay that futility, and here below present the Top 25 Artists heard during each tranche of 10,000 songs. I use the same methodology as the Popular Artists in the previous section of general overview, with all the caveats, problems, and sighs inherent to same. For comparison’s sake, the data is presented in a scrollable table so that you can view how the top artists changed over time.

Most Popular Artists by Number of Songs Played per each 10K Songs

(scroll to right to view)

10k Songs Played 20k Songs Played 30k Songs Played 40k Songs Played 50k Songs Played 60k Songs Played 70k Songs Played 80k Songs Played 90k Songs Played 100k Songs Played 110k Songs Played
Rank Artist #Songs Rank Artist #Songs Rank Artist #Songs Rank Artist #Songs Rank Artist #Songs Rank Artist #Songs Rank Artist #Songs Rank Artist #Songs Rank Artist #Songs Rank Artist #Songs Rank Artist #Songs
1 Johnny Cash 184 1 Johnny Cash 111 1 Bob Dylan 491 1 Bob Dylan 416 1 The Beatles 201 1 Wanda Jackson 519 1 Bob Dylan 564 1 Jerry Garcia 259 1 Bob Dylan 303 1 Bob Dylan 397 1 CBS Radio Mystery Theater 995
2 Johann Sebastian Bach 170 2 {unknown} 96 2 Yes 110 2 {unknown} 337 2 Bob Dylan 162 2 Bob Dylan 336 2 The Grateful Dead 282 2 Bob Dylan 227 2 {unknown} 191 2 Lux Radio Theatre 325 2 Lux Radio Theatre 345
3 Bob Dylan 122 3 Bob Dylan 90 3 The Grateful Dead 77 3 Sound Effects 81 3 Emmylou Harris 138 3 The Beatles 214 3 Wanda Jackson 219 3 The Beatles 190 3 The Grateful Dead 134 3 The Byrds 140 3 Bob Dylan 331
4 The Beatles 92 4 Johann Sebastian Bach 68 4 KKFS 74 4 The Simpsons 61 4 {unknown} 137 4 {unknown} 122 4 {unknown} 104 4 {unknown} 100 4 Johann Sebastian Bach 126 4 Johann Sebastian Bach 139 4 William Conrad 251
5 The Carter Family 92 5 Duke Ellington 64 5 Johnny Cash 71 5 Johann Sebastian Bach 58 5 Billie Holiday 96 5 De La Soul 112 5 Leonard Cohen 99 5 The Grateful Dead 86 5 Roger McGuinn 93 5 Jack Webb 134 5 Suspense 225
6 Jimi Hendrix 85 6 Martin Luther King, Jr. 63 6 {unknown} 69 6 Yo La Tengo 47 6 (tie) The Grateful Dead 67 6 The Grateful Dead 108 6 (tie) The Beatles 98 6 The Beach Boys 84 6 Jerry Garcia 74 6 The Grateful Dead 116 6 The Green Hornet 161
7 {unknown} 81 7 Wolfgang Amadeus Mozart 60 7 (tie) Neil Young 59 7 The White Stripes 45 6 (tie) The Rolling Stones 67 7 Ella Fitzgerald 92 6 (tie) The Everly Brothers 98 7 The Bevis Frond 73 7 Elmore James 67 7 Sears Radio Theater 115 7 Theater Five 150
8 X 77 8 P.D.Q. Bach 53 7 (tie) The Rolling Stones 59 8 Neil Young 44 8 Johann Sebastian Bach 58 8 Neil Young 90 8 Eric Clapton 94 8 Townes Van Zandt 68 8 John Lennon 60 8 {unknown} 109 8 Earle Graser 137
9 Wolfgang Amadeus Mozart 75 9 Hüsker Dü 52 9 The Ramones 56 9 David Bowie 41 9 Iron & Wine 53 9 Lloyd Cole 68 9 Lloyd Cole 90 9 Schmetterlinge 62 9 The Adventures of Superman 59 9 Theater Five 103 9 Electric Light Orchestra 134
10 The Who 69 10 James Brown 51 10 The Simpsons 54 10 Richard Wagner 40 10 Tom Waits 50 10 Jethro Tull 65 10 Pete Seeger 84 10 (tie) David Bowie 59 10 The Clash 58 10 NBC University Theater 93 10 The Whistler 129
11 Tom Jones 68 11 The Grateful Dead 49 11 Elvis Costello 52 11 (tie) Johnny Cash 39 11 (tie) Radiohead 49 11 Green Day 60 11 Frank Zappa 83 10 (tie) The Ramones 59 11 Radiohead 57 11 Bruce Springsteen 92 11 Cliff Edwards 99
12 Bongwater 66 12 Enrico Caruso 46 12 The Misfits 46 11 (tie) Wolfgang Amadeus Mozart 39 11 (tie) Stevie Wonder 49 12 Elvis Costello 52 12 The Clash 76 12 Friedrich Hollaender 47 12 The Beatles 54 12 The Green Hornet 77 12 {unknown} 89
13 Richard Wagner 64 13 (tie) The Rolling Stones 44 13 The Bevis Frond 45 13 (tie) Duke Ellington 38 13 Funkadelic 44 13 Johann Sebastian Bach 49 13 Small Faces 74 13 (tie) Pink Floyd 44 13 The Rolling Stones 53 13 Church of the SubGenius 74 13 Fred Waring’s Pennsylvanians 69
14 Neil Young 63 13 (tie) The Velvet Underground 44 14 Soft Cell 43 13 (tie) Franklin Delano Roosevelt 38 14 Hot Tuna 41 14 Joni Mitchell 48 14 Allen Brothers 68 13 (tie) The Rolling Stones 44 14 Willie Nelson 49 14 Jerry Garcia 73 14 The Weird Circle 68
15 Willie Nelson 57 15 John Coltrane 43 15 (tie) Pop Will Eat Itself 42 15 KKFS 37 15 Barenaked Ladies 40 15 Lonnie Donegan 40 15 The Rolling Stones 60 15 (tie) Kurt Weill 42 15 Graham Parker 43 15 Eugene Ormandy 62 15 John Stanley 65
16 (tie) Howlin’ Wolf 55 16 Me First And The Gimme Gimmes 42 15 (tie) The Moody Blues 42 16 The Rolling Stones 35 16 Pavement 38 16 (tie) Jimi Hendrix 38 16 Led Zeppelin 59 15 (tie) Neil Young 42 16 Ludwig van Beethoven 42 16 Special Ed 61 16 Bret Morrison 64
16 (tie) Juan Garcia Esquivel 55 17 (tie) Charlie Parker 41 17 (tie) Brian Eno 41 17 The Ramones 33 17 Wolfgang Amadeus Mozart 37 16 (tie) Primus 38 17 Green Day 58 17 (tie) Bruce Springsteen 37 17 (tie) Lou Reed 41 17 CBS Radio Mystery Theater 54 17 John Dehner 58
18 (tie) Captain Beefheart 54 17 (tie) Sonic Youth 41 17 (tie) Duke Ellington 41 18 Wilco 32 18 The Police 36 18 (tie) Nick Cave 34 18 Bee Gees 55 17 (tie) D.R.I. 37 17 (tie) Pentagram 41 18 Bob Bailey 52 18 Bill Johnstone 57
18 (tie) Drivin’ N’ Cryin’ 54 17 (tie) Tom Jones 41 19 Johann Sebastian Bach 41 19 (tie) Ludwig van Beethoven 29 19 (tie) Belle & Sebastian 35 18 (tie) The Rolling Stones 34 19 Big Audio Dynamite 51 19 The Easybeats 35 19 (tie) R.E.M. 39 19 Suspense 49 19 (tie) Bob Bailey 53
20 Gustav Mahler 51 20 Frank Sinatra 40 20 Jethro Tull 40 19 (tie) The Grateful Dead 29 19 (tie) Count Basie 35 20 (tie) Judy Collins 33 20 (tie) Odetta 48 20 The Clash 34 19 (tie) Vera Ward Hall 39 20 (tie) Beth Custer 47 19 (tie) Neil Young 53
21 (tie) Elvis Presley 49 21 (tie) The Beatles 39 21 (tie) Peterson Field Guides 34 21 (tie) The Who 28 19 (tie) Frank Sinatra 35 20 (tie) Wilco 33 20 (tie) U2 48 21 (tie) The Circle Jerks 33 21 Leatherface 37 20 (tie) John Dehner 47 21 (tie) Jack Webb 49
21 (tie) P.D.Q. Bach 49 21 (tie) They Might Be Giants 39 21 (tie) Wolfgang Amadeus Mozart 34 21 (tie) Winston Churchill 28 22 R.E.M. 34 22 Original Broadway Cast 32 22 (tie) Jerry Garcia 47 21 (tie) Tom T. Hall 33 21 Townes Van Zandt 36 22 Work Of Saws 44 21 (tie) The Beatles 49
23 (tie) Comedian Harmonists 48 23 (tie) Belle & Sebastian 38 23 (tie) Frank Sinatra 33 23 CBS 27 23 (tie) Subhumans 33 23 (tie) Carole King 30 22 (tie) Pink Floyd 47 23 (tie) Davie Allan & The Arrows 32 23 Elton and Betty White 35 23 Earle Graser 43 23 (tie) Betty Hutton 46
23 (tie) They Might Be Giants 48 23 (tie) Vera Lynn 38 23 (tie) John Cale 33 24 (tie) The Doors 26 23 (tie) The Beach Boys 33 23 (tie) Guided By Voices 30 22 (tie) The Bevis Frond 47 23 (tie) Depeche Mode 32 24 Neil Young 34 24 Neil Young 42 23 (tie) Walk Softly, Peter Troy 46
25 (tie) The Grateful Dead 47 25 (tie) Count Basie 37 23 (tie) The Clash 33 24 (tie) They Might Be Giants 26 25 Gang Of Four 32 23 (tie) The White Stripes 30 25 Lou Reed 42 23 (tie) The Kinks 32 25 (tie) Drive-By Truckers 32 25 Space Patrol 41 25 (tie) Ludwig van Beethoven 45
25 (tie) The Rolling Stones 47 25 (tie) Jimi Hendrix 37 24 (tie) U2 26 25 (tie) Gene 32 25 (tie) Mr District Attorney 45

Unsurprisingly, Bob Dylan features prominently in all eleven (11) slices of these 110,000 songs, never falling any lower than the 3rd position. The last two myriads show the rise of Radio Show in my listening diet. There may also be evidence of increased data capture, as the ‘{unknown}’ artist entry falls below 10th place for the first time in the most recent set of 10,000 songs heard. Looking closely at each tranche reveals my interests over time; for example, the set ending with 40k songs heard includes FDR, Winston Churchill, and CBS among the Top 25, indicating this was when I was listening to old news broadcasts from World War II (‘CBS’ here refers to the progenitor of the World News Today program).There are a few other surprises and oddities in the data — 112 De La Soul tracks!?! — but I’ll leave most of those as an exercise for the reader.

Sample Tracks Over Time

Flying over sixteen years of data means that no meaningful detail can really be seen at the most granular level, so permit me to provide that meaningless detail. Without (much) further ado, I present 0.1% of all the songs heard in this set of 110,000 tracks, randomly chosen by grabbing the 1000th, 2000th, etc.

Each 1000th Song Heard
# Track Artist Album Genre Last Heard
1,000 “I’m Stick In A Pagoda (With Tricia Toyota)” The Dickies Still Got Live, Even If You Don’t Want It Rock 4/19/03
2,000 “Think Again” Minor Threat Complete Discography Alternative & Punk 8/13/03
3,000 “City Of New Orleans” Willie Nelson Revolutions in Time…the journey 1975-1993 Country 2/2/04
4,000 “Funny How Time Slips Away” Tom Jones 26 Country Hits Easy Listening 5/16/04
5,000 “I Can’t Get Started” Charles Mingus Mingus Three Jazz 7/11/04
6,000 Brandenburg Concerto No. 3 in G Major, BWV 1048, I. Allegro The Swingle Singers Bach Hits Back & A Cappella Amadeus Classical 12/8/04
7,000 “7 AM” Dirty Vegas Dirty Vegas Electronica/Dance 1/13/05
8,000 “She’s Too Much” Johnny Littlejohn Chess Blues Guitar, Two Decades of Killer Fretwork 1949-1969 Blues 1/31/05
9,000 “Pink Champagne” Joe Liggins Specialty Sampler Blues 3/2/05
10,000 “Luck Be a Lady Tonight” Frank Sinatra Vocal 3/28/05
11,000 “Grandma” Mari Boine Radiant Warmth Folk 4/10/05
12,000 “Hold On I’m Comin'” Voltage GS I Love You : Japanese Garage Bands Of The 1960s Rock 4/24/05
13,000 “Blue Lines” Massive Attack Blue Lines Electronica/Dance 5/19/05
14,000 “Maggie May” A.L. Lloyd English Drinking Songs Folk 6/30/05
15,000 “People of the Sun” Rage Against The Machine Evil Empire Metal 7/29/05
16,000 “Tina” Camper Van Beethoven 2003-02-28 – Santa Cruz, CA, The Catalyst Alternative 8/17/05
17,000 “Swan” Andersens Songs For Nao: Fourteen Bands From Japan World 9/13/05
18,000 “Scatterbrain (As Dead As Leaves)” Radiohead Hail To The Thief Alternative & Punk 10/2/05
19,000 “Why Theory” Gang Of Four 100 Flowers Bloom Alternative & Punk 10/21/05
20,000 “Ecce Gratum” Carl Orff Carmina Burana Classical 12/5/05
21,000 “Making People Normal” bis Social Dancing Rock 1/11/06
22,000 “Death Is A Star” The Clash Rat Patrol from Fort Bragg Rock 1/23/06
23,000 “A Lot Of Living To Do” Johnny Adams There Is Always One More Time Blues 2/2/06
24,000 “To Forgive Is To Suffer” Death The Sound of Perseverance Metal 2/17/06
25,000 “Hell Yeah” Beck Bootleg Rock 2/26/06
26,000 “Last Match” The Aislers Set The Last Match Alternative & Punk 3/8/06
27,000 “Jack Goes to School” Denis Leary Merry Fuckin’ Christmas Comedy 3/20/06
28,000 “Chromatic” Mouse On Mars Deutscher Funk Rock 4/6/06
29,000 “Wild Horses” (Live Stripped Version) The Rolling Stones Rarities 1971-2003 Rock 7/14/06
30,000 “Throwaway Style” The Exploding Hearts Guitar Romantic Rock 2/24/07
31,000 “Higher And Higher” The Moody Blues To Our Children’s Children’s Children Rock 5/28/07
32,000 “Pedro Navaja” Rubén Blades & Willie Colon 20th Anniversary Of The NY Salsa Festival: 1975-1995 Latin 9/20/07
33,000 “Scissors & Glue” Conceit Wasted Talent Hip Hop/Rap 3/3/08
34,000 “If I Could Be Anything” Casper The Friendly Ghost Musical Adventure In Make-Believe Children’s 6/20/08
35,000 “Love You To Death” 400 Blows Angel’s Trumpets And Devil’s Trombones Punk 10/2/08
36,000 “The Sound Of Life Today” Super Furry Animals Guerrilla Alternative & Punk 1/13/09
37,000 “Mas Fuerte” CuCu Diamantes Mas Fuerte – Canción de la Semana Pop 4/25/09
38,000 “Do You Have A Strategy” Unihabitable Mansions Live on WFMU Sept 2008 Rock 11/20/09
39,000 “What We All Want” Gang Of Four Return The Gift Alternative & Punk 5/17/10
40,000 “Unburden Unbound” Gang Of Four 100 Flowers Bloom Alternative & Punk 8/25/10
41,000 “Blitzkrieg Bop” The Ramones No Thanks! The ’70s Punk Rebellion Punk 12/26/10
42,000 “Aaron & Maria” The American Analog Set Know by Heart Indie 6/19/11
43,000 “Shitty City” Gluecifer Respect The Rock America Rock 12/8/11
44,000 “Move Along” The All-American Rejects Move Along Alternative 4/3/12
45,000 “Everything Is Broken” (Alternate Mix) Bob Dylan Exclusive Rock 7/26/12
46,000 “Watch What Happens” Count Basie On The Road Jazz 11/11/12
47,000 “Rain Dance” Andy Andrews Timeless Wisdom From The Traveler Spoken Word 1/22/13
48,000 “The Last Time” The Rolling Stones London Singles Rock 2/21/13
49,000 “Like Sonny” John Coltrane Coltrane Jazz Jazz 3/26/13
50,000 “When I Fall” Barenaked Ladies Born On A Pirate Ship Alternative & Punk 4/13/13
51,000 “Beer:30” Reverend Horton Heat The Full-Custom Gospel Sounds of The Reverend Horton Heat Rock 4/30/13
52,000 “Oh, Lady Be Good” Ella Fitzgerald Ella: The Legendary Decca Recordings Jazz 5/28/13
53,000 “Hanging On Too Long” The Sinceros The Sound Of Sunbathing Pop 7/1/13
54,000 “Strange New Cottage in Berkeley” Allen Ginsberg Howl and Other Poems Spoken Word 9/15/13
55,000 “That Great Day” T.C.I. Women’s Four Goodbye, Babylon Gospel & Religious 1/6/14
56,000 “Burst” Magazine Definitive Daze Punk 2/18/14
57,000 “Just The Motion” Richard & Linda Thompson Complete Radio Sessions 1980-1981 Pop 3/5/14
58,000 “Baby” (Stephen Street mix) Lloyd Cole Cleaning Out The Ashtrays Pop 3/23/14
59,000 “Row Jimmy” The Grateful Dead Dick’s Picks Volume 7 Rock 4/3/14
60,000 “Lovesick Blues” Wanda Jackson Sundsvall (Live In Sweden) Country 4/18/14
61,000 “I’ll Never Forget To Remember” Watt Wilfong Songwriter Demos Other 5/5/14
62,000 “Konna Kaze ni Sugite iku no Nara” Asakawa Maki Darkness IV World 5/22/14
63,000 “Ship of Fools” The Grateful Dead 1982-12-31 – Oakland, CA, Oakland Auditorium Rock 6/5/14
64,000 “Gates Of Urizen” Bruce Dickinson The Chemical Wedding Metal 6/18/14
65,000 “Don’t Ask My Name” Korean People’s Army Beautiful Music of North Korea World 7/2/14
66,000 “Walk Slow” Little Willie John Little Willie John: All 15 of His Chart Hits from 1953-1962 Blues 7/17/14
67,000 “Broken Hearted, Ragged & Dirty Too” Sleepy John Estes The Early Blues Roots of Bob Dylan Blues 8/2/14
68,000 “I Like PIe, I Like Cake” The Four Clefs Those Dirty Blues, Vol. 3 Blues 9/6/14
69,000 “Drum Solo” Frank Zappa The Mystery Box Rock 10/1/14
70,000 “clouds” Fat Hed The Jump Room Hip Hop/Rap 10/23/14
71,000 “Rip Van Winkle” The Nutmegs Herald 574 Vocal 11/11/14
72,000 “Nebul” Matthias Koeppel Alles Lalula 2: Songs & Poeme von der Beat-Generation bis heute Spoken Word 12/10/14
73,000 “First Shall Be Last And The Last Shall Be First” Peetie Wheatstraw Decca 7167 Blues 1/27/15
74,000 “Apple Suckling Tree” (Take 2) Bob Dylan & The Band The Basement Tapes Complete: The Bootleg Series, Vol. 11 Rock 3/12/15
75,000 attencion 3 finals irdial The Conet Project The Conet Project Other 5/23/15
76,000 “They Love Each Other” Jerry Garcia Band 1977-08-07 – Berkeley, CA, The Keystone [SBD] Rock 7/8/15
77,000 “The Six Wives of Henry VIII” Buena Vista High Symphonic & Show Band Buena Vista High Symphonic and Show Band (Sierra Vista, AZ) Rock 7/20/15
78,000 “Give Peace A Chance” Plastic Ono Band Live Peace in Toronto 1969 Rock 8/3/15
79,000 “DOOM DADA” T.O.P DOOM DADA – Single Hip Hop/Rap 8/17/15
80,000 “Sit and Wonder” Prince Buster 200% Dynamite! Reggae 9/22/15
81,000 “Gimme Danger” Iggy & The Stooges Raw Power Alternative & Punk 10/12/15
82,000 “Freiheitskämpfer” Floh De Cologne 1974 Mumien Rock 11/3/15
83,000 “Medley” Foster Brooks Foster Brooks “Sings” Novelty 11/23/15
84,000 “Fucked” Partly Cloudy Analog CyberPunk Third Series X Analog CyberPunk 12/8/15
85,000 “They’ve Got Me In The Bottle” Brian Brain Analog CyberPunk Addendum IX Analog CyberPunk 12/20/15
86,000 “Breath” Pierre Henry Le Voyage Tibetan Book Of The Dead Avant-Garde 1/8/16
87,000 “Submarine Bells” The Chills 1990-06-09 – Melbourne, Australia, The Club Rock 1/23/16
88,000 “François Villon” Boulat Okoudjava Le Soldat en papier World 2/6/16
89,000 “Lomir Sich Iberbeten” Martha Schlamme The Yiddish Dream Folk 2/23/16
90,000 “Show Biz Kids” Steely Dan Live At The Rainbow May 20, 1974 Rock 3/20/16
91,000 “Eviction” London PX Orders EP Punk 4/23/16
92,000 “Doll” Moaning Lisa Wonderful Rock 6/12/16
93,000 “Color Him Father” Linda Martell Plantation Gold Country 6/28/16
94,000 “The Cheating Line” Paul Martin Plantation Gold Country 7/28/16
95,000 “This Little Girl of Mine” Ray Charles Ray Charles R&B 10/22/16
96,000 01 Xmas 2005 edit Special Ed Xmas 2005 Holiday 1/6/17
97,000 “Sadats (Saints of Marrakesh)” Cheb I Sabbah La Kahena World 3/28/17
98,000 “(I’m Going To Sit Right Down and) Write Myself A Letter” Johnny Mercer Capitol 141 Pop 6/24/17
99,000 “Sinyaro” Brikama Jali Kunda – Griots of West Africa and Beyond World 9/3/17
100,000 “The Nemesis” (1/10/43) The Whistler The Whistler Radio Show 10/18/17
101,000 “Stay A Little Longer, Santa” Shemekia Copeland The Perfect Christmas Holiday 12/25/17
102,000 “Went to See the Gypsy” (Demo Version) Bob Dylan Single Rock 2/21/18
103,000 “To The Future” (5/27/50) Ray Bradbury Dimension X Radio Show 4/18/18
104,000 “Doo Wacka Doo” Tony Randall Vo Vo De Oh Doe 365 Days Project 6/10/18
105,000 “That’s Alright Mama” Bob Dylan The Freewheelin’ Bob Dylan Outtakes Folk 8/7/18
106,000 “Idiot Prayer” Nick Cave & The Bad Seeds The Boatman’s Call Indie 9/26/18
107,000 “Hickory, Dickory, Doom” (2/26/79) CBS Radio Mystery Theater CBS Radio Mystery Theater Radio Show 11/9/18
108,000 “Down where the Swanee River flows” George Wilton Ballard Edison Blue Amberol 2969 Pop 1/23/19
109,000 7. Recitative: “Behold, A Virgin Shall Conceive” George Frederic Handel Handel: Messiah Classical 3/6/19
110,000 “Fly Me to the Moon” B. Howard Customusic AC “Sampler” Pop 4/17/19

Concluding Remarks

§ (see footnotes)

Given the fact that the underlying data analysis terms have changed (see my earlier post on this subject), I am presenting this information more as a baseline for future reports rather than as continuing commentary on my listening habits. Of course, the underlying dataset presents all manner of wonders for the enthralled searcher, but I confess that I am looking forward at this point to just shoving this turkey out the door and getting back to listening to iTunes. (I cannot listen to audio while typing anymore, another resentment I have against my teenaged self.) The plethora of data I have is, as I have said before, pointless — doubly so because I am not selling anything based upon it, which seems to be all data may be used for in our New New World.

I wish all of you well, and will report back when I have listened to another 1,000 tracks (I am about halfway there since I began compiling this information on April 17th [UPDATE:Now less than 50 tracks away as I finally hit the ‘Publish’ button]). I am primarily going to use the information I have to clean up some of the cruddier parts of my data, which is harmless enough I suppose, though pointless. For now I look forward to closing out the stupidly large files I’ve been messing with to gather this information for your perusal.

Good Day

Technical Notes

All data generated using Excel for Mac 2011, based on iTunes library and playlist export text files. For certain calculations I used the i41CX+ app for the iPhone as well as a Pickett N803-ES Log Log Speed Rule Dual Base slide rule. All audio files managed through iTunes, now on version 12.8.2.3, with additional file manipulation with Audacity as well as brute force tweaking of filetypes to generate ring tones, etc. The iTunes Library is maintained on an external hard drive, with two other hard drives for backup using rsync. Most (though not all) files are also kept separately in physical formats that will likely become obsolete along with so much else.

Caveats

Song Ratings

Some problems exist in the data available to me, as some glitch between my iPhone (used to listen to most tracks) and iTunes causes intermittent ratings to be applied to whole albums, which I never do. Those ratings get translated as individual song ratings for songs which have no explicit rating, and it is not possible to distinguish between the two (explicit vs. induced) in the data export file I used for this analysis. It is an annoying problem, and one could add it to the heap of complaints that people seem to have about iTunes as a piece of software. I say, in contrast, however, that I know of no other program that could give me data about what I had listened to for the past 15 years, unless I wrote it myself, which I am not capable of doing. Of course, it may be objected that who would want to do so? I can only submit myself as the proof of the rule you would seek to impose.

I should also point out that I do not use 1-star ratings for anything save as a placeholder for possibly corrupt files. This is because I use the actual description associated with the star ratings in iTunes, and 1 star supposedly means in this system that “I hate it” — and I have not hated any track I have listened to. I have come close (looking at you, U-God), but thus far I have found that “I don’t like it” expresses my feelings well enough. Thus the ratings curve is skewed, but isn’t that true of all modern grade curves?

Play Counts

The highest play counts are somewhat suspect, as these are almost all songs which were tracked during a period in which my iTunes was being shared for my daughter’s iPod usage as well as my own. Thus certain Green Day and My Chemical Romance tracks (which I love) have plays which I cannot swear are all mine. And the most played track is Florence + the Machine’s “Howl”, which I can aver has not been heard by me 157 times, though that is what the data says. Perhaps the only track having more than 70 plays for which I can claim all those plays is the iTunes special version of John Cage’s 4’33” (I do not quote the song title for clarity), which I have heard 75 times — whatever that statement means.

Early Dates

The ability to track information about large datasets always comes with a cost. In general this cost is to be seen in the sheer difficulty of maintaining internally consistent datapoints across the entire set, a difficulty which can only grow as the number of entities tracked becomes greater. While the overall makeup, trends, and detail of the aggregate information will only become more precise as the number of points becomes larger, there will always be database inconsistencies which threaten to hobble complete understanding of the full dataset. This is due to three primary factors: mistaken, incomplete, or corrupt information in the original data capture; inconsistent data entry, especially with multiple sources; and changes to noted data points, schema, or methodology over time. There will always be certain outliers in any sufficiently large set of data that have missing, incorrect, or otherwise inconsistent information stored for particular datapoints. These outliers will surface to plague analysis once a ‘deep dive’ into the data is begun, and how these are handled determines much of what is possible in a complete analysis. Bottom line: You can never know everything in your data universe, unless the number of points in your data is so small as to be worthless for statistical purposes.

With that ridiculously overstated preface, I note that the set of 110,000 songs heard has a small number of songs which are missing one vital statistic: Date Played. The set was generated by looking for a value for ‘Number of Plays’ greater than one, and for 127 tracks the database contains no data for the ‘Date Last Played’ datapoint. Since this last bit of information is used to generate the view of when I listened to these files over time, we have a small (just over 0.115%) set of files for which I can’t tell you when I heard them, although iTunes assures me that I did. A very small number of these files simply do not exist anymore, lost in the great 20GB hard drive crash back in the earliest days of my iPod usage. (I’ll always miss you, variants of Blondie’s “Rapture” from that long lost EP.) It appears from a cursory examination of the two other datetime datapoints (‘Last Modified’ and ‘Date Added’) that none of these files was messed with or was created before the earliest date seen for plays: January 13, 2003. Thus I use this date for the earliest information given in the time sequence analysis. However, petulant perfectionists should note that there may be something wrong with the earliest dates given for song plays, as most of the 127 items missing this factoid seem to be from the beginnings of time — at least as concerns my iTunes tracking.

Also note that I don’t have any information about song usage — or even existence — before the cataclysmic hard drive failure mentioned above, where I lost the entirety (at that time) of my iTunes collection when my 20GB external hard disk failed utterly. Since that tragedy I have, of course, instituted a rigorous backup program, and since that time, also of course, no such failure has recurred.

I also note that the inconsistency between the datapoints ‘Number of Plays’ and ‘Date Last Played’ means that a different view of my data could give a different value for the total number of files heard, as should be obvious. What is not as obvious is that this particular inconsistency seems to to preponderate over the opposite; that is, the number of entries in the database which have a ‘Date Last Played’ value but no value for the ‘Number of Plays’ field is minuscule, with only eight (8) instances found. It seems likely that the original issue is or was caused by problematic data capture between iPods and iTunes through the various OSes and app versions used. Emerson!

Data Scrubbing

Besides the issues with date datapoints mentioned above, many other inconsistencies and outright errors exist in the full dataset. One of the main issues noted immediately at the beginning of my analysis was genre information, which was often either missing or so specific as to be useless. (It is not clear, for example, how useful such putative ‘genres’ as “Dylanesque”, “Meditative”, or “The Camera As Pen” actually are.) Though I have attempted to modify this and other entries in the underlying data, it soon became clear that to wait until all 110,000 song files had been completely reviewed and updated with so-called ‘correct’ information would both be pointless (see note about inherent inconsistency above) and take such a long time as to obviate any information I might care to impart about this ‘milestone’, as I could keep massaging the data long past the point where I have heard 120, 130, or even 150,000 tracks. I have decided to call a halt to the massive effort to impose some order upon all my iTunes files, though I have greatly modified the genre information, and hope to continue to do so in the future.

Thus whereas I originally had over 98 entries which were assigned a Genre for which they were the single example, I now only have 89 Genres all told in the set of music already played. I still have three categories containing only a single exemplar, but feel those Genres are reflective of my own musical taste and hope to add other songs to them as I continue my ad hoc data scrubbing. Thus I cannot promise that the underlying data will not be changed in the backend before the next full-scale analysis; I can only promise that I will attempt to play fair with you and let you know just how I’ve munged the information I have.

Footnotes

* The term “listenable” refers only to the type of file, and does not imply that the sounds contained therein are worthy of being heard.
† Taking “myriad” to mean “ten thousand”, as its original Greek root word does in most cases.
‡ The longest radio show tracks are actually composites of separate daily shows concatenated into single files for ease of listening. Thus, for example, the heyday of the great radio show Yours Truly, Johnny Dollar consisted of daily 15-minute shows which (generally) told a single story over the course of a single week. I took the individual shows and generated a single mp3 file of the complete story arc. (This led to certain needed tweaks to such parameters as Track Number etc., which you almost certainly don’t care about, assuming you’ve even read this far to find these words in this throw-away section of this meaningless report.)
§ If you see an connection between this self-indulgent data analysis and the known association of this Cyndi Lauper tune with masturbation, you are much more clever than I.