Skip to content

WFMU Playlist Analysis

Background

WFMU-FM is a listener-supported, non-commercial radio station broadcasting at 91.1 Mhz FM in Jersey City, NJ, right across the Hudson from lower Manhattan. It is currently the longest running freeform radio station in the United States.

The programming on WFMU can only be described as eclectic which is no surprise given the station’s free-format approach. Even with this eclecticism the programming is consistently great, a product of the remarkable commitment of the station’s unpaid DJ staff. As a result it’s become one of the most well regarded radiostations in the US, at one-point being called the best radiostation in the US by Rolling Stone, CMJ, and the Village Voice. Neutral Milk Hotel lead singer Jeff Magnum had a show on WFMU for a bit, and a number of bands have performed live in their studio.

As a lifelong listener of tri-state radio, WFMU has always been a welcome oasis of ‘different’ in the increasingly commercial dominated FM band. It’s been my go-to station for years, I’ve lost count of the number of new bands or songs the station has turned me onto.

Project

Since roughly the year 2000, DJ’s have been maintaining playlists of their sets on the WFMU website. As a fan of the station it’s a great feature to have this record of what’s been played. I decided to take this one step further and set out to find what artist, albums and songs are consistently played on WFMU by analyzing individual playlists of shows.

The dataset
It’s a fairly large dataset at 78,000 individual playlists at the time of my analysis, or about 12 a day since 2000. There’s a high variability in number of songs in individual sets but it came out to roughly 15, bringing the total entry count to roughly 1.2 million records. A copy of that is available here (note: fields with _clean are after my data quality efforts).

The process
I scraped the playlists using a python script which was fairly straightforward given the playlists are in structured tables. After this I loaded the file into a postgresql database and extracted this to Openrefine to begin doing clean-up. While the playlists themselves are structued the actual inputs into the cells appears to be up to the DJ’s, leaving no real standardization of how artists, albums or songs appear. Additionally, there’s a lot of entries on the playlists that aren’t songs which weren’t relevant for this analysis.

I’ve still got a ways to go but figured I’d start posting some initial tidbits as I play around with the data.

Top Tracks

Two views of the most played tracks on WFMU across all DJ’s.

Top Songs by DJ

A series of playlists featuring songs plays more than once by WFMU DJ’s on their shows.

Top Artists

The top artists played on WFMU across all DJ’s.

Top Albums

The top albums played on WFMU across all DJ’s.