Home

Awesome

CoCoPops: The Coordinated Corpus of Popular Musics

This repository is the home of the Coordinated Corpus of Popular Music (CoCoPops). CoCoPops is a meta-corpus of melodic and harmonic transcriptions of popular music. CoCoPops has been developed primarily by Nat Condit-Schultz and Claire Arthur in the Computational and Cognitive Musicology Lab, within the Georgia Tech Center for Music Technology. Many other wonderful people have contributed to CoCoPops as well, both directly and indirectly:

Contents and Format

The goal of CoCoPops is to make a large ammount of comparable melodic/harmonic data available in a consistent, standardized format. All CoCoPops files are stored in humdrum format. CoCoPops currently includes of two main sub-corpora, the Billboard subset and the Rolling Stone subset. We plan on continuing to add more as additional datasets and corpora of popular music with melodic transcriptions become available.

The Billboard subcorpus contains two elements: 1) the entire McGill Billboard Dataset (Burgoyne et al., 2011) converted into humdrum format, as well as 2) new expert melodic transcriptions of (currently) 214 of these songs. The RollingStone subcorpus contains the entire RS200 (de Clercq & Temperley, 2011) converted into humdrum format. All transcriptions also include formal information and time stamps, as well as other data and metadata. In total, CoCoPops currently includes 414 complete melodic-harmonic transcriptions of 398 unique tracks. The complete list of songs can be found in the master CoCoPops_Sample.tsv file.

Directory Structure

Each subcorpus of CoCoPops is contained in its own subdirectory:

Each subdirectory in this repo has its own README file, with detailed explanations of the directory structure and contents.

Standards

Filenames

In CoCoPops, all files associated with particular sampled recordings follow the same filenaming convention. File names names are formatted as Arist_Title_Year.extension, with the following formatting guidelines: The Artist and Title portion of each filename contains only roman-letters and arabic numerals, no special characters (no hyphens, no apostrophes, no commas, no parenthesis, etc.) and no spaces between words. For instance, files associated with the song "Honey, Honey" by ABBA will all labeled ABBA_HoneyHoney_1974; for instance, ABBA_HoneyHoney_1974.xml, ABBA_HoneyHoney_1974.mus, and ABBA_HoneyHoney_1974.hum. The spelled out word "And" is always used, never "&". In lieu of spaces, the first character of each word is upper case, while the rest are lower case. Parentheticals in titles are simply removed, so Otis Reddings' "(Sittin' On) the Dock of the Bay" is just OtisRedding_TheDockOfTheBay_1968. The Year part of the filename is the year the song was sampled, which is not necessarily the year the song was released (in the Billboard subset, it is the year the song charted on the Hot 100).

References

To cite this work (and for suggestions on how or who to cite depending on usage), please see our 2023 ISMIR paper full citation reference below.