Home

Awesome

magnatagatune-list

List of automatic music tagging (using audio) research articles that are evaluated against MagnaTagATune Dataset

Download

MagnaTagATune is kindly hosted by MIRG of City University London. Visit here to download mp3s and others.

Folders and files

After download three files, $ cat mp3.zip.* > mp3_all.zip to merge them, then unzip it by unzip mp3_all.zip. You then got 16 folders, '0' to '9' and then 'a' to 'f'.

Many works based on splitting the folders as 13:1:3 for training/validatin/testing. However it might be the best choice since each folders do NOT have same tag distributions. Minimum line is to shuffle training set, and would like to recommend to shuffle all of them and then split. Still you might need the same splitting to comparison, as in my case :-(

Tags (sorted by popularity)

Top 50

guitar, classical, slow, techno, strings, drums, electronic, rock, fast, piano, ambient, beat, violin, vocal, synth, female, indian, opera, male, singing, vocals, no vocals, harpsichord, loud, quiet, flute, woman, male vocal, no vocal, pop, soft, sitar, solo, man, classic, choir, voice, new age, dance, male voice, female vocal, beats, harp, cello, no voice, weird, country, metal, female voice, choral

Top 50 by categories

Top 51-188

electro, drum, male vocals, jazz, violins, eastern, female vocals, instrumental, bass, modern, no piano, harpsicord, jazzy, string, baroque, foreign, orchestra, hard rock, electric, trance, folk, chorus, chant, voices, classical guitar, spanish, heavy, upbeat, no guitar, acoustic, male singer, electric guitar, electronica, oriental, funky, tribal, banjo, dark, medieval, man singing, organ, blues, irish, no singing, bells, percussion, no drums, woman singing, noise, spacey, singer, female singer, middle eastern, chanting, no flute, low, strange, calm, wind, lute, heavy metal, different, punk, oboe, celtic, sax, flutes, talking, women, arabic, hard, mellow, funk, fast beat, house, rap, not english, no violin, fiddle, female opera, water, india, guitars, no beat, chimes, drone, male opera, trumpet, duet, birds, industrial, sad, plucking, girl, silence, men, operatic, horns, repetitive, airy, world, eerie, deep, hip hop, space, light, keyboard, english, not opera, not classical, not rock, clapping, horn, acoustic guitar, disco, orchestral, no strings, old, echo, lol, soft rock, no singer, jungle, bongos, reggae, monks, clarinet, scary, synthesizer, female singing, piano solo, no voices, woodwind, happy, viola, soprano, quick, clasical

histogram of tags

So the dataset is unbalanced.

histogram

Proposed tag preprocessing

I wrote code to merge these synonyms.

synonyms = [['beat', 'beats'],
			['chant', 'chanting'],
			['choir', 'choral'],
			['classical', 'clasical', 'classic'],
			['drum', 'drums'],
			['electro', 'electronic', 'electronica', 'electric'],
			['fast', 'fast beat', 'quick'],
			['female', 'female singer', 'female singing', 'female vocals', 'female voice', 'woman', 'woman singing', 'women'],
			['flute', 'flutes'],
			['guitar', 'guitars'],
			['hard', 'hard rock'],
			['harpsichord', 'harpsicord'],
			['heavy', 'heavy metal', 'metal'],
			['horn', 'horns'],
			['india', 'indian'],
			['jazz', 'jazzy'],
			['male', 'male singer', 'male vocal', 'male vocals', 'male voice', 'man', 'man singing', 'men'],
			['no beat', 'no drums'],
			['no singer', 'no singing', 'no vocal','no vocals', 'no voice', 'no voices', 'instrumental'],
			['opera', 'operatic'],
			['orchestra', 'orchestral'],
			['quiet', 'silence'],
			['singer', 'singing'],
			['space', 'spacey'],
			['string', 'strings'],
			['synth', 'synthesizer'],
			['violin', 'violins'],
			['vocal', 'vocals', 'voice', 'voices'],
			['strange', 'weird']]`

I'm not 100% sure if these should be merged.

			['opera', 'operatic'],
			['hard', 'hard rock'],
			

Papers

This list is based on google scholar, list of papers that cited the dataset, google scholar search result

2016

not yet

2015

According to this,

(and,)

algorithmDeep-BoF (proposed)P 2011P Hamel 2011P Hamel 2012
AUC tag0.8880.8450.8610.870
AUC clip0.9560.9380.9430.949
Precision@30.5110.4490.4670.481
Precision@60.3580.4300.3270.339
Precision@90.2750.2490.2550.263
Precision@120.2250.2050.2110.216
Precision@150.1900.1750.1810.184

2014

According to this,

(and,)

filter lengthstrideAUC (spectrograms)AUC (raw audio)
102410240.86900.8366
2562560.88150.8487

| model | nmse | mean average precision (mAP) | | ------------- |:-------------:|:-------------:|:-------------:| | linear regression | 0.965 | 0.823 | 0.0099 | | MLP (1 hidden layer) | 0.939 | 0.841 | 0.0179 | | MLP (2 hidden layers) | 0.924 | 0.837 | 0.0179 |

taskAUC
tag (top 50 tags)<0.88
tag (all 188 tags)<0.86

2013

According to this,

algorithmprecision@1precision@3
k-nn39.4%28.6%
k-nn (Wsabie space)45.2%31.9%
Wsabie48.7%37.5%
Affinity Weighted Embedding52.7%39.2%

(and,)

algorithmaverage AUC
Laplacian 1 frame0.898
Multiresolution spectrograms0.888

2012

According to this,

2011

According to this,

table 1.

algorithmAUCaverage training time
MFCC(20)0.77+-0.045.9gh
MEL-spectrum(128)0.853+-0.0085.2h
PMSC(120)0.876+-0.0041.5h

table 2.

measureManzagolZhiMandelMarsyasmel-spec+PFCPMSC+PFCPSMC+MTSL
average AUC-tag0.7500.6730.8210.8310.8200.8450.861
average AUC-clip0.8100.7480.8860.9330.9300.9380.943
precision@30.2550.2240.3230.4400.4300.4490.467
precision@60.1940.1920.2450.3140.3050.3200.327
precision@90.1590.1680.1970.2440.2400.2490.255
precision@120.1360.1460.1670.2010.1980.2050.211
precision@150.1190.1270.1450.1720.1700.1750.181

left four: rsults from MIREX 2009

(and,)

2010

According to this,

(and,)

feature setf-scoreg-meanf-score (S2)g-mean(S20
SAF0.37750.61010.39620.6252
BLF-PCA0.41630.64100.42010.6439
feature setavg. f-scoreavg. g-meanavg. f-score (S2)avg. g-mean(S20
SAF0.17770.35730.19320.3784
BLF-PCA0.21360.40190.21850.4081

global average (table 2)

algorithmprecisionrecallaccuracyF-score
Audio SVM0.3070.3150.9690.311
Affinity SVM0.3510.3540.9710.353

affinity svm - per-tag evaluation (table 4) (table 3 is ommited as table 4 outperforms in overall)

number of tagsprecisionrecallaccuracyF-score
200.4180.6910.8560.518
300.3460.6710.8620.453
400.3940.3970.9140.395
500.3690.3720.9230.371
1000.2590.2620.9510.260
all (188)0.1840.1860.9710.185

2009

N/A