Awesome
InverseMV: Composing Piano Scores with a Convolutional Video-Music Transformer
This repository contains the code and examples for our paper InverseMV: Composing Piano Scores with a Convolutional Video-Music Transformer. Video-Music Transformer (VMT) is an attention-based multi-modal model, which generates piano music for a given video.
Our Dataset
We release a new dataset composed of over 7 hours of piano scores with fine alignment between pop music videos and MIDI files. Our complete InverseMV dataset is available here.
Demo
Here is an example video fragments from our dataset. Note that we do not do any post-production. Each file is made from the original video with a WAVE file converted from the MIDI of the model output.
Original
The original music of the videos.
VMT
The music generated by our VMT model.
Seq2Seq
The musics generated by the baseline Seq2Seq model.
Citation
Please cite our paper if you use InverseMV in your work:
@article{lin2021inversemv,
title={InverseMV: Composing Piano Scores with a Convolutional Video-Music Transformer},
author={Lin, Chin-Tung and Yang, Mu},
journal={arXiv preprint arXiv:2112.15320},
year={2021}
}