Home

Awesome

Msanii: High Fidelity Music Synthesis on a Shoestring Budget

arXiv Hugging Face Spaces Open In Colab GitHub Repo stars

A novel diffusion-based model for synthesizing long-context, high-fidelity music efficiently.

Abstract

In this paper, we present Msanii, a novel diffusion-based model for synthesizing long-context, high-fidelity music efficiently. Our model combines the expressiveness of mel spectrograms, the generative capabilities of diffusion models, and the vocoding capabilities of neural vocoders. We demonstrate the effectiveness of Msanii by synthesizing tens of seconds (190 seconds) of stereo music at high sample rates (44.1 kHz) without the use of concatenative synthesis, cascading architectures, or compression techniques. To the best of our knowledge, this is the first work to successfully employ a diffusion-based model for synthesizing such long music samples at high sample rates. Our demo can be found here and our code here.

Disclaimer

This is a work in progress and has not been finalized. The results and approach presented are subject to change and should not be considered final.

Samples

See more here.

Midnight MelodiesEchoes of Yesterday
 Midnight Melodies  Echoes of Yesterday
Rainy Day ReflectionsStarlight Sonatas
 Rainy Day Reflections  Starlight Sonatas

Setup

Setup your virtual environment using conda or venv.

Install package from git

    pip install -q git+https://github.com/Kinyugo/msanii.git

Install package in edit mode

    git clone https://github.com/Kinyugo/msanii.git
    cd msanii
    pip install -q -r requirements.txt
    pip install -e .

Training

Notebook

<a target="_blank" href="https://colab.research.google.com/github/Kinyugo/msanii/blob/main/notebooks/msanii_training.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> </a>

CLI

To train via CLI you need to define a config file. Check for sample config files within the conf directory.

    wandb login
    python -m msanii.scripts.training <path-to-your-config.yml-file>

Inference

Notebook

<a target="_blank" href="https://colab.research.google.com/github/Kinyugo/msanii/blob/main/notebooks/msanii_inference.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> </a>

CLI

Msanii supports the following inference tasks:

Each task requires a different config file. Check conf directory for samples.

    gdown 1G9kF0r5vxYXPSdSuv4t3GR-sBO8xGFCe # model checkpoint
    python -m msanii.scripts.inference <task> <path-to-your-config.yml-file>

Demo

HF Spaces & Notebook

Hugging Face Spaces Open In Colab

CLI

To run the demo via CLI you need to define a config file. Check for sample config files within the conf directory.

    gdown 1G9kF0r5vxYXPSdSuv4t3GR-sBO8xGFCe # model checkpoint
    python -m msanii.demo.demo <path-to-your-config.yml-file>

Contribute to the Project

We are always looking for ways to improve and expand our project, and we welcome contributions from the community. Here are a few ways you can get involved: