Home

Awesome

AutoRegressive-VITS

(WIP) text to speech using autoregressive transformer and VITS

Note

Todo

structure

structure.png

Training pipeline

  1. jointly train S2 vits decoder and quantizer
  2. extract semantic tokens
  3. train S1 text to semantic

vits S2 training

gpt S1 training

Inference

Pretrained models