Home

Awesome

Hyena

This repository provides a JAX/Flax implementation of the Hyena architecture introduced in Poli et. al. (2023). A full training run of a small 1.5M parameter model, on the Shakespeare dataset can be found in the included intro.ipynb. This achieves a best validation loss of ~1.45, on par with the results in nanoGPT.

Details

Specifically, the following is implemented: