Awesome
ComfyUI-ELLA
<div align="center"> <img src="./assets/ELLA-Diffusion.jpg" width="30%" > <br/> <a href='https://ella-diffusion.github.io/'><img src='https://img.shields.io/badge/Project-Page-green'></a> <a href='https://arxiv.org/abs/2403.05135'><img src='https://img.shields.io/badge/arXiv-2403.05135-b31b1b.svg'></a> </div>ComfyUI implementation for ELLA.
:star2: Changelog
- [2024.4.30] Add a new node
ELLA Text Encode
to automatically concat ella and clip condition. - [2024.4.24] Upgraded ELLA Apply method. Better compatibility with the comfyui ecosystem. Refer to the method mentioned in ComfyUI_ELLA PR #25
- DEPRECATED:
Apply ELLA
withoutsimgas
is deprecated and it will be removed in a future version.
- DEPRECATED:
- [2024.4.22] Fix unstable quality of image while multi-batch. Add CLIP concat (support lora trigger words now).
- [2024.4.19] Documenting nodes.
- [2024.4.19] Initial repo.
:pushpin: Notice
- SIGMAS from node
BasicScheduler
or TIMESTEPS by nodeSet ELLA Timesteps
must be the same as the KSampler settings. Because Timestep-Aware Semantic Connector (TSC), which dynamically adapts semantics features over sampling time steps, has been introduced. - If you need concat clip
CONDITIONING
to make LoRA trigger words effective, ELLA outputCONDITIONING
always needs to be linked to theconditioning_to
ofConditioning (Concat)
node.
:books: Example workflows
The examples directory has workflow examples. You can directly load these images as workflow into ComfyUI for use.
All legacy workflows was compatible. But it is deprecated and will be removed in a future version.
:tada: It works with controlnet!
:tada: It works with lora trigger words by concat CLIP CONDITIONING!
:warning: NOTE again that ELLA CONDITIONING
always needs to be linked to the conditioning_to
of Conditioning (Concat)
node.
With ELLA Text Encode
node, can simplify the workflow.
With the upgrade(2024.4.24), some interesting workflow can be implemented, such as using ELLA only in positive. As shown below:
positive + negative | positive only |
---|---|
However, there is no guarantee that positive-only will bring better results.
Workflow with AYS.
AYS got with more visual details and better text-alignment, ref to paper.
w/ AYS | w/o AYS |
---|---|
And EMMA is working in progress.
:green_book: Install
Download or git clone this repository inside ComfyUI/custom_nodes/ directory. ComfyUI-ELLA
requires the latest version of ComfyUI. If something doesn't work be sure to upgrade.
cd ComfyUI/custom_nodes
git clone https://github.com/TencentQQGYLab/ComfyUI-ELLA
Next install dependencies.
cd ComfyUI-ELLA
pip install -r requirements.txt
:orange_book: Models
These models must be placed in the corresponding directories under models.
Remember you can also use any custom location setting an ella
& ella_encoder
entry in the extra_model_paths.yaml
file.
ComfyUI/models/ella
, create it if not present.- Place ELLA Models here
ComfyUI/models/ella_encoder
, create it if not present.- Place FLAN-T5 XL Text Encoder here, it should be a folder of transfomers structure with config.json
In summary, you should have the following model directory structure:
ComfyUI/models/ella/
└── ella-sd1.5-tsc-t5xl.safetensors
ComfyUI/models/ella_encoder/
└── models--google--flan-t5-xl--text_encoder
├── config.json
├── model.safetensors
├── special_tokens_map.json
├── spiece.model
├── tokenizer_config.json
└── tokenizer.json
:book: Nodes reference
:mag: Common promblem
- XXX not implemented for 'Half'. See issue #12
- AYS + Ella getting dark image generations. See issue #39
- Check if
add_noise
ofSamplerCustom
node is enabled. - Lower the
cfg
ofSamplerCustom
node.
- Check if
:memo: TODO
- Support prompt weighting
:hugs: Contributors (direct & indirect)
<table> <tr> <td align="center"><a href="https://github.com/JettHu"><img src="https://avatars.githubusercontent.com/u/35261585?s=460&v=4" width="32px;" alt=""/><br /><sub><b>JettHu</b></sub></a></td> <td align="center"><a href="https://github.com/budui"><img src="https://avatars.githubusercontent.com/u/16448529?s=460&v=4" width="32px;" alt=""/><br /><sub><b>budui</b></sub></a></td> <td align="center"><a href="https://github.com/kijai"><img src="https://avatars.githubusercontent.com/u/40791699?s=460&v=4" width="32px;" alt=""/><br /><sub><b>kijai</b></sub></a></td> <td align="center"><a href="https://github.com/huagetai"><img src="https://avatars.githubusercontent.com/u/1137341?s=460&v=4" width="32px;" alt=""/><br /><sub><b>huagetai</b></sub></a></td> </tr> </table>:yum: Thanks
- ComfyUI: https://github.com/comfyanonymous/ComfyUI
- Diffusers (borrowed timestep modules): https://github.com/huggingface/diffusers
:wink: Citation
@misc{hu2024ella,
title={ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment},
author={Xiwei Hu and Rui Wang and Yixiao Fang and Bin Fu and Pei Cheng and Gang Yu},
year={2024},
eprint={2403.05135},
archivePrefix={arXiv},
primaryClass={cs.CV}
}