Awesome
DemoCaricature (CVPR'24)
Official PyTorch Implementation of "DemoCaricature: Democratising Caricature Generation with a Rough Sketch."
<img src="assets/figures/banner.jpg" width="800"/>Abstract: In this paper, we democratise caricature generation, empowering individuals to effortlessly craft personalised caricatures with just a photo and a conceptual sketch. Our objective is to strike a delicate balance between abstraction and identity, while preserving the creativity and subjectivity inherent in a sketch. To achieve this, we present Explicit Rank-1 Model Editing alongside single-image personalisation, selectively applying nuanced edits to cross-attention layers for a seamless merge of identity and style. Additionally, we propose Random Mask Reconstruction to enhance robustness, directing the model to focus on distinctive identity and style features. Crucially, our aim is not to replace artists but to eliminate accessibility barriers, allowing enthusiasts to engage in the artistry.
⏳ Coming Soon
- Training scripts
Environment
pip install -r requirements.txt
Weights
Check pretrained identities in ./identities
!
Local Gradio Demo
python gradio_app.py
Diffusers Implementation
import torch
from diffusers import T2IAdapter, EulerDiscreteScheduler
from PIL import Image, ImageOps
from handler import ExplicitROMEHandler
from pipeline import TextualStableDiffusionAdapterWithTauPipeline
model_name = "runwayml/stable-diffusion-v1-5"
adapter_name = "TencentARC/t2iadapter_sketch_sd15v2"
adapter = T2IAdapter.from_pretrained(adapter_name, torch_dtype=torch.bfloat16)
scheduler = EulerDiscreteScheduler.from_pretrained(model_name, subfolder="scheduler")
pipe = TextualStableDiffusionAdapterWithTauPipeline.from_pretrained(
model_name,
adapter=adapter,
scheduler=scheduler,
dtype=torch.bfloat16,
variant="fp16",
).to("cuda")
pipe.safety_checker = None
handler = ExplicitROMEHandler(pipe)
handler.load_explicit_rome("identities/Barack_Obama", token="<ID>")
sketch = Image.open("assets/sketches/Barack_Obama.jpg").convert("L")
sketch = ImageOps.invert(sketch)
generator = torch.Generator("cuda").manual_seed(100)
sample = handler(
prompt="a caricature of <ID>",
image=sketch,
num_inference_steps=20,
guidance_scale=9,
rome_scale=1.1,
adapter_conditioning_scale=0.8,
adapter_conditioning_tau=0.65,
generator=generator,
).images[0]
sample.save("Barack_Obama_caricature.jpg")
<p float="left">
<img src="assets/sketches/Barack_Obama.jpg" width="300"/>
<img src="assets/results/Barack_Obama.jpg" width="300"/>
</p>