Awesome

Awesome-ICCV2023-Low-Level-Vision

A Collection of Papers and Codes in ICCV2023 related to Low-Level Vision

[In Construction] If you find some missing papers or typos, feel free to pull issues or requests.

Related collections for low-level vision

Overview

Image Restoration
- Video Restoration
Super Resolution
- Image Super Resolution
- Video Super Resolution
Image Rescaling
Denoising
- Image Denoising
Deblurring
- Image Deblurring
- Video Deblurring
Deraining
Dehazing
Demosaicing
HDR Imaging / Multi-Exposure Image Fusion
Frame Interpolation
Image Enhancement
- Low-Light Image Enhancement
Image Harmonization
Image Completion/Inpainting
Image Stitching
Image Compression
Image Quality Assessment
Style Transfer
Image Editing
Image Generation/Synthesis/ Image-to-Image Translation
- Video Generation
Others

Image Restoration

SYENet: A Simple Yet Effective Network for Multiple Low-Level Vision Tasks with Real-time Performance on Mobile Device

Paper: https://arxiv.org/abs/2308.08137
Code: https://github.com/sanechips-multimedia/syenet

DiffIR: Efficient Diffusion Model for Image Restoration

Paper: https://arxiv.org/abs/2303.09472
Code: https://github.com/Zj-BinXia/DiffIR
Tags: Diffusion

PIRNet: Privacy-Preserving Image Restoration Network via Wavelet Lifting

Paper: ICCV Open Access Version
Code: https://github.com/gclonghorn/PIRNet

Focal Network for Image Restoration

Paper: ICCV Open Access Version
Code: https://github.com/c-yn/FocalNet

Learning Image-Adaptive Codebooks for Class-Agnostic Image Restoration

Paper: https://arxiv.org/abs/2306.06513

Under-Display Camera Image Restoration with Scattering Effect

Paper: https://arxiv.org/abs/2308.04163
Code: https://github.com/NamecantbeNULL/SRUDC
Tags: Under-Display Camera

FSI: Frequency and Spatial Interactive Learning for Image Restoration in Under-Display Cameras

Paper: ICCV Open Access Version
Tags: Under-Display Camera

Multi-weather Image Restoration via Domain Translation

Paper: ICCV Open Access Version
Code: https://github.com/pwp1208/Domain_Translation_Multi-weather_Restoration
Tags: Multi-weather

Adverse Weather Removal with Codebook Priors

Paper: ICCV Open Access Version
Tags: Adverse Weather Removal

Towards Authentic Face Restoration with Iterative Diffusion Models and Beyond

Paper: https://arxiv.org/abs/2307.08996
Tags: Authentic Face Restoration, Diffusion

Improving Lens Flare Removal with General Purpose Pipeline and Multiple Light Sources Recovery

Paper: https://arxiv.org/abs/2308.16460
Code: https://github.com/YuyanZhou1/Improving-Lens-Flare-Removal
Tags: Flare Removal

High-Resolution Document Shadow Removal via A Large-Scale Real-World Dataset and A Frequency-Aware Shadow Erasing Net

Paper: https://arxiv.org/abs/2308.14221
Code: https://github.com/CXH-Research/DocShadow-SD7K
Tags: Document Shadow Removal

Boundary-Aware Divide and Conquer: A Diffusion-Based Solution for Unsupervised Shadow Removal

Paper: ICCV Open Access Version
Tags: Shadow Removal

Leveraging Inpainting for Single-Image Shadow Removal

Paper: https://arxiv.org/abs/2302.05361
Tags: Shadow Removal

Fine-grained Visible Watermark Removal

Paper: https://openaccess.thecvf.com/content/ICCV2023/html/Niu_Fine-grained_Visible_Watermark_Removal_ICCV_2023_paper.html
Tags: Watermark Removal

Physics-Driven Turbulence Image Restoration with Stochastic Refinement

Paper: https://arxiv.org/abs/2307.10603
Code: https://github.com/VITA-Group/PiRN
Tags: Turbulence Image

Building Bridge Across the Time: Disruption and Restoration of Murals In the Wild

Paper: ICCV Open Access Version
Code: https://github.com/Shaocr/Building-Bridge-Across-the-Time-Disruption-and-Restoration-of-Murals-In-the-Wild
Tags: Murals Restoration

DDS2M: Self-Supervised Denoising Diffusion Spatio-Spectral Model for Hyperspectral Image Restoration

Paper: https://arxiv.org/abs/2303.06682
Code: https://github.com/miaoyuchun/DDS2M
Tags: Diffusion, Hyperspectral

Fingerprinting Deep Image Restoration Models

Paper: ICCV Open Access Version

Self-supervised Monocular Underwater Depth Recovery, Image Restoration, and a Real-sea Video Dataset

Paper: ICCV Open Access Version
Code: https://github.com/nishavarghese15/DRUVA

Image Reconstruction

Pixel Adaptive Deep Unfolding Transformer for Hyperspectral Image Reconstruction

Paper: https://arxiv.org/abs/2308.10820
Code: https://github.com/MyuLi/PADUT

Video Restoration

Snow Removal in Video: A New Dataset and A Novel Method

Paper: ICCV Open Access Version
Code: https://github.com/haoyuc/VideoDesnowing
Tags: Desnowing

Video Adverse-Weather-Component Suppression Network via Weather Messenger and Adversarial Backpropagation

Paper: https://arxiv.org/abs/2309.13700
Code: https://github.com/scott-yjyang/ViWS-Net

Fast Full-frame Video Stabilization with Iterative Optimization

Paper: https://arxiv.org/abs/2307.12774
Code: https://github.com/zwyking/Fast-Stab
Tags: Video Stabilization

Minimum Latency Deep Online Video Stabilization

Paper: https://arxiv.org/abs/2212.02073
Code: https://github.com/liuzhen03/NNDVS
Tags: Video Stabilization

Task Agnostic Restoration of Natural Video Dynamics

Paper: https://arxiv.org/abs/2206.03753
Code: https://github.com/MKashifAli/TARONVD

[Back-to-Overview]

Super Resolution

Image Super Resolution

On the Effectiveness of Spectral Discriminators for Perceptual Quality Improvement

Paper: https://arxiv.org/abs/2307.12027
Code: https://github.com/Luciennnnnnn/DualFormer

SRFormer: Permuted Self-Attention for Single Image Super-Resolution

Paper: https://arxiv.org/abs/2303.09735
Code: https://github.com/HVision-NKU/SRFormer

DLGSANet: Lightweight Dynamic Local and Global Self-Attention Network for Image Super-Resolution

Paper: https://arxiv.org/abs/2301.02031
Code: https://github.com/NeonLeexiang/DLGSANet

Dual Aggregation Transformer for Image Super-Resolution

Paper: https://arxiv.org/abs/2308.03364
Code: https://github.com/zhengchen1999/DAT

MSRA-SR: Image Super-resolution Transformer with Multi-scale Shared Representation Acquisition

Paper: ICCV Open Access Version

Content-Aware Local GAN for Photo-Realistic Super-Resolution

Boosting Single Image Super-Resolution via Partial Channel Shifting

Paper: ICCV Open Access Version
Code: https://github.com/OwXiaoM/_PCS

Feature Modulation Transformer: Cross-Refinement of Global Representation via High-Frequency Prior for Image Super-Resolution

Paper: https://arxiv.org/abs/2308.05022
Code: https://github.com/AVC2-UESTC/CRAFT-SR

Spatially-Adaptive Feature Modulation for Efficient Image Super-Resolution

Paper: https://arxiv.org/abs/2302.13800
Code: https://github.com/sunny2109/SAFMN
Tags: Efficient

Lightweight Image Super-Resolution with Superpixel Token Interaction

Paper: ICCV Open Access Version
Code: https://github.com/ArcticHare105/SPIN
Tags: Lightweight

Reconstructed Convolution Module Based Look-Up Tables for Efficient Image Super-Resolution

Paper: https://arxiv.org/abs/2307.08544
Code: https://github.com/liuguandu/RC-LUT
Tags: Efficient

Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution

Paper: https://arxiv.org/abs/2211.13654
Code: https://github.com/Jiamian-Wang/Iterative-Soft-Shrinkage-SR
Tags: Efficient

MetaF2N: Blind Image Super-Resolution by Learning Efficient Model Adaptation from Faces

Paper: https://arxiv.org/abs/2309.08113
Code: https://github.com/yinzhicun/MetaF2N
Tags: Blind

Learning Correction Filter via Degradation-Adaptive Regression for Blind Single Image Super-Resolution

LMR: A Large-Scale Multi-Reference Dataset for Reference-Based Super-Resolution

Paper: https://arxiv.org/abs/2303.04970
Tags: Reference-Based

Real-CE: A Benchmark for Chinese-English Scene Text Image Super-resolution

Paper: https://arxiv.org/abs/2308.03262
Code: https://github.com/mjq11302010044/Real-CE
Tags: Text SR

Learning Non-Local Spatial-Angular Correlation for Light Field Image Super-Resolution

Paper: https://arxiv.org/abs/2302.08058
Code: https://github.com/ZhengyuLiang24/EPIT
Tags: Light Field

Spherical Space Feature Decomposition for Guided Depth Map Super-Resolution

Paper: https://arxiv.org/abs/2303.08942
Code: https://github.com/Zhaozixiang1228/GDSR-SSDNet
Tags: Depth Map

HSR-Diff: Hyperspectral Image Super-Resolution via Conditional Diffusion Models

Paper: ICCV Open Access Version
Tags: Hyperspectral, Diffusion

ESSAformer: Efficient Transformer for Hyperspectral Image Super-resolution

Paper: https://arxiv.org/abs/2307.14010
Code: https://github.com/Rexzhan/ESSAformer
Tags: Hyperspectral

Rethinking Multi-Contrast MRI Super-Resolution: Rectangle-Window Cross-Attention Transformer and Arbitrary-Scale Upsampling

Paper: ICCV Open Access Version
Tags: MRI

Decomposition-Based Variational Network for Multi-Contrast MRI Super-Resolution and Reconstruction

Paper: ICCV Open Access Version
Code: https://github.com/lpcccc-cv/MC-VarNet
Tags: MRI

CuNeRF: Cube-Based Neural Radiance Field for Zero-Shot Medical Image Arbitrary-Scale Super Resolution

Paper: https://arxiv.org/abs/2303.16242
Code: https://github.com/NarcissusEx/CuNeRF
Tags: Medical, NeRF

Burst Super Resolution

Towards Real-World Burst Image Super-Resolution: Benchmark and Method

Self-Supervised Burst Super-Resolution

Paper: ICCV Open Access Version
Tags: Self-Supervised

Video Super Resolution

Learning Data-Driven Vector-Quantized Degradation Model for Animation Video Super-Resolution

Paper: https://arxiv.org/abs/2303.09826
Code: https://github.com/researchmm/VQD-SR

Multi-Frequency Representation Enhancement with Privilege Information for Video Super-Resolution

Paper: ICCV Open Access Version

Spatial-Temporal Video Super-Resolution

MoTIF: Learning Motion Trajectories with Local Implicit Neural Functions for Continuous Space-Time Video Super-Resolution

Paper: https://arxiv.org/abs/2307.07988
Code: https://github.com/sichun233746/MoTIF

[Back-to-Overview]

Image Rescaling

Downscaled Representation Matters: Improving Image Rescaling with Collaborative Downscaled Images

Paper: https://arxiv.org/abs/2211.10643

[Back-to-Overview]

Denoising

Image Denoising

Random Sub-Samples Generation for Self-Supervised Real Image Denoising

Score Priors Guided Deep Variational Inference for Unsupervised Real-World Single Image Denoising

Paper: https://arxiv.org/abs/2308.04682
Tags: Unsupervised

Unsupervised Image Denoising in Real-World Scenarios via Self-Collaboration Parallel Generative Adversarial Branches

Paper: https://arxiv.org/abs/2308.06776
Tags: Unsupervised

Self-supervised Image Denoising with Downsampled Invariance Loss and Conditional Blind-Spot Network

Multi-view Self-supervised Disentanglement for General Image Denoising

Paper: https://arxiv.org/abs/2309.05049
Tags: Self-supervised

Iterative Denoiser and Noise Estimator for Self-Supervised Image Denoising

Paper: ICCV Open Access Version
Tags: Self-Supervised

Noise2Info: Noisy Image to Information of Noise for Self-Supervised Image Denoising

Paper: ICCV Open Access Version
Tags: Self-Supervised

The Devil is in the Upsampling: Architectural Decisions Made Simpler for Denoising with Deep Image Prior

Lighting Every Darkness in Two Pairs: A Calibration-Free Pipeline for RAW Denoising

Paper: https://arxiv.org/abs/2308.03448
Code: https://github.com/Srameo/LED

ExposureDiffusion: Learning to Expose for Low-light Image Enhancement

Paper: https://arxiv.org/abs/2307.07710
Code: https://github.com/wyf0912/ExposureDiffusion

Towards General Low-Light Raw Noise Synthesis and Modeling

Paper: https://arxiv.org/abs/2307.16508
Code: https://github.com/fengzhang427/LRD
Tags: Noise Modeling

Hybrid Spectral Denoising Transformer with Guided Attention

Paper: https://arxiv.org/abs/2303.09040
Code: https://github.com/Zeqiang-Lai/HSDT
Tags: hyperspectral image denoising

[Back-to-Overview]

Deblurring

Image Deblurring

Multiscale Structure Guided Diffusion for Image Deblurring

Paper: https://arxiv.org/abs/2212.01789

Multi-Scale Residual Low-Pass Filter Network for Image Deblurring

Paper: ICCV Open Access Version

Single Image Defocus Deblurring via Implicit Neural Inverse Kernels

Paper: ICCV Open Access Version

Single Image Deblurring with Row-dependent Blur Magnitude

Paper: ICCV Open Access Version

Non-Coaxial Event-Guided Motion Deblurring with Spatial Alignment

Paper: ICCV Open Access Version
Tags: Event-Based

Generalizing Event-Based Motion Deblurring in Real-World Scenarios

Video Deblurring

Exploring Temporal Frequency Spectrum in Deep Video Deblurring

Paper：ICCV Open Access Version

[Back-to-Overview]

Deraining

From Sky to the Ground: A Large-scale Benchmark and Simple Baseline Towards Real Rain Removal

Paper: https://arxiv.org/abs/2308.03867
Code: https://github.com/yunguo224/LHP-Rain

Learning Rain Location Prior for Nighttime Deraining

Paper: ICCV Open Access Version
Code: https://github.com/zkawfanx/RLP

Sparse Sampling Transformer with Uncertainty-Driven Ranking for Unified Removal of Raindrops and Rain Streaks

Unsupervised Video Deraining with An Event Camera

Paper: ICCV Open Access Version

Both Diverse and Realism Matter: Physical Attribute and Style Alignment for Rainy Image Generation

Paper: ICCV Open Access Version

[Back-to-Overview]

Dehazing

MB-TaylorFormer: Multi-branch Efficient Transformer Expanded by Taylor Formula for Image Dehazing

Paper: https://arxiv.org/abs/2308.14036
Code: https://github.com/FVL2020/ICCV-2023-MB-TaylorFormer

[Back-to-Overview]

Demosaicing

Efficient Unified Demosaicing for Bayer and Non-Bayer Patterned Image Sensors

Paper: https://arxiv.org/abs/2307.10667

[Back-to-Overview]

HDR Imaging / Multi-Exposure Image Fusion

Alignment-free HDR Deghosting with Semantics Consistent Transformer

Paper: https://arxiv.org/abs/2305.18135
Code: https://github.com/Zongwei97/SCTNet

MEFLUT: Unsupervised 1D Lookup Tables for Multi-exposure Image Fusion

Paper: https://arxiv.org/abs/2309.11847
Code: https://github.com/Hedlen/MEFLUT

RawHDR: High Dynamic Range Image Reconstruction from a Single Raw Image

Paper: https://arxiv.org/abs/2309.02020
Code: https://github.com/jackzou233/RawHDR

Learning Continuous Exposure Value Representations for Single-Image HDR Reconstruction

Paper: https://arxiv.org/abs/2309.03900

LAN-HDR: Luminance-based Alignment Network for High Dynamic Range Video Reconstruction

Paper: https://arxiv.org/abs/2308.11116

Joint Demosaicing and Deghosting of Time-Varying Exposures for Single-Shot HDR Imaging

GlowGAN: Unsupervised Learning of HDR Images from LDR Images in the Wild

Paper: https://arxiv.org/abs/2211.12352

Beyond the Pixel: a Photometrically Calibrated HDR Dataset for Luminance and Color Prediction

Paper: https://arxiv.org/abs/2304.12372
Code: https://github.com/lvsn/beyondthepixel

[Back-to-Overview]

Frame Interpolation

Video Object Segmentation-aware Video Frame Interpolation

Paper: ICCV Open Access Version
Code: https://github.com/junsang7777/VOS-VFI

Rethinking Video Frame Interpolation from Shutter Mode Induced Degradation

Paper: ICCV Open Access Version

[Back-to-Overview]

Image Enhancement

Iterative Prompt Learning for Unsupervised Backlit Image Enhancement

Paper: https://arxiv.org/abs/2303.17569
Code: https://github.com/ZhexinLiang/CLIP-LIT

Low-Light Image Enhancement

ExposureDiffusion: Learning to Expose for Low-light Image Enhancement

Paper: https://arxiv.org/abs/2307.07710
Code: https://github.com/wyf0912/ExposureDiffusion

Implicit Neural Representation for Cooperative Low-light Image Enhancement

Paper: https://arxiv.org/abs/2303.11722
Code: https://github.com/Ysz2022/NeRCo

Low-Light Image Enhancement with Illumination-Aware Gamma Correction and Complete Image Modelling Network

Paper: https://arxiv.org/abs/2308.08220

Diff-Retinex: Rethinking Low-light Image Enhancement with A Generative Diffusion Model

Paper: https://arxiv.org/abs/2308.13164

Retinexformer: One-stage Retinex-based Transformer for Low-light Image Enhancement

Paper: https://arxiv.org/abs/2303.06705
Code: https://github.com/caiyuanhao1998/Retinexformer

Low-Light Image Enhancement with Multi-Stage Residue Quantization and Brightness-Aware Attention

Paper: ICCV Open Access Version

Dancing in the Dark: A Benchmark towards General Low-light Video Enhancement

Paper: ICCV Open Access Version
Code: https://github.com/ciki000/DID

NIR-assisted Video Enhancement via Unpaired 24-hour Data

Paper: ICCV Open Access Version
Code: https://github.com/MyNiuuu/NVEU

Coherent Event Guided Low-Light Video Enhancement

Paper: ICCV Open Access Version
Code: https://github.com/sherrycattt/EvLowLight

[Back-to-Overview]

Image Harmonization/Composition

Deep Image Harmonization with Learnable Augmentation

Deep Image Harmonization with Globally Guided Feature Transformation and Relation Distillation

TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition

Paper: https://arxiv.org/abs/2307.12493
Code: https://github.com/Shilin-LU/TF-ICON

[Back-to-Overview]

Image Completion/Inpainting

Diverse Inpainting and Editing with GAN Inversion

Paper: https://arxiv.org/abs/2307.15033

Rethinking Fast Fourier Convolution in Image Inpainting

Paper: ICCV Open Access Version

Continuously Masked Transformer for Image Inpainting

Paper: ICCV Open Access Version
Code: https://github.com/keunsoo-ko/CMT

MI-GAN: A Simple Baseline for Image Inpainting on Mobile Devices

Paper: ICCV Open Access Version
Code: https://github.com/Picsart-AI-Research/MI-GAN

PATMAT: Person Aware Tuning of Mask-Aware Transformer for Face Inpainting

Paper: https://arxiv.org/abs/2304.06107
Code: https://github.com/humansensinglab/PATMAT

Video Inpainting

ProPainter: Improving Propagation and Transformer for Video Inpainting

Paper: https://arxiv.org/abs/2309.03897
Code: https://github.com/sczhou/ProPainter

Semantic-Aware Dynamic Parameter for Video Inpainting Transformer

Paper: ICCV Open Access Version

CIRI: Curricular Inactivation for Residue-aware One-shot Video Inpainting

Paper: ICCV Open Access Version
Code: https://github.com/Arise-zwy/CIRI

[Back-to-Overview]

Image Stitching

Parallax-Tolerant Unsupervised Deep Image Stitching

Paper: https://arxiv.org/abs/2302.08207
Code: https://github.com/nie-lang/UDIS2

[Back-to-Overview]

Image Compression

RFD-ECNet: Extreme Underwater Image Compression with Reference to Feature Dictionary

Paper: ICCV Open Access Version
Code: https://github.com/lilala0/RFD-ECNet

COMPASS: High-Efficiency Deep Image Compression with Arbitrary-scale Spatial Scalability

Paper: https://arxiv.org/abs/2309.07926
Code: https://github.com/ImJongminPark/COMPASS

Computationally-Efficient Neural Image Compression with Shallow Decoders

Paper: ICCV Open Access Version
Code: https://github.com/mandt-lab/shallow-ntc

Dec-Adapter: Exploring Efficient Decoder-Side Adapter for Bridging Screen Content and Natural Image Compression

Paper: ICCV Open Access Version

Semantically Structured Image Compression via Irregular Group-Based Decoupling

Paper: https://arxiv.org/abs/2305.02586

TransTIC: Transferring Transformer-based Image Compression from Human Perception to Machine Perception

Paper: https://arxiv.org/abs/2306.05085
Code: https://github.com/NYCU-MAPL/TransTIC

AdaNIC: Towards Practical Neural Image Compression via Dynamic Transform Routing

Paper: ICCV Open Access Version

COOL-CHIC: Coordinate-based Low Complexity Hierarchical Image Codec

Paper: ICCV Open Access Version
Code: https://github.com/Orange-OpenSource/Cool-Chic

Video Compression

Scene Matters: Model-based Deep Video Compression

Paper: https://arxiv.org/abs/2303.04557

[Back-to-Overview]

Image Quality Assessment

Thinking Image Color Aesthetics Assessment: Models, Datasets and Benchmarks

Test Time Adaptation for Blind Image Quality Assessment

Paper: https://arxiv.org/abs/2307.14735

Troubleshooting Ethnic Quality Bias with Curriculum Domain Adaptation for Face Image Quality Assessment

Paper: ICCV Open Access Version
Code: https://github.com/oufuzhao/EQBM

SQAD: Automatic Smartphone Camera Quality Assessment and Benchmarking

Paper: ICCV Open Access Version
Code: https://github.com/aiff22/SQAD

Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives

Paper https://arxiv.org/abs/2211.04894
Code: https://github.com/VQAssessment/DOVER

[Back-to-Overview]

Style Transfer

AesPA-Net: Aesthetic Pattern-Aware Style Transfer Networks

Paper: https://arxiv.org/abs/2307.09724
Code: https://github.com/Kibeom-Hong/AesPA-Net

Two Birds, One Stone: A Unified Framework for Joint Learning of Image and Video Style Transfers

Paper: https://arxiv.org/abs/2304.11335
Code: https://github.com/NevSNev/UniST

All-to-key Attention for Arbitrary Style Transfer

Paper: https://arxiv.org/abs/2212.04105
Code: https://github.com/LearningHx/StyA2K

StyleDiffusion: Controllable Disentangled Style Transfer via Diffusion Models

Paper: https://arxiv.org/abs/2308.07863

StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized Tokenizer of a Large-Scale Generative Model

Paper: https://arxiv.org/abs/2303.09268
Code: https://github.com/zipengxuc/StylerDALLE

Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer

Paper: https://arxiv.org/abs/2303.08622
Code(unofficial): https://github.com/ouhenio/text-guided-diffusion-style-transfer

HairNeRF: Geometry-Aware Image Synthesis for Hairstyle Transfer

Paper: ICCV Open Access Version

[Back-to-Overview]

Image Editing

Adaptive Nonlinear Latent Transformation for Conditional Face Editing

Paper: https://arxiv.org/abs/2307.07790
Code: https://github.com/Hzzone/AdaTrans

Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing

MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing

Paper: https://arxiv.org/abs/2304.08465
Code: https://github.com/TencentARC/MasaCtrl

Not All Steps are Created Equal: Selective Diffusion Distillation for Image Manipulation

HairCLIPv2: Unifying Hair Editing via Proxy Feature Blending

Paper: ICCV Open Access Version
Code: https://github.com/wty-ustc/HairCLIPv2

StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces

Paper: https://arxiv.org/abs/2303.06146
Code: https://github.com/williamyang1991/StyleGANEX

Diverse Inpainting and Editing with GAN Inversion

Paper: https://arxiv.org/abs/2307.15033

Effective Real Image Editing with Accelerated Iterative Diffusion Inversion

Paper: https://arxiv.org/abs/2309.04907

Conceptual and Hierarchical Latent Space Decomposition for Face Editing

Paper: ICCV Open Access Version

Editing Implicit Assumptions in Text-to-Image Diffusion Models

Paper: https://arxiv.org/abs/2303.08084
Code: https://github.com/bahjat-kawar/time-diffusion

Prompt Tuning Inversion for Text-Driven Image Editing Using Diffusion Models

Paper: https://arxiv.org/abs/2305.04441

A Latent Space of Stochastic Diffusion Models for Zero-Shot Image Editing and Guidance

Paper: ICCV Open Access Version
Code: https://github.com/ChenWu98/cycle-diffusion

Out-of-domain GAN inversion via Invertibility Decomposition for Photo-Realistic Human Face Manipulation

Paper: https://arxiv.org/abs/2212.09262

Video Editing

RIGID: Recurrent GAN Inversion and Editing of Real Face Videos

Paper: https://arxiv.org/abs/2308.06097
Code: https://github.com/cnnlstm/RIGID

Pix2Video: Video Editing using Image Diffusion

Paper: https://arxiv.org/abs/2303.12688
Code: https://github.com/duyguceylan/pix2video

FateZero: Fusing Attentions for Zero-shot Text-based Video Editing

Paper: https://arxiv.org/abs/2303.09535
Code: https://github.com/ChenyangQiQi/FateZero

StableVideo: Text-driven Consistency-aware Diffusion Video Editing

Paper: https://arxiv.org/abs/2308.09592
Code: https://github.com/rese1f/StableVideo

VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs

Paper: https://arxiv.org/abs/2304.06020
Code: https://github.com/MoayedHajiAli/VidStyleODE-official

[Back-to-Overview]

Image Generation/Synthesis / Image-to-Image Translation

Text-to-Image / Text Guided / Multi-Modal

Adding Conditional Control to Text-to-Image Diffusion Models

Paper: https://arxiv.org/abs/2302.05543
Code: https://github.com/lllyasviel/ControlNet

MagicFusion: Boosting Text-to-Image Generation Performance by Fusing Diffusion Models

Paper: https://arxiv.org/abs/2303.13126
Code: https://github.com/MagicFusion/MagicFusion.github.io

ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation

Paper: https://arxiv.org/abs/2302.13848
Code: https://github.com/csyxwei/ELITE

Unleashing Text-to-Image Diffusion Models for Visual Perception

Paper: https://arxiv.org/abs/2303.02153
Code: https://github.com/wl-zhao/VPD

Unsupervised Compositional Concepts Discovery with Text-to-Image Generative Models

BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion

Paper: https://arxiv.org/abs/2307.10816
Code: https://github.com/Sierkinhane/BoxDiff

Ablating Concepts in Text-to-Image Diffusion Models

Paper: https://arxiv.org/abs/2303.13516
Code: https://github.com/nupurkmr9/concept-ablation

Learning to Generate Semantic Layouts for Higher Text-Image Correspondence in Text-to-Image Synthesis

Paper: https://arxiv.org/abs/2308.08157
Code: https://github.com/pmh9960/GCDP

HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image Generation

Paper: https://arxiv.org/abs/2304.04269
Code: https://github.com/IDEA-Research/HumanSD

Story Visualization by Online Text Augmentation with Context Memory

Paper: https://arxiv.org/abs/2308.07575

DiffCloth: Diffusion Based Garment Synthesis and Manipulation via Structural Cross-modal Semantic Alignment

Paper: https://arxiv.org/abs/2308.11206

Dense Text-to-Image Generation with Attention Modulation

Paper: https://arxiv.org/abs/2308.12964
Code: https://github.com/naver-ai/DenseDiffusion``

ITI-GEN: Inclusive Text-to-Image Generation

Paper: https://arxiv.org/abs/2309.05569
Project: https://czhang0528.github.io/iti-gen

Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image Synthesis

Paper: https://arxiv.org/abs/2211.02408

Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models

Paper: https://arxiv.org/abs/2304.01515

Human Preference Score: Better Aligning Text-to-Image Models with Human Preference

Paper: https://arxiv.org/abs/2303.14420
Code: https://github.com/tgxs002/align_sd

Learning to Generate Semantic Layouts for Higher Text-Image Correspondence in Text-to-Image Synthesis

Paper: https://arxiv.org/abs/2308.08157
Code: https://github.com/pmh9960/GCDP/

Zero-shot spatial layout conditioning for text-to-image diffusion models

Paper: https://arxiv.org/abs/2306.13754

A-STAR: Test-time Attention Segregation and Retention for Text-to-image Synthesis

Paper: ICCV Open Access Version

Evaluating Data Attribution for Text-to-Image Models

Paper: https://arxiv.org/abs/2306.09345
Code: https://github.com/PeterWang512/GenDataAttribution

Expressive Text-to-Image Generation with Rich Text

Paper: https://arxiv.org/abs/2304.06720
Code: https://github.com/SongweiGe/rich-text-to-image

Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image Synthesis

Localizing Object-level Shape Variations with Text-to-Image Diffusion Models

Paper: https://arxiv.org/abs/2303.11306
Code: https://github.com/orpatashnik/local-prompt-mixing

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models

Paper: ICCV Open Access Version

HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models

Paper: https://arxiv.org/abs/2304.05390
Code: https://github.com/eslambakr/HRS_benchmark

Anti-DreamBooth: Protecting Users from Personalized Text-to-image Synthesis

Paper: https://arxiv.org/abs/2303.15433
Code: https://github.com/VinAIResearch/Anti-DreamBooth

Discriminative Class Tokens for Text-to-Image Diffusion Models

Paper: https://arxiv.org/abs/2303.17155
Code: https://github.com/idansc/discriminative_class_tokens

GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation

Paper: https://arxiv.org/abs/2303.10056
Code: https://github.com/salesforce/GlueGen

Image-to-Image / Image Guided

Reinforced Disentanglement for Face Swapping without Skip Connection

Paper: https://arxiv.org/abs/2307.07928

BlendFace: Re-designing Identity Encoders for Face-Swapping

Paper: https://arxiv.org/abs/2307.10854
Code: https://github.com/mapooon/BlendFace

General Image-to-Image Translation with One-Shot Image Guidance

GaFET: Learning Geometry-aware Facial Expression Translation from In-The-Wild Images

Paper: https://arxiv.org/abs/2308.03413

Scenimefy: Learning to Craft Anime Scene via Semi-Supervised Image-to-Image Translation

Paper: https://arxiv.org/abs/2308.12968
Code: https://github.com/Yuxinn-J/Scenimefy

UGC: Unified GAN Compression for Efficient Image-to-Image Translation

Paper: https://arxiv.org/abs/2309.09310

Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional Image Synthesis

Paper: https://arxiv.org/abs/2310.00224

Controllable Person Image Synthesis with Pose-Constrained Latent Diffusion

Paper: ICCV Open Access Version
Code: https://github.com/BrandonHanx/PoCoLD

Others for image generation

Conditional 360-degree Image Synthesis for Immersive Indoor Scene Decoration

Paper: https://arxiv.org/abs/2307.09621

Masked Diffusion Transformer is a Strong Image Synthesizer

Paper: https://arxiv.org/abs/2303.14389
Code: https://github.com/sail-sg/MDT

Q-Diffusion: Quantizing Diffusion Models

Paper: https://arxiv.org/abs/2302.04304
Code: https://github.com/Xiuyu-Li/q-diffusion

The Euclidean Space is Evil: Hyperbolic Attribute Editing for Few-shot Image Generation

Paper: https://arxiv.org/abs/2211.12347
Code: https://github.com/lingxiao-li/HAE

LFS-GAN: Lifelong Few-Shot Image Generation

Paper: https://arxiv.org/abs/2308.11917
Code: https://github.com/JJuOn/LFS-GAN

FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model

Paper: https://arxiv.org/abs/2303.09833
Code: https://github.com/vvictoryuki/FreeDoM

Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations

Paper: https://arxiv.org/abs/2308.10554

Smoothness Similarity Regularization for Few-Shot GAN Adaptation

Paper: https://arxiv.org/abs/2308.09717

UnitedHuman: Harnessing Multi-Source Data for High-Resolution Human Generation

Paper: https://arxiv.org/abs/2309.14335
Code: https://github.com/UnitedHuman/UnitedHuman

Ray Conditioning: Trading Photo-consistency for Photo-realism in Multi-view Image Generation

Paper: https://arxiv.org/abs/2304.13681
Code: https://github.com/echen01/ray-conditioning

Personalized Image Generation for Color Vision Deficiency Population

Paper: ICCV Open Access Version

EGC: Image Generation and Classification via a Diffusion Energy-Based Model

Paper: https://arxiv.org/abs/2304.02012
Code: https://github.com/guoqiushan/egc

Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision Transformers

Paper: ICCV Open Access Version

Neural Characteristic Function Learning for Conditional Image Generation

Paper: ICCV Open Access Version
Code: https://github.com/zhangjialu126/ccf_gan

LinkGAN: Linking GAN Latents to Pixels for Controllable Image Synthesis

Paper: https://arxiv.org/abs/2301.04604
Code: https://github.com/zhujiapeng/linkgan

Perceptual Artifacts Localization for Image Synthesis Tasks

Paper: ICCV Open Access Version
Code: https://github.com/owenzlz/PAL4VST

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning

Paper: https://arxiv.org/abs/2303.11305
Code: https://github.com/mkshing/svdiff-pytorch

Erasing Concepts from Diffusion Models

Paper: https://arxiv.org/abs/2303.07345
Code: https://github.com/rohitgandikota/erasing

A Complete Recipe for Diffusion Generative Models

Paper: ICCV Open Access Version
Code: https://github.com/mandt-lab/PSLD

Efficient Diffusion Training via Min-SNR Weighting Strategy

Phasic Content Fusing Diffusion Model with Directional Distribution Consistency for Few-Shot Model Adaption

Paper: https://arxiv.org/abs/2309.03729
Code: https://github.com/sjtuplayer/few-shot-diffusion

AutoDiffusion: Training-Free Optimization of Time Steps and Architectures for Automated Diffusion Model Acceleration

Paper: https://arxiv.org/abs/2309.10438
Code: https://github.com/lilijiangg/AutoDiffusion

Video Generation

Bidirectionally Deformable Motion Modulation For Video-based Human Pose Transfer

Paper: https://arxiv.org/abs/2307.07754
Code: https://github.com/rocketappslab/bdmm

MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions

Paper: https://arxiv.org/abs/2307.10008
Code: https://github.com/DreamtaleCore/MODA

Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators

StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation

Paper: https://arxiv.org/abs/2308.16909
Code: https://github.com/johannwyh/StyleInV

The Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion

Paper: https://arxiv.org/abs/2309.04509
Project: https://ku-vai.github.io/TPoS/

SIDGAN: High-Resolution Dubbed Video Generation via Shift-Invariant Learning

Paper: ICCV Open Access Version

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation

Paper: https://arxiv.org/abs/2212.11565
Code: https://github.com/showlab/Tune-A-Video

Text2Performer: Text-Driven Human Video Generation

Paper: https://arxiv.org/abs/2304.08483
Code: https://github.com/yumingj/Text2Performer

StyleLipSync: Style-based Personalized Lip-sync Video Generation

Paper: https://arxiv.org/abs/2305.00521
Project: https://stylelipsync.github.io/

Mixed Neural Voxels for Fast Multi-view Video Synthesis

Paper: https://arxiv.org/abs/2212.00190
Code: https://github.com/fengres/mixvoxels

WALDO: Future Video Synthesis using Object Layer Decomposition and Parametric Flow Prediction

Paper: https://arxiv.org/abs/2211.14308
Code: https://github.com/16lemoing/waldo

DreamPose: Fashion Video Synthesis with Stable Diffusion

Paper: https://arxiv.org/abs/2304.06025
Code: https://github.com/johannakarras/DreamPose

Structure and Content-Guided Video Synthesis with Diffusion Models

Paper: https://arxiv.org/abs/2302.03011
Project: https://research.runwayml.com/gen1

[Back-to-Overview]

Others [back]

DDColor: Towards Photo-Realistic and Semantic-Aware Image Colorization via Dual Decoders

Paper: https://arxiv.org/abs/2212.11613
Code: https://github.com/piddnad/DDColor
Tags: Colorization

DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion

Paper: https://arxiv.org/abs/2303.06840
Code: https://github.com/Zhaozixiang1228/MMIF-DDFM
Tags: Image Fusion

Name Your Colour For the Task: Artificially Discover Colour Naming via Colour Quantisation Transformer

Paper: https://arxiv.org/abs/2212.03434
Code: https://github.com/ryeocthiv/CQFormer

Unfolding Framework with Prior of Convolution-Transformer Mixture and Uncertainty Estimation for Video Snapshot Compressive Imaging

Paper: https://arxiv.org/abs/2306.11316
Code: https://github.com/zsm1211/CTM-SCI
Tags: Snapshot Compressive Imaging

Deep Optics for Video Snapshot Compressive Imaging

Paper:
Code: https://github.com/pwangcs/DeepOpticsSCI
Tags: Snapshot Compressive Imaging

SimFIR: A Simple Framework for Fisheye Image Rectification with Self-supervised Representation Learning

Paper: https://arxiv.org/abs/2308.09040
Code: https://github.com/fh2019ustc/SimFIR
Tags: Fisheye Image Rectification

Single Image Reflection Separation via Component Synergy

Paper: https://arxiv.org/abs/2308.10027
Code: https://github.com/mingcv/DSRNet
Tag: Image Reflection Separation

Learned Image Reasoning Prior Penetrates Deep Unfolding Network for Panchromatic and Multi-Spectral Image Fusion

Paper: https://arxiv.org/abs/2308.16083
Tags: pan-sharpening

Talking Head Generation

Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head video Generation

Paper: https://arxiv.org/abs/2307.09906
Code: https://github.com/harlanhong/ICCV2023-MCNET

Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation

Paper: https://arxiv.org/abs/2309.04946
Code: https://github.com/yuangan/EAT_code

Handwriting/Font Generation

Few shot font generation via transferring similarity guided global style and quantization local style

Paper: https://arxiv.org/abs/2309.00827
Code: https://github.com/awei669/VQ-Font