Awesome
Awesome-ICCV2023-Low-Level-Vision
A Collection of Papers and Codes in ICCV2023 related to Low-Level Vision
[In Construction] If you find some missing papers or typos, feel free to pull issues or requests.
Related collections for low-level vision
- Awesome-ICCV2021-Low-Level-Vision
- Awesome-CVPR2023/2022-Low-Level-Vision
- Awesome-NeurIPS2022/2021-Low-Level-Vision
- Awesome-ECCV2022-Low-Level-Vision
- Awesome-AAAI2022-Low-Level-Vision
- Awesome-CVPR2021/2020-Low-Level-Vision
- Awesome-ECCV2020-Low-Level-Vision
Overview
Image Restoration
SYENet: A Simple Yet Effective Network for Multiple Low-Level Vision Tasks with Real-time Performance on Mobile Device
DiffIR: Efficient Diffusion Model for Image Restoration
- Paper: https://arxiv.org/abs/2303.09472
- Code: https://github.com/Zj-BinXia/DiffIR
- Tags: Diffusion
PIRNet: Privacy-Preserving Image Restoration Network via Wavelet Lifting
Focal Network for Image Restoration
- Paper: ICCV Open Access Version
- Code: https://github.com/c-yn/FocalNet
Learning Image-Adaptive Codebooks for Class-Agnostic Image Restoration
Under-Display Camera Image Restoration with Scattering Effect
- Paper: https://arxiv.org/abs/2308.04163
- Code: https://github.com/NamecantbeNULL/SRUDC
- Tags: Under-Display Camera
FSI: Frequency and Spatial Interactive Learning for Image Restoration in Under-Display Cameras
- Paper: ICCV Open Access Version
- Tags: Under-Display Camera
Multi-weather Image Restoration via Domain Translation
- Paper: ICCV Open Access Version
- Code: https://github.com/pwp1208/Domain_Translation_Multi-weather_Restoration
- Tags: Multi-weather
Adverse Weather Removal with Codebook Priors
- Paper: ICCV Open Access Version
- Tags: Adverse Weather Removal
Towards Authentic Face Restoration with Iterative Diffusion Models and Beyond
- Paper: https://arxiv.org/abs/2307.08996
- Tags: Authentic Face Restoration, Diffusion
Improving Lens Flare Removal with General Purpose Pipeline and Multiple Light Sources Recovery
- Paper: https://arxiv.org/abs/2308.16460
- Code: https://github.com/YuyanZhou1/Improving-Lens-Flare-Removal
- Tags: Flare Removal
High-Resolution Document Shadow Removal via A Large-Scale Real-World Dataset and A Frequency-Aware Shadow Erasing Net
- Paper: https://arxiv.org/abs/2308.14221
- Code: https://github.com/CXH-Research/DocShadow-SD7K
- Tags: Document Shadow Removal
Boundary-Aware Divide and Conquer: A Diffusion-Based Solution for Unsupervised Shadow Removal
- Paper: ICCV Open Access Version
- Tags: Shadow Removal
Leveraging Inpainting for Single-Image Shadow Removal
- Paper: https://arxiv.org/abs/2302.05361
- Tags: Shadow Removal
Fine-grained Visible Watermark Removal
- Paper: https://openaccess.thecvf.com/content/ICCV2023/html/Niu_Fine-grained_Visible_Watermark_Removal_ICCV_2023_paper.html
- Tags: Watermark Removal
Physics-Driven Turbulence Image Restoration with Stochastic Refinement
- Paper: https://arxiv.org/abs/2307.10603
- Code: https://github.com/VITA-Group/PiRN
- Tags: Turbulence Image
Building Bridge Across the Time: Disruption and Restoration of Murals In the Wild
- Paper: ICCV Open Access Version
- Code: https://github.com/Shaocr/Building-Bridge-Across-the-Time-Disruption-and-Restoration-of-Murals-In-the-Wild
- Tags: Murals Restoration
DDS2M: Self-Supervised Denoising Diffusion Spatio-Spectral Model for Hyperspectral Image Restoration
- Paper: https://arxiv.org/abs/2303.06682
- Code: https://github.com/miaoyuchun/DDS2M
- Tags: Diffusion, Hyperspectral
Fingerprinting Deep Image Restoration Models
- Paper: ICCV Open Access Version
Self-supervised Monocular Underwater Depth Recovery, Image Restoration, and a Real-sea Video Dataset
Image Reconstruction
Pixel Adaptive Deep Unfolding Transformer for Hyperspectral Image Reconstruction
Video Restoration
Snow Removal in Video: A New Dataset and A Novel Method
- Paper: ICCV Open Access Version
- Code: https://github.com/haoyuc/VideoDesnowing
- Tags: Desnowing
Video Adverse-Weather-Component Suppression Network via Weather Messenger and Adversarial Backpropagation
Fast Full-frame Video Stabilization with Iterative Optimization
- Paper: https://arxiv.org/abs/2307.12774
- Code: https://github.com/zwyking/Fast-Stab
- Tags: Video Stabilization
Minimum Latency Deep Online Video Stabilization
- Paper: https://arxiv.org/abs/2212.02073
- Code: https://github.com/liuzhen03/NNDVS
- Tags: Video Stabilization
Task Agnostic Restoration of Natural Video Dynamics
Super Resolution
Image Super Resolution
On the Effectiveness of Spectral Discriminators for Perceptual Quality Improvement
SRFormer: Permuted Self-Attention for Single Image Super-Resolution
DLGSANet: Lightweight Dynamic Local and Global Self-Attention Network for Image Super-Resolution
Dual Aggregation Transformer for Image Super-Resolution
MSRA-SR: Image Super-resolution Transformer with Multi-scale Shared Representation Acquisition
- Paper: ICCV Open Access Version
Content-Aware Local GAN for Photo-Realistic Super-Resolution
- Paper: https://openaccess.thecvf.com/content/ICCV2023/html/Park_Content-Aware_Local_GAN_for_Photo-Realistic_Super-Resolution_ICCV_2023_paper.html
- Code: https://github.com/jkpark0825/CAL
- Tags: GAN
Boosting Single Image Super-Resolution via Partial Channel Shifting
- Paper: ICCV Open Access Version
- Code: https://github.com/OwXiaoM/_PCS
Feature Modulation Transformer: Cross-Refinement of Global Representation via High-Frequency Prior for Image Super-Resolution
Spatially-Adaptive Feature Modulation for Efficient Image Super-Resolution
- Paper: https://arxiv.org/abs/2302.13800
- Code: https://github.com/sunny2109/SAFMN
- Tags: Efficient
Lightweight Image Super-Resolution with Superpixel Token Interaction
- Paper: ICCV Open Access Version
- Code: https://github.com/ArcticHare105/SPIN
- Tags: Lightweight
Reconstructed Convolution Module Based Look-Up Tables for Efficient Image Super-Resolution
- Paper: https://arxiv.org/abs/2307.08544
- Code: https://github.com/liuguandu/RC-LUT
- Tags: Efficient
Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution
- Paper: https://arxiv.org/abs/2211.13654
- Code: https://github.com/Jiamian-Wang/Iterative-Soft-Shrinkage-SR
- Tags: Efficient
MetaF2N: Blind Image Super-Resolution by Learning Efficient Model Adaptation from Faces
- Paper: https://arxiv.org/abs/2309.08113
- Code: https://github.com/yinzhicun/MetaF2N
- Tags: Blind
Learning Correction Filter via Degradation-Adaptive Regression for Blind Single Image Super-Resolution
- Paper: ICCV Open Access Version
- Code: https://github.com/edbca/DARSR
- Tags: Blind
LMR: A Large-Scale Multi-Reference Dataset for Reference-Based Super-Resolution
- Paper: https://arxiv.org/abs/2303.04970
- Tags: Reference-Based
Real-CE: A Benchmark for Chinese-English Scene Text Image Super-resolution
- Paper: https://arxiv.org/abs/2308.03262
- Code: https://github.com/mjq11302010044/Real-CE
- Tags: Text SR
Learning Non-Local Spatial-Angular Correlation for Light Field Image Super-Resolution
- Paper: https://arxiv.org/abs/2302.08058
- Code: https://github.com/ZhengyuLiang24/EPIT
- Tags: Light Field
Spherical Space Feature Decomposition for Guided Depth Map Super-Resolution
- Paper: https://arxiv.org/abs/2303.08942
- Code: https://github.com/Zhaozixiang1228/GDSR-SSDNet
- Tags: Depth Map
HSR-Diff: Hyperspectral Image Super-Resolution via Conditional Diffusion Models
- Paper: ICCV Open Access Version
- Tags: Hyperspectral, Diffusion
ESSAformer: Efficient Transformer for Hyperspectral Image Super-resolution
- Paper: https://arxiv.org/abs/2307.14010
- Code: https://github.com/Rexzhan/ESSAformer
- Tags: Hyperspectral
Rethinking Multi-Contrast MRI Super-Resolution: Rectangle-Window Cross-Attention Transformer and Arbitrary-Scale Upsampling
- Paper: ICCV Open Access Version
- Tags: MRI
Decomposition-Based Variational Network for Multi-Contrast MRI Super-Resolution and Reconstruction
- Paper: ICCV Open Access Version
- Code: https://github.com/lpcccc-cv/MC-VarNet
- Tags: MRI
CuNeRF: Cube-Based Neural Radiance Field for Zero-Shot Medical Image Arbitrary-Scale Super Resolution
- Paper: https://arxiv.org/abs/2303.16242
- Code: https://github.com/NarcissusEx/CuNeRF
- Tags: Medical, NeRF
Burst Super Resolution
Towards Real-World Burst Image Super-Resolution: Benchmark and Method
- Paper: https://arxiv.org/abs/2309.04803
- Code: https://github.com/yjsunnn/FBANet
- Tag: Real-World
Self-Supervised Burst Super-Resolution
- Paper: ICCV Open Access Version
- Tags: Self-Supervised
Video Super Resolution
Learning Data-Driven Vector-Quantized Degradation Model for Animation Video Super-Resolution
Multi-Frequency Representation Enhancement with Privilege Information for Video Super-Resolution
- Paper: ICCV Open Access Version
Spatial-Temporal Video Super-Resolution
MoTIF: Learning Motion Trajectories with Local Implicit Neural Functions for Continuous Space-Time Video Super-Resolution
Image Rescaling
Downscaled Representation Matters: Improving Image Rescaling with Collaborative Downscaled Images
Denoising
Image Denoising
Random Sub-Samples Generation for Self-Supervised Real Image Denoising
- Paper: https://arxiv.org/abs/2307.16825
- Code: https://github.com/p1y2z3/SDAP
- Tags: Self-Supervised
Score Priors Guided Deep Variational Inference for Unsupervised Real-World Single Image Denoising
- Paper: https://arxiv.org/abs/2308.04682
- Tags: Unsupervised
Unsupervised Image Denoising in Real-World Scenarios via Self-Collaboration Parallel Generative Adversarial Branches
- Paper: https://arxiv.org/abs/2308.06776
- Tags: Unsupervised
Self-supervised Image Denoising with Downsampled Invariance Loss and Conditional Blind-Spot Network
- Paper: https://arxiv.org/abs/2304.09507
- Code: https://github.com/jyicu/CBSN
- Tags: Self-supervised
Multi-view Self-supervised Disentanglement for General Image Denoising
- Paper: https://arxiv.org/abs/2309.05049
- Tags: Self-supervised
Iterative Denoiser and Noise Estimator for Self-Supervised Image Denoising
- Paper: ICCV Open Access Version
- Tags: Self-Supervised
Noise2Info: Noisy Image to Information of Noise for Self-Supervised Image Denoising
- Paper: ICCV Open Access Version
- Tags: Self-Supervised
The Devil is in the Upsampling: Architectural Decisions Made Simpler for Denoising with Deep Image Prior
- Paper: https://arxiv.org/abs/2304.11409
- Code: https://github.com/YilinLiu97/FasterDIP-devil-in-upsampling
Lighting Every Darkness in Two Pairs: A Calibration-Free Pipeline for RAW Denoising
ExposureDiffusion: Learning to Expose for Low-light Image Enhancement
Towards General Low-Light Raw Noise Synthesis and Modeling
- Paper: https://arxiv.org/abs/2307.16508
- Code: https://github.com/fengzhang427/LRD
- Tags: Noise Modeling
Hybrid Spectral Denoising Transformer with Guided Attention
- Paper: https://arxiv.org/abs/2303.09040
- Code: https://github.com/Zeqiang-Lai/HSDT
- Tags: hyperspectral image denoising
Deblurring
Image Deblurring
Multiscale Structure Guided Diffusion for Image Deblurring
Multi-Scale Residual Low-Pass Filter Network for Image Deblurring
- Paper: ICCV Open Access Version
Single Image Defocus Deblurring via Implicit Neural Inverse Kernels
- Paper: ICCV Open Access Version
Single Image Deblurring with Row-dependent Blur Magnitude
- Paper: ICCV Open Access Version
Non-Coaxial Event-Guided Motion Deblurring with Spatial Alignment
- Paper: ICCV Open Access Version
- Tags: Event-Based
Generalizing Event-Based Motion Deblurring in Real-World Scenarios
- Paper: https://arxiv.org/abs/2308.05932
- Code: https://github.com/XiangZ-0/GEM
- Tags: Event-Based
Video Deblurring
Exploring Temporal Frequency Spectrum in Deep Video Deblurring
- Paper:ICCV Open Access Version
Deraining
From Sky to the Ground: A Large-scale Benchmark and Simple Baseline Towards Real Rain Removal
Learning Rain Location Prior for Nighttime Deraining
- Paper: ICCV Open Access Version
- Code: https://github.com/zkawfanx/RLP
Sparse Sampling Transformer with Uncertainty-Driven Ranking for Unified Removal of Raindrops and Rain Streaks
- Paper: https://arxiv.org/abs/2308.14153
- Code: https://github.com/Ephemeral182/UDR-S2Former_deraining
Unsupervised Video Deraining with An Event Camera
- Paper: ICCV Open Access Version
Both Diverse and Realism Matter: Physical Attribute and Style Alignment for Rainy Image Generation
- Paper: ICCV Open Access Version
Dehazing
MB-TaylorFormer: Multi-branch Efficient Transformer Expanded by Taylor Formula for Image Dehazing
Demosaicing
Efficient Unified Demosaicing for Bayer and Non-Bayer Patterned Image Sensors
HDR Imaging / Multi-Exposure Image Fusion
Alignment-free HDR Deghosting with Semantics Consistent Transformer
MEFLUT: Unsupervised 1D Lookup Tables for Multi-exposure Image Fusion
RawHDR: High Dynamic Range Image Reconstruction from a Single Raw Image
Learning Continuous Exposure Value Representations for Single-Image HDR Reconstruction
LAN-HDR: Luminance-based Alignment Network for High Dynamic Range Video Reconstruction
Joint Demosaicing and Deghosting of Time-Varying Exposures for Single-Shot HDR Imaging
- Paper: https://vclab.kaist.ac.kr/iccv2023/iccv2023-single-shot-hdr-imaging.pdf
- Code: https://github.com/KAIST-VCLAB/singshot-hdr-demosaicing
GlowGAN: Unsupervised Learning of HDR Images from LDR Images in the Wild
Beyond the Pixel: a Photometrically Calibrated HDR Dataset for Luminance and Color Prediction
Frame Interpolation
Video Object Segmentation-aware Video Frame Interpolation
Rethinking Video Frame Interpolation from Shutter Mode Induced Degradation
- Paper: ICCV Open Access Version
Image Enhancement
Iterative Prompt Learning for Unsupervised Backlit Image Enhancement
Low-Light Image Enhancement
ExposureDiffusion: Learning to Expose for Low-light Image Enhancement
Implicit Neural Representation for Cooperative Low-light Image Enhancement
Low-Light Image Enhancement with Illumination-Aware Gamma Correction and Complete Image Modelling Network
Diff-Retinex: Rethinking Low-light Image Enhancement with A Generative Diffusion Model
Retinexformer: One-stage Retinex-based Transformer for Low-light Image Enhancement
Low-Light Image Enhancement with Multi-Stage Residue Quantization and Brightness-Aware Attention
- Paper: ICCV Open Access Version
Dancing in the Dark: A Benchmark towards General Low-light Video Enhancement
- Paper: ICCV Open Access Version
- Code: https://github.com/ciki000/DID
NIR-assisted Video Enhancement via Unpaired 24-hour Data
- Paper: ICCV Open Access Version
- Code: https://github.com/MyNiuuu/NVEU
Coherent Event Guided Low-Light Video Enhancement
Image Harmonization/Composition
Deep Image Harmonization with Learnable Augmentation
- Paper: https://arxiv.org/abs/2308.00376
- Code: https://github.com/bcmi/SycoNet-Adaptive-Image-Harmonization
Deep Image Harmonization with Globally Guided Feature Transformation and Relation Distillation
- Paper: https://arxiv.org/abs/2308.00356
- Code: https://github.com/bcmi/Image-Harmonization-Dataset-ccHarmony
TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition
Image Completion/Inpainting
Diverse Inpainting and Editing with GAN Inversion
Rethinking Fast Fourier Convolution in Image Inpainting
- Paper: ICCV Open Access Version
Continuously Masked Transformer for Image Inpainting
MI-GAN: A Simple Baseline for Image Inpainting on Mobile Devices
PATMAT: Person Aware Tuning of Mask-Aware Transformer for Face Inpainting
Video Inpainting
ProPainter: Improving Propagation and Transformer for Video Inpainting
Semantic-Aware Dynamic Parameter for Video Inpainting Transformer
- Paper: ICCV Open Access Version
CIRI: Curricular Inactivation for Residue-aware One-shot Video Inpainting
Image Stitching
Parallax-Tolerant Unsupervised Deep Image Stitching
Image Compression
RFD-ECNet: Extreme Underwater Image Compression with Reference to Feature Dictionary
COMPASS: High-Efficiency Deep Image Compression with Arbitrary-scale Spatial Scalability
Computationally-Efficient Neural Image Compression with Shallow Decoders
Dec-Adapter: Exploring Efficient Decoder-Side Adapter for Bridging Screen Content and Natural Image Compression
- Paper: ICCV Open Access Version
Semantically Structured Image Compression via Irregular Group-Based Decoupling
TransTIC: Transferring Transformer-based Image Compression from Human Perception to Machine Perception
AdaNIC: Towards Practical Neural Image Compression via Dynamic Transform Routing
- Paper: ICCV Open Access Version
COOL-CHIC: Coordinate-based Low Complexity Hierarchical Image Codec
Video Compression
Scene Matters: Model-based Deep Video Compression
Image Quality Assessment
Thinking Image Color Aesthetics Assessment: Models, Datasets and Benchmarks
- Paper: ICCV Open Access Version
- Code: https://github.com/woshidandan/Image-Color-Aesthetics-Assessment/tree/main
Test Time Adaptation for Blind Image Quality Assessment
Troubleshooting Ethnic Quality Bias with Curriculum Domain Adaptation for Face Image Quality Assessment
- Paper: ICCV Open Access Version
- Code: https://github.com/oufuzhao/EQBM
SQAD: Automatic Smartphone Camera Quality Assessment and Benchmarking
- Paper: ICCV Open Access Version
- Code: https://github.com/aiff22/SQAD
Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives
Style Transfer
AesPA-Net: Aesthetic Pattern-Aware Style Transfer Networks
Two Birds, One Stone: A Unified Framework for Joint Learning of Image and Video Style Transfers
All-to-key Attention for Arbitrary Style Transfer
StyleDiffusion: Controllable Disentangled Style Transfer via Diffusion Models
StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized Tokenizer of a Large-Scale Generative Model
Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer
- Paper: https://arxiv.org/abs/2303.08622
- Code(unofficial): https://github.com/ouhenio/text-guided-diffusion-style-transfer
HairNeRF: Geometry-Aware Image Synthesis for Hairstyle Transfer
- Paper: ICCV Open Access Version
Image Editing
Adaptive Nonlinear Latent Transformation for Conditional Face Editing
Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing
- Paper: https://arxiv.org/abs/2304.02051
- Code: https://github.com/aimagelab/multimodal-garment-designer
MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing
Not All Steps are Created Equal: Selective Diffusion Distillation for Image Manipulation
- Paper: https://arxiv.org/abs/2307.08448
- Code: https://github.com/AndysonYs/Selective-Diffusion-Distillation
HairCLIPv2: Unifying Hair Editing via Proxy Feature Blending
StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces
Diverse Inpainting and Editing with GAN Inversion
Effective Real Image Editing with Accelerated Iterative Diffusion Inversion
Conceptual and Hierarchical Latent Space Decomposition for Face Editing
- Paper: ICCV Open Access Version
Editing Implicit Assumptions in Text-to-Image Diffusion Models
Prompt Tuning Inversion for Text-Driven Image Editing Using Diffusion Models
A Latent Space of Stochastic Diffusion Models for Zero-Shot Image Editing and Guidance
Out-of-domain GAN inversion via Invertibility Decomposition for Photo-Realistic Human Face Manipulation
Video Editing
RIGID: Recurrent GAN Inversion and Editing of Real Face Videos
Pix2Video: Video Editing using Image Diffusion
FateZero: Fusing Attentions for Zero-shot Text-based Video Editing
StableVideo: Text-driven Consistency-aware Diffusion Video Editing
VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs
Image Generation/Synthesis / Image-to-Image Translation
Text-to-Image / Text Guided / Multi-Modal
Adding Conditional Control to Text-to-Image Diffusion Models
MagicFusion: Boosting Text-to-Image Generation Performance by Fusing Diffusion Models
ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation
Unleashing Text-to-Image Diffusion Models for Visual Perception
Unsupervised Compositional Concepts Discovery with Text-to-Image Generative Models
- Paper: https://arxiv.org/abs/2306.05357
- Code: https://github.com/nanlliu/Unsupervised-Compositional-Concepts-Discovery
BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
Ablating Concepts in Text-to-Image Diffusion Models
Learning to Generate Semantic Layouts for Higher Text-Image Correspondence in Text-to-Image Synthesis
HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image Generation
Story Visualization by Online Text Augmentation with Context Memory
DiffCloth: Diffusion Based Garment Synthesis and Manipulation via Structural Cross-modal Semantic Alignment
Dense Text-to-Image Generation with Attention Modulation
ITI-GEN: Inclusive Text-to-Image Generation
- Paper: https://arxiv.org/abs/2309.05569
- Project: https://czhang0528.github.io/iti-gen
Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image Synthesis
Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models
Human Preference Score: Better Aligning Text-to-Image Models with Human Preference
Learning to Generate Semantic Layouts for Higher Text-Image Correspondence in Text-to-Image Synthesis
Zero-shot spatial layout conditioning for text-to-image diffusion models
A-STAR: Test-time Attention Segregation and Retention for Text-to-image Synthesis
- Paper: ICCV Open Access Version
Evaluating Data Attribution for Text-to-Image Models
Expressive Text-to-Image Generation with Rich Text
Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image Synthesis
- Paper: https://arxiv.org/abs/2304.03869
- Code: https://github.com/UCSB-NLP-Chang/Diffusion-SpaceTime-Attn
Localizing Object-level Shape Variations with Text-to-Image Diffusion Models
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models
- Paper: ICCV Open Access Version
HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models
Anti-DreamBooth: Protecting Users from Personalized Text-to-image Synthesis
Discriminative Class Tokens for Text-to-Image Diffusion Models
GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation
Image-to-Image / Image Guided
Reinforced Disentanglement for Face Swapping without Skip Connection
BlendFace: Re-designing Identity Encoders for Face-Swapping
General Image-to-Image Translation with One-Shot Image Guidance
- Paper: https://arxiv.org/abs/2307.14352
- Code: https://github.com/CrystalNeuro/visual-concept-translator
GaFET: Learning Geometry-aware Facial Expression Translation from In-The-Wild Images
Scenimefy: Learning to Craft Anime Scene via Semi-Supervised Image-to-Image Translation
UGC: Unified GAN Compression for Efficient Image-to-Image Translation
Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional Image Synthesis
Controllable Person Image Synthesis with Pose-Constrained Latent Diffusion
Others for image generation
Conditional 360-degree Image Synthesis for Immersive Indoor Scene Decoration
Masked Diffusion Transformer is a Strong Image Synthesizer
Q-Diffusion: Quantizing Diffusion Models
The Euclidean Space is Evil: Hyperbolic Attribute Editing for Few-shot Image Generation
LFS-GAN: Lifelong Few-Shot Image Generation
FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model
Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations
Smoothness Similarity Regularization for Few-Shot GAN Adaptation
UnitedHuman: Harnessing Multi-Source Data for High-Resolution Human Generation
Ray Conditioning: Trading Photo-consistency for Photo-realism in Multi-view Image Generation
Personalized Image Generation for Color Vision Deficiency Population
- Paper: ICCV Open Access Version
EGC: Image Generation and Classification via a Diffusion Energy-Based Model
Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision Transformers
- Paper: ICCV Open Access Version
Neural Characteristic Function Learning for Conditional Image Generation
LinkGAN: Linking GAN Latents to Pixels for Controllable Image Synthesis
Perceptual Artifacts Localization for Image Synthesis Tasks
SVDiff: Compact Parameter Space for Diffusion Fine-Tuning
Erasing Concepts from Diffusion Models
A Complete Recipe for Diffusion Generative Models
Efficient Diffusion Training via Min-SNR Weighting Strategy
- Paper: https://arxiv.org/abs/2303.09556
- Code: https://github.com/TiankaiHang/Min-SNR-Diffusion-Training
Phasic Content Fusing Diffusion Model with Directional Distribution Consistency for Few-Shot Model Adaption
AutoDiffusion: Training-Free Optimization of Time Steps and Architectures for Automated Diffusion Model Acceleration
Video Generation
Bidirectionally Deformable Motion Modulation For Video-based Human Pose Transfer
MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
- Paper: https://arxiv.org/abs/2303.13439
- Code: https://github.com/Picsart-AI-Research/Text2Video-Zero
StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation
The Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion
- Paper: https://arxiv.org/abs/2309.04509
- Project: https://ku-vai.github.io/TPoS/
SIDGAN: High-Resolution Dubbed Video Generation via Shift-Invariant Learning
- Paper: ICCV Open Access Version
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
Text2Performer: Text-Driven Human Video Generation
StyleLipSync: Style-based Personalized Lip-sync Video Generation
- Paper: https://arxiv.org/abs/2305.00521
- Project: https://stylelipsync.github.io/
Mixed Neural Voxels for Fast Multi-view Video Synthesis
WALDO: Future Video Synthesis using Object Layer Decomposition and Parametric Flow Prediction
DreamPose: Fashion Video Synthesis with Stable Diffusion
Structure and Content-Guided Video Synthesis with Diffusion Models
- Paper: https://arxiv.org/abs/2302.03011
- Project: https://research.runwayml.com/gen1
Others [back]
DDColor: Towards Photo-Realistic and Semantic-Aware Image Colorization via Dual Decoders
- Paper: https://arxiv.org/abs/2212.11613
- Code: https://github.com/piddnad/DDColor
- Tags: Colorization
DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion
- Paper: https://arxiv.org/abs/2303.06840
- Code: https://github.com/Zhaozixiang1228/MMIF-DDFM
- Tags: Image Fusion
Name Your Colour For the Task: Artificially Discover Colour Naming via Colour Quantisation Transformer
Unfolding Framework with Prior of Convolution-Transformer Mixture and Uncertainty Estimation for Video Snapshot Compressive Imaging
- Paper: https://arxiv.org/abs/2306.11316
- Code: https://github.com/zsm1211/CTM-SCI
- Tags: Snapshot Compressive Imaging
Deep Optics for Video Snapshot Compressive Imaging
- Paper:
- Code: https://github.com/pwangcs/DeepOpticsSCI
- Tags: Snapshot Compressive Imaging
SimFIR: A Simple Framework for Fisheye Image Rectification with Self-supervised Representation Learning
- Paper: https://arxiv.org/abs/2308.09040
- Code: https://github.com/fh2019ustc/SimFIR
- Tags: Fisheye Image Rectification
Single Image Reflection Separation via Component Synergy
- Paper: https://arxiv.org/abs/2308.10027
- Code: https://github.com/mingcv/DSRNet
- Tag: Image Reflection Separation
Learned Image Reasoning Prior Penetrates Deep Unfolding Network for Panchromatic and Multi-Spectral Image Fusion
- Paper: https://arxiv.org/abs/2308.16083
- Tags: pan-sharpening
Talking Head Generation
Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head video Generation
Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation
Handwriting/Font Generation
Few shot font generation via transferring similarity guided global style and quantization local style
<!-- ## Virtual Try-on -->