Awesome
WACV-2025-Papers
会议时间:2025年2月28日–3月4日
会议网址:https://wacv2025.thecvf.com/
❣❣❣ WACV 2024 论文分类整理ing
查看2024年综述文献点这里↘️2024-CV-Surveys
2025 年论文分类汇总戳这里
↘️WACV-2025-Papers ↘️CVPR-2025-Papers
2024 年论文分类汇总戳这里
↘️WACV-2024-Papers ↘️CVPR-2024-Papers ↘️ECCV-2024-Papers
2023 年论文分类汇总戳这里
2022 年论文分类汇总戳这里
2021 年论文分类汇总戳这里
2020 年论文分类汇总戳这里
12月13日更新 2 篇,共计 155+2 篇。
- SyncViolinist: Music-Oriented Violin Motion Generation Based on Bowing and Fingering<br>:star:code
- Foundation Models and Adaptive Feature Selection: A Synergistic Approach to Video Question Answering
<br>:star:[code] <br>:house:[project] ASDF
Transformer
Dense Prediction(密集预测)
Neural Radiance Fields
Anomaly Detection(异常检测)
- SPACE: SPAtial-aware Consistency rEgularization for anomaly detection in Industrial applications
- Adaptive Deviation Learning for Visual Anomaly Detection with Data Contamination
- Anomaly Detection for People with Visual Impairments Using an Egocentric 360-Degree Camera
- ROADS: Robust Prompt-driven Multi-Class Anomaly Detection under Domain Shift
- FUN-AD: Fully Unsupervised Learning for Anomaly Detection with Noisy Training Data<br>:star:code
- 异常定位
Deepfake
Robots(机器人)
- SLAM
Scene(场景)
- LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spatial Relations<br>:star:code
Object Pose Estimation(物体姿态估计)
Dataset/Benchmark(数据集/基准)
- CT to PET Translation: A Large-scale Dataset and Domain-Knowledge-Guided Diffusion Approach<br>:star:code
- PV-VTT: A Privacy-Centric Dataset for Mission-Specific Anomaly Detection and Natural Language Interpretation
- SynDroneVision: A Synthetic Dataset for Image-Based Drone Detection
- SEED4D: A Synthetic Ego--Exo Dynamic 4D Data Generator, Driving Dataset and Benchmark<br>:star:code<br>:star:code
- DrIFT: Autonomous Drone Dataset with Integrated Real and Synthetic Data, Flexible Views, and Transformed Domains<br>:star:code
- 基准
虚拟头像
Vision-Language
- Active Learning for Vision-Language Models
- @Bench: Benchmarking Vision-Language Models for Human-centered Assistive Technology<br>:star:code<br>:house:project
- Style-Pro: Style-Guided Prompt Learning for Generalizable Vision-Language Models
- Who Brings the Frisbee: Probing Hidden Hallucination Factors in Large Vision-Language Model via Causality Analysis
- Retaining and Enhancing Pre-trained Knowledge in Vision-Language Models with Prompt Ensembling
- 视频语言
- VLN
- Visual Grounding
Semi/self-supervised learning(半/自监督)
- 自监督
Few/Zero-Shot Learning/DG/A(小/零样本/域泛化/域适应)
Machine Learning(机器学习)
- 类增量
- 对比学习
- 持续学习
- 多任务学习
Motion Generation(人体运动生成)
GAN/Image Synthesis(图像生成)
- 纹理生成
- 图像生成
- 食谱生成
- 图像编辑
- 文本-图像
- 图像-图像翻译
Visual Question Answering(视觉问答)
- Foundation Models and Adaptive Feature Selection: A Synergistic Approach to Video Question Answering
OCR
3D(三维重建\三维视觉)
- Planar Gaussian Splatting
- LIPIDS: Learning-based Illumination Planning In Discretized (Light) Space for Photometric Stereo
- 深度估计
Point Cloud(点云)
- PocoLoco: A Point Cloud Diffusion Model of Human Shape in Loose Clothing<br>:star:code
- 3D 点云
- 点云分类
- 点云分割
- 点云配准
Person Re-id
- ReMix: Training Generalized Person Re-identification on a Mixture of Data
- AnonyNoise: Anonymizing Event Data with Smart Noise to Outsmart Re-Identification and Preserve Privacy<br>:star:code
- 换衣重识别
- 行人搜索
Action Detection(动作检测)
- 开放词汇动作检测
- 基于骨架的动作识别
Human Pose Estimation
- 人体重建
- 人体网格恢复
- 三维姿态估计
- 人体运动恢复
- 手势生成
Medical Image Progress(医学影响处理)
- PK-YOLO: Pretrained Knowledge Guided YOLO for Brain Tumor Detection in Multiplanar MRI Slices<br>:star:code
- Volumetric Conditioning Module to Control Pretrained Diffusion Models for 3D Medical Images<br>:star:code
- AMNCutter: Affinity-Attention-Guided Multi-View Normalized Cutter for Unsupervised Surgical Instrument Segmentation<br>:star:code
- Multimodal Fusion Learning with Dual Attention for Medical Imaging<br>:star:code
- LQ-Adapter: ViT-Adapter with Learnable Queries for Gallbladder Cancer Detection from Ultrasound Image<br>:star:code
- 医学图像分割
- 医学放射科报告生成
- MRI
Autonomous Driving(自动驾驶)
- CoVLA: Comprehensive Vision-Language-Action Dataset for Autonomous Driving<br>:house:project
- S3PT: Scene Semantics and Structure Guided Clustering to Boost Self-Supervised Pre-Training for Autonomous Driving
- 轨迹预测
Biomedical(生物特征识别)
UAV/Remote Sensing/Satellite Image(无人机/遥感/卫星图像)
- PGRID: Power Grid Reconstruction in Informal Developments Using High-Resolution Aerial Imagery
- Pix2Poly: A Sequence Prediction Method for End-to-end Polygonal Building Footprint Extraction from Remote Sensing Imagery<br>:star:code
Object Tracking(目标跟踪)
- MFTIQ: Multi-Flow Tracker with Independent Matching Quality Estimation<br>:star:code
- Improving Accuracy and Generalization for Efficient Visual Tracking
- 点跟踪
Object Detection(目标检测)
- Mixed Patch Infrared-Visible Modality Agnostic Object Detection<br>:star:code<br>:house:project
- No Annotations for Object Detection in Art through Stable Diffusion<br>:star:code
- 3D OD
- VOD
Image/Video Retrieval(图像/视频检索)
- 图像检索
- 视频检索
- 信息检索
Image/video Compression(图像/视频压缩)
Image Classification(图像分类)
- Enhancing Visual Classification using Comparative Descriptors
- Class-Conditioned Transformation for Enhanced Robust Image Classification<br>:star:code
Image Progress(图像/视频处理)
- 图像恢复
- 图像修复
- 图像增强
- 图像质量评估
- 视频增强
- 视频去模糊
Image Segmentation(图像分割)
- HSDA: High-frequency Shuffle Data Augmentation for Bird's-Eye-View Map Segmentation<br>:star:code
- 语义分割
- COSNet: A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes
- Modality-Incremental Learning with Disjoint Relevance Mapping Networks for Image-based Semantic Segmentation
- Epipolar Attention Field Transformers for Bird's Eye View Semantic Segmentation
- Active Learning with Context Sampling and One-vs-Rest Entropy for Semantic Segmentation
- 小样本语义分割
- VSS
- VPS
Face
- Continual Learning of Personalized Generative Face Models with Experience Replay<br>:star:code
- 人脸识别
- 人脸验证
- 人脸生成
- 人脸表情识别
- 人脸关键点检测
Othere(其它)
- Dense Depth from Event Focal Stack
- MAGMA: Manifold Regularization for MAEs<br>:star:code
- SenCLIP: Enhancing zero-shot land-use mapping for Sentinel-2 with ground-level prompting
- Secrets of Edge-Informed Contrast Maximization for Event-Based Vision
- Multi-Class Textual-Inversion Secretly Yields a Semantic-Agnostic Classifier<br>:star:code
- PACA: Perspective-Aware Cross-Attention Representation for Zero-Shot Scene Rearrangement
- Active Event Alignment for Monocular Distance Estimation
- Multi-Level Feature Distillation of Joint Teachers Trained on Distinct Image Datasets<br>:star:code
- Self-Relaxed Joint Training: Sample Selection for Severity Estimation with Ordinal Noisy Labels<br>:star:code
- EI-Nexus: Towards Unmediated and Flexible Inter-Modality Local Feature Extraction and Matching for Event-Image Data<br>:star:code
- High-Fidelity Document Stain Removal via A Large-Scale Real-World Dataset and A Memory-Augmented Transformer
- SEMU-Net: A Segmentation-based Corrector for Fabrication Process Variations of Nanophotonics with Microscopic Images
- Situational Scene Graph for Structured Human-centric Situation Understanding
- Compositional Segmentation of Cardiac Images Leveraging Metadata<br>:star:code
- DiffPAD: Denoising Diffusion-based Adversarial Patch Decontamination
- TPP-Gaze: Modelling Gaze Dynamics in Space and Time with Neural Temporal Point Processes<br>:star:code
- MS-Glance: Non-semantic context vectors and the applications in supervising image reconstruction<br>:star:code
- Debiasify: Self-Distillation for Unsupervised Bias Mitigation
- TaxaBind: A Unified Embedding Space for Ecological Applications<br>:star:code
- Towards High-fidelity Head Blending with Chroma Keying for Industrial Applications<br>:star:code
- Through the Curved Cover: Synthesizing Cover Aberrated Scenes with Refractive Field
- HandCraft: Anatomically Correct Restoration of Malformed Hands in Diffusion Generated Images
- WAFFLE: Multimodal Floorplan Understanding in the Wild<br>:house:project
- Distillation of Diffusion Features for Semantic Correspondence<br>:star:code
- Divergent Domains, Convergent Grading: Enhancing Generalization in Diabetic Retinopathy Grading<br>:star:code
- HeightMapNet: Explicit Height Modeling for End-to-End HD Map Learning<br>:star:code
- STLight: a Fully Convolutional Approach for Efficient Predictive Learning by Spatio-Temporal joint Processing
- Design-o-meter: Towards Evaluating and Refining Graphic Designs<br>:star:code
- Ordinal Multiple-instance Learning for Ulcerative Colitis Severity Estimation with Selective Aggregated Transformer<br>:star:code
- TreeFormer: Single-view Plant Skeleton Estimation via Tree-constrained Graph Generation<br>:star:code
- I Spy With My Little Eye: A Minimum Cost Multicut Investigation of Dataset Frames<br>:star:code
- Diffusion Model Guided Sampling with Pixel-Wise Aleatoric Uncertainty Estimation
- SimuScope: Realistic Endoscopic Synthetic Dataset Generation through Surgical Simulation and Diffusion Models<br>:star:code
- EgoSonics: Generating Synchronized Audio for Silent Egocentric Videos<br>:house:project
- Multi-view Image Diffusion via Coordinate Noise and Fourier Attention
- LLS: Local Learning Rule for Deep Neural Networks Inspired by Neural Activity Synchronization
- [SHIP: Structural Hierarchies for Instance-dependent Partial Labels]<br>本文介绍了一个模块化组件,旨在无缝集成到深度学习架构中,特别是在标签层次结构存在的情况下。SHIP增强了基于实例的部分标签学习(PLL),并在各种算法中提高了2.6%的准确率!
2020 年论文分类汇总戳这里
↘️CVPR-2020-Papers ↘️ECCV-2020-Papers
<a name="00"/>2021 年论文分类汇总戳这里
↘️ICCV-2021-Papers ↘️CVPR-2021-Papers
<a name="000"/>2022 年论文分类汇总戳这里
↘️CVPR-2022-Papers ↘️WACV-2022-Papers ↘️ECCV-2022-Papers
<a name="0000"/>2023 年论文分类汇总戳这里
↘️CVPR-2023-Papers ↘️WACV-2023-Papers ↘️ICCV-2023-Papers ↘️2023-CV-Surveys