Awesome

WACV-2024-Papers

Alt text

会议时间：2024年1月3-7日

会议网址：https://wacv2024.thecvf.com/

❣❣❣ WACV 2024 论文分类整理已完成

📢📢📢获奖论文

查看2024年综述文献点这里↘️2024-CV-Surveys

2024 年论文分类汇总戳这里

↘️WACV-2024-Papers

2023 年论文分类汇总戳这里

↘️CVPR-2023-Papers ↘️WACV-2023-Papers ↘️ICCV-2023-Papers ↘️2023-CV-Surveys

2022 年论文分类汇总戳这里

2021 年论文分类汇总戳这里

2020 年论文分类汇总戳这里

:cat:	:dog:	:tiger:	:wolf:
1.其它(Other)	2.SR(超分辨率)	3.Image/Video Retrieval(图像/视频检索)	4.Image/Video Caption(图像/视频字幕)
5.Image/Video Composition(图像/视频压缩)	6.Medical Image(医学图像处理)	7.3D(三维重建\三维视觉)	8.Face(人脸技术)
9.Image Segmentation(图像分割)	10.Object Detector(目标检测)	11.Object Tracking(目标跟踪)	12.UAV/RS/Satellite Image(无人机/遥感/卫星图像)
13.Reid(人员重识别/步态识别/行人检测)	14.OCR(文本检测识别)	15.Video	16.Action Detection(动作检测)
17.HPE(人体姿态估计)	18.Animal	19.Object Pose Estimation(物体姿态估计)	20.GAN/生成
21.SLAM/AR/VR/Robotics(增强/虚拟现实/机器人)	22.VAQ(视觉问答)	23.VL(视觉语言)	24.LLM(大语言模型)
25.Multimodal(多模态)	26.Human Motion Prediction(人体运动预测)	27.HOI(人物交互)	28.Point-Cloud(点云)
29.SGG(场景图生成)	30.GNN/GCN	31.Automated Driving(自动驾驶)	32.Scene Flow Estimation(场景流估计)
33.Optical Flow Estimation(光流估计)	34.NAS	35.MC/KD/Pruning(模型压缩/知识蒸馏/剪枝)	36.NLP
37.ML(机器学习)	38.Visual Representation Learning	39.Few/Zero-Shot Learning/DG/A(小/零样本/域泛化/域适应)	40.Self/Semi-supervised learning
41.Image Progress(低层图像处理、质量评价)	42.Image Classification(图像分类)	43.Image Fusion(图像融合)	44.visual industrial inspection(工业检测)
45.Visual Tampering Detection(视觉篡改检测)	46.Dense Prediction(密集预测)	47.Edge Detection(边缘检测)	48.Image/Video Editing
49.Vision Transformers	50.Dataset(数据集)	51.sound(语音)	52.Gaze Estimation(凝视估计)
53.Crack Segmentation	54.Style Transfer(风格迁移)	55.Biometrics(生物特征识别)	56.Event Cameras(事件相机)
57.Neural Radiance Fields(NeRF)	58.Novel View Synthesis(新视角合成)	59.Rendering	60.Graphic Layout(图形布局)
61.Computed Imaging(计算成像，如光学、几何、光场成像等)

61.Computed Imaging(计算成像，如光学、几何、光场成像等)

60.Graphic Layout(图形布局)

Unsupervised Graphic Layout Grouping with Transformers

59.Rendering

58.Novel View Synthesis(新视角合成)

57.Neural Radiance Fields(NeRF)

56.Event Cameras(事件相机)

Masked Event Modeling: Self-Supervised Pretraining for Event Cameras

55.Biometrics(生物特征识别)

54.Style Transfer(风格迁移)

53.Crack Segmentation

Designing a Hybrid Neural System To Learn Real-World Crack Segmentation From Fractal-Based Simulation

52.Gaze Estimation(凝视估计)

51.sound(语音)

唇语同步
- Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization
声源定位
- Can CLIP Help Sound Source Localization?
音频分离
- LAVSS: Location-Guided Audio-Visual Spatial Audio Separation
- Visually Guided Audio Source Separation With Meta Consistency Learning
3D 声源检测
- Sound3DVDet: 3D Sound Source Detection Using Multiview Microphone Array and RGB Images
音视频分割
- Annotation-Free Audio-Visual Segmentation
语音视频合成
- DR2: Disentangled Recurrent Representation Learning for Data-Efficient Speech Video Synthesis
身体节拍制作互动鼓声
- Let the Beat Follow You - Creating Interactive Drum Sounds From Body Rhythm

50.Dataset(数据集)

49.Vision Transformers

48.Image/Video Editing

47.Edge Detection(边缘检测)

Self-Supervised Edge Detection Reconstruction for Topology-Informed 3D Axon Segmentation and Centerline Detection

46.Dense Prediction(密集预测)

45.Visual Tampering Detection(视觉篡改检测)

44.visual industrial inspection(工业检测)

43.Image Fusion(图像融合)

Bridging the Gap between Multi-focus and Multi-modal: A Focused Integration Framework for Multi-modal Image Fusion<br>:star:code

42.Image Classification(图像分类)

41.Image Progress(低层图像处理、质量评价)

40.Self/Semi-supervised learning

39.Few/Zero-Shot Learning/Domain Generalization/Adaptation(小/零样本/域泛化/域适应)

38.Visual Representation Learning

Group-Wise Contrastive Bottleneck for Weakly-Supervised Visual Representation Learning

37.Machine Learning(机器学习)

36.NLP

Few-Shot Event Classification in Images Using Knowledge Graphs for Prompting

35.Model Compression/Knowledge Distillation/Pruning(模型压缩/知识蒸馏/剪枝)

34.NAS

33.Optical Flow Estimation(光流估计)

32.Scene Flow Estimation(场景流估计)

OptFlow: Fast Optimization-Based Scene Flow Estimation Without Supervision

31.Automated Driving(自动驾驶)

车道线检测
- CLRerNet: Improving Confidence of Lane Detection With LaneIoU
自动驾驶
驾驶员损伤评估
- Estimating Blood Alcohol Level Through Facial Features for Driver Impairment Assessment
交通标志检测
- Natural Light Can Also Be Dangerous: Traffic Sign Misinterpretation Under Adversarial Natural Light Attacks
障碍物检测
- Have We Ever Encountered This Before? Retrieving Out-of-Distribution Road Obstacles From Driving Scenes
驾驶员动作意图识别
- Evaluation of Video Masked Autoencoders' Performance and Uncertainty Estimations for Driver Action and Intention Recognition

30.GNN/GCN

29.Scene Graph Generation(场景图生成)

28.Point-Cloud(点云)

27.Human-Object Interactions(人物交互)

26.Human Motion Prediction(人体运动预测)

25.Multimodal(多模态)

24.Lage Language Models(大语言模型)

Zero-Shot Building Attribute Extraction From Large-Scale Vision and Language Models

23.Vision-Language(视觉语言)

22.Visual Answer Questions(视觉问答)

21.SLAM/Augmented Reality/Virtual Reality/Robotics(增强/虚拟现实/机器人)

20.GAN/生成

19.Object Pose Estimation(物体姿态估计)

18.Animal

犬类姿态分析
- RGBT-Dog: A Parametric Model and Pose Prior for Canine Body Analysis Data Creation
动物重识别
- WildlifeDatasets: An Open-Source Toolkit for Animal Re-Identification

17.Human Pose Estimation(人体姿态估计)

16.Action Detection(动作检测)

15.Video

14.OCR(文本检测识别)

DTrOCR: Decoder-only Transformer for Optical Character Recognition
On Manipulating Scene Text in the Wild with Diffusion Models
DECDM: Document Enhancement using Cycle-Consistent Diffusion Models
文本检测
Text Spotting
- Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards Enhancing Text Spotting Performance
- Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis
Scene-Text Spotting
- STEP - Towards Structured Scene-Text Spotting
Document Dewarping(文档矫正)
- DocReal: Robust Document Dewarping of Real-Life Images via Attention-Enhanced Control Point Prediction
场景文本理解
- Textual Alchemy: CoFormer for Scene Text Understanding
文档布局分割
- A One-Shot Learning Approach To Document Layout Segmentation of Ancient Arabic Manuscripts
字体生成
- Towards Diverse and Consistent Typography Generation
信息提取
- Graph Neural Networks for End-to-End Information Extraction From Handwritten Documents

13.Reid(人员重识别/步态识别/行人检测)

12.UAV/Remote Sensing/Satellite Image(无人机/遥感/卫星图像)

11.Object Tracking(目标跟踪)

10.Object Detector(目标检测)

9.Image Segmentation(图像分割)

8.Face(人脸技术)

7.3D(三维重建\三维视觉)

6.Medical Image(医学图像处理)

5.Image/Video Composition(图像/视频压缩)

4.Image/Video Caption(图像/视频字幕)

3.Image/Video Retrieval(图像/视频检索)

图像检索
食谱检索
- Fine-Grained Alignment for Cross-Modal Recipe Retrieval
3D形状检索
- Domain Adaptive 3D Shape Retrieval From Monocular Images
文本-穿搭检索(时尚推荐)
- Lost Your Style? Navigating with Semantic-Level Approach for Text-to-Outfit Retrieval
文本-形状检索
- TriCoLo: Trimodal Contrastive Loss for Text To Shape Retrieval

2.Super-Resolution(超分辨率)

1.其它

2020 年论文分类汇总戳这里

↘️CVPR-2020-Papers ↘️ECCV-2020-Papers

2021 年论文分类汇总戳这里

↘️ICCV-2021-Papers ↘️CVPR-2021-Papers

2022 年论文分类汇总戳这里

↘️CVPR-2022-Papers ↘️WACV-2022-Papers ↘️ECCV-2022-Papers

扫码CV君微信(注明：CVPR)入微信交流群：

9475fa20fd5e95235d9fa23ae9587a2

Awesome

WACV-2024-Papers

会议时间：2024年1月3-7日

会议网址：https://wacv2024.thecvf.com/

❣❣❣ WACV 2024 论文分类整理已完成

📢📢📢获奖论文

🏆最佳论文奖(Algorithms)

🏆最佳论文奖(Applications)

🏆最佳学生论文

🏆最佳论文荣誉提名

查看2024年综述文献点这里↘️2024-CV-Surveys

2024 年论文分类汇总戳这里

2023 年论文分类汇总戳这里

2022 年论文分类汇总戳这里

2021 年论文分类汇总戳这里

2020 年论文分类汇总戳这里

目录

61.Computed Imaging(计算成像，如光学、几何、光场成像等)

60.Graphic Layout(图形布局)

59.Rendering

58.Novel View Synthesis(新视角合成)

57.Neural Radiance Fields(NeRF)

56.Event Cameras(事件相机)

55.Biometrics(生物特征识别)

54.Style Transfer(风格迁移)

53.Crack Segmentation

52.Gaze Estimation(凝视估计)

51.sound(语音)

50.Dataset(数据集)

49.Vision Transformers

48.Image/Video Editing

47.Edge Detection(边缘检测)

46.Dense Prediction(密集预测)

45.Visual Tampering Detection(视觉篡改检测)

44.visual industrial inspection(工业检测)

43.Image Fusion(图像融合)

42.Image Classification(图像分类)

41.Image Progress(低层图像处理、质量评价)

40.Self/Semi-supervised learning

39.Few/Zero-Shot Learning/Domain Generalization/Adaptation(小/零样本/域泛化/域适应)

38.Visual Representation Learning

37.Machine Learning(机器学习)

36.NLP

35.Model Compression/Knowledge Distillation/Pruning(模型压缩/知识蒸馏/剪枝)

34.NAS

33.Optical Flow Estimation(光流估计)

32.Scene Flow Estimation(场景流估计)

31.Automated Driving(自动驾驶)

30.GNN/GCN

29.Scene Graph Generation(场景图生成)

28.Point-Cloud(点云)

27.Human-Object Interactions(人物交互)

26.Human Motion Prediction(人体运动预测)

25.Multimodal(多模态)

24.Lage Language Models(大语言模型)

23.Vision-Language(视觉语言)

22.Visual Answer Questions(视觉问答)

21.SLAM/Augmented Reality/Virtual Reality/Robotics(增强/虚拟现实/机器人)

20.GAN/生成

19.Object Pose Estimation(物体姿态估计)

18.Animal

17.Human Pose Estimation(人体姿态估计)

16.Action Detection(动作检测)

15.Video

14.OCR(文本检测识别)

13.Reid(人员重识别/步态识别/行人检测)

12.UAV/Remote Sensing/Satellite Image(无人机/遥感/卫星图像)

11.Object Tracking(目标跟踪)

10.Object Detector(目标检测)

9.Image Segmentation(图像分割)

8.Face(人脸技术)

7.3D(三维重建\三维视觉)

6.Medical Image(医学图像处理)

5.Image/Video Composition(图像/视频压缩)

4.Image/Video Caption(图像/视频字幕)

3.Image/Video Retrieval(图像/视频检索)

2.Super-Resolution(超分辨率)

1.其它

2020 年论文分类汇总戳这里

2021 年论文分类汇总戳这里