Home

Awesome

HOI-Learning-List

Some recent (2015-now) Human-Object Interaction Learning studies. If you find any errors or problems, please don't hesitate to let me know.

A list of Transfomer-based vision works https://github.com/DirtyHarryLYL/Transformer-in-Vision.

Image Dataset/Benchmark

More...

Video HOI Datasets

3D HOI Datasets

Survey

Method

HOI Image Generation

HOI Recognition: Image-based, to recognize all the HOIs in one image.

More...

Unseen or zero-shot learning (image-level recognition).

More...

HOI for Robotics.

HOI Detection: Instance-based, to detect the human-object pairs and classify the interactions.

More...

Unseen or zero/low-shot or weakly-supervised learning (instance-level detection).

More...

Video HOI methods

More...

3D HOI Reconstruction/Generation/Understanding

Result

PaStaNet-HOI:

Proposed by TIN (TPAMI version, Transferable Interactiveness Network). It is built on HAKE data, includes 110K+ images and 520 HOIs (without the 80 "no_interaction" HOIs of HICO-DET to avoid the incomplete labeling). It has a more severe long-tailed data distribution thus is more difficult.

Detector: COCO pre-trained

MethodmAP
iCAN11.00
iCAN+NIS13.13
TIN15.38

HICO-DET:

1) Detector: COCO pre-trained

MethodPubFull(def)Rare(def)None-Rare(def)Full(ko)Rare(ko)None-Rare(ko)
Shen et al.WACV20186.464.247.12---
HO-RCNNWACV20187.815.378.5410.418.9410.85
InteractNetCVPR20189.947.1610.77---
TurboAAAI201911.407.3012.60---
GPNNECCV201813.119.3414.23---
Xu et. alICCV201914.7013.2615.13---
iCANBMVC201814.8410.4516.1516.2611.3317.73
Wang et. al.ICCV201916.2411.1617.7517.7312.7819.21
Lin et. alIJCAI202016.6311.3018.2219.2214.5620.61
Functional (suppl)AAAI202016.9611.7318.52---
InteractivenessCVPR201917.0313.4218.1119.1715.5120.26
No-FrillsICCV201917.1812.1718.68---
RPNNICCV201917.3512.7818.71---
PMFNetICCV201917.4615.6518.0020.3417.4721.20
SIGNICME202017.5115.3118.5320.4917.5321.51
Interactiveness-optimizedCVPR201917.5413.8018.6519.7515.7020.96
Liu et.al.arXiv17.5520.61----
Wang et al.ECCV202017.5716.8517.7821.0020.7421.08
UnionDetarXiv202317.5811.7219.3319.7614.6821.27
In-GraphNetIJCAI-PRICAI 202017.7212.9319.31---
HOIDCVPR202017.8512.8519.34---
MLCNetICMR202017.9516.6218.3522.2820.7322.74
SAGarXiv18.2613.4019.71---
Sarullo et al.arXiv18.74-----
DRGECCV202019.2617.7419.7123.4021.7523.89
AnalogyICCV201919.4014.6020.90---
VCLECCV202019.4316.5520.2922.0019.0922.87
VS-GATsarXiv19.6615.7920.81---
VSGNetCVPR202019.8016.0520.91---
PFNetCVM20.0516.6621.0724.0121.0924.89
ATL(w/ COCO)CVPR202120.0815.5721.43---
FCMNetECCV202020.4117.3421.5622.0418.9723.12
ACPECCV202020.5915.9221.98---
PD-NetECCV202020.8115.9022.2824.7818.8826.54
SG2HOIICCV202120.9318.2421.7824.8320.5225.32
TIN-PAMITAPMI202120.9318.9521.3223.0220.9623.42
ATLCVPR202121.0716.7922.35---
PMNarXiv21.2117.6022.29---
IPGNTIP202121.2618.4722.07---
DJ-RNCVPR202021.3418.5322.1823.6920.6424.60
OSGNetIEEE Access21.4018.1222.38---
K-BANarXiv202221.4816.8522.8624.2919.0925.85
SCG+ODMECCV202221.5017.5922.67---
DIRVAAAI202121.7816.3823.3925.5220.8426.92
SCGICCV202121.8518.1122.97---
HRNetTIP202121.9316.3023.6225.2218.7527.15
ConsNetACMMM202022.1517.5523.5226.5720.828.3
SKGHOIarXiv202322.6115.8724.62---
IDNNeurIPS202023.3622.4723.6326.4325.0126.85
QAHOI-Res50arXiv202124.3516.1826.80---
DOQCVPR202225.9726.0925.93---
STIPCVPR202228.8127.5529.1832.2831.0732.64

2) Detector: pre-trained on COCO, fine-tuned on HICO-DET train set (with GT human-object pair boxes) or one-stage detector (point-based, transformer-based)

The finetuned detector would learn to only detect the interactive humans and objects (with interactiveness), thus suppressing many wrong pairings (non-interactive human-object pairs) and boosting the performance.

MethodPubFull(def)Rare(def)None-Rare(def)Full(ko)Rare(ko)None-Rare(ko)
UniDetECCV202017.5811.7219.3319.7614.6821.27
IP-NetCVPR202019.5612.7921.5822.0515.7723.92
RR-NetarXiv20.7213.2122.97---
PPDM (paper)CVPR202021.1014.4623.09---
PPDM (github-hourglass104)CVPR202021.73/21.9413.78/13.9724.10/24.3224.58/24.8116.65/17.0926.84/27.12
FunctionalAAAI202021.9616.4323.62---
SABRA-Res50arXiv23.4816.3925.5928.7922.7530.54
VCLECCV202023.6317.2125.5525.9819.1228.03
ATLCVPR202123.6717.6425.4726.0119.6027.93
PSTICCV202123.9314.9826.6026.4217.6129.05
SABRA-Res50FPNarXiv24.1215.9126.5729.6522.9231.65
ATL(w/ COCO)CVPR202124.5018.5326.2827.2321.2729.00
IDNNeurIPS202024.5820.3325.8627.8923.6429.16
FCLCVPR202124.6820.0326.0726.8021.6128.35
HOTRCVPR202125.1017.3427.42---
FCL+VCLCVPR202125.2720.5726.6727.7122.3428.93
OC-ImmunityAAAI202225.4423.0326.1627.2424.3228.11
ConsNet-FACMMM202025.9419.3527.9130.3423.432.41
SABRA-Res152arXiv26.0916.2929.0231.0823.4433.37
QAHOI-Res50arXiv202126.1818.0628.61---
Zou et al.CVPR202126.6119.1528.8429.1320.9831.57
SKGHOIarXiv202326.9521.2828.56---
RGBMarXiv202227.3921.3429.2030.8724.2032.87
GTNetarXiv28.0322.7329.6129.9824.1331.73
K-BANarXiv202228.8320.2931.3131.0521.4133.93
AS-NetCVPR202128.8724.2530.2531.7427.0733.14
QPIC-Res50CVPR202129.0721.8531.2331.6824.1433.93
GGNetCVPR202129.1722.1330.8433.5026.6734.89
QPIC-CPCCVPR202229.6323.1431.57---
QPIC-Res101CVPR202129.9023.9231.6932.3826.0634.27
SCGICCV202129.2624.6130.6532.8727.8934.35
MHOITCSVT202229.6724.3731.2531.8727.2833.24
PhraseHOIAAAI202230.0323.4831.9933.7427.3535.64
CDTTNNLS 202330.4825.4832.37---
SQABDisplays202330.8224.9232.5833.5827.1935.49
MSTRCVPR202231.1725.3132.9234.0228.8335.57
SSRTCVPR202231.3424.3133.32---
OCNAAAI202231.4325.8033.11---
SCG+ODMECCV202231.6524.9533.65---
DTCVPR202231.7527.4533.0334.5030.1335.81
ParSe (COCO)NeurIPS202231.7926.3633.41---
CATN (w/ Bert)CVPR202231.8625.1533.8434.4427.6936.45
SQAICASSP202331.9929.8832.6235.1232.7435.84
CDNNeurIPS202132.0727.1933.5334.7929.4836.38
STIPCVPR202232.2228.1533.4335.2931.4336.45
DEFRarXiv202132.3533.4532.02---
PQNet-Lmmasia202232.4527.8033.8435.2830.7236.64
CDN-s+HQMECCV202232.4728.1533.76---
UPTCVPR202232.6228.6233.8136.0831.4137.47
OpenCatCVPR202332.6828.4233.75---
IwinECCV202232.7927.8435.4035.8428.7436.09
RLIP-ParSe (VG+COCO)NeurIPS202232.8426.8534.63---
PR-NetarXiv202332.8628.0334.30---
MURENCVPR202332.8728.6734.1235.5230.8836.91
SDTarXiv202232.9728.4934.3136.3231.9037.64
HODNTMM202333.1428.5434.5235.8631.1837.26
SG2HOIarxXiv202333.1429.2735.7235.7332.0136.43
PDNPR202333.1827.9534.7535.8630.5737.43
DOQCVPR202233.2829.1934.50---
IFCVPR202233.5130.3034.4636.2833.1637.21
ICDTICANN202334.0127.6035.9236.2929.8838.21
PSNarXiv202334.0229.4435.39---
KI2HOIarXiv202434.2032.2636.1037.8535.8938.78
VIL+ACMMM202334.2130.5835.3037.6734.8838.50
Multi-StepACMMM202334.4230.0335.7337.7133.7438.89
OBPA-NetPRCV202334.6332.8335.1636.7835.3838.04
MLKDWACV202434.6931.1235.74---
HOICLIPCVPR202334.6931.1235.7437.6134.4738.54
PViC w/ detrICCV202334.6932.1435.4538.1435.3838.97
GEN-VLKT+SCAarXiv202334.7931.8035.68---
HOIGenACMMM202434.8434.5234.94---
SBMPRCV202334.9231.6735.8538.7935.4339.60
(w/ CLIP)CVPR202234.9531.1836.0838.2234.3639.37
SOV-STG (res101)arXiv202335.0130.6336.3237.6032.7739.05
GeoHOIarXiv202435.0533.0135.7137.1234.7937.97
PartMapECCV202235.1533.7135.5837.5635.8738.06
GFINNN202335.2831.9136.2938.8035.4839.79
CLIP4HOINeurIPS202335.3333.9535.7437.1935.2737.77
LOGICHOINeurIPS202335.4732.0336.2238.2135.2939.03
QAHOI-Swin-Large-ImageNet-22KarXiv202135.7829.8037.5637.5931.6639.36
DPADNAAAI202435.9135.8235.9438.9939.6138.80
-L + CQLCVPR202336.0333.1636.8938.8235.5139.81
HOICLIP+DP-HOICVPR202436.5634.3637.22---
AGERICCV202336.7533.5337.7139.8435.5840.23
FGAHOIarXiv202337.1830.7139.1138.9331.9341.02
ViPLOCVPR202337.2235.4537.7540.6138.8241.15
RmLRICCV202337.4128.8139.9738.6931.2740.91
HCVCarXiv202337.5437.0137.7839.9839.0140.32
ADA-CMICCV202338.4037.5238.66---
UniVRD w/ extra data+VLMarXiv202338.6133.3940.16---
SCTCAAAI202439.1236.0939.87---
BCOMCVPR202439.3439.9039.1742.2442.8642.05
UniHOINeurIPS202340.9540.2741.3243.2643.1243.25
DiffHOI w/ syn dataarXiv202341.5039.9641.9643.6241.4144.28
DiffusionHOINeurIPS202442.5442.9542.3544.9145.1844.83
SOV-STG (swin-l)arXiv202343.3542.2543.6945.5343.6246.11
PViC w/ h-detr (swin-l)ICCV202344.3244.6144.2447.8148.3847.64
MP-HOICVPR202444.5344.4844.55---
SICHOICVPR202445.0445.6144.8848.1648.3748.09
RLIPv2-ParSeDA w/ extra dataICCV202345.0943.2345.64---
CycleHOIarXiv202445.7146.1445.5249.2349.8748.96
Pose-AwareCVPR202446.0146.7445.8049.5050.5949.18

3) Ground Truth human-object pair boxes (only evaluating HOI recognition)

MethodPubFull(def)Rare(def)None-Rare(def)
iCANBMVC201833.3821.4336.95
InteractivenessCVPR201934.2622.9037.65
AnalogyICCV201934.3527.5736.38
ATLCVPR202143.3233.8446.15
IDNNeurIPS202043.9840.2745.09
ATL(w/ COCO)CVPR202144.2735.5246.89
FCLCVPR202145.2536.2747.94
GTNetarXiv46.4535.1049.84
SCGICCV202151.5341.0154.67
K-BANarXiv202252.9934.9158.40
ConsNetACMMM202053.0438.7957.3
ViPLOCVPR202362.0959.2662.93

4) Interactiveness detection (interactive or not + pair box detection):

MethodPubHICO-DETV-COCO
TIN++TPAMI202214.3529.36
PPDMCVPR202027.34-
QPICCVPR202132.9638.33
CDNNeurIPS202133.5540.13
PartMapECCV202238.7443.61

5) Enhanced with HAKE:

MethodPubFull(def)Rare(def)None-Rare(def)Full(ko)Rare(ko)None-Rare(ko)
iCANBMVC201814.8410.4516.1516.2611.3317.73
iCAN + HAKE-HICO-DETCVPR202019.61 (+4.77)17.2920.3022.1020.4622.59
InteractivenessCVPR201917.0313.4218.1119.1715.5120.26
Interactiveness + HAKE-HICO-DETCVPR202022.12 (+5.09)20.1922.6924.0622.1924.62
Interactiveness + HAKE-LargeCVPR202022.66 (+5.63)21.1723.0924.5323.0024.99

6) Zero-Shot HOI detection:

Unseen action-object combination scenario (UC)
MethodPubDetectorUnseen(def)Seen(def)Full(def)
Shen et al.WACV2018COCO5.62-6.26
FunctionalAAAI2020HICO-DET11.31 ± 1.0312.74 ± 0.3412.45 ± 0.16
ConsNetACMMM2020COCO16.99 ± 1.6720.51 ± 0.6219.81 ± 0.32
CDTTNNLS 2023-18.0623.3420.72
EoIDAAAI2023-23.01±1.5430.39±0.4028.91±0.27
HOICLIPCVPR2023-25.5334.8532.99
KI2HOIarXiv2024-27.4335.7634.56
CLIP4HOINeurIPS2023-27.7133.2532.11
HOIGenACMMM2024-30.2634.2333.44
VCL (NF-UC)ECCV2020HICO-DET16.2218.5218.06
ATL(w/ COCO) ((NF-UC))CVPR2021HICO-DET18.2518.7818.67
FCL (NF-UC)CVPR2021HICO-DET18.6619.5519.37
RLIP-ParSe (NF-UC)NeurIPS2022COCO, VG20.2727.6726.19
SCLarxivHICO-DET21.7325.0024.34
OpenCat(NF-UC)CVPR2023HICO-DET23.2528.0427.08
GEN-VLKT* (NF-UC)CVPR2022HICO-DET25.0523.3823.71
EoID (NF-UC)AAAI2023HICO-DET26.7726.6626.69
HOICLIP (NF-UC)CVPR2023HICO-DET26.3928.1027.75
LOGICHOI (NF-UC)NeurIPS2023-26.8427.8627.95
Wu et.al. (NF-UC)AAAI2024-27.3522.0923.14
UniHOI (NF-UC)NeurIPS2023-28.4532.6331.79
KI2HOI (NF-UC)arXiv2024-28.8928.3127.77
DiffHOI w/ syn data (NF-UC)arXiv2023HICO-DET + syn data29.4531.6831.24
HCVC (NF-UC)arXiv2023-28.4431.3530.77
CLIP4HOI (NF-UC)NeurIPS2023-31.4428.2628.90
HOIGen (NF-UC)ACMMM2024-33.9832.8633.08
VCL (RF-UC)ECCV2020HICO-DET10.0624.2821.43
ATL(w/ COCO) ((RF-UC))CVPR2021HICO-DET9.1824.6721.57
FCL (RF-UC)CVPR2021HICO-DET13.1624.2322.01
SCL (RF-UC)arxivHICO-DET19.0730.3928.08
RLIP-ParSe (RF-UC)NeurIPS2022COCO, VG19.1933.3530.52
GEN-VLKT* (RF-UC)CVPR2022HICO-DET21.3632.9130.56
OpenCat(RF-UC)CVPR2023HICO-DET21.4633.8631.38
Wu et.al. (RF-UC)AAAI2024-23.3230.0928.53
HOICLIP (RF-UC)CVPR2023HICO-DET25.5334.8532.99
LOGICHOI (RF-UC)NeurIPS2023-25.9734.9333.17
KI2HOI (RF-UC)arXiv2024-26.3335.7934.10
CLIP4HOINeurIPS2023-28.4735.4834.08
UniHOI (RF-UC)NeurIPS2023-28.6833.1632.27
DiffHOI w/ syn data (RF-UC)arXiv2023HICO-DET + syn data28.7638.0136.16
HCVC (RF-UC)arXiv2023-30.9537.1635.87
HOIGen (RF-UC)ACMMM2024-31.0134.5733.86
RLIPv2-ParSeDA (RF-UC)ICCV2023VG, COCO, O36531.2345.0142.26
Zero-shot* HOI detection without fine-tuning (NF)
MethodPubBackboneDatasetDetectorFullRareNon-Rare
RLIP-ParSeDNeurIPS2022ResNet-50COCO + VGDDETR13.9211.2014.73
RLIP-ParSeNeurIPS2022ResNet-50COCO + VGDETR15.4015.0815.50
RLIPv2-ParSeDAICCV2023Swin-LVG+COCO+O365DDETR23.2927.9721.90
Unseen object scenario (UO)
MethodPubDetectorFull(def)Seen(def)Unseen(def)
FunctionalAAAI2020HICO-DET13.8414.3611.22
FCLCVPR2021HICO-DET19.8720.7415.54
ConsNetACMMM2020COCO20.7120.9919.27
Wu et.al.AAAI2024-27.7327.8727.05
ATLCVPR2021-15.1121.5420.47
GEN-VLKTCVPR2022-10.5128.9225.63
LOGICHOINeurIPS2023-15.6730.4228.23
HOICLIPCVPR2023-16.2030.9928.53
KI2HOIarXiv2024-16.5031.7028.84
HCVCarXiv2023-16.7833.3130.53
CLIP4HOINeurIPS2023-31.7932.7332.58
HOIGenACMMM2024-36.3532.9033.48
Unseen action scenario (UA)
MethodPubDetectorFull(def)Seen(def)Unseen(def)
ConsNetACMMM2020COCO19.0420.0214.12
CDTTNNLS 2023-19.6821.4515.17
Wu et.al.AAAI2024-26.4328.1317.92
EoIDAAAI2023-29.2230.4623.04
Unseen action scenario (UV), results from EoID
MethodPubDetectorUnseen(def)Seen(def)Full(def)
HOIGenACMMM2024-20.2734.3132.34
GEN-VLKTCVPR2022-20.9630.2328.74
EoIDAAAI2023-22.7130.7329.61
HOICLIPCVPR2023-24.3032.1931.09
LOGICHOINeurIPS2023-24.5731.8830.77
HCVCarXiv2023-24.6936.1134.51
KI2HOIarXiv2024-25.2032.9531.85
CLIP4HOINeurIPS2023-26.0231.1430.42
UniHOINeurIPS2023-26.0536.7834.68
Another setting
MethodPubUnseenSeenFull
Shen et. al.WACV20185.62-6.26
FunctionalAAAI202010.9312.6012.26
VCLECCV202010.0624.2821.43
ATLCVPR20219.1824.6721.57
FCLCVPR202113.1624.2322.01
THID (w/ CLIP)CVPR202215.5324.3222.96
EoIDAAAI202322.0431.3929.52
GEN-VLKTCVPR202221.3632.9130.56

7) Few-Shot HOI detection:

1% HICO-Det Data used in fine-tuning
MethodPubBackboneDatasetDetectorDataFullRareNon-Rare
RLIP-ParSeDNeurIPS2022ResNet-50COCO + VGDDETR1%18.3016.2218.92
RLIP-ParSeNeurIPS2022ResNet-50COCO + VGDETR1%18.4617.4718.76
RLIPv2-ParSeDAICCV2023Swin-LVG+COCO+O365DDETR1%32.2231.8932.32
10% HICO-Det Data used in fine-tuning
MethodPubBackboneDatasetDetectorDataFullRareNon-Rare
RLIP-ParSeDNeurIPS2022ResNet-50COCO + VGDDETR10%22.0915.8923.94
RLIP-ParSeNeurIPS2022ResNet-50COCO + VGDETR10%22.5920.1623.32
RLIPv2-ParSeDAICCV2023Swin-LVG+COCO+O365DDETR10%37.4634.7538.27

8) Weakly-supervised HOI detection:

MethodPubBackboneDatasetDetectorFullRareNon-Rare
Explanation-HOIECCV2020ResNeXt101COCOFRCNN10.638.7111.20
MX-HOIWACV2021ResNet-101COCOFRCNN16.1412.0617.50
PPR-FCN (from Weakly-HOI-CLIP)ICCV2017ResNet-50, CLIPCOCOFRCNN17.5515.6918.41
Align-FormerBMVC2021ResNet-101--20.8518.2321.64
Weakly-HOI-CLIPICLR2023ResNet-101, CLIPCOCOFRCNN25.7024.5226.05
OpenCatCVPR 2023DETR--25.8224.3526.19

Ambiguous-HOI

Detector: COCO pre-trained

MethodmAP
iCAN8.14
Interactiveness8.22
Analogy(reproduced)9.72
DJ-RN10.37
OC-Immunity10.45

SWiG-HOI

MethodPubNon-RareUnseenSeenFull
JSRECCV202010.016.102.346.08
CHOIDICCV202110.936.632.646.64
QPICCVPR202116.9510.846.2111.12
THID (w/ CLIP)CVPR202217.6712.8210.0413.26

V-COCO: Scenario1

1) Detector: COCO pre-trained or one-stage detector

MethodPubAP(role)
Gupta et al.arXiv31.8
InteractNetCVPR201840.0
TurboAAAI201942.0
GPNNECCV201844.0
UniVRD w/ extra data+VLMarXiv202345.19
iCANBMVC201845.3
Xu et. alCVPR201945.9
Wang et. al.ICCV201947.3
UniDetECCV202047.5
InteractivenessCVPR201947.8
Lin et. alIJCAI202048.1
VCLECCV202048.3
Zhou et. al.CVPR202048.9
In-GraphNetIJCAI-PRICAI 202048.9
Interactiveness-optimizedCVPR201949.0
TIN-PAMITAPMI202149.1
IP-NetCVPR202051.0
DRGECCV202051.0
RGBMarXiv202251.7
VSGNetCVPR202051.8
PMNarXiv51.8
PMFNetICCV201952.0
Liu et.al.arXiv52.28
FCLCVPR202152.35
PD-NetECCV202052.6
Wang et.al.ECCV202052.7
PFNetCVM52.8
Zou et al.CVPR202152.9
SIGNICME202053.1
ACPECCV202052.98 (53.23)
FCMNetECCV202053.1
HRNetTIP202153.1
SGCN4HOIIEEESMC202253.1
ConsNetACMMM202053.2
IDNNeurIPS202053.3
SG2HOIICCV202153.3
OSGNetIEEE Access53.43
SABRA-Res50arXiv53.57
K-BANarXiv202253.70
IPGNTIP202153.79
AS-NetCVPR202153.9
RR-NetarXiv54.2
SCGICCV202154.2
HOKEMarXiv202354.6
SABRA-Res50FPNarXiv54.69
GGNetCVPR202154.7
MLCNetICMR202055.2
HOTRCVPR202155.2
DIRVAAAI202156.1
UnionDetarXiv202356.2
SABRA-Res152arXiv56.62
PhraseHOIAAAI202257.4
GTNetarXiv58.29
QPIC-Res101CVPR202158.3
ADA-CMICCV202358.57
QPIC-Res50CVPR202158.8
ICDTICANN202359.4
CATN (w/ fastText)CVPR202260.1
FGAHOIarXiv202360.5
IwinECCV202260.85
UPT-ResNet-101-DC5CVPR202261.3
CDTTNNLS 202361.43
SBMPRCV202361.5
SDTarXiv202261.8
OpenCatCVPR202361.9
MSTRCVPR202262.0
ViPLOCVPR202362.2
Multi-StepACMMM202362.4
PViC w/ detrICCV202362.8
PR-NetarXiv202362.9
IFCVPR202263.0
ParMapECCV202263.0
QPIC-CPCCVPR202263.1
DOQCVPR202263.5
HOICLIPCVPR202363.5
GEN-VLKT (w/ CLIP)CVPR202263.58
SG2HOIarxXiv202363.6
QPIC+HQMECCV202263.6
SOV-STGarXiv202363.9
KI2HOIarXiv202463.9
CDNNeurIPS202163.91
PViC w/ h-detr (swin-l)ICCV202364.1
OBPA-NetPRCV202364.1
RmLRICCV202364.17
RLIP-ParSe (COCO+VG)NeurIPS202264.2
LOGICHOINeurIPS202364.4
MHOITCSVT202264.5
GEN-VLKT+SCAarXiv202364.5
PDNPR202364.7
ParSe (COCO)NeurIPS202264.8
SSRTCVPR202265.0
SQABDisplays202365.0
OCNAAAI202265.3
SQAICASSP202365.4
AGERICCV202365.68
DiffHOIarXiv202365.7
BCOMCVPR202465.8
PSNarXiv202365.9
DPADNAAAI202462.62
Pose-AwareCVPR202463.0
CO-HOIarXiv202465.44
STIPCVPR202266.0
DTCVPR202266.2
MP-HOICVPR202466.2
CLIP4HOINeurIPS202366.3
GENs+DP-HOICVPR202466.6
GEN-VLKT-L + CQLCVPR202366.8
CycleHOIarXiv202466.8
HODNTMM202367.0
DiffusionHOINeurIPS202467.1
VIL+DisTRACMMM202367.6
UniHOINeurIPS202368.05
SCTCAAAI202468.2
HCVCarXiv202368.4
MURENCVPR202368.8
GeoHOIarXiv202469.4
GFINNN202370.1
SICHOICVPR202471.1
RLIPv2-ParSeDA w/ extra dataICCV202372.1

2) Enhanced with HAKE:

MethodPubAP(role)
iCANCVPR201945.3
iCAN + HAKE-Large (transfer learning)CVPR202049.2 (+3.9)
InteractivenessCVPR201947.8
Interactiveness + HAKE-Large (transfer learning)CVPR202051.0 (+3.2)

3) Weakly-supervised HOI detection:

MethodPubBackboneDatasetDetectorAP(role)-S1AP(role)-S2
Weakly-HOI-CLIPICLR2023ResNet-101, CLIPCOCOFRCNN44.7449.97

HOI-COCO:

based on V-COCO

MethodPubFullSeenUnseen
VCLECCV202023.538.2935.36
ATL(w/ COCO)CVPR202123.408.0135.34

HICO

1) Default

MethodmAP
R*CNN28.5
Girdhar et.al.34.6
Mallya et.al.36.1
RAM++ LLM37.6
Pairwise39.9
RelViT40.12
DEFR-base44.1
OpenTAP51.7
DEFR-CLIP60.5
HTS60.5
DEFR/16 CLIP65.6

2) Enhanced with HAKE:

MethodmAP
Mallya et.al.36.1
Mallya et.al.+HAKE-HICO45.0 (+8.9)
Pairwise39.9
Pairwise+HAKE-HICO45.9 (+6.0)
Pairwise+HAKE-Large46.3 (+6.4)