Home

Awesome

E3M: Zero-Shot Spatio-Temporal Video Grounding

[ECCV 2024] Zero-Shot Spatio-Temporal Video Grounding with Expectation-Maximization Multimodal Modulation