Awesome
<img src='https://s3.amazonaws.com/drivendata-public-assets/logo-white-blue.png' height='70'> <br> <br> <img alt="A side-by-side comparison of two videos showing a frame from a video on the left and the same frame manipulated with emojis on the right." src="https://drivendata-public-assets.s3.amazonaws.com/meta-vsc-hero.png" style=" object-fit: scale-down; max-height: 150px; width: 100%; "> <sub>Credit: BrentOzar</sub>
Meta AI Video Similarity Challenge
Goal of the Competition
Competitors built models to help detect whether a given query video is derived from any of the videos in a large reference set.
The ability to identify and track content on social media platforms, called content tracing, is crucial to the experience of users on these platforms. Previously, Meta AI and DrivenData hosted the Image Similarity Challenge in which participants developed state-of-the-art models capable of accurately detecting when an image was derived from a known image. The motivation for detecting copies and manipulations with videos is similar — enforcing copyright protections, identifying misinformation, and removing violent or objectionable content.
This competition allowed users to test their skills in building a key part of that content tracing system, and in so doing contribute to making social media more trustworthy and safe for the people who use it.
There were two tracks to this challenge:
- For the Descriptor Track, the goal was to generate useful vector representations of videos for this video similarity task. Competitors generated descriptors for both query and reference set videos, and a standardized similarity search using pair-wise inner-product similarity was used to generate ranked video match predictions.
- For the Matching Track, the goal was to create a model that directly detects clips of a query video that correspond to clips in one or more videos in a large corpus of reference videos.
Winning Submissions
See below for links to winning submissions' arXiv papers and code.
Descriptor Track
Place | Team or User | Code | Paper | Score | Summary of Model |
---|---|---|---|---|---|
1 | do something | GitHub repository | A Dual-level Detection Method for Video Copy Detection | 0.8717 | Uses a model derived from the provided baseline with an edit detection model and a video decomposition model to separate stacked videos. |
2 | FriendshipFirst | GitHub repository | Feature-compatible Progressive Learning for Video Copy Detection | 0.8514 | Utilizes feature-compatible progressive learning, with a model ensemble that generates comparable (compatible) similarity feature vectors. |
3 | cvl-descriptor | GitHub repository | 3rd Place Solution to Meta AI Video Similarity Challenge | 0.8362 | Leverages previous winning image similarity challenge model with test-time augmentation and edit prediction models to generate descriptors. |
Matching Track
Place | Team or User | Code | Paper | Score | Summary of Model |
---|---|---|---|---|---|
1 | do something more | GitHub repository | A Similarity Alignment Model for Video Copy Segment Matching | 0.9153 | Uses an align-refine pipeline for aligning video copy segments. |
2 | CompetitionSecond | GitHub repository | Feature-compatible Progressive Learning for Video Copy Detection | 0.7711 | Builds on feature-compatible progressive learning approach and uses a temporal network approach to localize copied segments. |
3 | cvl-matching | GitHub repository | 3rd Place Solution to Meta AI Video Similarity Challenge | 0.7036 | Uses descriptor track model with temporal network localization to localize copied segments. |