Awesome

Multi-Camera Networks

Multi-camera Networks research notes. Target venues: system conferences (OSDI/SOSP/ATC/EuroSys), network (NSDI/SIGCOMM/SoCC), mobile (MobiCom/MobiSys/SenSys/UbiComp), data analytics (VLDB/SIGMOD) and computer vision (ICCV/CVPR/ECCV/ICML/ICLR/NeurIPS).
Unlike book, I collect papers from system and AI perspective, respectively. To avoid diving into details of specific vision tasks (eg., object detection), I only list low-resource learning, domain adaptation & continual learning and dynamic deep neural networks in AI Algorithm because I think these three topics are generalized on all vision tasks and are useful to help us deploy deep learning based vision applications. In the end, I list datasets and useful toolboxes.

Note: specific vision algorithms (tracking, object detection, segmentation and action recognition) are not collected in this note. If you want to learn or try them, you can refer to SenseTime-CHUK Open-MMLab, which provides a suit of toolboxes to help AI researcher/engineers implement vision algorithms. For example, you can try 50+ image-based object detection models using the same mmdetection API and try 10+ video-based object detection methods using the same mmtracking API.

Outline

Book and Survey - a starting point to understand basic concepts behind multi-camera networks
Researchers, Workshops and Courses - follow them to get recent research trends in multi-camera networks
Topics - group recent papers in different sub-topics (i.e., Camera calibration)
- System
  - Edge video analytics - speed up analysis pipeline
  - Configuration search - search most suitable configuration file
  - Database - distributed data processing
  - Video streaming - video compression
  - Resource management - resource managment
  - Prediction serving and model update - model exchange, prediction serving, model monitoring and model updates
  - Multi-Camera Collaboration - improve performance and reduce deployments' cost
  - Privacy - data privacy, model privacy and computation privacy
- AI Algorithm
  - Low-resource learning - efficient learning under limited data/annotations/computation/(time)
    - In AI, low-resource learning is often named low-shot learning (few-/one-/zero-shot learning), which expect to retrain or train from the sratch with only a few new data. Inspired by style transfer, many image/speech synthesis tasks leverage Adaptive Instance Normalization layer (AdaIN) to calibrate the distribution of inputing data. But in object detection, there is not existing many works on low-resource learning and I only found two related papers (SpotTune, CVPR'19, Citation=159 and Budget-Aware Adapters, ICCV'19, Citation=10), which are not based on detection architectures and suitable for all CNN models.
  - Domain adaptation and continual learning - robustness and sustainability
    - For continual learning, most AI works focus on how to learn unseen classes and how to memory seen classes (avoid catastrophic forgetting). Thus, it is also named incremental learning.
    - For domain adaptation, AI researchers target to improve generalization of existing pretrained models. Based on given target data (labeled or unlabeled), existing algorithms can be split into two categories: (1) supervised retraining; (2) unsupervised domain adaptation (source-free and source-target-joint training).
    - Recent works about Model Exchange & Serving and Model Monitoring & Updates are summarized in this slide provided by Architecture of ML Systems (SS2021, Graz University of Technology).
  - Dynamic deep neural networks - computing flexibility
Dataset - test your ideas on popular datasets
Toolbox - verify your ideas quickly using toolbox

Book and Survey

Researchers, Workshops and Courses

Researchers (organization and research interests)

System (live video analytics, distributed computing, video streaming, privacy, collaborative/continual learning)
- Matthias Boehm (Graz University of Technology, Austria) - data management and deep learning based data analytics
- Arun Kumar (University of California San Diego, USA) - data management and deep learning based data analytics
- Ganesh Ananthanarayanan (Microsoft Research, USA) - live video analytics, distributed computing
- Yuanchao Shu (Microsoft Research, USA) - live video analytics, collobarative/continual learning
- Feng Qian (University of Minnesota Twin Cities, USA) - video streaming
- Juncheng Jiang (The University of Chicago, USA) - video streaming
- Ravi Netravali (Princeton, USA) - edge video AI
- Fengyuan Xu (Nanjing University, China) - the Internet of Video Things (IoVT) and privacy-preserving edge AI
- Shivaram Venkataraman (University of Wisconsin-Madison, USA) - real-time video processing
Deep based algorithms (tracking, object detection, segmentation and action recognition)
- Andrea Cavallaro (Queen Mary University of London, UK) - multi-modal fusion, privacy-aware video analytics (based on adversarial-training/learning)
- Amit K. Roy-Chowdhury (UC Riverside, USA) - tracking, reID, super-resolution and domain adaptation
- Jenq-Neng Hwang (University of Washington, USA) - tracking, reID, localization and visual odometry
- Hamed Haddadi (Imperial College London, UK) - privacy-preserving edge AI
- Ying Wu (Northwestern, USA) - tracking, detection, reID and segmentation
- Gaoang Wang (Zhejiang University, China) - scene-aware multi-object tracking
- Haibin Ling (Stony Brook University, USA) - visual tracking in drones
- Mubarak Shah (University of Central Florida, USA) - zero/few-shot learning in video based tracking/segmentation/action recognition
- Ming-Hsuan Yang (UC Merced, USA) - low-resources (data or compute) learning for tracking/detection/segmentation

Workshops (video analytics)

The 3rd Workshop on Hot Topics in Video Analytics and Intelligent Edges (ACM MobiCom'21) - focus on deep learning based video analytics
Multi-camera Multiple People Tracking Workshop (IEEE ICCV'21) - track multiple people from indoor scenes using multiple RGB cameras
Multimedia Systems Conference (ACM MMSys'21) - contain multiple topics in video analysis

Courses

CS294: Machine Learning Systems (Fall 2019, Berkeley) - contain all concepts/background behind machine learning systems (the best reference website!)
706.550: Architecture of ML Systems (Summer 2021, Graz University of Technology) - the architecture and essential concepts of modern ML systems for both local and large-scale machine learning (based on non-deep ML analytics)
CS231A: Computer Vision, From 3D Reconstruction to Recognition (Winter 2021, Stanford) - focus on basic concepts behind many computer vision tasks across multi-camera networks (camera models, calibration, single- and multiple-view geometry, stereo systems, sfm, stereo, matching, depth estimation, optical flow and optimal estimation)
COS 598a: Machine Learning-Driven Video Systems (Spring 2022, Princeton) - target to recent research interests on video analytics (Strong Recommendation)
CS34702 Topics in Networks: Machine Learning for Networking and Systems (Fall 2020, UChicago) - target to awesome recent research works on netwoking system (video streaming and cloud scheduing are recommended)
CSE 234: Data Systems for Machine Learning (Fall 2021, UCSD) - focus on the lifecycle of ML-based data analytics, including data sourcing and preparation for ML, programming models and systems for scalable ML model building, and systems for faster ML deployment
CSE 291F: Advanced Data Analytics and ML Systems (Winter 2019, UCSD) - the emerging area of advanced data analytics and ML systems, at the intersection of data management, ML/AI, and systems.
CS6465: Emerging Cloud Technologies and Systems Challenges (Fall 2019, Cornell) - emerging cloud computing technology, opportunities and challenges.

Topics

System

Privacy

Useful external links	Keywords
Tutorial on privacy-preserving data analysis (The Alan Turing Institute)	todo
The Second AAAI Workshop on Privacy-Preserving Artificial Intelligence (PPAI-21)	todo
A Dive into Privacy Preserving Machine Learning (OpML'20)	todo
CrypTen (Facebook AI Research)	Privacy Preserving Machine Learning framework, PyTorch, Multi-Party Computation (MPC)

[1] (TAMU and Adobe Research) Wu et al. Towards Privacy-Preserving Visual Recognition via Adversarial Training: A Pilot Study. In ECCV'18. [2] (CMU) Wang et al. Enabling Live Video Analytics with a Scalable and Privacy-Aware Framework. In 2018 ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM'18). [3] (KAIST, USTC, Rice, NJU, SNU, PKU and MSRA) Lee et al. Occlumency: Privacy-preserving Remote Deep-learning Inference Using SGX. In MobiCom'19. [4] (NUS) Shen et al. Human-imperceptible Privacy Protection Against Machines. In MM'19. [5] (PSU and Facebook) Khazbak et al. TargetFinder: Privacy Preserving Target Search through IoT Cameras. In IoTDI'19 (Best Paper Award). [6] (Tsinghua and USTC) Li et al. Invisible: Federated Learning over Non-Informative Intermediate Updates against Multimedia Privacy Leakages. In MM'20. [7] (UCB and MSR) Poddar et al. Visor: Privacy-Preserving Video Analytics as a Cloud Service. In 29th Usenix Security Symposium (Security'20). [8] (ICL, QMUL, Telefónica Research and Samsung AI) Mo et al. DarkneTZ: Towards Model Privacy at the Edge using Trusted Execution Environments. In MobiSys'20. [9] (NJU, Cornell and MSRA) Wu et al. PECAM: privacy-enhanced video streaming and analytics via securely-reversible transformation. In MobiCom'21. [10] (ASU) Hu et al. LensCap: Split-Process Framework for Fine-Grained Visual Privacy Control for Augmented Reality Apps. In MobiSys'21. [11] (CUHK) Ouyang et al. ClusterFL: A Similarity-Aware Federated Learning System for Human Activity Recognition. In MobiSys'21. [12] (ICL and Telefónica Research) Mo et al. PPFL: Privacy-preserving Federated Learning with Trusted Execution Environments. In MobiSys'21 (Best paper award). [13] (CMU, UCSD and MSR) Dsouza et al. Amadeus: Scalable, Privacy-Preserving Live Video Analytics. arXiv prePrint 2011.05163. [14] (MIT, Princeton, UChicago and Rutgers) Cangialosi et al. Privid: Practical, Privacy-Preserving Video Analytics Queries. In NSDI'22.

AI Algorithm

Low-resource learning

[1] H. Aghdam et al. Active Learning for Deep Detection Neural Networks. In ICCV'19. Public Code Note

Domain adaptation and continual learning

xxx

Dynamic deep neural networks

xxx

Dataset

Toolbox

CHUK-mmcv: a foundational python library for computer vision research and supports many research projects (2D/3D detection, semantic segmentation, image and video editing, pose estimation, action understanding and image classification).
JDCV-fastreid: a python library implementing SOTA re-identification methods (including pedestrian and vehicle re-identification). They also provided a good documentation.
Cheetah: an end-to-end deep learning based prediction serving server that speeds up deployment of image classification, object detection, segmentation and tracking techniques, which is based on NVIDIA Trition server and docker.
Chameleon: an efficient continuous adaptation framework based on NVIDIA TAO.