Home

Awesome

MSEG-VCUQ: Multimodal SEGmentation with Enhanced Vision Foundation Models, Convolutional Neural Networks, and Uncertainty Quantification for High-Speed Video Phase Detection Data

MIT Vision Models Uncertainty Quantification Open Source

This repository contains the codebase and resources for the MSEG-VCUQ framework, as detailed in the paper, "MSEG-VCUQ: Multimodal SEGmentation with Enhanced Vision Foundation Models, Convolutional Neural Networks, and Uncertainty Quantification for High-Speed Video Phase Detection Data." The framework integrates vision foundation models and convolutional neural networks (CNNs) with uncertainty quantification (UQ) to advance segmentation accuracy and reliability across diverse HSV modalities.


📁 Repository Structure

<details> <summary><strong>cnn_uq/</strong> - Convolutional Neural Networks and Uncertainty Quantification</summary>

This folder focuses on U-Net Convolutional Neural Networks (CNNs) and Uncertainty Quantification (UQ) for HSV segmentation. Highlights include:

</details> <details> <summary><strong>videosam/</strong> - Vision Foundation Models: VideoSAM Framework</summary>

This folder contains the VideoSAM framework, which integrates convolutional neural networks (CNNs) with the transformer-based Segment Anything Model (SAM). It represents the vision foundation model component of the MSEG-VCUQ framework, specifically tailored for multimodal segmentation tasks. Key features include:

</details> <details> <summary><strong>paper/</strong> - Research Documentation</summary>

This folder contains:

</details> <details> <summary><strong>README.md</strong></summary>

This file provides a comprehensive overview of the repository and guides users in navigating its structure.

</details>

🚀 Getting Started

To begin using MSEG-VCUQ, follow these steps:

  1. Clone the repository:

    git clone https://github.com/chikap421/mseg_vcuq.git
    cd mseg_vcuq
    
  2. Explore the subdirectories (cnn_uq, videosam, paper) for specific tools, datasets, and models.

  3. Install required dependencies:

    pip install -r requirements.txt
    
  4. Follow folder-specific README.md files for instructions on replicating results or running experiments.


🔗 Links and Resources


📜 License

This repository is licensed under the MIT License. See the LICENSE file for details.

🖋️ Citations

If you use this repository in your research, please cite:

@misc{maduabuchi2024msegvcuqmultimodalsegmentationenhanced,
      title={MSEG-VCUQ: Multimodal SEGmentation with Enhanced Vision Foundation Models, Convolutional Neural Networks, and Uncertainty Quantification for High-Speed Video Phase Detection Data}, 
      author={Chika Maduabuchi and Ericmoore Jossou and Matteo Bucci},
      year={2024},
      eprint={2411.07463},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2411.07463}, 
}

🌟 Acknowledgments

We acknowledge the contributions of the MIT Red Lab, collaborators, and funding agencies that supported this research.