Using vision transformer for retinal video object detection and instance-level semantic segmentations

This project is mainly focusing on utilizing vision transformer based models such as Detection Transformer (DETR) as well as Panoptic Instance-level Semantic Segmentation models for retinal video tasks. We will be using a dataset containing 3800 training images, all annotated for different features on the retinal images. Our goal is to establish a benchmark performance in terms of using DNN model to detect aforementioned features regarding the precision, recall (false negative, false positive of predicted objects), ROC curve, mean-average-precision (mAP), so on and so forth. Potentially then we will be investigating any possibilities in terms of improvement, for example, data-centric approaches to find most valuable training data, ablation studies, etc.

Term

Spring 2023

Topic

Data Visualizations

Physical Science/Engineering

Technical Area(s)

Machine Learning (ML)

Featured

Off