This project is mainly focusing on utilizing vision transformer based models such as Detection Transformer (DETR) as well as Panoptic Instance-level Semantic Segmentation models for retinal video tasks. We will be using a dataset containing 3800 training images, all annotated for different features on the retinal images. Our goal is to establish a benchmark performance in terms of using DNN model to detect aforementioned features regarding the precision, recall (false negative, false positive of predicted objects), ROC curve, mean-average-precision (mAP), so on and so forth. Potentially then we will be investigating any possibilities in terms of improvement, for example, data-centric approaches to find most valuable training data, ablation studies, etc.

Using vision transformer for retinal video object detection and instance-level semantic segmentations - Spring 2023 Discovery Project
Term
Spring 2023
Topic
Data Visualizations
Physical Science/Engineering
Technical Area(s)
Machine Learning (ML)