The escalating frequency and severity of natural disasters pose significant threats to human health, infrastructure, and ecosystems. The timely and accurate provision of information has the potential to revolutionize disaster management, particularly in accessing visual data for rapid response and recovery. Unmanned aerial systems (UAS) with affordable sensors offer opportunities to collect extensive high-resolution data, especially from inaccessible areas. However, the challenge lies in effectively analyzing these large datasets.
The RescueNet initiative contributes high-resolution UAS imagery with detailed classification, semantic segmentation, and visual question-answering annotations. This challenge will have several tracks including semi-supervised semantic segmentation and VQA.
The data is collected with a small UAS platform, DJI Mavic Pro quadcopters, after Hurricane Michael. The whole dataset has 4494 images, divided into training (~80%), validation (~10%), and test (~10%) sets. For Track 1- Semi-supervised Semantic Segmentation, in the training set, we have around 900 labeled images (~25% of the training set) and around 2695 unlabeled images (~75% of the training set ).
For Track 2- Visual Question Answering (VQA): In total 100,000 questions were generated from 4300 images (RescueNet-VQA dataset). Each image is associated with multiple questions depending on the Question Types. In the training set, we have around 70,000 Image-Question (QA) pairs. The rest of the data sample will be further considered for model evaluation. Both training and evaluation datasets are mutually exclusive.
RescueNet Challenge Tracks
This challenge mainly offers two tracks for post-disaster damage assessment on the RescueNet dataset. The first track focuses on Semantic Segmentation, while the second track focuses on Visual Question Answering.
Semi-Supervised Semantic Segmentation: The semantic segmentation labels include: 1) Background, 2) Water, 3)Building No Damage, 4) Building Minor Damage, 5) Building Major Damage, 6) Building Total Destruction, 7) Road-Clear, 8) Road-Blocked, 9) Vehicle, 10) Tree, 11) Pool. Only a small portion of the training images have their corresponding masks available.
For the Visual Question Answering (VQA) task, we will provide QA pairs. These questions can be divided into the 9 categories: Simple Counting (SC), Complex Counting (CC), Building Condition Recognition (BCR), Road Condition Recognition (RCR), Level of Damage (LOD), Risk Assessment (RA), Density Estimation (DE), Positional (POS) and Change Detection (CD).
In Simple Counting, we mainly ask to count the objects within the images regardless of attributes. For example, “How many buildings are in the image?”.
In Complex Counting, attribute-based counting questions will be asked. “How many majorly damaged structures are seen in the images?”.
In Building Condition Recognition, we ask questions to identify whether there is any building present for a given attribute. “Are there any structures that have been destroyed completely by the disaster?” is an example of this question category.
In Road Condition Recognition, we consider comprehending the road condition by asking questions. ”What is the condition of the road?”, “Is any part of the road undamaged in this scene?”.
In Level of Damage, we ask questions to identify the overall level of damage in a scene. “How badly damaged is the scene?” is an example.
In Risk Assessment, we evaluate the risk due to the damages. “Does the recovery action need to be taken urgently?” is an example of a question for this category.
In the Density Estimation question category, we estimate the density of buildings in a given scene. This category includes questions such as “How dense is the area?”.
Dataset includes the Positional question category for which high-level scene understanding is necessary. “How much damage does the largest building in this image have?”, and “What is the damage status of the smallest building in this image?” are examples.
In Change Detection, we identify whether there is a change in the number of buildings before and after the disaster due to the total number of destroyed buildings. “Is there any change in the number of buildings after the disaster on the scene?”.
How to Participate
Submit the predicted result to corresponding validation and test set in Codabench. For VQA, prediction results need to be submitted through text (.txt) file.
Codabench Link for Track 1: https://www.codabench.org/competitions/1409/
Codabench Link for Track 2: https://www.codabench.org/competitions/1550/
Submit the following supplementary documents in the provided links for each track
A 4-page document written in NeurIPS format describing the method and results at the end of the competition.
Trained model weight and source code for evaluation/inference
Relevant training curves (loss and/or accuracy) of the trained model
Relevant training logs that shows training loss and/or accuracy in each epoch
The semantic Segmentation task will be evaluated based on the Mean IoU score.
Visual Question Answering will be evaluated based on the accuracy score. We will consider only one answer (answer with the highest probability from the model) for each question.
Challenge starts: October 15, 2023
Development phase of Challenge ends: December 1, 2023
Testing phase of the Challenge: December 1, 2023 - December 10, 2023
Winners announced: at NeurIPS 2023
Maryam Rahnemoonfar, Computer Vision and Remote Sensing Laboratory (Bina Lab), Lehigh University
Younghyun Koo, Computer Vision and Remote Sensing Laboratory (Bina Lab), Lehigh University
Tashnim Chowdhury, UMBC
Argho Sarkar, UMBC
Robin Murphy, Texas A&M University
Leila Hashemi Beni, North Carolina A&T