ACM ICMR 2021
Radar Object Detection Challenge on Radio Frequency (RF) Images for Autonomous Driving Related Applications
Challenge Available: Jan - Apr 2021
The dataset can only be used for academic purposes. By using this dataset and related software, you agree to cite our dataset and baseline paper. Download link will be available before the challenge starts.
Evaluation server is hosted on CodaLab. For the detailed evaluation and submission instructions, please refer to the evaluation server. Link for the evaluation server and leaderboard: https://competitions.codalab.org/competitions/28019.
Baseline Method (RODNet)
The source code for the baseline method (RODNet) can be found: https://github.com/yizhou-wang/RODNet. Please cite our baseline paper if it is helpful for your research:
Registration start date: January 15th, 2021
Dataset available date: January 18th, 2021
First-phase submission start date: January 18th, 2021
First-phase submission deadline: March 12th, 2021
Second-phase submission start date: March 12th, 2021
Second-phase submission deadline: March 26th, 2021
Winner announcement date: April 5th, 2021
Introduction and Significance
Among the sensors (e.g., camera, LiDAR, radar, etc.) that are commonly used in autonomous or assisted driving strategies, radar has usually been considered as a robust and cost-effective solution even in adverse driving scenarios, e.g., weak/strong lighting and bad weather. As high requirement on universality for the autonomous driving systems, radar is an important alternative not only for compensation but also as a replacement for semantic perception purpose.
However, object detection task on radar data is not well explored either in academia or industry. The reasons can be concluded into three folds:
Radar signal, especially radio frequency (RF) data, is not an intuitive type of data like RGB images, so that its role is seriously underestimated.
Significantly limited public datasets with proper object annotations are available for machine learning mechanisms.
It is noticeably difficult to extract semantic information for object classification from the radar signals.
Although some related research works [1, 2, 3, 4] have explored the object detection task on RF data, there is no public benchmark in this area, to the best of our knowledge. To further explore the radar’s potentials and attract more attention in the field, a public benchmark is crucial and urgently required.
In this challenge, we aim to detect and classify the objects in the radar’s field of view (FoV) based on our previous work  and self-collected ROD2021 dataset. The radar data is represented by normalized radio frequency (RF) images in radar’s range-azimuth, i.e., bird’s-eye view (BEV), coordinates. There are four different driving scenarios in this challenge, i.e., parking lot, campus road, city street, and highway. The provided object annotations in the training set are mixed by both human labels and camera-radar fused (CRF) annotations , with three different object categories, i.e., pedestrian, cyclist, and car. The final score is evaluated by the overall AP on the testing set.
Our previous work RODNet  will be used as the baseline method for this challenge, which is accepted by WACV 2021 with very high reviewing score. It is a good attempt to address the radar object detection problem, using the annotation systematically generated by a proposed CRF algorithm. This algorithm is also used to provide annotations of the training set in this challenge.
We will use the ROD2021 dataset, which is a subset of the CRUW dataset for this challenge. The sensor platform for the ROD2021 dataset contains an RGB camera and a 77GHz FMCW MMW radar, which are well-calibrated and synchronized. The framerate of camera and radar are both 30 FPS. The radar’s field of view (FoV) is 0-25m, ±60°.
There are 50 sequences in total, where 40 for training and 10 for testing. Each sequence contains around 800-1700 frames in four different driving scenarios, i.e., parking lot (PL), campus road (CR), city street (CS), and highway (HW). Besides, we also have several vision-hard sequences of poor image quality, i.e., weak/strong lighting, blur, etc. These data are only used for testing/evaluation purpose to illustrate radar’s robustness compared with camera.
In this challenge, the training set contains both RGB and RF image sequences, while the testing set only contains radar RF image sequences. We provide the annotations on the RF images for all the training data, and the participates are also allowed to use both RGB and RF images during the training stage. But the RF images are the only allowed input data in the testing stage.
The provided camera data in the training set are in the format of RGB images sequences. The images are pre-processed by the following steps:
Image undistortion (for monocular camera) or rectification (for stereo cameras).
Besides the four different driving scenarios, there are three status for camera data, i.e., normal, blur, and night, representing different qualities of the RGB images.
The provided radar data are pre-processed sequences of range-azimuth heatmaps after several Fast Fourier Transforms (FFTs). The technical details of the pre-processing  can be described as follows:
First, we implement FFT on the samples to estimate the range of the reflections.
A low-pass filter (LPF) is then utilized to remove the high-frequency noise across all chirps (described below) in each frame at the rate of 30 FPS.
After the LPF, we conduct a second FFT on the samples along different receiver antennas to estimate the azimuth angle of the reflections and obtain the final RF images.
After the pre-processing, we will get the normalized radar RF images for different frames at 30 FPS. Differ from the video sequences, the FMCW radar has a special definition called chirp. The chirp can be treated as “sub-frame”, that are uniformly distributed in temporal within each frame. Although our radar sensor can provide 255 chirps per frame, we only release the data of 4 chirps per frame that are uniformly selected. In our experiments, we can achieve good object detection performance using only 4 chirps.
The object annotations on the training set will be provided for this challenge. The annotations include the object classes and their locations in the radar’s range-azimuth coordinates. There are three different classes of objects we need to distinguish, i.e., pedestrian, cyclist, and car. There are no other object attributes included (e.g., object size) in this challenge. The training set annotations are provided through the either two ways: human-labeling and camera-radar fusion (CRF) algorithm , while the testing set annotations are purely from the human-labeled ground truth. Since we also provide the RGB images for the training set, the participates can use both RGB and RF images and design their own annotation generation method. Hand-labeling on the training set is not allowed.
The ROD2021 dataset (a subset of CRUW) for this challenge will be available to the participants once the challenge starts. The participates are required to use the provided training set with annotations to develop an object detection method using the radar data only as the input. The participates are also allowed to propose their own object annotation methods based on the RGB and RF images in the training set, but the proposed object annotation method needs to be clearly described in your method description as well as any future paper at ICMR 2021. The object detection results should be submitted to our evaluation server (hosted by CodaLab), including the object classes and object locations in the radar range-azimuth coordinates, i.e., in the bird’s-eye view. Each object in the radar’s FoV is represented by a point in the RF image.
There will be two phases for this challenge:
First phase: randomly select 30% from the overall testing set for evaluation.
Second phase: the remaining 70% of the testing set. The final score is the AP in the second phase.
Some detailed rules are listed as follows:
The participates can form their own teams from different organizations and the number of participants is not limited. But only one team is allowed from an individual organization.
The participates are NOT allowed to use external data for either training or validation.
The teams need to provide their opensource code through GitHub after the challenge results announcement.
The participates are not allowed to use extra information from human labeling on the training dataset or testing dataset for the challenge’s target labels.
The participates are allowed to propose their own object annotation methods based on the RGB and RF images in the training set, but the proposed object annotation method needs to be clearly described in your method description as well as any future paper at ICMR 2021.
During each of the two phases in the competition, each team can only submit their results for evaluation once per day, and less than 10 attempts in total.
Remember to submit your best results to the leaderboard before the phase deadline.
The provided dataset can only be used for academic purposes. By using this dataset and related software, you agree to cite our dataset and baseline paper .
The teams with good performance will be invited to submit their papers. The top-3 teams will be invited to present their works in the ACM ICMR 2021.
The prizes for the winners are sponsored by ETRI. ETRI will provide the prizes as follows:
First Place: $2000
Second Place: $1000
Third Place: $500
Yizhou Wang (firstname.lastname@example.org): Ph.D. Student, University of Washington.
Jenq-Neng Hwang (email@example.com): Professor, University of Washington.
Hui Liu (firstname.lastname@example.org): Professor, University of Washington; President, Silkwave Holdings Limited.
Gaoang Wang (email@example.com): Assistant Professor, Zhejiang University.
Kwang-Ju Kim (firstname.lastname@example.org): Electronics and Telecommunications Research Institute (ETRI), South Korea.
 Wang, Yizhou, et al. "RODNet: Radar Object Detection using Cross-Modal Supervision." Winter Conference on Applications of Computer Vision (WACV). IEEE, 2021.
 Major, Bence, et al. "Vehicle Detection With Automotive Radar Using Deep Learning on Range-Azimuth-Doppler Tensors." Proceedings of the IEEE International Conference on Computer Vision Workshops. 2019.
 Ouaknine, Arthur, et al. "CARRADA Dataset: Camera and Automotive Radar with Range-Angle-Doppler Annotations." arXiv preprint arXiv:2005.01456 (2020).
 Palffy, Andras, et al. "CNN based Road User Detection using the 3D Radar Cube." IEEE Robotics and Automation Letters 5.2 (2020): 1263-1270.