Online continual learning from data streams in dynamic environments is a critical direction in the computer vision field. However, realistic benchmarks and fundamental studies in this line are still missing. To bridge the gap, we present a new online continual object detection benchmark with an egocentric video dataset, Objects Around Krishna (OAK). The emergence of new object categories in our benchmark follows a pattern similar to what a single person might see in their day-to-day life. The dataset also captures the natural distribution shifts as the person travels to different places. These egocentric long running videos provide a realistic playground for continual learning algorithms, especially in online embodied settings.
OAK provides exhaustive bounding box annotations of 80 video snippets (∼17.5 hours) for 105 object categories in outdoor scenes where 16 classes are from the PASCAL VOC dataset and the remaining categories are frequent classes determined by running the LVIS pre-trained Mask R-CNN model on the raw videos. We evaluate the performance of a model from three aspects:
We provide documents and tools for inspection, preparation, and evaluation of the Wanderlust Challenge.
Here we list the sources for Wanderlust Challenge. For instructions on how to setup and use this data, please see the Wanderlust devkit. The following files are available for download:
Please check our evaluation server for more up-to-date results.