TIDE: Test-Time Few-Shot Object Detection

Weikai Li, Hongfeng Wei, Yanlai Wu, Jie Yang, Yudi Ruan, Yuan Li, Ying Tang

Research output: Contribution to journalArticlepeer-review

Abstract

Few-shot object detection (FSOD) aims to extract semantic knowledge from limited object instances of novel categories within a target domain. Recent advances in FSOD focus on fine-tuning the base model based on a few objects via meta-learning or data augmentation. Despite their success, the majority of them are grounded with parametric readjustment to generalize on novel objects, which face considerable challenges in Industry 5.0, such as 1) a certain amount of fine-tuning time is required and 2) the parameters of the constructed model being unavailable due to the privilege protection, making the fine-tuning fail. Such constraints naturally limit its application in scenarios with real-time configuration requirements or within black-box settings. To tackle the challenges mentioned above, we formalize a novel FSOD task, referred to as test-time few-shot detection (TIDE), where the model is un-tuned in the configuration procedure. To that end, we introduce an asymmetric architecture for learning a support-instance-guided dynamic category classifier. Further, a cross-attention module and a multiscale resizer are provided to enhance the model performance. Experimental results on multiple FSOD platforms reveal that the proposed TIDE significantly outperforms existing contemporary methods. The implementation codes are available at https://github.com/deku-0621/TIDE.

Original languageEnglish (US)
Pages (from-to)1-10
Number of pages10
JournalIEEE Transactions on Systems, Man, and Cybernetics: Systems
DOIs
StateAccepted/In press - 2024
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Software
  • Control and Systems Engineering
  • Human-Computer Interaction
  • Computer Science Applications
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'TIDE: Test-Time Few-Shot Object Detection'. Together they form a unique fingerprint.

Cite this