Abstract
Continual learning approaches are useful to help a model learn new information or new tasks sequentially, while also retaining the previously acquired information. However, such approaches are known to be extremely vulnerable to the adversarial backdoor attacks, where an intelligent adversary can introduce small amount of misinformation in the form of imperceptible backdoor pattern during training to cause deliberate forgetting of a specific task or class at test time. In this work, we propose a novel defensive framework to counter such an attack and use the attacker's primary strength - hiding the backdoor pattern by making it imperceptible to humans - against itself. To do so we train the model to learn a random perceptible pattern as decoy data during training with the deliberate intent to overpower the attacker's imperceptible pattern. We demonstrate the effectiveness of the proposed defensive mechanism through various commonly used replay-based (both generative and exact replay-based) continual learning algorithms using continual learning benchmark variants of CIFAR-10, CIFAR-100, and MNIST datasets. We show that the proposed defensive framework considerably improves the robustness of continual learning algorithms without having any knowledge of the attacker's target task, attacker's target class, shape, size, and location of the attacker's backdoor pattern. Moreover, our defensive framework does not depend on the underlying continual learning algorithm and does not rely on detecting the attack samples (to subsequently remove them from further consideration), but instead attempts to correctly classify even the attack samples and thus ensuring robustness in continual learning models. We refer to the proposed defensive framework as adversary aware continual learning (AACL).
| Original language | English (US) |
|---|---|
| Pages (from-to) | 126108-126121 |
| Number of pages | 14 |
| Journal | IEEE Access |
| Volume | 12 |
| DOIs | |
| State | Published - 2024 |
| Externally published | Yes |
All Science Journal Classification (ASJC) codes
- General Computer Science
- General Materials Science
- General Engineering