A dilution-based defense method against poisoning attacks on deep learning systems

Hweerang Park, Youngho Cho


Poisoning attack in deep learning (DL) refers to a type of adversarial attack that injects maliciously manipulated data samples into a training dataset for the purpose of forcing a DL model trained based on the poisoned training dataset to misclassify inputs and thus significantly degrading its performance and reliability. Meanwhile, a traditional defense approach against poisoning attacks tries to detect poisoned data samples from the training dataset and then remove them. However, since new sophisticated attacks avoiding existing detection methods continue to emerge, a detection method alone cannot effectively counter poisoning attacks. For this reason, in this paper, we propose a novel dilution-based defense method that mitigates the effect of poisoned data by adding clean data to the training dataset. According to our experiments, our dilution-based defense technique can significantly decrease the success rate of poisoning attacks and improve classification accuracy by effectively reducing the contamination ratio of the manipulated data. Especially, our proposed method outperformed an existing defense method (Cutmix data augmentation) by 20.9%p at most in terms of classification accuracy.


Adversarial attacks; Adversarial defense method; Adversarial machine learning; Backdoor attacks; Deep learning; Poisoning attacks;

Full Text:


DOI: http://doi.org/10.11591/ijece.v14i1.pp645-652

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).