TY - GEN
T1 - Integrating symbolic and statistical methods for testing intelligent systems
T2 - 19th Design, Automation and Test in Europe Conference and Exhibition, DATE 2016
AU - Ramanathan, Arvind
AU - Pullum, Laura L.
AU - Hussain, Faraz
AU - Chakrabarty, Dwaipayan
AU - Jha, Sumit Kumar
N1 - Publisher Copyright:
© 2016 EDAA.
PY - 2016/4/25
Y1 - 2016/4/25
N2 - Embedded intelligent systems ranging from tiny implantable biomedical devices to large swarms of autonomous unmanned aerial systems are becoming pervasive in our daily lives. While we depend on the flawless functioning of such intelligent systems, and often take their behavioral correctness and safety for granted, it is notoriously difficult to generate test cases that expose subtle errors in the implementations of machine learning algorithms. Hence, the validation of intelligent systems is usually achieved by studying their behavior on representative data sets, using methods such as cross-validation and bootstrapping. In this paper, we present a new testing methodology for studying the correctness of intelligent systems. Our approach uses symbolic decision procedures coupled with statistical hypothesis testing to validate machine learning algorithms. We show how we have employed our technique to successfully identify subtle bugs (such as bit flips) in implementations of the k-means algorithm. Such errors are not readily detected by standard validation methods such as randomized testing. We also use our algorithm to analyze the robustness of a human detection algorithm built using the OpenCV open-source computer vision library. We show that the human detection implementation can fail to detect humans in perturbed video frames even when the perturbations are so small that the corresponding frames look identical to the naked eye.
AB - Embedded intelligent systems ranging from tiny implantable biomedical devices to large swarms of autonomous unmanned aerial systems are becoming pervasive in our daily lives. While we depend on the flawless functioning of such intelligent systems, and often take their behavioral correctness and safety for granted, it is notoriously difficult to generate test cases that expose subtle errors in the implementations of machine learning algorithms. Hence, the validation of intelligent systems is usually achieved by studying their behavior on representative data sets, using methods such as cross-validation and bootstrapping. In this paper, we present a new testing methodology for studying the correctness of intelligent systems. Our approach uses symbolic decision procedures coupled with statistical hypothesis testing to validate machine learning algorithms. We show how we have employed our technique to successfully identify subtle bugs (such as bit flips) in implementations of the k-means algorithm. Such errors are not readily detected by standard validation methods such as randomized testing. We also use our algorithm to analyze the robustness of a human detection algorithm built using the OpenCV open-source computer vision library. We show that the human detection implementation can fail to detect humans in perturbed video frames even when the perturbations are so small that the corresponding frames look identical to the naked eye.
UR - http://www.scopus.com/inward/record.url?scp=84973662053&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84973662053&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84973662053
T3 - Proceedings of the 2016 Design, Automation and Test in Europe Conference and Exhibition, DATE 2016
SP - 786
EP - 791
BT - Proceedings of the 2016 Design, Automation and Test in Europe Conference and Exhibition, DATE 2016
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 14 March 2016 through 18 March 2016
ER -