A data-centric approach towards identification and prediction of anomalies in industrial cyber-physical systems

Abstract

The theory, methodology, industrial use-cases and experimental results elaborated in this thesis are aimed at the need to address unexpected anomalies in industrial Cyber-Physical Systems (CPS). A fairly common example to consider is unexpected delays during data processing, or machine workflow steps, resulting in missed deadlines and affecting the correct operation, or the yield of the system. It is highly advantageous to detect when the system behaviour leaves the normal state and steps into the abnormal domain. It is also highly advantageous to predict the type of anomaly to be experienced by identifying the anomalous behavioural trend. As modern industrial CPS are data-rich ecosystems, we rely on metrics revealing Extra-Functional Behaviour (EFB). To achieve such detection and prediction, our data-centric methodology follows repetitive patterns by considering repeated units of execution, i.e., execution phases, to define suitable compartmentalisations of the executional timeline. We provide our alternative experimental implementations, with different considerations directed towards system visibility, i.e., information position. Through a mixture of fingerprinting constructs, i.e., behavioural signatures and behavioural passports generated from the EFB sensory data, alongside Artificial Intelligence (AI) models, we have been able to achieve anomaly identification accuracies above 99%. We demonstrate how our Classic ML workflow, based on traditional Machine Learning (ML) classifiers, differs from our Advanced DL workflow, based on Convolutional Neural Network (CNN) models. Though both workflows prove to result in highly accurate classification of anomalies, Classic ML is superior in this regard, with 99.23% accuracy against 94.85%.