Error Propagation Analysis for Hybrid Block Diagram and Finite State Machine Models
Final Report Abstract
In this project, we have developed a new approach to design dependable heterogeneous control systems with hybrid Simulink time-discrete Block Diagrams(BD) and Finite State Machines (FSM) and analyze their reliability characteristics at the model level under the influence of random hardware faults. Three key scientific contributions are listed below. The first contribution of the project is a new reliability evaluation method for control algorithms. Control system engineers prefer to use MATLAB Simulink to design complex control systems. It is essential to develop reliable control systems at the Simulink level. A specific control algorithm can be implemented through different but functionally equivalent combinations of Simulink blocks. These implementations may exhibit utterly different reliability properties. In this project, we have introduced the method for the reliability evaluation of control systems. This method analyzes the assembly code generated from a Simulink model and transforms it into the stochastic Dual-graph Error Propagation Model (DEPM) of data errors in RAM and CPU. The numerical computation of reliability properties is achieved via the automatic transformation of the DEPM model into one or several discrete-time Markov chain models and the application of state-of-the-art probabilistic model checking techniques. As an illustrative case study, the introduced analytical method was applied to four functionally equivalent Simulink implementations of a PID controller. The method reveals the most reliable (i.e., least vulnerable to data errors) Simulink implementation. The second contribution of this project is a set of fault-tolerant design patterns that can be applied at the control algorithm level. Comprehensive analysis of common dependability-oriented hardware and software fault-tolerant designs has revealed several implementation-independent conceptual similarities that have been formalized into design patterns. We have introduced three basic design patterns: comparison, voting, and sparing. These basic patterns can be combined to form more complex and efficient structures. It has been demonstrated that the well-known dependability-oriented hardware and software fault-tolerant architectures follow these patterns. These patterns have been evaluated in the Simulink level. Based on them, we have introduced the MORE, a MOdel-based REdundancy approach to achieve fault tolerance. The MORE can be applied in the model-based design phase to tolerate soft errors caused by hardware-related faults. The MORE is independent of any development tool-chain or target hardware platform. It helps control system engineers to directly protect vulnerable parts with fault tolerance mechanisms at the model level. The third contribution is the computationally efficient reliability evaluation of control systems at the model level. We developed the method for the reliability evaluation of Simulink models under data errors in RAM and CPU registers at the model level in the early development phase. The reliability properties of each individual Simulink block that have been evaluated by the probabilistic modeling of data errors at the assembly level are stored in a database. Then, these results are applied at the Simulink model level. The Simulink model is transformed into a model-level DEPM, taking into account block functions (output and update), their execution sequence, and data flow structure. The effectiveness of the proposed approach has been assessed and verified experimentally by performing the reliability evaluation of Simulink models both at the assembly level and at the model level. The experimental results have indicated that the evaluated reliability properties at the Simulink model level are almost equivalent to those evaluated at the assembly level. However, this combined model-level analysis is much more computationally efficient. The sizes of the generated DTMCs for the model-level assessment were significantly smaller than for the assembly-level. As an attractive feature, the proposed model-level assessment allows control system engineers to make design decisions and reliability assessments at the model level, where the model-based design is applied.
Publications
- “Automatic transformation of uml system models for model-based error propagation analysis of mechatronic systems,” IFAC-PapersOnLine, vol. 49, no. 21, pp. 439–446, 2016
K. Ding, T. Mutzke, A. Morozov, and K. Janschek
(See online at https://doi.org/10.1016/j.ifacol.2016.10.643) - “Flight control software failure mitigation: Design optimization for software-implemented fault detectors,” IFAC-PapersOnLine, vol. 49, no. 17, pp. 248–253, 2016
A. Morozov and K. Janschek
(See online at https://doi.org/10.1016/j.ifacol.2016.09.043) - “Classification of hierarchical fault-tolerant design patterns,” in 2017 IEEE 15th Intl Conf on Dependable, Autonomic and Secure Computing (DASC), pp. 612–619, IEEE, 2017
K. Ding, A. Morozov, and K. Janschek
(See online at https://doi.org/10.1109/DASC-PICom-DataCom-CyberSciTec.2017.108) - “Test suite prioritization for efficient regression testing of model-based automotive software,” in 2017 International Conference on Software Analysis, Testing and Evolution (SATE), pp. 20–29, IEEE, 2017
A. Morozov, K. Ding, T. Chen, and K. Janschek
(See online at https://doi.org/10.1109/SATE.2017.11) - “More: Model-based redundancy for simulink,” in International Conference on Computer Safety, Reliability, and Security, pp. 250–264, Springer, 2018
K. Ding, A. Morozov, and K. Janschek
(See online at https://doi.org/10.1007/978-3-319-99130-6_17) - “Reliability evaluation of functionally equivalent simulink implementations of a pid controller under silent data corruption,” in 2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE), pp. 47–57, IEEE, 2018
K. Ding, A. Morozov, and K. Janschek
(See online at https://doi.org/10.1109/ISSRE.2018.00016) - “Efficient model-level reliability analysis of simulink models,” in International Conference on Computer Safety, Reliability, and Security, Springer, 2019
K. Ding, A. Morozov, and K. Janschek
(See online at https://doi.org/10.1007/978-3-030-26601-1_10) - “Openerrorpro: A new tool for stochastic model-based reliability and resilience analysis,” in 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE), IEEE, 2019
A. Morozov, K. Ding, M. Steurer, and K. Janschek
(See online at https://doi.org/10.1109/ISSRE.2019.00038)