Runtime Monitoring for Safe Automated Driving Systems

Full text

(1)The main reason is the shift of all driving responsibilities, including the responsibility for the overall vehicle safety, from the human driver to a computer-based system responsible for the automated driving functionality (i.e., the Automated Driving System (ADS)). Such a shift makes the ADS highly safety-critical, and the consensus of cross-domain experts is that there is no “silver bullet” for ensuring the required levels of safety. Instead, a set of complementary safety methods are necessary. In this context, runtime monitoring that continuously verifies the safe operation of the ADS, once deployed on public roads, is a promising complementary approach for ensuring safety. However, the development of a runtime monitoring solution is a challenge on its own. On a conceptual level, the complex and opaque technology used in ADS often makes researchers doubt “what” a runtime monitor should verify and “how” such verification should be performed. This thesis proposes novel runtime monitoring solutions for verifying the safe operation of ADS. On a conceptual level, a novel Runtime Verification (RV) approach, namely the Safe Driving Envelope- Verification (SDE-V), answers the “what” and “how” of monitoring an ADS. In particular, the SDE-V approach verifies whether the ADS path planner output (i.e., a trajectory) is safe to be executed by the vehicle’s actuators. To perform this verification, the trajectory is checked against the following safety rules: (i) trajectory not leading into collision with obstacles on the road, and (ii) trajectory not leaving the road edge. Since the beginning of 2014, Ayhan Mehmed has been involved in a three-year European Industrial Doctorate Programme RetNet. Thanks to that programme, he has been an (i) Industrial Ph.D. student at the School of Innovation, Design, and Engineering at Mälardalen University, Sweden and (ii) a Marie Curie researcher at TTTech Computertechnik AG in Vienna, Austria. Since 2017, he has been involved in multiple industrial projects to develop concepts for safety-critical automated driving systems. Today, Ayhan is a Technology Expert at The Autonomous, where he seeks to establish and maintain a global network of technical experts that collaborate on relevant safety challenges from the automated driving field.. Address: P.O. Box 883, SE-721 23 Västerås. Sweden Address: P.O. Box 325, SE-631 05 Eskilstuna. Sweden E-mail: info@mdh.se Web: www.mdh.se. ISBN 978-91-7485-489-3 ISSN 1651-4238. Mälardalen University Doctoral Dissertation 324. Ayhan Mehmed RUNTIME MONITORING FOR SAFE AUTOMATED DRIVING SYSTEMS 2020. Mass-produced passenger vehicles are one of the greatest inventions of the 20th century that significantly changed human lives. Several safety measures such as traffic signs, traffic lights, mandatory driver education, seat belts, airbags, and anti-lock braking systems were introduced throughout the years. Today, a further increase in safety, comfort, and efficiency is being targeted by developing systems with automated driving capabilities. These systems range from those supporting the driver with a particular driving task to taking all driving responsibilities from the driver (i.e., full driving automation). The development and series production of the former has already been accomplished, whereas reaching full driving automation still presents many challenges.. Runtime Monitoring for Safe Automated Driving Systems Ayhan Mehmed.

(2)

(3)

(4)

(5)

(6) .

(7)

(8) . . .

(9)

(10)

(11)

(12) .

(13) !"

(14) #$ % && '()*(+)*,*( +-,+*

(15) .(3ULQW$%6WRFNKROP2 .

(16) Abstract Mass-produced passenger vehicles are one of the greatest inventions of the 20th century that significantly changed human lives. Several safety measures such as traffic signs, traffic lights, mandatory driver education, seat belts, airbags, and anti-lock braking systems were introduced throughout the years. Today, a further increase in safety, comfort, and efficiency is being targeted by developing systems with automated driving capabilities. These systems range from those supporting the driver with a particular function (e.g., ensuring vehicle drives with constant speed while keeping a safe distance to other road participants) to taking all driving responsibilities from the driver (i.e., full driving automation). The development and series production of the former has already been accomplished, whereas reaching full driving automation still presents many challenges. The main reason is the shift of all driving responsibilities, including the responsibility for the overall vehicle safety, from the human driver to a computerbased system responsible for the automated driving functionality (i.e., the Automated Driving System (ADS)). Such a shift makes the ADS highly safetycritical, and the consensus of cross-domain experts is that there is no “silver bullet” for ensuring the required levels of safety. Instead, a set of complementary safety methods are necessary. In this context, runtime monitoring that continuously verifies the safe operation of the ADS, once deployed on public roads, is a promising complementary approach for ensuring safety. However, the development of a runtime monitoring solution is a challenge on its own. On a conceptual level, the complex and opaque technology used in ADS often makes researchers doubt “what” a runtime monitor should verify and “how”such verification should be performed.. i.

(17) ii. This thesis proposes novel runtime monitoring solutions for verifying the safe operation of ADS. On a conceptual level, a novel Runtime Verification (RV) approach, namely the Safe Driving Envelope - Verification (SDE-V), answers the “what” and “how” of monitoring an ADS. In particular, the SDE-V approach verifies whether the ADS path planner output (i.e., a trajectory) is safe to be executed by the vehicle’s actuators. To perform this verification, the trajectory is checked against the following safety rules: (i) trajectory not leading into collision with obstacles on the road, and (ii) trajectory not leaving the road edge. Towards realizing the proposed SDE-V concept into an actual solution, additional concepts, methods, and architectural solutions have been developed. Our contributions in this context include : (i) a concept for reducing the false positive rate of SDE-V, (ii) a method for evaluating the quality of runtime monitors by investigating to what extent they can handle faults related to different classes of real accident scenarios, (iii) a modular and scalable fail-operational architecture which enables integration of multiple RV approaches alongside the SDE-V, (iv) estimation of a “forecast horizon” to ensure the timely execution of emergency actions upon an ADS failure detection by SDE-V, and (v) an approach to tackle the out-of-sequence measurement problem in sensor fusion-based ADS. A prototype implementation of SDV-E has been realized on an automotive-grade embedded platform. Based on its promising results, a future industrial implementation project has been initiated..

(18) Sammanfattning Massproducerade personbilar a¨ r en av 1900-talets största uppfinningar som väsentligt förändrade människors liv. Genom a˚ ren har flera säkerhets˚atgärder introducerats, s˚asom trafikskyltar, trafikljus, obligatorisk förarutbildning, säkerhetsbälten, krockkuddar och l˚asningsfria bromsar. Införandet av automatiserade körfunktioner lovar ytterligare o¨ kad passagerarsäkerhet, körkomfort och effektivitet. Dessa system sträcker sig fr˚an (i) de som stöder föraren med en viss funktion (t.ex. att fordonet körs med konstant hastighet samtidigt som de h˚aller ett säkert avst˚and till andra trafikanter) till (ii) att ta o¨ ver allt köransvar fr˚an föraren (dvs. en självkörand bil). Adaptiva farth˚allare finns redan i m˚anga bilar, medan det fortfarande a˚ terst˚ar m˚anga utmaningar innan självkörande bilar kan introduceras p˚a marknaden. Den främsta utmaningen a¨ r relaterad o¨ verföring av allt köransvar, inklusive ansvaret för den totala fordonssäkerheten, fr˚an den mänskliga föraren till ett datorbaserat system som ansvarar för den autonoma körfunktionen (systemet som hanterar detta kallas Automated Driving System (ADS)). Ett ADS a¨ r i högsta grad säkerhetskritiskt, och experter inom flera discipliner a¨ r o¨ verens om att det inte finns n˚agon enkel lösning för att säkerställa erforderlig säkerhet. I stället a¨ r det nödvändigt med en uppsättning kompletterande säkerhetsmetoder. Ett lovande kompletterande tillvägag˚angssätt i detta sammanhang a¨ r o¨ vervakning under körning som kontinuerligt verifierar säker drift av ADS för att säkerställa att säkerheten bibeh˚alls. Utvecklingen av en s˚adan lösning a¨ r dock en utmaning i sig. P˚a begreppsniv˚a ställer den komplexa teknik som används i ADS forskare inför fr˚agan ”vad behöver verifieras och hur ska verifieringen g˚a till?”. Denna avhandling föresl˚ar nya lösningar för o¨ vervakning under körning för att verifiera säker drift av ADS. P˚a en konceptuell niv˚a svarar en ny o¨ vervakningsmetod – en s˚a kallad runtime verification (RV) -metod – nämligen iii.

(19) iv. Safe Driving Envelope-Verification (SDE-V), p˚a ”vad” som behöver o¨ vervakas och ”hur” o¨ vervakning av en ADS kan genomföras. I synnerhet verifierar SDEV-metoden om ADS planering av fordonets väg (bana) a¨ r säker att köra. För att utföra denna verifiering sker kontroll mot följande säkerhetsregler: (i) bana som inte leder till kollision med hinder p˚a vägen och (ii) bana som inte lämnar vägbanan. För att förverkliga det föreslagna SDE-V-konceptet utvecklades ytterligare koncept, metoder och arkitekturella lösningar. V˚ara huvudsakliga bidrag utgörs av: (i) ett koncept för att minska felaktiga slutsatser i SDE-V-konceptet, (ii) en metod för utvärdering av hur väl olika metoder för o¨ vervakning under körning fungerar, utg˚aende fr˚an ett antal scenarier baserade p˚a verkliga olyckor, (iii) en modulär och skalbar feltolerant (fail-operational) arkitektur som möjliggör integrering av flera RV-tillvägag˚angssätt tillsammans med SDEV, (iv) uppskattning av den s˚a kallade ”prognoshorisonten” för att säkerställa tillräckligt snabb aktivering av av nöd˚atgärder när SDE-V detekterar en kritisk situation, och v) en metod som hanterar problem med ordningen av händelser ADS som använder sig av aggregering av data fr˚an flera givare (sensor fusion). En prototypimplementering av SDV-E p˚a en industriell fordonsplattform har givit s˚a lovande resultat att ett industriellt implementeringsprojekt initierats..

(20) In loving memory of my grandparents, Kerime and Mehmed.

(21)

(22) Acknowledgements Special gratitudes to Michael Scopchanov, who back in 2014 was the person that motivated and supported me to start this journey. An outstanding teacher and friend - I hope to make you proud of this work. Throughout the years, I had the fortune of meet and work with truly inspirational people and good friends. Special thanks to my supervisors Wilfried ˇ sević, and Hans Hansson for believSteiner, Sasikumar Punnekkat, Aida Cauˇ ing in me and guiding me in my research and life matters. Many thanks to my project mates that become truly good friends: Elena Lisova, Marina Gutiérrez, Francisco Pozo, and Pablo Gutiérrez Peón. The memories we made together were fantastic: Elena and the “Crazy Shark”, Francisco and his experiences in Floridsdorf, Marina and her life advices, and sharing a house with Pablo not one but three times! Furthermore, I’d like to say thank you to Moritz Antlanger, Christoph Schulze, Maliˇsa Marijan, and Velibor Ilić for working together on the Safety Co-pilot project - thank you for the hard work and the pleasant times together. Special thanks to Salvador Rodriguez Lopez for supporting my one-year study leave, which helped for the significant progress in this thesis work. I am also thankful to the people at MDH for being always friendly and welcoming. With the hope not to miss someone, I’d like to say thank you to Simin Cai, Sara Afshar and Mohammad Ashjaei, Svetlana Girs, Saad Mubeen, Matthias Becker, Mirgita Frasheri, Per Hellström, Alessio Bucaioni, Leo Hatvani, and Nesredin Mahmud. You folks made such a positive impact and warmed me with your humanity in these cold winter days. To the Jungschar friends from Vienna - Alex, Berni, Danny, Ei, Ines, Kathrin, Lena, Lisa, Thomas. Thank you for the positive memories in the past years - apart from the joy, they gave me the energy to come back to the thesis work with a fresh mind. vii.

(23) viii. To my parents Gyulnaz and Ilhan, who unconditionally dedicated their lives in teaching me life and the quality of being humane, I love you. To the best sister in the world, Aylya Mehmed. Thank you as well to my second family in Vienna - Monika and Walter Denk, Kathi Seiler, and Jakob Denk - for the acceptance, support, and love you gave me. To the friends that endured throughout the time, Ercan Ahmed, Vladislav Kovachev, Ervin Çetiner, and Hristo Hristoskov. Last but not least, to my partner in life and wife Angelika. The amount of love and support you gave me throughout the years cannot fit into these pages. Thank you for all the understanding you showed when I had to prioritize this thesis work over our family! Thank you for being a truly remarkable mother of our two little boys Noel and Luis. I love you! Ayhan Mehmed November, 2020 Vienna, Austria. The work in this thesis has been conducted within the RetNet Project supported by the Marie Skłodowska-Curie actions under the Seventh Framework Programme..

(24) List of publications Publications included in the thesis Paper A System Architecture and Application-Specific Verification Method for Fault-Tolerant Automated Driving System, Ayhan Mehmed, Wilfried Steiner, Moritz Antlanger, Sasikumar Punnekkat. Published in Proceedings of 30th IEEE Intelligent Vehicles Symposium, Workshop on Ensuring and Validating Safety for Automated Vehicles (EVSAV), Paris, France, June 2019. Paper B The Monitor as Key Architecture Element for Safe Self-Driving Cars, Ayhan Mehmed, Moritz Antlanger, Wilfried Steiner. Published in Proceedings of 50th Annual IEEE-IFIP International Conference on Dependable Systems and Networks-Supplemental Volume (DSN-S), Valencia, Spain, June 2020. Paper C Systematic False Positive Mitigation in Safe Automated Driving Sysˇ sević. Accepted in tems, Ayhan Mehmed, Wilfried Steiner, Aida Cauˇ Proceedings of the International IEEE Symposium on Industrial Electronics and Applications (INDEL), Banja Luka, Bosnia and Herzegovina, November 2020. Paper D Early Concept Evaluation of a Runtime Monitoring Approach for ˇ sević, Wilfried Safe Automated Driving, Ayhan Mehmed, Aida Cauˇ Steiner, Sasikumar Punnekkat. Submitted to Proceedings of the 93rd IEEE International Conference on Vehicular Technology (VTC), Technical Track on Vehicle Cooperation and Control, Assisted and Autonomous Driving, Helsinki, Finland, April 2021. The included publications are reformatted to comply with the thesis printing format.. ix.

(25) x. Paper E Forecast Horizon for Automated Safety Actions in Automated Driving Systems, Ayhan Mehmed, Moritz Antlanger, Wilfried Steiner, Sasikumar Punnekkat. Published in Proceedings of the 38th International Conference on Computer Safety, Reliability and Security (SafeComp), Turku, Finland, September 2019. Paper F Deterministic Ethernet: Addressing the Challenges of Asynchronous Sensing in Sensor Fusion Systems, Ayhan Mehmed, Sasikumar Punnekkat, Wilfried Steiner. Published in Proceedings of the 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Workshop on Safety and Security of Intelligent Vehicles (SSIV), Denver, CO, USA, June 2017..

(26) xi. Additional publications not included in the thesis Paper G Development of Sensor Fusion Based ADAS Modules in Virtual Environments, Velibor Ilić, Maliˇsa Marijan, Ayhan Mehmed, and Moritz Antlanger. Published in Proceedings of the International Conference on Zooming Innovation in Consumer Technologies Conference (ZINC), Novi Sad, Serbia, May 2018. IEEE Computer Society. Paper H Next generation real-time networks based on IT technologies, Wilfried Steiner, Pablo Gutiérrez Peón, Marina Gutiérrez, Ayhan Mehmed, Guillermo Rodriguez-Navas, Elena Lisova, Francisco Pozo. Published in Proceedings of the International Conference on Emerging Technologies and Factory Automation (ETFA), Berlin, Germany, September 2016. IEEE Computer Society. Paper I Improving Dependability of Vision-Based Advanced Driver Assistance Systems Using Navigation Data and Checkpoint Recognition, Ayhan Mehmed, Sasikumar Punnekkat, Wilfried Steiner, Giacomo Spampinato, and Martin Lettner. Published in Proceedings of the International Conference on Computer Safety, Reliability and Security (SafeComp), Delft, The Netherlands, September 2015. Springer, LNCS. Paper J Improving Intelligent Vehicle Dependability By Means of InfrastructureInduced Tests, Wilfried Steiner, Ayhan Mehmed, Sasikumar Punnekkat. Published in Proceedings of the International Conference on Dependable Systems and Networks (DSN), Workshop on Safety and Security of Intelligent Vehicles (SSIV), Rio de Janeiro, Brazil, June 2015. IEEE Computer Society.. The publications are listed in reverse chronological order..

(27)

(28) Contents. I. Thesis. 1. 1 Introduction 1.1 Thesis Focus . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Overview of contributions . . . . . . . . . . . . . . . . . . . 1.3 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . .. 3 4 5 6. 2 Background and Motivation 2.1 Automated Driving Systems . . . . . . . . . . . . . . . . . . 2.1.1 Levels of Driving Automation . . . . . . . . . . . . . 2.1.2 Prioritization Function . . . . . . . . . . . . . . . . . 2.2 Runtime Monitoring . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Types of Runtime Monitors . . . . . . . . . . . . . . 2.2.2 Conditions of Runtime Monitor Verification Outcome 2.2.3 Performance Measures of a Runtime Monitor . . . . . 2.2.4 Receiver Operating Characteristic Curve . . . . . . . . 2.2.5 Balancing Decision Threshold . . . . . . . . . . . . . 2.2.6 Forecast Horizon . . . . . . . . . . . . . . . . . . . .. 7 7 8 11 15 17 18 19 21 22 23. 3 Research Methodology. 25. 4 Research Problems & Goals 4.1 RP A: The Verification Approach . . . . . . . . . . . . . . . . 4.2 RP B: Achieving High Hit-Rate Over Entire Failure Population 4.2.1 TPR Requirements for ADS with a Fallback Driver . . 4.2.2 TPR Requirements for ADS without a Fallback Driver. 29 30 30 31 32. xiii.

(29) xiv. Contents. 4.3 4.4 4.5 4.6 4.7 4.8. RP C: Reducing False Alarms . . . . . . . . . . . . RP D: Performing Accurate TPR Measurements . . . RP E: Requirements for Scalable Design . . . . . . . RP F: Estimating the Forecast Horizon . . . . . . . . RP G: The Out-of-Sequence Measurements Problem Research Goals . . . . . . . . . . . . . . . . . . . .. . . . . . .. 33 34 35 36 37 38. 5 Thesis Contributions 5.1 Design-Science Research Contributions . . . . . . . . . . . . 5.1.1 C1: The Safe Driving Envelope Verification Concept . 5.1.2 C2: The Congruency Exchange Concept . . . . . . . . 5.1.3 C3: An Approach for Early TPR Evaluation . . . . . . 5.1.4 C4: The Verification Modules . . . . . . . . . . . . . 5.1.5 C5: Estimating Forecast Horizon . . . . . . . . . . . . 5.1.6 C6: Ensuring Precise Sensor Measurements Timestamping . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Overview of Included Papers . . . . . . . . . . . . . . . . . . 5.2.1 Paper A: System Architecture and ApplicationSpecific Verification Method for Fault-Tolerant Automated Driving System . . . . . . . . . . . . . . . . . 5.2.2 Paper B: The Monitor as Key Architecture Element for Safe Self-Driving Cars . . . . . . . . . . . . . . . . . 5.2.3 Paper C: Systematic False Positive Mitigation in Safe Automated Driving Systems . . . . . . . . . . . . . . 5.2.4 Paper D: Early Concept Evaluation of a Runtime Monitoring Approach for Safe Automated Driving . . . . . 5.2.5 Paper E: Forecast Horizon for Automated Safety Actions in Automated Driving Systems . . . . . . . . . . 5.2.6 Paper F: Deterministic Ethernet: Addressing the Challenges of Asynchronous Sensing in Sensor Fusion Systems . . . . . . . . . . . . . . . . . . . . . . . . . . .. 41 42 43 46 49 53 54. 6 Related Work 6.1 Step 1: Identifying Research Questions . 6.2 Step 2: Defining Search String . . . . . . 6.3 Step 3: Paper Collection . . . . . . . . . 6.4 Step 4: Paper Selection . . . . . . . . . . 6.4.1 Step 4.1: Abstract screening . . . 6.4.2 Step 4.2: Classification by domain. 61 62 62 62 63 63 63. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. 55 57. 57 58 58 59 59. 60.

(30) Contents. . . . . . .. 64 64 65 65 66 66. 7 Evaluation 7.1 Evaluation of SDE-V . . . . . . . . . . . . . . . . . . . . . . 7.2 Evaluation of Related Work . . . . . . . . . . . . . . . . . . . 7.3 Feasibility study . . . . . . . . . . . . . . . . . . . . . . . . .. 67 68 70 71. 8 Conclusions and Future Work 8.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . .. 75 75 76. Bibliography. 79. II. 87. 6.5. 6.4.3 6.4.4 Step 5: 6.5.1 6.5.2 6.5.3. Step 4.3: Classification by sub-domain Step 4.4: Full text reading . . . . . . . Related Work Analyses . . . . . . . . . Technical University of Munich (TUM) Intel . . . . . . . . . . . . . . . . . . . Others . . . . . . . . . . . . . . . . . .. Included Papers. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. xv. 9 Paper A: System Architecture and Application-Specific Verification Method for Fault-Tolerant Automated Driving System 89 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 9.2 Problem Statement and Proposed Solution . . . . . . . . . . . 91 9.2.1 Unconditional Prioritization . . . . . . . . . . . . . . 92 9.2.2 The Problem of Replica Determinism . . . . . . . . . 93 9.2.3 Proposed System Architecture . . . . . . . . . . . . . 94 9.3 Safety Co-pilot . . . . . . . . . . . . . . . . . . . . . . . . . 95 9.3.1 Sensor Fusion . . . . . . . . . . . . . . . . . . . . . . 96 9.3.2 Verification Modules . . . . . . . . . . . . . . . . . . 97 9.3.3 Decision Maker . . . . . . . . . . . . . . . . . . . . . 99 9.4 Vehicle Collision Verification . . . . . . . . . . . . . . . . . . 99 9.4.1 Safe Driving Envelope Assembler . . . . . . . . . . . 100 9.4.2 Safe Driving Envelope Verifier . . . . . . . . . . . . . 101 9.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . 102 9.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104.

(31) xvi. Contents. 10 Paper B: The Monitor as Key Architecture Element for Safe Self-Driving Cars 107 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 10.2 Fail-operational Design Paradigms . . . . . . . . . . . . . . . 110 10.2.1 Triple Modular Redundancy (TMR) ADS . . . . . . . 110 10.2.2 Commander/Monitor (Com/Mon) ADS with Fallback . 111 10.3 Monitor Design Aspects . . . . . . . . . . . . . . . . . . . . 112 10.3.1 Is a monitor without ML possible, or how can we minimize ML? . . . . . . . . . . . . . . . . . . . . . . . 112 10.3.2 Can the monitor share sensors with the commander and/or the fallback? . . . . . . . . . . . . . . . . . . . 113 10.3.3 What coupling and information exchange between the monitor and other architectural elements is justifiable? 114 10.3.4 Further challenges in monitor design for safe selfdriving cars . . . . . . . . . . . . . . . . . . . . . . . 114 10.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 11 Paper C: Systematic False Positive Mitigation in Safe Automated Systems 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 11.2 Background . . . . . . . . . . . . . . . . . . . . . . . 11.2.1 General ADS Architecture . . . . . . . . . . . 11.2.2 General ADS Failure Assumptions . . . . . . . 11.2.3 General ADS Fail-Silent Architecture . . . . . 11.3 Problem statement . . . . . . . . . . . . . . . . . . . 11.4 Related Work . . . . . . . . . . . . . . . . . . . . . . 11.5 Proposed solution . . . . . . . . . . . . . . . . . . . . 11.5.1 Merging Process . . . . . . . . . . . . . . . . 11.5.2 Trajectory planning based on merged free space 11.5.3 Trajectory verification . . . . . . . . . . . . . 11.5.4 Trajectory and Free Space Formalization . . . 11.6 Formal Verification . . . . . . . . . . . . . . . . . . . 11.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . .. Driving . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . .. 119 120 121 121 123 124 127 128 130 131 132 133 133 135 136 136.

(32) Contents. xvii. 12 Paper D: Early Concept Evaluation of a Runtime Monitoring Approach for Safe Automated Driving 143 12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 12.2 The Runtmme Monitoring Approach . . . . . . . . . . . . . . 145 12.3 Runtime Monitor Performance Metrics . . . . . . . . . . . . . 146 12.3.1 Runtime Monitor Verification Outcomes . . . . . . . . 146 12.3.2 Performance Measures of a Runtime Monitor . . . . . 146 12.3.3 The importance of high TPR Rate . . . . . . . . . . . 148 12.4 Problem statement . . . . . . . . . . . . . . . . . . . . . . . 149 12.5 Proposed Solution . . . . . . . . . . . . . . . . . . . . . . . . 150 12.5.1 Viewpoint . . . . . . . . . . . . . . . . . . . . . . . . 150 12.5.2 Assumtions . . . . . . . . . . . . . . . . . . . . . . . 151 12.5.3 Proposed method . . . . . . . . . . . . . . . . . . . . 152 12.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 12.6.1 SDE-V Evaluation . . . . . . . . . . . . . . . . . . . 154 12.6.2 Evaluation of other runtime monitoring approaches . . 155 12.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 13 Paper E: Forecast Horizon for Automated Safety Actions in Automated Driving Systems 161 13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 13.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . 164 13.3 Parameters Influencing the Forecast Horizon . . . . . . . . . . 165 13.3.1 The FTTI Concept and Proposed Extension . . . . . . 166 13.3.2 Fault Detection Time . . . . . . . . . . . . . . . . . . 169 13.3.3 Fault Reaction Time . . . . . . . . . . . . . . . . . . 170 13.3.4 Automated Safety Action . . . . . . . . . . . . . . . . 171 13.3.5 Safety Margin . . . . . . . . . . . . . . . . . . . . . . 172 13.4 Complexity Reduction . . . . . . . . . . . . . . . . . . . . . 173 13.4.1 Specific Relevant Scenarios . . . . . . . . . . . . . . 173 13.4.2 Safe States and Strategies for the Specific Relevant Scenarios . . . . . . . . . . . . . . . . . . . . . . . . 174 13.4.3 Road Conditions and Speed Limits . . . . . . . . . . 175 13.5 Estimation of the Forecast Horizon . . . . . . . . . . . . . . . 175 13.5.1 Scenario 1 . . . . . . . . . . . . . . . . . . . . . . . . 176 13.5.2 Scenario 2 . . . . . . . . . . . . . . . . . . . . . . . . 176.

(33) xviii. Contents. 13.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 14 Paper F: Deterministic Ethernet: Addressing the Challenges of Asynchronous Sensing in Sensor Fusion Systems 181 14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 14.2 The OOSM Problem . . . . . . . . . . . . . . . . . . . . . . 183 14.2.1 The Kalman Filter . . . . . . . . . . . . . . . . . . . 183 14.2.2 The Cause of the OOSM . . . . . . . . . . . . . . . . 185 14.2.3 The Effect of OOSM . . . . . . . . . . . . . . . . . . 186 14.2.4 Solutions Dealing with OOSM and their Drawbacks . 187 14.3 Methods for Measurement Timestamping . . . . . . . . . . . 188 14.3.1 Timestamping Data at Arrival (Centralized Method) . 188 14.3.2 Triggering Method . . . . . . . . . . . . . . . . . . . 190 14.3.3 Timestamping at the Time of Acquisition (Distributed Method) . . . . . . . . . . . . . . . . . . . . . . . . . 191 14.4 The use of Deterministic Ethernet Networks for Precise Timestamping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 14.4.1 System-wide Synchronized Time . . . . . . . . . . . 193 14.4.2 Low Latency and Jitter . . . . . . . . . . . . . . . . . 193 14.4.3 Suitability of the Standards for Achieving Precise Measurement Timestamps . . . . . . . . . . . . . . . 194 14.5 Conclusion and Future Work . . . . . . . . . . . . . . . . . . 197 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 Appendices 203 A List of Abbreviations . . . . . . . . . . . . . . . . . . . . . . 204 B NHTSA Dataset . . . . . . . . . . . . . . . . . . . . . . . . . 206.

(34) I Thesis. 1.

(35)

(36) Chapter 1. Introduction. Standard AES. Automated Driving Systems. Safety is of utmost importance when developing Automotive Embedded Systems (AES). Today, there are well-established standards and methods for ensuring the safety of classic AES, such as Anti-Lock-Braking System (ABS), Electronic Stability Control (ESC), and similar. Generally speaking, at the development phase, relevant safety standards such as ISO 26262 and IEC 61508 are followed to ensure the specific safety grade is reached (Figure 1.1). Once the AES is deployed on public roads, two common approaches are used. Firstly, classic safety mechanisms like Built-in-Self-Tests (BIST), Cyclic-Redundancy Checks (CRC), watchdog timers are implemented to ensure the detection and mitigation of failures at runtime. Secondly, maintenance plans are carried out at specific time intervals or mileage to avoid failures due to aged components.. Thesis focus. Many others under development. Other safety mechanisms. UL 4600. SAE J3018. Runtime monitoring of ADS. ISO/PRF TR 4804. IEEE P2851. Fail-operational mechanisms. ISO/PAS 21448. IEEE P2846. Enhanced maintenance. IEC 61508. ISO 26262. Classic safety mechanisms. Safety development processes. Maintenance. Safety mechanisms and maintenance. Development. Deployment. Figure 1.1: Overview of approaches for ensuring AES safety. 3.

(37) 4. Chapter 1. Introduction. While these standards and approaches have worked well to ensure the safety of standard, well-constrained, and deterministic AES, they are certainly insufficient when it comes to the development of future Automated Driving Systems (ADS). The leading reasons for that are the (i) the shift of all safety responsibilities from the driver to the ADS (SAE L3-L5 AD (see Section 2.1.1)), as well as (ii) the novelty and overall complexity of the technology needed for achieving the Automated Driving (AD) functionality. To address this challenge, new standards are currently under development (see Figure 1.1 light gray rectangles). Some of the many standards are ISO/PAS 21448 (SOTIF), ISO/PRF TR 4804 (SaFAD), IEEE P2846, IEEE P2851, SAE J3016, SAE J3018, and UL 4600. Furthermore, new approaches for ensuring the safe operation of ADS during the deployment phase (i.e., when the vehicle is already on public roads) are also emerging. For example, enhanced maintenance methods that allow over-the-air updates once software and algorithmic issues are discovered. Fail-operational safety mechanisms are being implemented to enable the system to continue operating safely on public roads, even under predefined number of failures. Finally, as classic runtime safety mechanisms like BIST and CRC cannot detect all types of ADS failures (see SOTIF performance limitations in Section 2.2), more advanced runtime monitors, tailored for the AD domain, are being developed.. 1.1 Thesis Focus This thesis aims at building a runtime monitoring solution for real-time verification of the safe operation of ADS (see Figure 1.1). In particular, a Type 4 runtime monitor is developed (See Figure 1.2). A Type 4 monitor verifies whether. Figure 1.2: Overview of types of runtime monitors..

(38) 1.2 Overview of contributions. 5. the output trajectory of ADS path planning component is safe to be executed by vehicle’s actuators. The verification is done by checking the planned ADS trajectory, whether it complies with the following set of safety rules: • The output of the ADS (e.g., a trajectory that needs to be followed by vehicle actuators) shall not lead the vehicle to a collision with dynamic (e.g., another vehicle participant) or static obstacles on the road. • The output of the ADS shall not lead into a non-drivable area (e.g., into a pothole, off the road, or a cliff). Examples for other type monitors include monitoring other in-vehicle systems (Type 1), BIST, CRC (Type 2), monitoring the perception part of the ADS (Type 3), and monitoring the actuating part of the system (Type 5). A detailed description is given in Section 2.2.1.. 1.2 Overview of contributions Generally speaking, from idea generation to being deployed in the realworld for the intended purpose, a system goes through four main phases (see Figure 1.3). In the Concept Phase (i) the idea is generated, (ii) the core concept is implemented, and (iii) feasibility studies are performed. Successful completion of the concept phase is followed by the Development Phase, where the system is developed according to the relevant development process (e.g., ISO 26262). Once the system is developed, its production and deployment phases are carried out. The work conducted in this thesis primarily targets the Concept Phase. The contributions have tackled a wide range of research and engineering challenges. At the concept level, a novel Runtime Verification (RV) approach that uses application-specific knowledge to detect violations of previously introduced safety rules is proposed. The concept is referred to as the Safe Driving Envelope-Verification (SDE-V).. Figure 1.3: System development phases and focus of thesis contributions..

(39) 6. Chapter 1. Introduction. Towards the realization of the proposed SDE-V concept into an actual runtime monitoring solution, further contributions have been made. This encompasses: • A novel approach for reducing the false positives rate of the SDE-V. • An approach for high-level evaluation of runtime monitors’ true positive rate at the early concept phase. • A modular and scalable fail-operational architecture that enables the gradual implementation of other relevant runtime verification approaches alongside the SDE-V approach. • An approach for estimating the “forecast horizon”1 , to ensure the timely execution of emergency actions upon an ADS failure detection by the SDE-V. • A method for precise sensor measurement timestamping to tackle the problem of out-of-sequence measurements in sensor fusion-based ADS.. 1.3 Thesis outline This thesis is organized into two main parts. The first part provides a summary of the thesis, organized as follows. In Chapter 1, a high-level introduction to ADS, problem statement, and motivation for selecting runtime monitoring as a research topic is given. Chapter 2 provides an in-depth background to ADS and runtime monitoring to ensure a common ground when discussing the topics of interest. Chapter 3 presents the research methodology used in conducting research activities. Chapter 4 defines the research problems and research goals. Chapter 5 contains the contributions of this thesis and provides an overview of the included papers. Chapter 6 surveys the related work on runtime monitoring for ADS. Chapter 7 presents a high-level evaluation of the SDE-V approach and other relevant approaches. A summary of the SDE-V feasibility study is also given in Chapter 7. Chapter 8 concludes part one of the thesis with conclusions and future work. The second part of the thesis consists of a collection of peer-reviewed research publications that encompass all thesis contributions. 1 Forecast horizon defines how much in advance a hazard has to be identified by a runtime monitor in order to ensure the timely execution of an automated safety action (see Section 2.2.6)..

(40) Chapter 2. Background and Motivation 2.1 Automated Driving Systems Vehicles with Automated Driving (AD) capabilities aim at making the driving more comfortable, efficient, and ultimately, safer. Generally speaking, the AD functionality is implemented in a computer-based system known as ADS [1]. Equipped with a variety of sensors (e.g., radar, lidar, camera, and others), the ADS perceives the environment, calculates driving commands (e.g., trajectories) by means of complex sensor data processing, situation analysis and planning algorithms, and controls the vehicle with actuators (e.g., steering, braking, throttle) (as depicted in Figure 2.1).. Figure 2.1: Generalized ADS block diagram. 7.

(41) 8. Chapter 2. Background and Motivation. 2.1.1. Levels of Driving Automation. Depending on the capabilities, the ADS can support the driving function at different levels of automation (see Figure 2.2). At the lowest level of automation (Level 1 (L1)), the ADS is capable at assisting the driving tasks by taking over the (i) longitudinal motion control (e.g., control brakes, throttle) of the vehicle or (ii) the lateral motion control (e.g., steering). The first enables features like Adaptive Cruise Control (ACC) and Automated Emergency Braking (AEB), whereas the second enables features like Lane Keep Assistance (LKA). At the intermediate level of automation (L2, L3), the ADS is capable of taking over both the longitudinal and the lateral control of the vehicle, hence enabling ACC with LKA to be used at the same time. It is essential to understand that L1-L3 AD capable vehicles do still require a human driver to be alert and ready to interfere with the ADS’ decisions immediately (for L1, L2) or within a predefined time frame (for L3). In this case, the ADS can be characterized as in Figure 2.3. The human driver is an operator that activates or deactivates the system. Once the activation criteria is met (e.g., the vehicle is in the correct operational design domain), the ADS starts perceiving the environment and the state of the vehicle (e.g., speed, posi-. Figure 2.2: Levels of driving automation. Based on SAE J3016 [1]..

(42) 2.1 Automated Driving Systems. 9. tion, orientation, etc.) using a set of sensors (e.g., camera, lidar, radar, inertial measurement unit, etc.). Based on the sensor inputs, the Automated Driving Function (ADF) calculates driving commands (i.e., trajectories) that are then executed by actuators in the vehicle. The human driver is also responsible for the backup driving function. Hence, in parallel to the ADS, the human driver perceives the environment and the vehicle’s state using human senses. The human driver is ready to interfere with the ADS at any time, e.g., whenever he/she identifies the vehicle operation as erroneous or receives a request for intervention from the ADS. The prioritization function is designed in a way that the driving commands exercised by the human driver (e.g., steering, braking) are always and unconditionally prioritized over the driving commands from the ADF. In such a setup, the human driver is responsible for the overall vehicle safety, thus, significantly reducing safety requirements on the ADS.. Figure 2.3: ADS with a human in charge of the backup driving function..

(43) 10. Chapter 2. Background and Motivation. Ultimately, in the race for higher levels of AD capabilities (L4, L5 in Figure 2.2), all driving responsibilities will be shifted to the ADS, and the driver will no longer be responsible for the backup driving function. Such ADSs have characteristics, as depicted in Figure 2.4. In this case, the ADS is responsible for performing (i) the ADF, now called Primary ADF, (ii) the backup driving function, also known as Backup ADF, and (iii) the prioritization function. In this setup, the overall responsibility for automated vehicle safety lies within the ADS. The shift of responsibility significantly increases the safety requirements towards the ADS, requiring a safety level comparable to an aircraft system (e.g., about 1 billion operating hours per catastrophic event [2]). Developing a system to this safety level is a complex task that requires numerous cross-domain challenges to be addressed. For example, traditional approaches for ensuring safety require (i) diverse implementation of the ADS components and (ii) the use of fault-tolerance techniques, which are relatively unproven, particularly in the AD domain. We elaborate in more detail on these problems in the next section.. Figure 2.4: ADS without a human in charge of the backup driving function..

(44) 2.1 Automated Driving Systems. 2.1.2. 11. Prioritization Function. In contrast to a human backup driver, a Backup ADF shall not be prioritized unconditionally as it can also be in an unsafe state due to common cause failures. That is, both Primary and Backup ADF may fail due to a single internal or external root cause and generate driving commands leading to an accident. Figure 2.5, illustrates an example of a common cause failure, where due to foggy weather, both Primary and Backup ADF do not recognize a static object (e.g., a rock) on the road, which then leads to a generation of driving commands (e.g., trajectories) colliding into the object. To avoid such hazardous scenarios, safety experts follow two conventional approaches during system development. Firstly, the system is designed considering the diversity of the redundant components to avoid common cause failures. In the context of ADS, diversity between Primary and Backup ADF can be achieved by developing these systems by different teams or companies, with different sensors, algorithms, hardware, or different programming languages. Secondly, instead of simply prioritizing the Backup ADF in case of Primary ADF failure, more intelligent prioritization function is used to manage their coordinated operation. Such functionality can be achieved using faulttolerance techniques. Fault-tolerance techniques use some form of redundancy in the system (e.g., hardware, software) and fault detection techniques to ensure that the system’s output commands are safe. Common fault-tolerant architectures are the self-checking pair and triple modular redundancy (see Figure 2.6).. Accident. Primary ADF Backup ADF. Fog. Figure 2.5: Example for common cause failure..

(45) 12. Chapter 2. Background and Motivation. Figure 2.6: Common fault-tolerant architectures. To detect faults in the overall system, both architectures in Figure 2.6 rely on differences between the redundant components’ outputs. For example, in the case of the self-checking pair architecture (Figure 2.6, (a)), the outputs of the redundant components (System 1, System 2) are given to the comparator, which compares whether the signals are identical. The comparator requires that the outputs of System 1 and System 2 are identical (with some tolerance) before any output is sent to the actuators. In the case of triple modular redundancy (Figure 2.6, (b)), a voting block (i.e., voter) collects the outputs from the three redundant components (System 1, System 2, and System 3) and compares whether they are identical. The voter requires at least two out of the three systems to produce the same output before any output is sent to the actuators. It is important to note that both comparison and voting approaches work under the assumption that all redundant components show consistent outputs in the absence of faults, also known as replica determinism [3]. In the context of AD and the system described in Figure 2.4, the replica determinism can, in general, be expressed as: “The Primary and Backup ADF shall output consistent driving commands that safely maneuver the vehicle on the road”. However, achieving such consistent driving commands is not so simple. The rationale for this is the requirement for diversity between the redundant ADS’ components (i.e., Primary and Backup ADFs) to cope with common cause failures in the first place. As Primary and Backup ADF are implemented with diversity in mind, they will not produce the same output (e.g., trajectories) at the same driving scenarios - a problem known as replica indeterminsm. The resulting problems of this inconsistency are illustrated in Figure 2.7 (a), (b), and (c). Figure 2.7 (a) illustrates a case where the Primary and Backup ADFs generate two correct but differing results. Because of that, a simple comparison of Primary and Backup ADFs output (e.g., a self-checking pair) will falsely conclude on existing fault (i.e., false positive)..

(46) 2.1 Automated Driving Systems. Trajectories of shown by. Primary ADF. 13. Backup ADF 1 Backup ADF 2. Accident STOP. (a) Inconsistent outputs: both correct. (b) Inconsistent outputs: correct and incorrect. (c) Inconsistent outputs: all three correct. Figure 2.7: Scenarios of inconsistent Primary and Backup ADF outputs.. Furthermore, the comparison approach will as well not be able to conclude which ADF is faulty or not (see Figure 2.7 (b)). It is a similar matter with faulttolerant approaches that use majority voting (e.g., triple modular redundancy). As all three (Primary ADF, Backup ADF 1, and Backup ADF 2) are likely to generate different outputs (see Figure 2.7 (c)), no consensus will be reached. Inexact voting or approximate agreement algorithms can theoretically solve the replica indeterminism problem. However, these algorithms are proven to be “more complicated due to intransitivity of approximate equality” [4]. One way to address the replica indeterminism problem in fault-tolerant computing is to use correctness checks for fault detection instead of comparison or voting. The dual standby fault-tolerant architecture is one example application of the correctness checks (see Figure 2.8). In a typical dual standby. Figure 2.8: General dual standby fault-tolerant architecture..

(47) 14. Chapter 2. Background and Motivation. fault-tolerant architecture, one of the redundant systems is assigned to be the Primary (System 1) and the other the Backup (System 2). The architecture has a runtime monitor block that verifies the correctness of the Primary system based on application-specific knowledge about how the system may fail. The result of the verification is then given to the switch. If the Primary system is identified to be in a failure state, the switch engages the Backup system, and thus Backup system’s commands are forwarded to the actuators. Figure 2.9 depicts an example of adopting the dual standby fault-tolerant architecture to the ADS architecture described earlier in Figure 2.4. The runtime monitor verifies the correctness of the Primary ADF. Based on the verification results, the prioritization function (equivalent to the switch) prioritizes the Primary or Backup ADF driving commands. If the verification has been successful (i.e., the Primary ADF concluded to be correct), the prioritization function forwards the Primary ADF driving commands to the actuators. In case the Primary ADF verified to be faulty, the prioritization function switches to the Backup ADF and forwards its driving commands to the actuators. Moreover, to. Figure 2.9: Dual standby fault-tolerant ADS architecture with RM..

(48) 2.2 Runtime Monitoring. 15. avoid switching to a faulty backup system, the runtime monitor can also verify the Backup ADF for latent failures. Such an architecture solves the problem of unconditional prioritization of the Backup ADF, as well as the problem of replica indeterminism. However, it also opens new challenges for scientists and practitioners. For example, one of the key elements responsible for the safety in such an architecture, namely the runtime monitor, has to be yet developed for the use case of AD. We look into these challenges in the next section.. 2.2 Runtime Monitoring A runtime monitor is a hardware and software item that continuously executes an RV approach to verify the current execution of the ADS against given correctness properties. Classic runtime monitors like BIST, CRC, and watchdog timers are well-known and commonly used measures for detecting functional failures of the Electrical/Electronic (E/E) system. However, they are not proven to detect other failures an ADS can experience. These are failures caused by performance limitations or security attacks (see Figure 2.10). An unsafe ADS operation may occur due to E/E malfunctions (Figure 2.10, left-hand side) [5]. These malfunctions can be classified into systematic and random failures. Systematic failures are caused by faults introduced in the system during the design, development, or manufacturing of an item and happen in a deterministic way. One such example is an incorrectly calibrated stereo camera, which leads to wrong distance measurements to the surrounding objects and, therefore, to possibly wrong driving decisions. On the other hand,. Figure 2.10: Failures leading to unsafe ADS operation..

(49) 16. Chapter 2. Background and Motivation. random failures can occur in a non-deterministic manner during the lifetime of hardware. Such failures usually occur due to physical causes, i.e., a correctly calibrated stereo camera may temporarily or permanently lose calibration after the vehicle passes through a bump on the road. Moreover, systems that rely on sensing the external environment can experience unsafe operation even in the absence of E/E system malfunctions due to insufficiencies in the implementation of the intended functionality (i.e., performance limitations) (Figure 2.10, middle). Such performance limitations can be due to “the inability of a function to correctly comprehend the situation and operate safely” [6]. An AI-based path planning algorithm, for instance, can incorrectly comprehend a complicated new road situation and initiate unsafe actuation. Performance limitations as well include “insufficient robustness of the function with respect to sensor input variations or diverse environmental conditions” [6]. For instance, a camera-based automated driving system used for lane keeping can incorrectly determine the lane due to adverse illumination conditions (e.g., direct sunlight, weather conditions, light reflections, glares, artificial light) and therefore fail in keeping the vehicle on the road. Finally, considering the high connectivity of future automated vehicles (e.g., V2X), unsafe operation of the vehicle may occur due to cyber-security attacks on the ADS (See Figure 2.10 right-hand side). Invasive attacks, such as code modification, code injection, packet sniffing, packet fuzzing, in-vehicle spoofing, can result in an abnormal and unsafe operation of the ADS [7]. Because of the high and diverse number of failures an ADS can experience, more advanced runtime monitoring solutions are necessary: i.e., runtime monitoring solutions particularly developed for the AD use case. However, the development of such a runtime monitoring solution is not a challenge that is easily tackled. • To begin with, a good understanding of the system is necessary to be able to develop a runtime monitor that detects system failures. • Furthermore, the verification approach executed in the runtime monitor needs (i) to run in real-time (e.g., execution cycle in the range of milliseconds), as well as (ii) not to add a lot to the overall AD system’s end-to-end latency (e.g., from sensing to actuating). • Finally, the integrity of the runtime monitor is essential for ensuring the automated vehicle safety. Consider a case where the runtime monitor does not detect a failure in the Primary ADF (i.e., a false negative) and thus misleads.

(50) 2.2 Runtime Monitoring. 17. the prioritization function to forward the unsafe Primary ADF driving commands to the actuators. We provide details on this failure and other types of failure in Section 2.2.2.. 2.2.1. Types of Runtime Monitors. In this section, we classify the type of runtime monitors in five main categories, as shown earlier in Figure 1.2. Below is a high-level summary. • Type 1: A modern vehicle consists of one to a few dozen systems that range from Engine Control System (ECS) to ABS, and others. Each of these systems implements certain safety monitors (i.e., runtime monitors) that continuously monitor relevant system parameters. An example safety monitor for ECS could be monitoring the engine’s water cooling system temperature. Once the temperature reaches a certain upper bound, actions like turning on the engine’s ventilator or informing the driver via the dashboard are taken. • Type 2: BIST, CRC, watchdog timers, resource management, type of monitors verify the overall health of the system. For example, a watchdog timer can generate a system reset if the ADS neglects to service it (e.g., reset a timer). Such a simple runtime monitor can solve various software and hardware faults that cause a system to hang. • Type 3: This type of runtime monitors target detecting failures of the perception part of the ADS. This encompasses the detection of sensor failures and failures on the algorithmic side (e.g., Sensor Fusion, AIalgorithm). An example of this type of monitor would be injecting “virtual objects” through a sensor interface and verifying whether the perception system would detect the objects - hence verifying whether the system is responsive overall. • Type 4: Runtime monitors of this type typically verify whether the output the path planner (e.g., a trajectory that needs to be followed by vehicle actuators) comply with specific safety requirements. For example, for a lane-keeping function, a safety requirement can be “the generated trajectory keeps the vehicle within its lane boundaries plus some safety margin” [8]. • Type 5: Such type of monitors typically check for (i) the overall health of the actuators or (ii) unsafe actuation. An example of the former is.

(51) 18. Chapter 2. Background and Motivation. verifying that the hydraulic braking system’s pressure is in the desired range. An example of the latter is continuously checking that actuators do not over-actuate, e.g., sudden steering commands and similar.. 2.2.2. Conditions of Runtime Monitor Verification Outcome. Figure 2.11 illustrates two rectangles representing datasets with condition positive (P) cases (i.e., ADS fault) and condition negative (N) cases (i.e., ADS not at fault)1 . Furthemore, Figure 2.11 ummarizes the four conditions for any given verification outcome of a runtime monitor. There are two type of correct conditions a runtime monitor can experience, i.e., true positive and true negative. A true positive is a positive detection of a fault in an incorrectly operating system, i.e., when a runtime monitor correctly identifies that a faulty ADS is at fault.. Condition Positive (P) cases Condition Negative (N) cases (i.e., ADS in fault) (i.e., ADS not in fault). Positive case classified as negative (i.e. False Negative). Positive case classified as positive (i.e. True Positive). Negative case classified as negative (i.e. True Negative). Negative case classified as positive (i.e. False Positive). Figure 2.11: State space of runtime verification outputs. 1 The dataset of ADS faults is typically a list of known failures made by the development team and often does not represent the entire fault/faulure population of ADS. An issue we will address in Research Problem B in Section 4.2..

(52) 2.2 Runtime Monitoring. 19. A true negative is a negative detection of a fault in a correctly operating system, i.e., when a runtime monitor correctly identifies that non-faulty ADS is not at fault. Furthermore, there are two types of error conditions a runtime monitor can experience: false positive (error type I) and false negative (error type II). A false positive is a positive detection of a fault in a correctly operating system, i.e., when a runtime monitor falsely concludes that non-faulty ADS is at fault. In the context of Automated Driving, the occurrence of a false positive error is primarily an availability concern2 , e.g., a false positive will lead to switch-over to Fallback ADF that then may pull the vehicle to the road edge (making the AD functionality not available). A false negative is a negative detection of a fault in an incorrectly operating system, i.e., when a runtime monitor falsely concludes that a faulty ADS is not at fault. Such failures are highly safetycritical, especially for L3-L5 ADS, as any failure not detected by the monitor is likely to lead to unsafe operation and potentially to a hazard.. 2.2.3. Performance Measures of a Runtime Monitor. True Positive Rate (TPR), True Negative Rate (TNR), False Positive Rate (FPR), False Negative Rate (FNR) are the most common measures of the performance of a runtime monitor. We elaborate on these performance measures in the following.. True Positive Rate TPR, also known as sensitivity, hit rate, or recall is a metric that“measures the proportion of positive cases in the data that are correctly identified as such” [9]. Specifically, it is the ratio of the total number of TPs to the total number of positive cases (i.e., condition positive (P) cases, see Figure 2.11) in a dataset expressed as a percentage. The TPR can also be estimated as 1 − F N R. TPR =. TP × 100 [%] = 1 − F N R [%] P. (2.1). 2 In some cases, false positive can be a safety concern as well. A false positive decision by runtime monitor may trigger an emergency braking. Such a sudden and unreasonable braking can put other road participants in danger and potentially to an accident (e.g., rear-end collision)..

(53) 20. Chapter 2. Background and Motivation. True Negative Rate TNR, also known as specificity, “measures the proportion of negative cases in the data that are correctly identified as such” [9]. It is the ratio of the total number of TNs to the total number of negative cases (i.e., condition negative (N)) in the dataset expressed as a percentage. TNR =. TN × 100 [%] = 1 − F P R [%] N. (2.2). The TNR can also be estimated as 1 − F P R. False Positive Rate FPR, also known as fall-out or false-alarm ratio measures the proportion of negative cases in the data that are wrongly identified as positive [10]. It is the ratio of the total number of FPs to the total number of negative cases (i.e., condition negative (N), see Figure 2.11) in the dataset expressed as a percentage. FPR =. FP × 100 [%] = 1 − T N R [%] N. (2.3). The FPR can also be estimated as 1 − T N R. False Negative Rate FNR, also known as miss rate, measures the proportion of positive cases in the data that are wrongly identified as negative. It is the ratio of the total number of FNs divided by the total number of positive cases (i.e., condition positive (P), see Figure 2.11) in the dataset expressed as a percentage. FNR =. FN × 100 [%] = 1 − T P R [%] P. The FNR can also be estimated as 1 − T P R.. (2.4).

(54) 2.2 Runtime Monitoring. 2.2.4. 21. Receiver Operating Characteristic Curve. A common approach to intuitively present a runtime monitor’s diagnostic ability is the use of a Receiver Operating Characteristic (ROC) curve. It is a graphical plot that illustrates the rate of true positives against the rate of false positives at different threshold settings (see Figure 2.12). The line labeled with random classifier is an example of a runtime monitor that has a 50% probability to distinguish a positive case from a negative case that is, a 50% probability that a failure of an ADS can be detected. The black circle labeled as ideal classifier is an example of a flawless classifier that can distinguish a positive case from a negative case with a 100% probability. The line labeled with real-life classifier is an approximation of how the diagnostic ability of a realistic runtime monitor is. As illustrated in Figure 2.12, the higher the TPR gets, the higher the FPR of the classifier becomes. The two thresholds points T 1 and T 2 in Figure 2.12, indicate the TPR and FPR at different thresholds selected. In statistical classification, the threshold, also known as the decision threshold, enables making a binary decision on whether an observation belongs to a positive class. Extrapolating this to the context of runtime monitoring, the threshold is used to decide whether the current observation of a runtime monitor about a system condition is positive or negative, i.e., whether the ADS is in failure or not. Ideal classifier. True Positive Rate (TPR). 1. 0. T1 (high). 0. False Positive Rate (FPR). 1. Figure 2.12: Receiver Operating Characteristic (ROC) curve..

(55) 22. Chapter 2. Background and Motivation. 2.2.5. Balancing Decision Threshold. A low threshold value results in classifying more observations as positive, and therefore the TPR increases. A high threshold leads to classifying more observations as negative, and therefore to high TNR (see Figure 2.13). Figure 2.13 (a) depicts an example of an observation that has resulted in a 0.7 probability of being a positive case (e.g., a failure). Figure 2.13 (b) is an example of a low threshold (T2), which leads to classifying the observation as positive. Figure 2.13 (c) is an example of a high threshold (T1), which results in classifying the observation as negative. 1. 1. 1. T1 T2. Case X. Case X. Case X. 0.7. 0 0 0 (a) Observation with 0.7 (b) Low threshold (c) High threshold probability of positive case (positive classification) (negative classification). Figure 2.13: Two decision thresholds with different classification outcomes. What intuitively comes to mind is to set the threshold low enough in order to increase the TPR. However, this leads to an increase of the FPR, as indicated in Figure 2.12 with threshold point T2. To better illustrate the problem in Figure 2.14, we recall Figure 2.11, however, with different threshold examples. Condition Positive (P) Condition Negative (N). Condition Positive (P) Condition Negative (N). FN. TN. FN. TN. FN. TN. TP. FP. TP. FP. TP. FP. T1. T2. (a) Classification results with high threshold (T1). (b) Classification results with low threshold (T2). Figure 2.14: Classification resutls with T1 on the left and T2 on the right..

(56) 2.2 Runtime Monitoring. 23. Therefore, the threshold should be carefully chosen to have the desired balance between TPR and FPR. A good analogy is fishing with a net: “As you use a finer net, a smaller number of fish slip through, but you also catch more seaweed and garbage. The $64,000 question is how fine a net to use for the best possible results.” [11]. 2.2.6. Forecast Horizon. The RV approach developed in this thesis is envisioned to be implemented in future fail-operational ADS architectures, e.g., the dual standby ADS architecture presented earlier in Figure 2.9. In such an architecture, the runtime verification approach verifies the correctness of the Primary ADF and based on the verification results, the prioritization function prioritizes the Primary or Backup ADF. In the event of a faulty Primary ADF, the Backup ADF is supposed to take over the control and execute a so-called Automated Safety Actions (ASAs) that aim to bring the vehicle to a safe state, e.g., by safely circumventing obstacles on the road or reaching a safe halt before colliding with them. In some cases, the time needed to execute the ASA can be higher than the time to the hazard thus jeopardizing the overall safety of the vehicle (see Figure 2.15). Trajectories of shown by. Primary ADF. Fault detected. Backup ADF takeover. Backup ADF. Safe state reached. Accident. Early detection. (a) Sufficient time to execute ASA. (b) No sufficient time to (c) Fault detected earlier execute ASA ASA executed. Figure 2.15: Example scenarios with different outcomes..

(57) 24. Chapter 2. Background and Motivation. Figure 2.15 depicts three scenarios with different outcomes. In Figure 2.15 (a), the Primary ADF experiences a fault (i.e., not detecting the obstacle and therefore generating actuating commands (e.g., trajectories) that drive the vehicle into an obstacle), the fault is detected by the RV approach. The Backup ADF takes over and initiates and executes the ASA by reaching the right lane and hence reaches a safe state. Figure 2.15 (b) depicts a similar scenario as in Figure 2.15 (a). However, this time all lanes are blocked with obstacles - thus, the ASA strategy is to initiate an emergency braking. As the time to execute the emergency braking is shorter than the time to the obstacle, the vehicle crashes into the obstacle. A way to ensure the ASA execution and therefore reaching a safe state is to provide the required time for completing the ASA by detecting a hazard early enough in advance. In this research work, this is described as the forecast horizon, which is the “lower bound of a time interval that defines how much in advance an impending potential hazard has to be identified so that the execution of ASA is guaranteed”. Figure 2.15 (c), presents how the advance identification of a hazard ensures the execution of ASA. The scenario is similar to the one given in Figure 2.15 (b), but this time, the RV approach has identified the Primary ADF fault earlier and hence switched to the Backup ADF earlier. As a result, the Fallback ADF has enough time to execute the ASA (e.g., emergency braking) and reach a safe state (e.g., reach a safe halt before colliding into the obstacle). Estimating the forecast horizon is not a trivial problem. The rationale for that is further explained in Section 4.6, research problem F..

(58) Chapter 3. Research Methodology Traditionally, the goal of science has been to explore, develop, and justify theories (i.e., principles and laws) that explain or predict given phenomena and at the same time advance the knowledge in a given area. Eventually, such theories are used by researchers and practitioners while seeking to design and develop a solution for a given real-world problem. Research, where the solution is designed and developed rather than discovered, is referred to as Design Science Research (DSR). Also known as “sciences of the artificial” [12], “systemeering” [13], and “constructive research” [14], DSR is a problem solving method that “seeks to create innovations that define ideas, practices, technical capabilities, and products” [15]. This thesis uses DSR as a method for conducting research studies. Over the years, different DSR processes have been proposed [15–18]. We find the process proposed by Vaishnavi and Kuechler [18] as the most suitable for the research conducted in this thesis. Figure 3.1 presents our adaptation of this process. We include two additional steps before the solution implementation step. In Step 1, the real-world problems in the AD safety field are investigated. For that, a thorough study of the state-of-practice and state-of-art has been carried out. As a result, two problems were identified: (i) the insufficiency of current standards and safety mechanisms for ensuring the safe operation of ADS on public roads (see Chapter 1), and (ii) the problem of replica indeterminism in fault-tolerant ADS described earlier in Section 2.1. In Step 2, solutions for the identified problems are proposed. Based on the formulated problems and background study in Chapter 2, we focus on runtime monitoring as a complementary solution for addressing the identified problems. 25.

(59) 26. Chapter 3. Research Methodology. Figure 3.1: An adaptation of a design science research process based on [18]. The development of a runtime monitoring solution for ADS requires solving a variety of research and engineering problems. The solution implementation step (Step 3 in Figure 3.1) is intended to guide through these activities.1 . In Step 3.1, research problems are identified. A detailed description of the identified research problems is given in Section 4.1 to Section 4.7. In Step 3.2, research goals are defined. We summarize the research goals in Section 4.8 1 The solution implementation step is an iterative process. For example, since not all research problems are known initially, they can be refined in the later stages of the research once given know-how is reached. Corresponding research goals are then further defined..

(60) 27. Figure 3.2: Proposed solution. In Step 3.3, solutions are proposed. Chapter 5 provides a detailed description of the proposed solution, whereas a summary is given next. Figure 3.2 depicts a summary of the proposed solutions in this thesis. We propose an RV concept, namely the Safe Driving Envelope Verification (SDE-V). Moreover, we propose solutions for problems encountered during the further development of the SDE-V concept. For example, we propose an approach for reducing the false positive rate of the SDE-V concept: i.e., the congruency exchange concept. We also propose a method for high-level evaluation of the true positive rate of runtime monitors. Furthermore, we propose a scalable architectural solution that allows the gradual implementation of multiple runtime verification approaches alongside the SDE-V. Another proposed solution is a method for estimating the forecast horizon to guarantee the execution of automated safety actions upon a failure detection by a runtime monitor. Finally, we propose methods for ensuring the correct timestamping of sensor measurements to solve the Out-of-Sequence Measurements (OOSM) problem in sensor fusion-based ADS. In Step 3.4 and Step 3.5, the proposed solutions are respectively implemented and evaluated. A dedicated chapter (Chapter 7) summarizes the implementation and evaluation results. Finally, in Step 3.6 of the solution implementation, conclusions are summarized. The outcome of each solution implementation iteration is an artifact, which essentially is a contribution containing new knowledge (see Figure 3.1). This knowledge is then used to advance knowledge in the general and specific field of interest. The artifact can be in the form of constructs, models, architectures, methods, concepts, and instantiations (see Table 3.1)..

(61) 28. Chapter 3. Research Methodology. Table 3.1: A summary of artifact types [18]. Extensive list can be find in [19]. No.. Type. 1. Constructs. 2. Models. 3. Architectures. 4. Methods. 5. Concepts. 6. Instantiations. Description Terms, notations, definitions, and concepts that are needed for formulating problems and their possible solutions. A representation of possible solution to practical problem. Describes the internal organization of a computer in an abstract way; that is, it defines the capabilities of the computer and its programming model. Sets of steps used to perform tasks. How-to knowledge. An idea of how something is, or how something should be done to solve a a problem of interest. The realization of the artifact in an environment.. Constructs provide the language in which problems and solutions are defined and communicated. “The field of mathematics was revolutionized, for example, with the constructs defined by Arabic numbers, zero, and place notation” [15]. Another example is the Unified Modeling Language (UML). Models, use constructs to represent a topic or a solution for a given problem (e.g., a UML database model). Methods define processes for how to solve a given problem and achieve a goal. Fault Tree Analyses (FTA) and Failure Mode and Effects Analysis (FMEA) are both methods for conducting a hazard analysis. Concepts are the initial proposal for solving a problem of interest. Last but not least, Instantiations demonstrate the feasibility of an artifact (e.g., a concept) in a real or prototypical implementation. The artifacts generated in this thesis are as follows. The SDE-V approach and the congruency exchange approach are classified as concepts. Moreover, some artifacts support the further development of the RV concepts. This includes, three methods: (i) a method for high-level evaluation fo the TPR of runtime monitors. (ii) a method for estimating the forecast horizon, and (iii) a method for achieving correct sensor measurement timestamping in sensor fusion systems. Finally, we provide one architecture artifacts - enhancing the scalability of the architecture where the runtime monitoring solution is developed. A detailed description of the generated artifacts can be found in Chapter 5..

(62) Chapter 4. Research Problems & Goals Building a runtime monitoring solution requires various challenges to be solved. We classify them into five groups (see Figure 4.1). On the concept development level, the complex and opaque technology used in ADS often makes researchers and practitioners struggle to define a RV approach for detecting system failures (see Research Problem A (RP A)). Once the initial RV concept is defined (see the circle in Figure 4.1), researchers and practitioners have to deal with various challenges to realize the solution into an actual runtime monitoring solution for ADS. We classify these challenges into safety (see RP B, RP F, RP G), availability (see RP C), efficiency (see RP D), and scalability, (see RP E). In the following, we provide details on the identified important research problems.. Figure 4.1: Research problems and their classification. 29.

No results found