Analysis of enterprise IT service availability
Enterprise architecture modeling for assessment, prediction, and decision-making
ULRIK FRANKE
Doctoral Thesis Stockholm, Sweden 2012
Akademisk avhandling som med tillstånd av Kungliga Tekniska Högskolan framlägges till offentlig granskning för avläggande av teknologie doktorsexamen onsdagen den 31 oktober, 2012 i sal F3, Lindstedtsvägen 26, KTH, Stockholm.
Opponent: Ass. prof. João Paulo A. Almeida, Federal University of Espírito Santo, Brazil
TRITA EE 2012:032 • ISSN 1653-5146 • ISRN KTH/ICS/R--12/02--SE • ISBN 978-91-7501-443-2
Abstract
Information technology has become increasingly important to individuals and organiza- tions alike. Not only does IT allow us to do what we always did faster and more effectively, but it also allows us to do new things, organize ourselves differently, and work in ways previously unimaginable. However, these advantages come at a cost: as we become in- creasingly dependent upon IT services, we also demand that they are continuously and uninterruptedly available for use. Despite advances in reliability engineering, the complex- ity of today’s increasingly integrated systems offers a non-trivial challenge in this respect.
How can high availability of enterprise IT services be maintained in the face of constant additions and upgrades, decade-long life-cycles, dependencies upon third-parties and the ever-present business-imposed requirement of flexible and agile IT services?
The contribution of this thesis includes (i) an enterprise architecture framework that offers a unique and action-guiding way to analyze service availability, (ii) identification of causal factors that affect the availability of enterprise IT services, (iii) a study of the use of fault trees for enterprise architecture availability analysis, and (iv) principles for how to think about availability management.
This thesis is a composite thesis of five papers. Paper 1 offers a framework for thinking about enterprise IT service availability management, highlighting the importance of variance of outage costs. Paper 2 shows how enterprise architecture (EA) frameworks for dependency analysis can be extended with Fault Tree Analysis (FTA) and Bayesian networks (BN) techniques. FTA and BN are proven formal methods for reliability and availability modeling.
Paper 3 describes a Bayesian prediction model for systems availability, based on expert elicitation from 50 experts. Paper 4 combines FTA and constructs from the ArchiMate EA language into a method for availability analysis on the enterprise level. The method is validated by five case studies, where annual downtime estimates were always within eight hours from the actual values. Paper 5 extends the Bayesian prediction model from paper 3 and the modeling method from paper 4 into a full-blown enterprise architecture framework, expressed in a probabilistic version of the Object Constraint Language. The resulting modeling framework is tested in nine case studies of enterprise information systems.
Keywords: Service Level Agreement, outage costs, Enterprise Architecture, enterprise IT service availability, decision-making, metamodeling, Enterprise Architecture analysis, Bayesian networks, fault trees, Predictive Probabilistic Architecture Modeling Framework