• No results found

Fault Isolation in Object Oriented Control Systems

N/A
N/A
Protected

Academic year: 2021

Share "Fault Isolation in Object Oriented Control Systems"

Copied!
7
0
0

Loading.... (view fulltext now)

Full text

(1)

Control Systems MagnusLarsson  , Inger Klein  ,Dan Lawesson 

and Ulf Nilsson 

*Department of Electrical Engineering

Linkoping University, SE-581 83Linkoping, Sweden

WWW: http://www.control.isy.l iu. se

Email: magnusl,inger@isy.liu.se

**Department of Computer and InformationScience

Linkoping University, SE-581 83Linkoping, Sweden

WWW: http://www.ida.liu.se

Email: danla,ulfni@isy.liu.se

12 December, 2000

REG

LERTEKNIK

AUTO

MATIC CONTR

OL

LINKÖPING

Report no.: LiTH-ISY-R-2324

Presented atSAFEPROCESS2000

TechnicalreportsfromtheAutomaticControlgroupinLinkopingareavailable

by anonymous ftp at the address ftp.control.isy.liu.se. This report is

(2)
(3)

CONTROL SYSTEMS Magnus Larsson  IngerKlein  Dan Lawesson  UlfNilsson  

Dept. of Electrical Engineering, LinkopingUniversity, Sweden 

Dept. of ComputerandInfo. Science, Linkoping University,

Sweden

Abstract: This article addresses theproblem of fault propagation between software

modulesinalarge-scalecontrolsystemwithobjectorientedarchitecture.Thereexists

acon ict between object-orienteddesign goalssuch asencapsulation and

modular-ity, and the possibility to suppress propagating error conditions. The propagation

manifests itself as many irrelevant error messages, and hence causes problems for

systemoperatorsandservicepersonnel whenattemptingtoisolatetherealfault.We

proposeafaultisolationschemeaimedatachievingclearandconcisefaultinformation

to the operator without violating encapsulation and modularity. The approach is

implementedand testedonacommercial industrialrobot controlsystemfrom ABB

Robotics and a patent application has been led with the Swedish patent oÆce

(PRV)LarssonandEriksson(1999).

Keywords:Faultisolation,objectmodelingtechniques,controlsystem,

safety-critical,propagation

1. INTRODUCTION

Developingcontrol systems for complex systems

is a diÆcult and increasingly important task.

Traditionalsoftwaredevelopmentmethodsbased

on structured analysis and functional

decompo-sition (see e.g.DeMarco (1979))are today often

replaced by object oriented methods, see e.g.,

Douglass (1998); Rumbaugh et al. (1991). The

new methods have many advantages over

tradi-tional approaches,including better possibility to

master complexity and to facilitate maintenance

and reuse (seee.g.Booch (1994)).However,new

problemsarise;theproblemaddressedhereisfault

propagation in an objectoriented software

archi-tecture for a large-scale, con gurable and safety

critical control system. As basic inspiration and

case study we have used a commercial control

system for industrial robots developed by ABB

grammable and has an object oriented

architec-ture.

Object-oriented design goals such as

encapsula-tion andmodularity oftenstandindirect con ict

with the need to generate concise information

about a fault situation, and to avoid

propagat-ing error messages. Error messages are sent by

individual objectsto notify an operator that an

error condition has been detected. The aim to

encapsulate information implies that individual

objects, or groups of objects, in general do not

knowhowclosetheyareto afaultorifthefault

hasalreadybeenadequatelyreported.

The focus of this paper is an operational and

safety critical control system running without

direct supervision;in case of aserious fault, the

rstpriorityistotakethesystemtoasafestate.

Only then is it possible to start analyzing what

may have caused the fault. Operators orservice

(4)

afailurearefairlyunexperiencedwiththesystem

andhavelittleinsightinitsinternaldesign.Since

error reporting often re ects the internal design

of the system, it can be very diÆcult for the

operatorto understandwhicherrormessagethat

is most relevant and closest to the fault. In this

paperwepropose aliberal errorreporting policy

in combinationwithafaultisolationlayerhiding

the log and the core control system from the

operator; the layer performs post-processing of

thefault informationand isableto presentclear

andconcisefaultinformationtotheoperator;thus

facilitatingdesignprinciplessuchasencapsulation

andmodularity.

Itshould benotedthat thenumberoferror

mes-sages or alarms in a fault scenario need not be

especially large to cause problems for an

unex-perienced operator,the number typically ranges

from 3 to 20 in our use case. The strength of

theproposedapproachdoesnotlieinthenumber

of errormessageshandled in each fault scenario,

but inthewiderangeof potentialfault scenarios

handled byageneralmethod.

2. INFORMATIONUSEDDURINGFAULT

ISOLATION

We propose a fault isolation scheme where error

messages are explained locally, by means of

in-formation already available to an object at

run-time;thusnotviolatingtheprincipleof

encapsula-tionandmodularity.Thislocalinformation,made

available in an errormessage, is called the error

messagesignature which togetherwitha

concep-tualexplanation model makesit possibletoinfer

cause-e ectrelationsbetweentheerrormessages.

The most relevant error message(s) can then be

presentedto theoperator.Whentheinformation

from the error messages is inconclusive, we use

a structural system model to nd dependencies

between objects and hence\ ll in the gaps". To

theauthors'knowledge,this isanovelapproach.

Error messages are divided into internal error

messages,andrelational errormessages;ina

rela-tionalmessagethecomplainingobjectiscalledthe

complainer, and theimputed objectis called the

complainee.Relationalerrormessagesarefurther

specializedintothosewhere(1)thecomplaineeis

knownandwhere(2)thecomplaineeisunknown.

If the system is regarded as a collection of

col-laborating, fairly intelligent, but narrow-minded

individuals,thesethreetypescanbecharacterized

withthestatements\Ididit",\hedidit"and\I

didn'tdoit"respectively.

Theinformationprovidedintheerrormessagesis

complementedwithastructural system model. A

fartoocomplex(in theorderof10 linesof code

in the ABB case). Even if it is not yet common

practicetodoso,itiswidelyrecognized that

de-velopingsystemmodelshelpsindesigningcorrect

systems; hence it is not unreasonable to assume

thatsoftwareinthefuturewillbeaccompaniedby

modelsatdi erentlevelsofabstraction.The

mod-elinglanguageusedherewastheUni edModeling

Language(UML)(seee.g.,Douglass(1998)).The

UMLisadesignnotationforobjectoriented

sys-tems and also serves as system documentation.

For the fault isolation process we use the UML

class diagrams andtaskdiagrams.

Closely collaborating and related classes can in

the UML be collected into modules called

pack-ages.Apackagemodels aspeci csubjector

con-cerninthesystem,andsuppliesamorehigh-level

modelofthesystemarchitecturethantheclasses

and theclassrelationships.Howthisinformation

is used in the fault isolation approach will be

demonstratedbelow.

Forthesystemmodel (in theform ofUML class

diagrams) to be useful for fault isolation, the

system and system model should be such that

the static class structure re ects the run-time

object structure well. Classes in control systems

areoftenhighlyspecialized.Evenifthecomplete

run-time object structure often is very dynamic

and constantly changing, there are usually only

a few \major players" among the objects that

arealwayspresent.Iftheseobjectshavethemain

responsibilityforerrorreporting,theycanprovide

enough similarity with the static class structure

for thesystemmodel basedon classdiagrams to

be useful for fault isolation. Another important

property is that the inheritance hierarchies are

seldomverydeep;oftenonlyoneortwolevels.

Since the information used for fault isolation is

partitionedintoaUML modelanderrormessage

signatures, the approach scales quite well. The

fault isolation schemeis easyto maintainand to

extend when the system changes, since it is an

integralpartofthesoftwaredevelopmentprocess

andthesoftwareitself.

3. AFAULTISOLATIONSCENARIO

Duetospacelimitationsitisnotpossibletofully

describe the formal notation or algorithms used

in the implemented system.Insteadweillustrate

themethodbyarealfaultscenariofromtheABB

Roboticsindustrial robotapplication.Fora

com-plete,andformal,treatmentseethethesisLarsson

(1999) or the report Larsson et al. (1999). The

purpose of the example is to illustrate the fault

(5)

ibsser eiodevIBS 1..* 1..* 1 1 1 1 1 1 eiodev eioexe (from EIO)

Fig.1.Extractofclassdiagramfromthepackage

Drivers.

7. 10008 Program restarted 0105 13:45.9

The task MAIN has

restart to execute.

The originator is the production window.

8. 71061 I/O bus error 0105 13:45.30

Description\Reason:

- An abnormal rate of errors on

bus IBS has been detected.

9. 71107 InterBus-S bus failure 0105 13:45.31

Description\Reason:

- Lost contact at address 2.3

10. 71139 Access error from IO 0105 13:45.35

Description\Reason:

- Cannot Read or Write signal DO3_1

due to communication down.

11. 40503 Reference error 0105 13:45.35

Device descriptor is

not valid for a digital write operation

12. 40223 Execution error 0105 13:45.35

Task MAIN: Fatal runtime

error

13. 10020 Execution error state 0105 13:45.35

The program execution has reached

a spontaneous error state

14. 10005 Program stopped 0105 13:45.35

The task MAIN has

stopped. The reason is that

an external or internal stop has

occurred.

Fig.2.Errorlogfortheexample.

In Figure 1, part of the system model relevant

to our fault scenario is shown in UML class

diagram notation. Classes are shown graphically

usingrectangleswiththenameoftheclassinside.

That an object uses some service performed by

anotherobjectismodeledwithanarrowbetween

classes, aso called association. A class can be a

specializationofamoregeneralclass;thisiscalled

inheritance, and is indicatedby an arrow witha

int pgmexe 40223 RealInstructio n rlio 40503 eio 71139 eiount eiobus eioexe ibsser 71061 71107 PGM REAL EIO DRIVERS

(a)Originalbasegraph.

40503 40223 71139 71061 71107 (b) Explanation graph.

Fig.3.Original baseandexplanationgraphs.

Thefaultconsideredhereisamalfunctioning eld

bus. The resulting error message log is given in

Figure 2. The error message signatures are not

shown in the log, but the local information

pro-vided in the error message signatures is

visual-ized in a so-called base graph, see Figure 3(a).

Each node of the base graph corresponds to an

object that has either sent an error message (a

complainer)orispointedoutbyanotherobject(a

complainee).Theedgesbetweennodescorrespond

to relational error messages and should be read

\complains on".The self-loop adornedwith int

corresponds to an internal error message. There

isalsooneinheritancerelationin thebasegraph.

Thepackagesareshownusingdashedboxeswith

thepackagenamein theupperleft corner.

The base graph describes dependencies between

objects, but the aim is to point out the error

messageclosestto thefault. Forthispurpose we

construct anexplanation graph,see Figure 3(b).

The explanationgraphis in somesense the dual

of thebase graph; thenodes correspondto error

messages and the edges represent dependencies

between error messages. The goal of the fault

isolationschemeistoproduceaconnected

expla-nation graph without any cycles where all error

messagescanbetracedtooneprimaryerror

mes-sage.Thiserrormessagecanthenbepresentedto

the operator. In our scenario this primary error

(6)

int pgmexe 40223 RealInstruction rlio 40503 eio 71139 eiount eiobus eioexe ibsser eiodevIBS 71061 71107

(a)Extendedbasegraph.

40503 40223 71139 71061 71107 (b)Explanation graph.

Fig. 4. Extended base graph and explanation

graph.

but byusing thepackageinformationweachieve

aconnectedexplanationgraphanyway.

Ifthebasegraphisnotconnected,asinourcase,

it may be necessary to extend the base graph

usingtheUMLsystemmodel(Figure1).Thiscan

be done both on the class- and package levels.

Algorithms for doing this are further described

in Larsson (1999); Larsson et al. (1999). In our

example,an extensionon theclass levelis

possi-ble. The basic ideais to tryto nd complainees

for objectsin the base graphwhich do nothave

\somebodytoblame".

The system model is searched for associations

between classes corresponding to objects in the

basegraph,possiblyviaintermediateclasses.The

resultisanextendedbasegraphand

correspond-ing explanation graphas in Figure 4. Note that

theconclusionsofthefaultisolationapproachare

strengthened by the extension, even though the

originalexplanationgraphalreadywasconnected.

In this example the generation of the

explana-tion graph is easy, but the situation becomes

morecomplicated if,e.g.,IPC errormessagesare

present (communication errors between

concur-renttasks).Inthosecasesthebasegraphconsists

oftwoparts,onebasedontheclassdiagramsand

onebasedontaskdiagrams,andtheexplanation

We have presented a scheme for fault isolation

in object oriented control systems. The method

is based on the error messages in the error log,

andusesaUMLmodelofthesystemtocomplete

the explanation graph which shows the

cause-e ect relationships between error messages. The

strength of the proposed approach does not lie

in the amount oferrormessageshandled in each

fault scenario,but in thewiderangeof potential

fault scenarioshandledbyageneralmethod.

Themethodoutlinedherehasbeenimplemented.

The core of the fault isolation layer consists of

ca. 2000 linesof C++code andis ableto access

UML models developed in Rational Rose. The

algorithms have been tried on aset of real fault

scenariosfromtheABBRoboticsindustrialrobot

control system. Inthenine examplesconsidered,

thesystemwasabletopin-pointtheprimaryerror

message in seven cases { in the remaining two

cases the error wascaused in asub-system that

wasnotpartoftheUMLmodel{hence,therewas

nohopeofpin-pointingtheerror.Inthosecases,

the fault isolation tool returnedseveral maximal

errormessages,but o eredadeepened insight in

thefaultscenario.

Thesystemmodelusedabovecapturesthe

struc-ture of the system. In our future work we will

examine the possibility to use a system model

containingalsobehavioralinformation.Naturally,

a more detailed model allows for more precise

diagnosis,but it also posesproblems in termsof

maintenance of the model and nding (in some

sense)correctrulesforreasoning.Statechartsare

included in the UML, and they are our present

candidate forabehaviorsystemmodel. Oneway

of performing reasoning would then be to use a

modelchecker.

References

G. Booch. Object-OrientedAnalysis andDesign:

With Applications. Benjamin/Cummings, 2

edition,1994.

T. DeMarco. Structured Analysis and System

Speci cation. Prentice-Hall,1979.

B.P. Douglass. Real-TimeUML: Developing

Ef- cient Objectsfor EmbeddedSystems. Addison

Wesley,1998.

M. Larsson. Behavioral and Structural

Model Based Approaches to Discrete

Diagnosis. Phd thesis 608, Department of

Electrical Engineering, Linkoping University,

Linkoping, Sweden, 1999. Can be aquired at

http://control.isy.liu.se/publications/.

M. Larsson and P. Eriksson. Fault isolation

(7)

Modelbased fault isolationfor object-oriented

control systems. Technical Report

LiTH-ISY-R-2205, Dept. of Electrical Engineering,

LinkopingUniversity,1999.Canbeacquiredat

http://control.isy.liu.se/publications/.

J.Rumbaugh,M.Blaha,W.Premerlani,F.Eddy,

and W. Lorensen. Object-Oriented Modeling

References

Related documents

Vuxna personer som sökt vård för kliniska symptom på artros i höft- eller knäled och som deltagit i Artrosskola på någon av vårdcentralerna i Örebro län, samt dessutom fyllt

Nollhypotesen att fördelningarna för differenserna mellan det säsongsrensade värdet i steg ett och två var detsamma för de båda metoderna förkastades för samtliga modeller

Ett exempel på detta är när optionsprogram relateras till hur andra företags aktier har utvecklats istället för att se till det absoluta beloppet av

A flexible, model based fault detection and isolation (FDI) system for an arbitrary configuration of a water tank world has been designed and implemented in M ATLAB , S IMULINK

rather high number of images may be needed to reflect the content of a recording. A video file showing a hologram is thus more likely to be more precise as a representation than

För kadmium ligger dock den totala halten un- der mätområdet för metoden (0,005 mg/m 3 ) och prestandakravet är därmed inte rele- vant och kan inte ligga till grund för att

Resultatet visade att studenter till viss grad erfarit våld under uppväxten såväl som i egna relationer.. Den vanligast förekommande typen av våld var hot om fysiskt våld, samt

INNEHÅLLSFÖRTECKNING 1 Bakgrund 1.1 Syfte och mål 2 Metodbeskrivning 2.1 Anläggning 2.2 Försöksuppställning och genomförande 2.3 Referensmetod sluten kammare 2.4 Isolation