An Object-OrientedApproachto Crafting Compilers
Jan Bosch
?
Department ofComputer ScienceandBusinessAdministrati on,
UniversityofKarlskrona/Ro nneb y,
S-37225,Ronneby,Sweden.
E-mail:Jan.Bosch@ide.hk-r.se,
URL:http://www.pt.hk-r.se/~b osch
Abstract. Conventionalcompilersoftenarelargeentitiesthatarehighly
complex, dicult to maintain and hard to reuse. In this article it is
argued that this is due to the inherently functionalapproach to com-
piler construction . An alternativ e approachto compiler construction is
prop osed, based on object-oriented princip les ,which solves(oratleast
lessens) the problems of compiler construction. Theapproach is based
on delegating compilerobjects(dcos) thatprovide a structuraldecom-
p osition ofcompilers inadditio n to theconventionalfunctional decom-
p osition.Thedcoapproachmakesuseoftheparserdelegationandlexer
delegationtechniques,thatprovidereuseandmo dulari sati on ofsyntac-
tical,resp ectively,lexical sp ecicati ons.
1 Introduction
Traditionally, compiler constructors have taken a functional approach to the
pro cess of compiling program text. In its simplest form, the pro cess consists
of a lexical analyser, converting program text into a token stream, a parser,
convertingthetokenstreamintoaparsetree andaco de generator,converting
the parse tree into output co de. Both the lexical analyser and the parser are
monolithicentitiesinthatonlyasingleinstanceofeachexistsinanapplication.
The monolithicapproachto compilerconstruction isb ecoming increasingly
problematicduetochangingrequirementsoncompilerconstructiontechniques.
Whereaspreviouslyapplicationswerebuiltusingoneofthefewgeneralpurp ose
languages,nowadayoftena sp ecialised, applicationdomain languages is used.
Examplescanb efoundinthefourthgenerationdevelopmentenvironments,e.g.
Paradox 2
,andformalsp ecicationenvironments,e.g.SDL. Anotherexampleis
theuseofcompilationtechniquestoobtainstructuredinputfromtheuserasin,
e.g.mo dernphones.Inamo dernphoneexchange, theuser canrequest services
bydialingthedigits asso ciated witha service.Exampleservices are follow-me
?
Thisworkhasb eensupp ortedbytheBlekingeForskningsstiftels e.
2
is activated for the user. The available services inmost systems, however,are
subject toregularchange.Athird examplearetheextensiblelanguagemo dels.
An extensiblelanguagecanb eextendedwithnewconstructsandthesemantics
of existing language constructs can b e changed. In this article, an extensible
object-orientedlanguage,L ay
OM,is usedasanexample.
Thechangingrequirementsdescrib edab ovecallformo dularisationandreuse
ofcompilersp ecications.Thesenewrequirementswouldb enetfrommeansto
mo dularise and reuse compiler sp ecications as it would reduce the required
eort intheconstruction ofcompilers.
Inthisarticledelegatingcompilerobjects(dcos)areprop osedasanapproach
to compiler construction that supp orts mo dularisation and reuse of compiler
sp ecications. Thedcoapproachto compilerdevelopmentallowsone torecur-
sively decomp ose a compiler intostructuralcomponents, i.e. nested compilers.
Thedcoconcept providesastructuraldecomp ositionofacompilerinaddition
tothetraditionalfunctionalcompilerdecomp osition.Whenusingdcos,aninput
program text isnotcompiledby asinglecompiler, but byaset ofco op erating
compiler objects. Acompiler object can delegate parts of thecompilationpro-
cess toothercompilerobjects.Theadvantageofthisapproachisthatreusability
and maintainabilityareincreased considerablyduetothemo dularapproach.
Toevaluatethemechanism,ato ol,phest, hasb een implementedthat im-
plementsthedcoconcept.Usingthisto ol,compilershaveb eenconstructedthat
convertL ay
OMco de intoc++andcco de.
Theremainderofthisarticleisorganisedasfollows.Inthenext section,the
problems of traditionalcompilerconstruction techniquesthat weidentied are
describ ed. Section3describ es thelayeredobject mo delthatwillb eusedas the
running example throughout the pap er. Section4describ es thedco approach
to compilerconstruction,whereassections 4.2,4.3and4.4resp ec tivelydescrib e
the parser delegation, lexer delegation and parse graph node object techniques
employedbythedcoapproach.Section5describ es thephest to olsupp orting
thetechniquesdiscussed inthispap er.Section6describ esrelatedworkand the
pap er isconcludedinsection7.
2 The Problems of onventional ompilers
Traditionally,thecompilationpro cess isdecomp osedtowardsthedierentfunc-
tions that convert program co de into a description in another language, e.g.
lexing, parsingand co de generation. We haveidentied three problems of this
approachtocompilerconstruction:
{ Complexity:A traditional,monolithiccompiler tries to dealwith the com-
plexityofacompilerapplicationthroughdecomp osingthecompilationpro-
cessintoanumb erofsubsequentphases.Althoughthisindeeddecreases the
complexity,this approach is notscaleableb ecause a large problemcannot
lexing,parsing,semanticanalysisandco degenerationphaseswehaveexp e-
rienced tob einsu cient.
{ aintainability: Althoughthecompilationpro cess isdecomp osed intomul-
tiplephases, eachphase itselfcanb e alargeand complexentitywithmany
interdep endencies. Maintaining the parser, for example, can b e a di cult
task when thesyntax description is largeand hasmanyinterdep endencies
b etweenthepro ductionrules.Inthetraditionalapproaches,thesyntaxde-
scription of thelanguagecannot b e decomp osed intosmaller, indep endent
comp onents.
{ Reuseability:Although thedomainofcompilershasarichtheoreticalbase,
building a compiler often means starting from scratch, even when similar
applicationsareavailable.Thenotionofreusabilityhasnosupp ortingmech-
anismincompilerconstruction. Nevertheless, craftinga compilergenerally
is alarge and exp ensive undertaking and reuse, when available,would b e
highlyb enecial.
Summarising,theconventionalapproachto compilerconstruction results in
large,complexmo dulesthatresultintheaforementionedproblems.Thisisdueto
theone-levelfunctionaldecomp osition.Weconsideranobject-orientedapproach
to compiler construction that supp orts reuse and mo dularisation of compiler
sp ecicationsprovidesasolutiontheaforementionedproblems.
3 xample: a ered bject odel
The layered object mo del (L ay
OM)[6] is an extensible object mo del that ex-
tends the conventionalobject mo del to improve expressiveness. An object in
L ay
OM(seegure1)consistsofvemajorcomp onents,i.e.variables(nestedob-
jects), methods, states, categories and layers. The L ay
OMobject mo delcan b e
furtherextendedbyaddingnewtyp esoflayersornewobjectmo delcomp onents.
Thevariablesandmetho dsofaL ay
OMobjectaredenedasinmostobject-
orientedlanguages.A state,as denedinL ay
OM,isadimensionoftheabstract
object state[5].Thenotionofabstractobjectstateprovidesanasystematicand
structured approach to make the conceptual state of the object accessible at
theinterface.Acategoryisusedtodeneaclientcategory,i.e.adistinguishing
characteristicofagroup ofclientsthataretob etreated similar.
The layersencapsulate the object suchthat messages sent to theobject or
sent by the object itself have to pass all layers. Each layer can change, delay,
redirect or resp ond to a message or just let is pass. Layers are, among oth-
ers,usedforrepresentingandimplementinginter-objectrelations[4]anddesign
patterns[7].
4 elegating ompiler bjects
Thedelegatingcompilerobject(dco)approachaimsatmo dular,extensibleand
Fig.1.L object
traditionalapproachestocompilercompilationhaddi cultyprovidingrequired
features due to thelackof mo dularisationand reuse. Theunderlying rationale
of dcos is that next to the functional decomp osition into alexer, parser and
co de generator, we oer another structural decomp osition dimensionthat can
b eusedtodecomp oseacompilerintoasetofsub compilers. atherthanhaving
asinglecompilerconsistingofalexer,parserand co degenerator,aninputtext
can b e compiledbyagroup ofcompilerobjectsthat co op eratetoachievetheir
task. Each compilerobject consistsof one or morelexers, one or moreparsers
and aparsegraph.Acompilerobject,whendetecting thataparticular partof
thesyntaxistob ecompiled,caninstantiateanewcompilerobjectanddelegate
thecompilationofthatparticularpart tothenewcompilerobject.
Thedelegatingcompiler object concept makes useof parser delegationand
lexer delegationforachievingthestructuraldecomp ositionof grammar,resp ec-
tively,lexersp ecication.Thesetechniqueswillb edescrib edinsection4.2and4.3.
Section4.4discusses theparsegraphno deobjects
4.1 elegating o piler bjects
Adelegatingcompilerobject(dco)isageneralisationofaconventionalcompiler
inthattheconventionalcompilerisused asacomp onentinthedcoapproach.
Inthis approach,acompilerobject consists ofone ormorelexers,one or more
parsers andaparsegraph.
Aconsequence ofdecomp osingacompilerintosub compilersisthatthesyn-
tax of the compiledlanguage should b e decomp osed intothe main constructs
of the language.Each construct is thencompiled byadco. Eachdco can in-
dcosanddelegatespartsof thecompilationtothesedcos.Thesedcoscan,in
turn,instantiateother compilerobjectsanddelegatethecompilationtothem.
Asanexample,thestructureofthecompilerforL ay
OMisshowningure2.
Asmentioned,L ay
OMisanextensiblelanguage,requiringitscompilertob eex-
tensible.Inaddition,thelayertyp espartofL ay
OMhavetheirindividualsyntax
andsemantics,butwithconsiderableoverlap.TheL ay
OMcompilerisconstructed
usingdcos since mo dularisationand reuseof thecompilersp ecication ispro-
vided.Theinitialcompiler object of theL ay
OMcompileristhe classdco.The
parser ofthe class dcoinstantiatesthe other dcos and delegates controlover
partsofthecompilationpro cess totheinstantiateddcos.
ig.2. dco-basedL ay
OMcompiler
Adco-basedcompilerconsists,asdescrib ed,ofasetofdcos thatco op erate
tocompileaninputtext. Eachdcoconsistsofoneor morelexers,one ormore
parsersandaparsegraph,consistingofno deobjects.Whenadcohasmultiple
parsersorlexersthiscanb etomo dularisetheparserorlexersp ecicationforthe
dcoorto reuseexisting parserorlexer sp ecications. Theinteractionb etween
dierentparsers orlexers is achievedthrough parser delegation andlexer dele-
gation,resp ectively.Ingure3,anoverviewof adco-basedcompilerisshown.
Each dco has a parse graph consisting of no de objects. The classes of the
no de objects are inthenode spaceandthe dcosp ecies which no declasses it
requires.
Theconceptofdelegatingcompilerobjectsmakesuseofparserdelegation[3],
lexer delegation and parse graphnode objects[6].These techniques willb e dis-
cussedinthefollowingsections.
4.2 arser elegation
arser delegation is a mechanismthat allows one to mo dularise and to reuse
parsers and redirect the input token stream to the instantiated parser. The
instantiatedparserwillparsetheinputtokenstreamuntilitreachestheendof
its syntax sp ecication.It willsubsequentlyreturn tothe instantiatingparser,
whichwillcontinuetoparsefromthep ointwherethesubparserstopp ed.Incase
of reuse,thedesignercansp ecify foranewgrammarsp ecicationthenamesof
oneormoreexistinggrammarsp ecications.Thenewgrammarisextendedwith
thepro ductionrulesandthesemanticactionsofthereusedgrammar(s),buthas
thep ossibilitytooverrideand extendreused pro ductionrulesandactions.
Wedeneamonolithicgrammaras =( ; ; ; ),where isthename
of the grammar, is the set of nonterminals, is the set of terminals and
is the set of pro duction rules. The set = is thevo cabulary of the
grammar.Eachpro ductionrule isdenedas =(q ; ),whereqisdened
as q : where and
3
and is the set of semanticactions
asso ciated with thepro duction ruleq .Dierent frommost yacc-likegrammar
sp ecications (e.g.[1, 1 ]), agrammar in this denition has aname, and the
startsymb olissimplydenotedbythepro ductioncalled start.
Parser delegation extends the monolithic grammar sp ecication in several
waysto achievereuse andmo dularisation. irst,one can sp ecify inagrammar
that other grammarsare reused. Insituationswhere adesignerhas todene a
grammarandarelatedgrammarsp ecicationexists,onewouldliketoreusethe
existing grammarand extend andredene parts of it. Ifa grammar isreused,
all the pro duction rulesand semanticactions b ecome available to the reusing
grammarsp ecication.Inourapproach,reuseofanexistinggrammarisachieved
bycreatinganinstanceofaparserforthereusedgrammarup oninstantiationof
the parser forthe reusinggrammar.Thereusing parser uses the reused parser
bydelegatingpartsoftheparsingpro cess tothereusedparser.
When mo dularising a grammar sp ecication, the grammar sp ecication is
divided intoacollectionofgrammarmo duleclasses. Whenaparser object de-
cidestodelegateparsing,itcreatesanewparserobject.Theactiveparserobject
parses the input token stream until it is nished and subsequently it returns
controltothedelegatingparserobject.
Insteadofdelegatingtoadierentparser,theparsercanalsodelegatecontrol
toanewcompilerobject.Anewdcoisinstantiatedand theactivedcoleaves
controltothenewdco.Thedelegateddcocompilesitspartoftheinputsyntax.
Whenitisnisheditreturns controltothedelegatingcompilerobject.
Todescrib etherequiredb ehaviour,thepro ductionruleofamonolithicparser
has b een replaced with a set of pro duction rule typ es. These pro duction rule
typ escontrolthereuseofpro ductionrulesfromreusedgrammarsandthedele-
gationtoparserandcompilerobjects. Thefollowingpro ductionruletyp eshave
b eendened:
{ :
2
... ,where and
i
Allpro ductions
3
from
r e e
areexcludedfromthegrammarsp ec-
icationandonlythepro ductions from
r e i
areincluded.Thisisthe
overridingpro ductionruletyp esinceitoverridesallpro ductions fromthe
reused grammars.
{ :
2
... ,where and
i
Thepro ductionrule
2
... ,ifexistingin
r e e
isreplacedbythe
sp ecied pro duction rule . Theextending pro duction rule typ e facilitates
thedenitionofnewalternativerighthandsidesforapro duction .
{ [ ]:
2
... ,where and
i
Theelement mustcontainthenameofaparserclasswhichwillb einstan-
tiatedandparsingwillb edelegated tothisnewparser.Whenthedelegated
parser is nished parsing, it returns control to the delegatingparser. The
results of the delegated parser are stored in the parse graph. When is
usedasanidentier,
i
musthaveavalidparserclassnameasitsvalue.The
delegatingpro ductionruletyp einitiatesdelegationtoanotherparserobject.
{ [[ ]]:
2
::: ,where , thesetofall nonterminalsand
i ,
thevo cabularyofthegrammar.
Theelement mustcontainthenameofadelegatingcompilerobject typ e
whichwillb e instantiatedandthepro cess ofcompilationwillb e delegated
tothisnewcompilerobject.Whenthedelegatedcompilerobject isnished
compilingitspartoftheprogram,itreturnsthecontroloverthecompilation
pro cess tothe originatingcompilerobject. Theoriginatingcompilerobject
receives,asaresult,areferencetothedelegatedcompilerobject whichcon-
tainstheresultingparsegraph.Thedelegatingparserstoresthereferenceto
the delegated dcoin theparse graph using adco-no de. Next to using an
explicitnameforthe ,onecanalsouse asanidentier,inwhichcase
i
musthaveavalidcompilerobject classnameasitsvalue.Thedcopro duc-
tion rule typ e causes the delegation of the compilationpro cess to another
dco.
In gure4the pro cess of parser delegationfor mo dularisingpurp oses is il-
instantiationofanew,dedicatedparserobject.In(3)thecontrolovertheinput
token streamhasb een delegated to thenewparser, whichparses itssectionof
thetokenstream.In(4)thenewparserhasnishedparsingandithasreturned
the controlto theoriginatingparser.This parserstores areference to theded-
icated parser as it contains the parsing results. Note that the lexer and parse
graph arenotshownforspacereasons.
Wereferto[3,6]formoredetaileddiscussionofparserdelegation.
4. e er elegation
The lexer delegation concept providessupp ort for modularisation and reuseof
lexical analysis sp ecications.Esp eciallyindomainswhereapplications change
regularly and new applications areoften dened mo dularisationand reuse are
very imp ortantfeatures. Lexerdelegationcan b eseen asan object-orientedap-
proachtolexicalanalysis.
Amonolithiclexercanb edenedas =( ; ; ; ),where istheidentier
ofthelexersp ecication, isthesetofdenitions, isthesetofrulesand is
thesetofprogrammersubroutines.Eachdenition isdenedas =( ; ),
where isaname, ,thesetofallidentiers,and isatranslation, ,
the set of all translations. Each rule is dened as = ( ; ), where
is a regular expression, , the set of all regular expressions, and is an
action, ,the set ofall actions.Each subroutine isaroutine in the
output languagewhich willb e incorp oratedinthelexer generatedbythelexer
generator. Dierentfrommostlexical analysis sp ecicationslanguages,alexer
sp ecicationinourdenitionhasaidenti erwhichwillb eusedinlatersections
to referto dierentlexicalsp ecications.
Lexerdelegation,analogoustoparserdelegation,extendsthemonolithiclexer
sp ecication to achievemo dularisationand reuse. Thedesigner, when dening
a new lexer sp ecication, can sp ecify the lexer sp ecications that should b e
the reusing lexer sp ecication. In a lexical sp ecication, the designer is able
to exclude or override de nitions, rules and subroutines. Overriding a reused
denition =( ; )issimplydonebyprovidingadenitionfor inthereusing
lexerdenition.One can,however,alsoextendthetranslation for byadding
a b ehind thename of .Extendingadenitionisrepresentedas:
e e e
Theresult ofextending thisdenitionisthefollowing:
e e e r e e
Areusedrule =( ; )canalsob eoverriddenbydeningarule 0
=( ; 0
),
i.e.arulewiththesameregularexpression .Onecaninterpretextendingarule
intwoways.The rstwayisto interpretit as extending theactionasso ciated
with the rule. The second wayis to extend the regular expression asso ciated
withanaction.Bothtyp esofruleextensionsaresupp ortedbylexerdelegation.
Extendingtheregularexpression isrepresentedasfollows:
0
When a lexer sp ecication is mo dularised, it is decomp osed into smaller
mo dules that contain parts of the lexer sp ecication. One of the mo dules is
the initial lexer which is instantiated at the start of the lexing pro cess. The
extensionsforlexermo dularisationconsistoftwonewactionsthat canb eused
in the action part of rules in the lexer sp ecications. Lexer delegationo ccurs
in the action part of the lexing rules. The semantics of these actions are the
following:
{ elegate le er class :Thisactionispartoftheactionpartofalexing
ruleandisgenerallyfollowedbyareturn to en statement.Thedelegate
actioninstantiatesanewlexerobject ofclass lexer-class andinstallsthe
lexerobject suchthatanyfollowingtokenrequestsaredelegatedtothenew
lexerobject.Thedelegateactionisnownished andthenext actioninthe
actionblo ckisexecuted.
{ n elegate:Theundelegate action isalsocontainedintheaction partof
alexingrule.Theundelegateaction,asthenameimplies,do es theopp osite
of thedelegate action.It changes thedelegatinglexer object such that the
next tokenrequest ishandledbythedelegatinglexerobject anddelegation
isterminated.Thelexerobject do es notcontainanystatethat needs tob e
stored for future referenc e, so the object is simplyremovedafter nishing
theactionblo ck.
In the delegatingcompiler object approach, an object-oriented, rather than a
functionalapproach,istakentoparsetree andco degeneration.Insteadofusing
passive data structures as theno des in theparse tree as wasdone inthe con-
ventional approach,the dco approachuses objectsas no des. A no de object is
instantiatedbyapro ductionruleoftheparser.Up oninstantiation,theno deob-
jectalsoreceivesanumb erofargumentswhichitusestoinitialiseitself.Another
dierence from traditional approaches is that, rather than havingan separate
co de generation functionusing theparse tree as data, the no de objects them-
selves containsknowledgeforgeneratingthe outputco de asso ciated with their
semantics.
A parse graph no de object, or simplyno de object, containsthree parts of
functionality. The rst is the constructor metho d, which instantiates and ini-
tialisesanewinstanceoftheno deobject class.Theconstructor metho disused
by the pro duction rules of theparser to create new no desin theparse graph.
The second part is the code generation metho d, which is invoked during the
generation ofoutputco de. Thethird partconsists ofaset ofmetho ds thatare
used to access the state of theno de object, e.g.thename of anidentier or a
reference toanotherno deobject.
Thegrammarhasfacilitiesfor parse graphno de instantiation.An example
pro duction rulecouldb e thefollowing:
:
Theparsegraph,generally,consistsofalargenumb erofno deobjects.There
isaro otobjectthatrepresen tsthep ointofaccesstotheparsegraph.Whenthe
compilerdecidestogenerateco defromtheparsegraph,itsendsa
messagetothero otno deobject.Thero otno deobjectwillgenerate someco de
andsubsequentlyinvokeitschildrenparseno deswitha message.
Thechildrenparseno deswillgeneratetheirco de andinvokealltheirchildren.
Tool et
In order to b e able to exp eriment with the concepts describ ed in this pap er,
an integrated to ol, phest, has b een develop ed. The phest to ol provides the
functionalityof dcoapproach. Itincorp orates twopreviously develop ed to ols,
i.e. yaccand e , thatimplementparserdelegationandlexerdelegation.
Another to olpart ofphestsupp ortsthedenitionofparsegraphno declasses.
The phest to ol itself facilitates the denition of a compiler by comp osing a
co op eratingsetofdcosthat,whencombined,providetherequiredfunctionality.
A secondasp ectofthephest to olisthecomp ositionofcompilerobjects based
ontheavailablelexers, parsers andparsegraphno declasses.
In gure5,theuser interfaceof thephest to olis shown.A designer using
can b e op en and worked on using the to ol.In the left subwindow, the list of
dcos contained inthecurrent project (orcompiler)is shown. The dco
isselected and b elowtheaforementioned window,informationonthe nameof
the grammar and lexer used inby the dco. In this case, the grammar
andthelexer alsohave thename .Intheupp er rightwindow,thelistof
grammarscontainedintheprojectisshown.Notethatthenumb erofgrammars
islargerthanthenumb erofdcos.Thereasonforthat isthatsomegrammars,
e.g. ,are usedto as abstract classes, i.e. only used forreuse, but never
instantiated inadco.Thelowerrightwindowshowsthelexersdenedwithin
theproject.
When satised with the conguration, thedesigner can request the phest
to oltogenerateanexecutablecompiler. orpragmaticreasons,phestmakesuse
ofyaccfortheactualparsergeneration.Thisisdonebyrstconvertingagram-
mar expresse d in yacc,the extended grammar denition syntax forparser
delegation,to an equivalentyaccsp ecication. This sp ecication is converted
treatedinananalogousfashion. ortheimplementationdetailsofphestto ol,we
refer to[6,1 ].
Thephestto olhasb eenusedtoconstructtwocompilersfortheL ay
OMobject
mo deldiscussed insection3.One compilergeneratesc++ outputco de forthe
SunSolarisenvironment.Thesecondcompilergenerates euroncoutputco de
fortheLonworksenvironment.Thestudentsthatbuiltthecompilersnotedthat
the mo dularisation and reusabilityprovided bythe dcoapproach indeed sim-
pliedcompilerdevelopment.
elated or
In[1 , 11] a dierent approach to language engineering, TaLE, is presented.
atherthanusingameta-languagelike e oryaccforsp ecifyingalanguage,
the user editsthe classes that makeupthe implementationusinga sp ecialised
editor. TaLE not immediately intended for the implementation of traditional
programminglanguages,butprimarilyfortheimplementationoflanguagesthat
havemoredynamiccharacteristics,likeapplication-orientedlanguages.Reuseis
one of the key requirements in TaLE and it is supp orted in three ways:rst,
language structures are implementedby indep endent classes, leading to a dis-
tributed implementationmo del second, general languageconcepts can b e sp e-
cialisedforparticularlanguages third,thesystemsupp ortsalibraryofstandard
languagecomp onents.
TheTaLE approachis dierent fromthedelegating compiler object (dco)
approach in,at least, twoasp ects. irst, TaLEdo es notmake useof metalan-
guageslike e andyacc,whereasthedcoapproachto okthesemetalanguages
as abasisandextendedonthem.Thisprop ertymakesitmoredi cultto com-
pare thetwoapproaches.Second,theclasses inTaLEusedforlanguageimple-
mentation seem only to b e used for language parts at the levelof individual
pro duction rules, whereas dcos are particularly intended for, p ossibly small,
groups ofpro ductionrulesreprese ntingamajorconcept inthelanguage.
In[2]amechanismforreuseofgrammarsp ecications,grammarinheritance
is describ ed.It allowsagrammarto inheritpro duction rulesfromone or more
predenedgrammars.Inheritedpro ductionrulescanb eoverriddenintheinherit-
inggrammar,butexclusionofrulesisnotsupp orted.Althoughinheritanceoers
amechanismtoreuseexistinggrammarsp ecications,nosupp ortformo dularis-
ingagrammarsp ecicationisoered.Therefore,forpurp osesofmo dularisinga
largegrammarsp ecication,weareconvincedthatdelegationisab etter mech-
anism than inheritance. The rational for this is that delegationallowsone to
separateagrammarsp ecicationat theobjectlevel,whereas inheritancewould
still require the denitionof a monolithic parser, althoughb eing comp osed of
inherited grammarsp ecications. Also, delegationoers auniform mechanism
forb oth reuseand mo dularisationofgrammarsp ecications.
In[ ], a p ersistent system for compiler construction is prop osed. The ap-
mo duleshaveatyp edescriptionwhichisusedtodeterminewhethercomp onents
canb ecombined.
Theapproachprop osed in[ ] isdierentfromthedcoapproachinthefol-
lowingasp ects. irst,althoughtheapproachenhancesthetraditional compiler
mo dularisation,mo dularisationand reuseofindividualmo dules,e.g.thegram-
mar sp ecication, is not supp orted. Secondly, judging fromthe pap er, it do es
notseemfeasibletohavemultiplecompilersco op eratingonasingleinputsp ec-
ication,asinthedcoapproach.
TheMj lnerOrmsystem[13,14,15]isanapproachtoobject-orientedcom-
pilerdevelopmentthat ispurelygrammar-driven.Dierentfromthetraditional
grammar-drivensystemsthat generate alanguagecompilerfromthegrammar,
Orm uses grammarinterpretation.The advantageof theinterpretiveapproach
isthat changes to thegrammar immediatelyare incorp orated inthelanguage.
AgrammarinOrmisrepresentedasanobjectandconsistsoffourparts:anab-
stractgrammardeningthestructureofthelanguage aconcretegrammarthat
denesthetextualpresentationofthelanguageconstructs asemanticgrammar
thatdenesthestaticsemanticsandacode-generationgrammarthattranslates
the language into an intermediate language. Orm may b e used to implement
an existing language or for language prototyping, e.g. for application-domain
sp ecic languages.
Although the researchers b ehind the Orm system do recognise the imp or-
tanceof grammar andco de reuse, see e.g.[16],this isdeferred to future work.
TheOrmsystemaddressesthecomplexityoflanguageimplementationthrough,
among others, the decomp osition of a grammar sp ecication into an abstract
and concrete grammar.Extensibility and reusabilityare not addressed. Thus,
thereareseveraldierences b etweentheOrmapproachand thedcoapproach.
irst,Ormtakesthegrammar-interpretiveapproach,whereas dcos extend the
conventionalgenerativeapproach.Second,alanguageimplementationcanb ede-
comp osedintomultipledcos,whereasanequivalentOrmimplementationwould
consist of asingle abstract, concrete, etc. grammar,even when the si e of the
languageimplementationwouldjustifyastructuraldecomp osition. urthermore,
thegoalsoftheOrmsystemandthedcoapproacharequitedierent.TheOrm
system aims at an interactive, incrementally compiling environment, whereas
dcos aimat improvingthemo dularityand reusabilityofthe traditional com-
pilerconstructiontechniques. Itisthusdi culttocomparetheseapproaches.
onclusion
Therequirements oncompilerconstruction techniquesarechangingdueto cer-
tain trends that one can recognise, e.g. applicationdomain languages, fourth
generation languages and extensible language mo dels. Due to this, the tradi-
tional,functionalapproachtocompilerconstructionhasb eenproveninsu cient
sub compilers,inadditiontothetraditionalfunctionaldecomp ositionintoalexer,
parser and co de generator,is required. In this article, thenotionof delegating
compiler objects(dco) is prop osed as asolution. The major dierence with a
traditionalcompileristhataninputtext,inthedcoapproach,iscompiledbya
co op erating group ofcompilerobjects ratherthanb e asingle,monolithiccom-
piler.Theresultisacompilerthatismuchmoremo dular, exibleandextensible
than theconventional,monolithiccompiler. Thedcoapproachis basedon the
abilityofcompilerobjectstoinstantiatenewcompilerobjectsanddelegate the
compilationofpiecesoftheinputsyntaxtothesesp ecialised compilerobjects.
Thedelegatingcompilerobjectapproachbuildsontheparserdelegationand
lexerdelegationtechniques.Traditionalparsingapproachessuerfromproblems
related to complexity, extensibility and reusability.Parser delegationoers an
object-oriented approachtoparsingwhichdo esnotsuerfromtheseproblems.
Theparsersp ecicationsyntaxhasb eenextendedtosupp ortreuseandmo dular-
isation.Eachgrammarhasanameandp ossiblyalistofgrammarsitreuses.The
pro ductionrulessyntaxhasb eenextended tosupp ortextensionandoverriding
of reusedpro ductionrules.
Traditionallexingapproaches,analogoustoconventionalparsingapproaches,
suer fromproblemsrelatedtocomplexity, exibility,extensibilityandreusabil-
ity.Lexerdelegationisprop osedasanobject-orientedsolutiontotheseproblems
that facilitates mo dularisationand reuse of lexical analysis sp ecications. The
syntaxforlexicalanalysissp ecicationshasb eenextendedwithelementsforthe
sp ecication of mo dularisationand reuse and for theextension and overriding
of reusedsp ecications.
Thecontributionof delegating compiler objects,parser delegation and lexer
delegation isthat these techniques compriseanovelapproachto compilercon-
struction which supp orts structural decomp osition and reuse of existing sp ec-
ications. The complexity,maintainability, extensibility and reusabilityof the
resulting compiler is signicantly b etter than the conventionalapproaches to
compilerdevelopment.
c no le ge ents
Manythankstothestudentsthatwereinvolvedintheconstructionofthephest,
yaccand e to ols.
eferences
. A. .Aho,R.Sethi, J.D.Ullman, Compilers Principl es, Techniques, andTo ols,
Addison esley Publishi ngCompany, arch .
2. . Aksit, R. ostert, B. averkort, Compiler eneration Based on rammar
Inheritance, ec nical eport DepartmentofComputerScience,University
ofTwente,February .
3. J.Bosch, ParserDelegation An bject- rientedApproachtoParsing, inPro-
. J.Bosch, Relation sasFirst-Class Entitiesin L , acceptedforpubli cati on in
ournalof rogramming anguages, 5.
5. J.Bosch, Abstracting bjectState, submitted to bject riented stems De-
cemb er .
. J. Bosch, Layered bject o del Investigating Paradigm Extensibi li ty,
esis(inpreparation),DepartmentofComputerScience,LundUniversity, ctob er
5.
7. J.Bosch, Language Supp ortforDesignPatterns, tob epublish edinpro ceedings
of urope .
. A.Dearle, ConstructingCompilersinaPersistentEnvironment, TechnicalRep ort,
ComputationalScience Department,UniversityofSt.Andrews, .
. B.Fischer,C. ammer, .Struckmann, ALADI :A Scanner eneratorforIn-
crementalProgrammingEnvironments, oft are ractice xperience, ol.22,
o. ,pp. - 2 , ovemb er 2.
. E. J arnvall, K. Koskimies, Language Implementation o del in TaLE, eport
DepartmentofComputerScience, UniversityofTamp ere, 3.
. E.J arnvall,K.Koskimies, . iittym aki, bject- rientedLanguageEngineering
withTaLE, toapp earin bject riented stems 5.
2. .E. Lesk, Lex A Lexical Analy er enerator, cience ec nical eport
At TBellLab oratories, urray ill, 75.
3. B. agnusson, .Bengtsson,L. .Dahli n, .Fries,A, ustavsson, . edin,S.
in or,D. scarsson, .Taub e, An verviewofthe j lner/ rm Environment:
Incremental Language andSoftwareDevelopment, eport De-
partmentofComputerScience, LundUniversity, .
. B. agnusson, The j lner rm system, in: bject riented n ironments
e j lner pproac J.Lindskov Knudsen, . L ofgren, .Lehrmann adsen,
B. agnusson(eds.),Prentice all, .
5. B. agnusson, The j lner rmarchitecture, in: bject riented n ironments
e j lner pproac J.LindskovKnudsen, .L ofgren, .Lehrmann adsen,
B. agnusson(eds.),Prentice all, .
. S. in or,B. agnusson, Using j lner rmasaStructure-Based etaEnviron-
ment, Tob epublish ed in tructure riented ditorsand n ironments L. eal
and .Swillus (eds), .
7. .Paxson, Flex anual Pages, ublic omain oft are .
. Pheasant student project, easant project documentation University of Karl-
skrona/Ronn eby, 5.
. Sun icrosystemsInc., etAnotherCompilerCompiler, rogramming tilities
and ibraries Solaris . S CCReleaseAAnswerb o ok,June 2.
Thisarticle waspro cessedusingtheL a
T macropackagewithLL CSstyle