An Object-Oriented Approach to Parsing
Jan Bosch 3
Departmentof Computer Science andBusiness Administration
Universityof Karlskrona/Ronneby
S-372 25, Ronneby,Sweden
E-mail: Jan.Bosch@ide.hk-r.se
WWW: http://www.pt.hk-r.se/~b osch
Abstract
Conventional grammar sp ecication and parsing
is generallydone in a monolithic manner, i.e. the
syntax and semantics of a grammar are sp ec-
ied in one large sp ecication. Although this
mightb esucientin staticenvironments,amo d-
ular approachis required in situations where the
syntax or semantics of a grammar sp ecication
are subject to frequent changes. The problems
withmonolithicgrammarsarerelatedto(1)deal-
ing with thecomplexity,(2) extensibili ty and (3)
reusability. Weprop osetheconceptofparserdele-
gationasasolutiontotheseproblems. Parserdel-
egationallowsone tomo dularise andreuse gram-
mar sp ecications. Toachieve this, thenotion of
a pro duction rule is sp ecialised into (1) overrid-
ing, (2) extending and (3) delegating pro duction
ruletyp es. Toexp erimentwithparserdelegation,
we have develop ed D-yacc, a graphical to ol for
dening grammars. Parser delegation has b een
applied forconstructing atranslatorforanexp er-
imentallanguageand iscurrentlyappliedinother
domains.
3
This work has b een supp orted by the Blekinge
Forskningsstiftelse .
1 Introduction
Traditionally, compiler constructors have taken
a functional approach to parsing program text.
Generally, the pro cess consists of a lexical anal-
yser,convertingprogramtextintoatokenstream,
and a parser, converting the token stream into
a parse tree. Both the lexical analyser and the
parser are monolithic entities in that only a sin-
gle instance of eachexists in an application. Al-
thoughthis mightb eadequate forstatic environ-
ments, a more mo dular approach is required in
cases where the syntax or semantic actions are
subject to changes, or when one can recognise a
clear partitioning in thegrammarforpartsof the
input text.
Themonolithicapproachtogrammarsp ecica-
tion isb ecoming increasingly problematic due to,
at least, two trendsone can recognise. First, the
emergence ofsp ecial purp oselanguages. Whereas
previouslyapplicationswerebuildusingoneofthe
few general purp ose languages, nowaday we of-
tenseethatsp ecialise d,application (domain)sp e-
cic languages areused. Examples of this can b e
found in thefourth generationdevelopmentenvi-
ronments,e.g. Paradox 1
,and formalsp ecication
environments, e.g. SDL. Each of these environ-
mentsdenes its ownlanguage asan interfaceto
the user. The development of these sp ecial pur-
p ose languages callsformo dularisationand reuse
of grammar sp ecications. A second trend is the
use of grammarstoobtain structuredinput from
1
Paradoxisatrademark ofBorlandInternational,Inc.
phone exchange, the user can request services by
dialing the digits asso ciated with a service. Ex-
ample services are follow-me and tele-conference.
The digits are parsed by a parser and the re-
questedserviceisactivatedfortheuser. A second
exampleistheuseofintermediatelestostoreap-
plicationdata. Theretrievalofthedatarequiresa
parsertoparsetheletorecreatethestoredstruc-
tures. The describ ed trends would b enet from a
means to mo dularise and reuse grammar sp eci-
cationsas itwouldreduce theeortput in tothe
construction ofparsers.
In this pap er we describ e parser delegation, a
novel mechanism that allows one to mo dularise
a syntax sp ecication into multiple parsers. The
baseparsercaninstantiateotherparsersandredi-
rect the input token stream to the instantiated
parser. The instantiated parserwill parse the in-
put token stream until it reaches the end of its
syntax sp ecication. It will subsequently return
totheinstantiatingparser,which willcontinueto
parsefromthep ointwherethesubparserstopp ed.
The parser delegation mechanism is not only
used for mo dularisation, but also for reusing ex-
isting grammar sp ecications. The designer can
sp ecify, when dening a new grammar sp ecica-
tion, that one or more existing grammar sp eci-
cationsarereused. Thenewgrammarisextended
withthepro ductionrulesandthesemanticactions
of thereused grammar,but has thep ossibili ty to
overrideand extend reused pro duction rules and
actions.
To evaluate the mechanism, we constructed a
to ol D-yaccfor sp ecifying grammarsand gram-
mar delegation and reuse. D-yacc uses an ex-
tended yacc sp ecication syntax. The reason
to use yacc is largely pragmatic. It is avail-
able on numerous machines and using yacc al-
lowed us to concentrate on applying and evalu-
atingparser delegation,ratherthanimplementing
ourownparsergenerator. D-yaccallowstheuser
toworkonmultiplegrammarsp ecicationssimul-
taneously and totest parsers generated from the
grammars.
To illustrate theparser delegation mechanism,
we use the layered object model (L ay
OM), the
object-orientedresearchlanguagewearecurrently
taxandsemanticschangefrequently,althoughthe
changes are generally small. However,to b e able
to exp erimentwith the language, we constructed
atranslatorwhichtranslatesL ay
OMprogramtext,
e.g. class descriptions into C++ co de, e.g. C++
classdescriptions. InL ay
OManobjectisencapsu-
latedby,socalled,layers. Theselayersextendthe
b ehaviour of the class in many ways. Each layer
typ e denes a certain typ e of functionality, but
in general alayerdenes arelation with another
object, e.g. an inheritance, a conditionaldelega-
tionor an application dened relation, like works
for, or a constraint, e.g. a concurrency or real-
time constraint, on the b ehaviour of the object.
However, often a new layer typ e, e.g. an appli-
cation sp ecic relation typ e, with its asso ciated
syntaxandsemantics,isintro ducedortheseman-
tics or syntax of an existing layer are changed.
When wewould usea monolithic parser,wehave
to change the parser very often. Instead, we de-
ne a parser for each layer typ e and instantiate
thelayerparsersfrom within theclassparser.
Theremainderofthispap erisorganisedasfol-
lows. In the next section, we brie y describ e a
numb er of problems with traditional, monolithic
grammarsp ecication techniques. Insection3we
prop oseasolutiontotheseproblems, i.e. thecon-
cept of parser delegation, which is explained by
using the layered object mo del example. In sec-
tion4,D-yacc,theto olimplementing parserdel-
egation, is describ ed and its use is illustrated by
using L ay
OM examples. Section 5 discusses work
that is related to the parser delegation concept.
The last section contains conclusions and a de-
scriptionof thefuture work.
2 e roble s of onolit ic
arsers
Traditionally, parsers are constructed as often
large, monolithic entities. These parsers contain
all the pro duction rules for all of the syntax and
all of the semantic information that is required.
Below wedescrib e three problems of these mono-
lithic parsers.
grammar, the designer can `get lost' in the
grammar. Forinstance,whenworkingonone
asp ect,hemakeschangesthatin uenceother
partsof thegrammar. Because conventional
grammars do not provide a mo dularisation
mechanism, this problem is unavoidable un-
less one can decomp osea grammarsp ecica-
tioninto mo dules.
Extensibility: A problem resulting from the
complexityistheextensibil ity ofa grammar.
Software,b eing a mo del ofa part of the real
world, changes regularly and these changes
shouldb eincorp oratedin a,preferably,natu-
ral manner. However,a monolithic grammar
sp ecicationdo esnotprovideforextensionof
the grammar. Basically, extending a gram-
mar results in editing the original grammar
sp ecication. This, however, easily results
in, again,thecomplexityproblems of editing
a grammar sp ecication which was, p ossibly
by someone else, dened some time ago. It
would b e preferable to b e able to dene the
extensions to a grammar separate from the
grammaritself.
eusability: When constructing a new gram-
mar while having a related grammar avail-
able, one would like to reuse the existing
grammarandextendandredene partsofit.
However, asdescrib ed ab ove,apartfrom the
copy-pastefacilitie s areno mechanismsavail-
able forreusing an existing grammarsp eci-
cation. Again, it would b e advantageous to
separate the reused grammar from the new
grammarsp ecications.
In the categorisation ab ove, we do not claim
to b e exhaustive in the problem identication.
We primarily make a case for a technique that
allows a designer to mo dularise, reuse and ex-
tend grammar sp ecications. Also others have
identied one or more of these problems, see
e.g.[Minor94,Aksit 90]. Inthispap erwepresent
the concept of parser delegation as a solution to
theidentied problems.
ce t
The parser delegation concept, as mentioned b e-
fore,aimsatsupp ortingmo dularisationandreuse
of grammar sp ecications. This is imp ortant
for systems that are likely to b e extended or
reused. Conventionally, a grammar sp ecication
wasalarge monolithic description which wasdif-
cult to extend or reuse. We b elieve that apply-
inganobject-orientedapproach,in particularthe
concept of parser delegation, to the sp ecication
grammarsandthe pro cess ofparsing will b every
b enecial. In the following, the layered object
mo del example will b eintro duced rst. Then the
parser delegation concept is describ ed and illus-
trated byusing this example. Due to space con-
straints,we concentrate on grammatical analysis
anddonotdiscusslexicalanalysis,northesematic
actionsasso ciated with pro duction rules.
In this pap er, we apply bottom-up parsing,
rather than top-down parsing. The reason for
this is twofold. First, b ottom-up parsing handles
a larger class of grammars than top-down pars-
ing and, second, most software to ols tend to use
b ottom-up metho ds. For example, yaccaccepts
LAL (1) grammars, but also has disambiguat-
ingrulestohandlehigherorder LAL grammars.
However, the concept of parser delegation is not
dep ending on b ottom-up parsing, it can also b e
applied totop-downparsing.
3.1 xa le: a ere ject o el
The layeredobjectmo del (L ay
OM)[Bosch94b] is
an extension 2
of the conventional object mo del.
Thecentreof theobjectconsistsof instancevari-
ables and metho ds, like the conventional object
mo del,buttheobjectisencapsulatedby,socalled,
layers. These layerssp ecify additional functional-
ity of the object, generally in terms of relations
with other objects and constraints on the object
itself. Messagessent toor by the object have to
pass the layers. These layers have the ability to
2
Inthispap er,weuseasimplie dversionoftheL ay
.
The L ay
object mo del supp orts many additiona l fea-
tures, e.g. states, condition s and dynamic layer creation.
owever,theseasp ects are notrelevantforthe discussion
inthispap er.
sendtoandfromtheobject. Butalayerdo esnot
have to function in resp onse to the receipt of a
message. It can also b e an active object,sending
messages and monitoring the contextit is placed
in.
As eachtyp e oflayerhasits distinct function-
ality,it also has its ownsp ecication syntax. Al-
thoughlayerscan share largepartsoftheir sp eci-
cationsyntax,mostlayersadd someunique syn-
tax elements. In gure 1,an example LOM class
denition isshown.
Inclass ater emperature ensor(seealso g-
ure2)threelayersaredened, i.e. apartialinher-
itance layer pi , a delegation layer d E and
a mutual exclusion layercc. pi denes a par-
tial inheritance relation with class emperature-
ensor. The star denotes that it will inherit all
metho ds from emperature ensor, but the third
element, i.e. ` calibrate ', indicates that the cal-
ibrate metho d is not inherited. p on creation
of an instance of class atertemperature ensor,
the pi layer will also create an instance of
class emperature ensor. All messages sent to
the object requesting a metho d implemented by
class emperature ensor will b e sent to the in-
stance created bydi . The second layerd E
delegates messagesrequestingthemetho dscheck-
ealand calibrateConstant to an external object
called aterE uipment. The third layer cc de-
nes a mutual exclusion constraint of `1', mean-
ing that not more than one thread is allowed to
b e active within the object. The semantics of
L ay
OM are only discussed very brie y here. We
refer to [Bosch94a, Bosch 94b, Bosch 94c] for a
detailed description.
Eachlayertyp ehas its own,indep en dent syn-
tax and semantics. Often, a new typ eof layer is
added to the system. Also, the syntax and se-
mantics of the existing layer typ es are changed
regularly. If the parser for L ay
OM would b e im-
plemented asa monolithic parser,we would have
tochange theparser veryoftenwith all theasso-
ciated problems discussed in section 2. A mech-
anism that supp orts mo dularisation and reuse of
grammar sp ecications would b eextremely help-
ful in dealing withthecomplexityof constructing
and maintaining a parser and co de generatorfor
L ay
OM.
Figure 2: WaterTemp eratureSensorobject struc-
ture
3. arser ele at on once t
A monolithic grammar can b e dened as =
( ; ; ; ), where isthe name of thegram-
mar, is the set of nonterminals, is the set
of terminals and is theset of pro duction rules.
Theset = isthevo cabularyofthegram-
mar. Each pro duction rule is dened as
=( ; ), where is dened as : where
and
3
and is the set of semantic
actionsasso ciatedwiththepro ductionrule . Dif-
ferentfrommostyacc-likegrammarsp ecications
(e.g. [Aho ,Sun92]), a grammar in this deni-
tion has a name, and the start symb ol is simply
denoted bythepro duction called start.
The concept of parser delegationallowsone to
delegate parsing of sections of the input token
stream to parsers dedicated to that part of the
syntax. Inthe example describ ed in theprevious
section,thelayersp ecicationsareparsedbyded-
icated parsers of, resp ectively, the artial nheri-
tance, elegateand utex typ e. Thebaseparser,
Class peci cation,instantiatestheselayerparsers
when it has parsed the layername and the layer
typ e. Parserdelegation can b e used in twoways,
i.e. for reusing an existinggrammar sp ecication
and for partitioning a large grammar into a col-
lection ofsmaller grammarsp ecications.
Aparsercanb eseenasanobjectandtheinput
token stream can b e viewed as messages to the
object. The right-hand sides of the pro duction
rules aremetho ds withmetho d names equivalent
to theasso ciated grammar sp ecication. If, after
the receipt of a token, a pro duction rule can b e
executed, the parser object will do so, otherwise
the tokenis placed on the stack. If a pro duction
s
pi TS: PartialInheritance(Temp eratureSensor, ( ),(calibrate))
d WE:Delegate( WaterEquipment,(checkSeal, calibrateConstant))
cc : Mutex(1) concurrency constraint
s
voltage : eal
calibrate alue : eal
s
calibrate (
1
,:::, ) sBo olean
:::co de tocalibrate thesensor:::
readTemp erature() sInteger
:::co de tocalculate thetemp erature:::
WaterTemp eratureSensor
Figure1: Example Class ater emperature ensor
rulematches,theparserobjectwill(1)executethe
semantic actions asso ciated with the pro duction
rule and (2) subsequently replace the right-hand
sideelementswiththeleft-handside nonterminal.
This pro cess of parsing continues until the top-
levelpro duction rulehas b eenexecuted.
In this pap er, we concentrateon applying the
concept of delegation,rather thaninheritance, to
achieving reusein parsing. The rational for this
is two-fold. First, othershavedened inheritance
mechanisms for parsing, and we, therefore, pre-
ferinvestigatinganalternativeapproach. Second,
and more imp ortant, the concept of delegation
providesauniformframeworkforb othreusingand
mo dularising grammarsp ecications, whereasin-
heritance do es not. We elab orate on this in sec-
tion5.
. . s
In situations where a designer has to dene a
grammarand arelated grammarsp ecication ex-
ists,one would liketoreusetheexistinggrammar
and extendandredene partsofit. Ifagrammar
is reused, all the pro duction rules and semantic
actions b ecome available tothe reusing grammar
sp ecication. Inourapproach,reuseofanexisting
grammaris achievedbycreating an instance of a
parserforthereusedgrammarup oninstantiation
oftheparserforthereusinggrammar. Thereusing
parser uses thereused parserbydelegatingparts
ofthe parsingpro cess tothereused parser.
When reusing an existing grammar sp ecica-
tion,acrucial asp ectistheabilitytooverrideand
exclude pro duction rules from the reused gram-
mar. To achieve this, the reusing parser must
control the pro ductions p erformed by the reused
parser. A second imp ortant asp ect is that the
reusing grammar is able to extend pro ductions
declared at the reused parser. To allow o errid-
ingand extensionof reused pro duction rules, two
typ esof pro duction rule sp ecication aredened.
The rst is the o erriding sp ecication, denoted
by '', and the second is the extending sp ecica-
tion, denoted by' '. The semantics of the pro-
ductionrule sp ecicationtyp esaredened b elow.
:
1 2
... , where and
Allpro ductions
3
from areex-
Theterm`reusingparser'isashorthand fortheparser
generated from the reusing grammar sp ecicati on. imi-
larly, `reused parser' refers to the parser generated from
thereusedgrammarsp ecicati on
cluded from the grammar sp ecication and
only the pro ductions from are in-
cluded.
+:
1 2
... ,where and
The pro duction rule
1 2
... , if ex-
isting in is replaced by the sp ecied
pro duction rule .
Thepro duction ruletyp esdonotallowthede-
signer tosp ecify exclusion of pro duction rules in
the reused grammar. Exclusion is sp ecied to-
gether with thereuse sp ecication, i.e. when the
name of the grammaris sp ecied. Thus, when a
grammar
1
reuses agrammar
2 ,
1
can dene
a set
2
2
suchthat if
2
then
1 .
In gure 3 the parser conguration for reuse
is shown. The reusing parser has a reference to
the reused parser. The reused parser has con-
trolled access to the stack of the reusing parser.
Theaccesshastob econtrolledinordertob eable
to exclude pro duction rules dened in the reused
parser. As the pro duction rules dened in the
reusinggrammararealwaystriedb eforetheparser
attempts to execute the reused pro duction rules,
noadditional controlforoverridingpro ductionsis
required.
. . s -
In thepreceding text wedescrib ed howgrammar
sp ecicationscanb ereused. Wewillnowdescrib e
howa grammarsp ecication canb emo dularised.
When mo dularising a grammar sp ecication, the
grammar sp ecication is divided into a collection
of grammar mo dule classes, of which one is the
base grammar class. A third typ e of pro duc-
tion rule sp ecication, called delegating pro duc-
tion rule has b een dened to co ordinate b etween
mo dules. Whentheparserobjectexecutesadele-
gatingpro ductionrule,itcreatesanew parserob-
ject. The new parserobjectis an instance of the
classsp ecied bythe pro duction rule. The active
parserobjectdelegates parsingto thenew parser
object, which will gain controlovertheinput to-
kenstream. The new parser object,now referred
toas thedelegatedparser, parsesthe input token
streamuntil it is nished and subsequently it re-
turns controltothedelegating parser object.
The delegated parserstartswith an emptyto-
ken (or message) queue and an empty stack. It
parses the input token stream from the current
lo cation untilitreachestheendof itsgrammar
sp ecication, at which p oint it will have reached
lo cation . The delegatingparser continuespars-
ingatlo cation asifnotokensexistedb etween
and
01
. Theresult of thedelegating pro duction
isthe instantiatedparserobject.
A delegating pro duction rule is denoted by
'id ',whereidreferstothenameofthedelegated
parserthatisinstantiatedup ontheactualpro duc-
tion. One,veryuseful,typ eofidis i,whichrefers
tothe valueof thei-th elementatthe right-hand
side. The semantics of this pro duction rule typ e
areshownb elow.
[ ]:
1 2
... ,where and
The element must contain the name of a
parser class which will b e instantiated and
parsing will b e delegated to this new parser.
When the delegated parser is nished pars-
ing, it will return control to the delegating
parser, which stores a reference to the del-
egated parser which contains the parsing re-
sult. When isusedasanidentier, must
have avalidparser classname asits value.
In gure 4 the pro cess of parser delegation
for mo dularising purp oses is illu strated. In (1)
adelegating pro duction ruleis executed. Thisre-
sults (2) in the instantiationof a new, dedicated
parserobject. In(3)thecontrolovertheinputto-
kenstreamhasb eendelegated tothenew parser,
whichparsesitssectionofthetokenstream. In(4)
thenew parserhasnished parsingand it hasre-
turned thecontroltotheoriginatingparser. This
parser stores a reference to the dedicated parser
asitcontains theparsingresults.
. .
s
Here, we use the principl e of parser delegation
forthe samereasons delegation is usedin object-
orientedlanguages: reuseandmo dularisation. Al-
though these two goals are sometimes mixed, we
have to separate them clearly in grammar sp ec-
ication. Because of the extended functionality
of a grammar sp ecication in D-yacc,we dene
a grammar as = ( ; ; ; ; ) where
is the name of the dened grammar, is the
set of reused grammars, is the set of dened
nonterminals, the set of dened terminals and
the setof dened pro ductions. is dened as
=(
1
;
2
;:::; ), where =( ; ).
is the set of excluded nonterminals of the gram-
mar referred toby . A Grammar dened
in D-yaccisequivalenttoamonolithic grammar
. In gure 5, the relation b etween a D-yacc
grammar and a monolithic grammar is de-
ned.
Due to space and complexity reasons wehave
not includ ed the delegated grammars in the for-
maldenition. Theresultingdenitionis solarge
and complex that it would not b e supp ortive to
this pap er. The causes forthe complexity of ex-
pressingdelegatedgrammarsalsoinanequivalent
monolithic parserarethefollowing:
The pro duction rules of the delegated gram-
marcanonlyb eusedinthe(monolithic)pars-
ingpro cessimmediatelyafteracorresp onding
delegating pro duction hasb een executed.
Similarly,afterthedelegatinggrammarisn-
ished the pro duction rules in the delegating
grammarcan notb eusedanymore.
Also, while the monolithic parser is parsing
delegated pro duction rules, the pro duction
rules originally part of the delegating parser
arenotallowedtob eused.
Lastly, is used as the identier in a dele-
gating pro duction rule, the delegated parser
classisdeterminedatrun-time,i.e. whenthe
inputtextisparsed. Therefore, itisnotp os-
sibletodeneanequivalentmonolithicparser
in caseswhere is used.
We b elieve that the diculty of sp ecifying in
an equivalent monolithic grammar for a D-yacc
=( ; ; ; ; )
=( 1
;:::; )
=( ; ) where refers tothereused grammar
=( ; ; ; ; )
has anequivalent monolithic grammar =( ; ; ; ),dened as
=
=
1
::: ,where
=
= 1 :::
=
1
::: ,where
= =( ; ); :
Figure 5: A D-yaccgrammar and its equivalent monolithic grammar
grammar incorp orating delegated grammars is a
clear indication that the expressiveness of D-
yacc grammar sp ecications is larger than the
expressiveness of conventional, monolithic gram-
mar sp ecications.
3.3 ll strat n the once t o arser
ele at on
In this section we illustrate the concept of
parser delegationbyusing examples based onthe
L ay
OMobjectmo del describ ed in section 3.1. We
rst showan example of reusing a grammar and
subsequently anexample ofmo dularising agram-
mar sp ecication.
. . s s
We will now illustrate the reuse of an existing
grammar sp ecication with the example in g-
ure . The artial nheritance layer syntax has
muchin common with the nheritancelayersyn-
tax,but extends it toallowexclusion of metho ds
of the inherited class. In the grammar sp ecica-
tions,rstthe nheritancelayersyntaxissp ecied
andsubsequentlythe artial nheritancelayersyn-
tax. The latterreuses thepro duction rules of the
formerandextendstheClass eclarationrulewith
an additional righthand side that sp ecies, next
totheclass name,themetho ds that aretob e in-
heritedfromtheclass. Thisallowsthedesignerto
reuse only asubsetof themetho ds ofa class.
. . -
ext to using parser delegation forreusing exist-
inggrammarsp ecications,itcanalsob eusedfor
mo dularising a grammar sp ecication. When the
base parserexecutes a pro duction of thedelegat-
ingruletyp e, itinstantiatesa parserofthesp eci-
edparsertyp e. Afterinstantiationthedelegating
parser stores a reference to the delegated parser
andsubsequently handsitthecontroloverthein-
put token stream. The delegated parser parses
theinput tokenstreamuntilit reachestheend of
its grammarsp ecication. Itthen returnscontrol
tothe delegatingparser whichcontinuesfromthe
p ositionintheinputtokenstreamwherethedele-
gatedparserstopp ed. Donotethatparserdelega-
tion istransitivein that thedelegated parsercan
instantiateagainanewparserandpassthecontrol
overtheinput tokenstreamto it. Ingure ,the
section of theClass peci cation grammarsp eci-
cationforparsinglayersisshown. Inthisexample,
when the Layer rule matches, the action part is
executed, which instantiatesand delegates tothe
newparser objectof typ e 3.
' ' ' '
''
Figure : Example ofParserDelegation forGrammar euse
LayerDeclaration : 'layers'Layers
Layers : Layer''Layers
empty
Layer [ 3]: identier':' identier
Figure : Example ofParserDelegation forGrammarMo dularisation
ele atin arser ool
Toinvestigateandapplytheconceptofparserdel-
egationinrealapplications,wehavedevelop edD-
yacc , a graphical to ol for sp ecifying grammars
that can reuse from and delegate to other gram-
mars. Forpragmaticreasons,wedecided tomake
use of yaccforthe actual parser generation. By
doing this we werenotrequired tobuild a parser
generator,butcouldfo cusoureortonconstruct-
ing a translator and a c prepro cessor. For
space reasons, wedo not discuss thelexical anal-
ysis,northe semanticactions,butconcentrateon
thegrammatical analysis.
ya is currently a prototyp eto ol, butweplan to
improveitand,p erhaps,makeitpublical lyavailable.
D-yacccanb eseenasb eingcomp osedofthree
parts:
ser nterface: The user-interfaceallowsthe
grammardesignertoworkonthedenitionof
grammarsp ecications. Grammars aresp ec-
iedintheD-yaccsyntaxwhichisanexten-
sion of the yacc sp ecication syntax. The
full sp ecication of the D-yacc syntax can
b efound in app endix A. The to olallowsthe
designertoworkwithmultiplegrammarsand
totestageneratedparserusingtestinputand
examiningtheresulting testoutput.
D-yacc to yacc ranslator: The grammars
are sp ecied by the designer using the D-
yacc syntax. As we make use of yacc, the
input grammar has to b e compliant with it.
The translator converts a D-yaccgrammar
tion. For pragmatic reasons, i.e. to b e able
to make use of yacc, the reused grammars
and theselected grammarare convertedinto
oneyaccgrammarsp ecication, ratherthan
implementing itasdescrib ed insection 3.2.1.
Donote, however, that this makes no dier-
encefromtheuser'sp ersp ectiveasthegener-
ation is done automatically. This approach,
therefore, do es not suer from the problems
describ ed in section 2. Delegated grammars
for mo dularisation are implemented as de-
scrib ed in section3.2.2.
c preprocessor: yacc generates a c
le containing the parser, but it uses prede-
ned function names. This predened nam-
ing will cause name con icts when we have
multiple parsers in one application. There-
fore, weprepro cess thec les torename
all p otentially con icting names, e.g. fo o,
intouniqueidentiers, e.g. fo o.
In gure , the conceptual organisation of D-
yacc and the comp onents it uses in its course
of op eration are shown. We will now describ e
what happ ens if a user decides to test a gram-
mar. TheD-yaccenvironmentgenerallycontains
multiplegrammarswhichhavesomerelationwith
eachother. The selected grammar is analysed to
determine what grammars it uses. The selected
grammar and all reused grammarsare converted
intooneyaccgrammarsp ecication. Eachgram-
marusedformo dularisationpurp oses,sp eciedin
D-yacc syntax, is translated into a corresp ond-
ingyaccsp ecication. Eachyaccsp ecicationis
converted into a c program by yacc. The
D-yacc prepro cessor will convert the resulting
c programintoanequivalentc programin
which all yaccsp ecic identiers arerenamedto
unique names. This lastconversionis required to
havemultipleparsersco existinginac applica-
tion. Thetranslatorhasalsogeneratedamake le
atthegenerationoftheyaccsp ecication. When
it has converted, generated and prepro cessed all
the required grammarparsers, D-yaccwill start
themaketo oltocompileandlinkallnecessaryles
andtheresultisanexecutableprogramcontaining
allthesp eciedparsers. Theusercannowtestthe
grammar bytyping text into the input text win-
dow and starting the executable program. The
userinterface. Ingure9theuserinterfaceofthe
to olis shown.
elated or
The need for grammar reuse and mo dularisa-
tion has b een recognised by several researchers.
In[Minor94],theauthorsmentiontheimp ortance
forgrammarreuse and drawtheanalogyb etween
co dereuseandgrammarreuse,buttheydeferthis
tofuture work. Instead they try todealwith the
complexity of grammar sp ecication throughthe
decomp osition ofa grammarsp ecication into an
abstractand concretegrammar. Extensibili ty and
reuseabilityarenot addressed.
In the eld of attribute grammars, object-
oriented concepts are, among others, applied
by [Hedin 9, Grosch90, astens 92]. However,
theconcepts are,in general, notapplied toreuse
and or mo dularise grammar sp ecications, but
to deal with attributes and attribute compu-
tation. Hedin [Hedin 9] describ es an object-
oriented notation for attribute grammars where
sub(child)pro ductions are sp ecied as subtyp es
of their sup er(parent)pro ductions. This results
in the grammar sp ecication b eing represented
analogous to a classication hierarchy. Although
this reducesthecomplexityof thegrammarsp ec-
ication, little supp ort is oered for mo dular-
ising, extending or reusing an existing gram-
mar sp ecication. A similar approach is taken
in [Grosch90], where attributes and attribute
computations are inherited from the sup ertyp e.
astensandWaite[ astens92]takeaslightlydif-
ferent approach by dening attribution modules
containing abstracted semantics, i.e. attributes
and attributecomputations,that can b ereused.
One approach to reuse of grammar sp ecica-
tions is grammar inheritance. In [Aksit90] a
mechanism forgrammarinheritance is describ ed.
It allows a grammar to inherit pro duction rules
fromone ormorepredened grammars. Inherited
pro duction rules can b e overridden in the inher-
iting grammar,but exclusion of rules is not sup-
p orted. Althoughinheritance oers a mechanism
to reuse existing grammarsp ecications, no sup-
p ort for mo dularising a grammar sp ecication is
oered. Therefore, for purp oses of mo dularising
a large grammar sp ecication, we are convinced
that delegation is a b etter mechanism than in-
heritance. The rational for this is that delega-
tion allows one to separate a grammar sp ecica-
tionattheobjectlevel,whereasinheritance would
still require the denition of a monolithic parser,
although b eing comp osed of inherited grammar
sp ecications. Also, delegation oers a uniform
mechanism for b oth reuse and mo dularisation of
grammarsp ecications.
An approach to facilitate evolutionary parser
developmentisdescrib ed in [Hucklesby 9]. They
use a parsing library with classes that represent
no des in the grammar sp ecication. Extensi-
bility is supp orted by inserting intermediate su-
p erclasses of no des. It, however, do es not sup-
p ort grammar decomp osition, nor do es it facili-
tate reuse of grammar sp ecications as a whole.
Another approach, based on the aformentioned
work,isdescrib edin[Grap e92]wheretheauthors
aim at a high degree of sep eration b etween syn-
taxand semantics asameanstosupp ort extensi-
bility. Thisis achievedbymo delling syntaxtrees
separatelyfromparsetreesandbydeningactions
thatconverttheparsetreeintoasyntaxtree. The
syntax tree is supp osed to b e more stable than
the parse tree, b ecause the latter changes for ev-
ery grammarchange whereasthe setof keywords
and op erators, which makes up the syntax tree,
tends to b e morestable. The work of [Grap e92]
do es notprovide meansformo dularising, reusing
orextending grammarsp ecications.
In a way, one could view a preprocessor, e.g.
thec prepro cessor,asa means tomo dularise
and extend an existing grammar. A prepro cessor
can pro cess the input text and replace parts of
theinputtextwithtextthatcan b eparsedbythe
parser. We considerthis solutiontob e inferiorto
parser delegationfor, amongothers,thefollowing
reasons:
It requires thesemantics of theprepro cessed
input text to b e expressable in terms of the
grammaronwhich the parserisbased.
It provides no means for reusing, overriding
orextending partsin theparser.
It do es not allow for changing the semantic
actions, e.g. the way the parse tree is con-
structed,in theparser.
Totheb estofourknowledge,noapproachesfor
mo dularising grammar sp ecications have b een
dened. Also, the application of the concept of
delegationforthereuseandextensionofgrammar
sp ecications web elieve tob enovel.
onclusion and uture or
The traditional, monolithic approach to gram-
marsp ecication has anumb erof problemswhen
the syntaxand semantics are subject tofrequent
changes. These problems are related to (1) deal-
ingwiththecomplexityofalargegrammarsp eci-
cation,(2)thedicultyofextendingagrammar
sp ecication and(3)theimp ossibil ityofreusinga
grammar sp ecication in a satisfying manner. In
this pap er wehave prop osed parser delegation, a
novelconcepttodealwithreusingandmo dularis-
inggrammarsp ecications. Theconceptofparser
delegation providesasolution totheproblems as-
so ciated with conventional, monolithic grammar
sp ecication.
To investigate and apply parser delegation in
real applications, we have develop ed D-yacc, a
graphical to ol for sp ecifying grammars that can
reuse from and delegate to other grammars. For
pragmatic reasons, this to ol converts a grammar
sp ecication in D-yacc into a yacc grammar
sp ecication. D-yacc mo dies the c co de
generated by yacc to allow multiple parsers in
a single c application. D-yacc is currently
a restricted prototyp e, but we plan toimproveit
and,p erhaps, makeit publicall y available.
We have used parser delegation in the
L ay
OM translator. Currently, parserdelegation is
b eing applied in thetelecommunications domain.
Amo dernphoneexchange,nowadays,oersmany
serviceswhich arerequestedbydialingdigits,the
and . These services change over time and
when the parsing of the service requests would
b edone byamonolithic parser,thisparserwould
needtob eup dated regularly. Byconguringeach
typ eofservicewithitsownparser,thebaseparser
candelegatetheservicerequeststotheparsersp e-
cicforthatservicetyp e. Weintendtoapplythe
conceptofparserdelegationinseveralotherappli-
cationdomains.
c no le e ents
IwouldliketothankBertilEkdahl,MichaelMatt-
son,PeterMolin,LennartOhlsson,PetraSijtsema
andtheanonymousrefereesfortheircommentson
earlierversionsofthispap erand onas ilssonfor
implementing theprototyp eof theD-yaccto ol.
eferences
[Aho ] A. . Aho, . Sethi, .D. ll-
man, CompilersPrinciples, Tech-
niques, andTo ols, Addison Wes-
ley Publishi ng Company, March
19 .
Haverkort, Compiler Generation
Based on Grammar Inheritance,
echnical eport - Depart-
ment of Computer Science, ni-
versityof Twente,February1990.
[Bosch94a] . Bosch, Paradigm, Language
Mo del and Metho d, submitted
to the C E- orkshop on
research issues in the intersec-
tion of oftware Engineering and
rogramming anguages ovem-
b er 1994. Also esearch e-
p ort 94, Department of Com-
puter Science and Business Ad-
ministration, niversity of arl-
skrona onneby, ovemb er1994.
[Bosch94b] . Bosch, elations as First-
Class Entities in L ay
OM, sub-
mitted to EC ' ovem-
b er 1994. Also esearch e-
p ort 9 94, Department of Com-
puter Science and Business Ad-
ministration, niversity of arl-
skrona onneby, ovemb er1994.
[Bosch94c] . Bosch, Ab-
stracting Object State, submit-
ted to bject- riented ystems
Decemb er1994.Also esearch e-
p ort 10 94, Department of Com-
puter Science and Business Ad-
ministration, niversity of arl-
skrona onneby,Decemb er1994.
[Grap e92] P.Grap e, . Walden, Automat-
ing the Development of Syntax
Tree Generators for an Evolving
Language, roceedingsof echol-
ogy of bject- riented anguages
and ystems ,pp.1 5-
195,SantaBarbara,Aug. 1992.
[Grosch90] . Grosch, Object-Oriented At-
tribute Grammars, eport o
ProjectCompiler Generation,
GMD,August 1990.
[Hedin 9] G. Hedin, An Object-Oriented
otation for Attribute Gram-
ropean Conference on bject-
riented rogramming pp. 329-
345, BCS Workshop Series, uly
19 9.
[Hucklesby 9] P.Hucklesby,B. Meyer, The Eif-
fel Object-Oriented Parsing Li-
brary, roceedings of echnology
of bject- riented anguagesand
ystems pp.501-50 ,
Paris, ov.19 9.
[ astens 92] . astens, W.M. Waite, Mo d-
ularity and eusabili ty in At-
tributeGrammars, echnical e-
port C -C - - niversityof
Colorado at Boulder, Septemb er
1992.
[Minor94] S. Minor, B. Magnusson, sing
Mj lnerOrmasaStructure-Based
Meta Environment, To b e pub-
lished in tructure- riented Edi-
tors and En ironments L. eal
and G.Swillus (eds), 1994.
[Sun92] Sun Microsystems Inc., et An-
other Compiler Compiler, ro-
gramming tilities and ibraries
Solaris 1.1 SMCC elease A An-
swerb o ok, une 1992.
' ' ' '
''
'' ''
''
' ' ' '
''
' '
'' ' '
''
'' ' '
' ' ' '
''