• No results found

Parser Delegation: An Object-Oriented Approach to Parsing

N/A
N/A
Protected

Academic year: 2021

Share "Parser Delegation: An Object-Oriented Approach to Parsing"

Copied!
14
0
0

Loading.... (view fulltext now)

Full text

(1)

An Object-Oriented Approach to Parsing

Jan Bosch 3

Departmentof Computer Science andBusiness Administration

Universityof Karlskrona/Ronneby

S-372 25, Ronneby,Sweden

E-mail: Jan.Bosch@ide.hk-r.se

WWW: http://www.pt.hk-r.se/~b osch

Abstract

Conventional grammar sp eci cation and parsing

is generallydone in a monolithic manner, i.e. the

syntax and semantics of a grammar are sp ec-

i ed in one large sp eci cation. Although this

mightb esucientin staticenvironments,amo d-

ular approachis required in situations where the

syntax or semantics of a grammar sp eci cation

are subject to frequent changes. The problems

withmonolithicgrammarsarerelatedto(1)deal-

ing with thecomplexity,(2) extensibili ty and (3)

reusability. Weprop osetheconceptofparserdele-

gationasasolutiontotheseproblems. Parserdel-

egationallowsone tomo dularise andreuse gram-

mar sp eci cations. Toachieve this, thenotion of

a pro duction rule is sp ecialised into (1) overrid-

ing, (2) extending and (3) delegating pro duction

ruletyp es. Toexp erimentwithparserdelegation,

we have develop ed D-yacc, a graphical to ol for

de ning grammars. Parser delegation has b een

applied forconstructing atranslatorforanexp er-

imentallanguageand iscurrentlyappliedinother

domains.

3

This work has b een supp orted by the Blekinge

Forskningsstiftelse .

1 Introduction

Traditionally, compiler constructors have taken

a functional approach to parsing program text.

Generally, the pro cess consists of a lexical anal-

yser,convertingprogramtextintoatokenstream,

and a parser, converting the token stream into

a parse tree. Both the lexical analyser and the

parser are monolithic entities in that only a sin-

gle instance of eachexists in an application. Al-

thoughthis mightb eadequate forstatic environ-

ments, a more mo dular approach is required in

cases where the syntax or semantic actions are

subject to changes, or when one can recognise a

clear partitioning in thegrammarforpartsof the

input text.

Themonolithicapproachtogrammarsp eci ca-

tion isb ecoming increasingly problematic due to,

at least, two trendsone can recognise. First, the

emergence ofsp ecial purp oselanguages. Whereas

previouslyapplicationswerebuildusingoneofthe

few general purp ose languages, nowaday we of-

tenseethatsp ecialise d,application (domain)sp e-

ci c languages areused. Examples of this can b e

found in thefourth generationdevelopmentenvi-

ronments,e.g. Paradox 1

,and formalsp eci cation

environments, e.g. SDL. Each of these environ-

mentsde nes its ownlanguage asan interfaceto

the user. The development of these sp ecial pur-

p ose languages callsformo dularisationand reuse

of grammar sp eci cations. A second trend is the

use of grammarstoobtain structuredinput from

1

Paradoxisatrademark ofBorlandInternational,Inc.

(2)

phone exchange, the user can request services by

dialing the digits asso ciated with a service. Ex-

ample services are follow-me and tele-conference.

The digits are parsed by a parser and the re-

questedserviceisactivatedfortheuser. A second

exampleistheuseofintermediate lestostoreap-

plicationdata. Theretrievalofthedatarequiresa

parsertoparsethe letorecreatethestoredstruc-

tures. The describ ed trends would b ene t from a

means to mo dularise and reuse grammar sp eci -

cationsas itwouldreduce thee ortput in tothe

construction ofparsers.

In this pap er we describ e parser delegation, a

novel mechanism that allows one to mo dularise

a syntax sp eci cation into multiple parsers. The

baseparsercaninstantiateotherparsersandredi-

rect the input token stream to the instantiated

parser. The instantiated parserwill parse the in-

put token stream until it reaches the end of its

syntax sp eci cation. It will subsequently return

totheinstantiatingparser,which willcontinueto

parsefromthep ointwherethesubparserstopp ed.

The parser delegation mechanism is not only

used for mo dularisation, but also for reusing ex-

isting grammar sp eci cations. The designer can

sp ecify, when de ning a new grammar sp eci ca-

tion, that one or more existing grammar sp eci -

cationsarereused. Thenewgrammarisextended

withthepro ductionrulesandthesemanticactions

of thereused grammar,but has thep ossibili ty to

overrideand extend reused pro duction rules and

actions.

To evaluate the mechanism, we constructed a

to ol D-yaccfor sp ecifying grammarsand gram-

mar delegation and reuse. D-yacc uses an ex-

tended yacc sp eci cation syntax. The reason

to use yacc is largely pragmatic. It is avail-

able on numerous machines and using yacc al-

lowed us to concentrate on applying and evalu-

atingparser delegation,ratherthanimplementing

ourownparsergenerator. D-yaccallowstheuser

toworkonmultiplegrammarsp eci cationssimul-

taneously and totest parsers generated from the

grammars.

To illustrate theparser delegation mechanism,

we use the layered object model (L ay

OM), the

object-orientedresearchlanguagewearecurrently

taxandsemanticschangefrequently,althoughthe

changes are generally small. However,to b e able

to exp erimentwith the language, we constructed

atranslatorwhichtranslatesL ay

OMprogramtext,

e.g. class descriptions into C++ co de, e.g. C++

classdescriptions. InL ay

OManobjectisencapsu-

latedby,socalled,layers. Theselayersextendthe

b ehaviour of the class in many ways. Each layer

typ e de nes a certain typ e of functionality, but

in general alayerde nes arelation with another

object, e.g. an inheritance, a conditionaldelega-

tionor an application de ned relation, like works

for, or a constraint, e.g. a concurrency or real-

time constraint, on the b ehaviour of the object.

However, often a new layer typ e, e.g. an appli-

cation sp eci c relation typ e, with its asso ciated

syntaxandsemantics,isintro ducedortheseman-

tics or syntax of an existing layer are changed.

When wewould usea monolithic parser,wehave

to change the parser very often. Instead, we de-

ne a parser for each layer typ e and instantiate

thelayerparsersfrom within theclassparser.

Theremainderofthispap erisorganisedasfol-

lows. In the next section, we brie y describ e a

numb er of problems with traditional, monolithic

grammarsp eci cation techniques. Insection3we

prop oseasolutiontotheseproblems, i.e. thecon-

cept of parser delegation, which is explained by

using the layered object mo del example. In sec-

tion4,D-yacc,theto olimplementing parserdel-

egation, is describ ed and its use is illustrated by

using L ay

OM examples. Section 5 discusses work

that is related to the parser delegation concept.

The last section contains conclusions and a de-

scriptionof thefuture work.

2 e roble s of onolit ic

arsers

Traditionally, parsers are constructed as often

large, monolithic entities. These parsers contain

all the pro duction rules for all of the syntax and

all of the semantic information that is required.

Below wedescrib e three problems of these mono-

lithic parsers.

(3)

grammar, the designer can `get lost' in the

grammar. Forinstance,whenworkingonone

asp ect,hemakeschangesthatin uenceother

partsof thegrammar. Because conventional

grammars do not provide a mo dularisation

mechanism, this problem is unavoidable un-

less one can decomp osea grammarsp eci ca-

tioninto mo dules.

 Extensibility: A problem resulting from the

complexityistheextensibil ity ofa grammar.

Software,b eing a mo del ofa part of the real

world, changes regularly and these changes

shouldb eincorp oratedin a,preferably,natu-

ral manner. However,a monolithic grammar

sp eci cationdo esnotprovideforextensionof

the grammar. Basically, extending a gram-

mar results in editing the original grammar

sp eci cation. This, however, easily results

in, again,thecomplexityproblems of editing

a grammar sp eci cation which was, p ossibly

by someone else, de ned some time ago. It

would b e preferable to b e able to de ne the

extensions to a grammar separate from the

grammaritself.

 eusability: When constructing a new gram-

mar while having a related grammar avail-

able, one would like to reuse the existing

grammarandextendandrede ne partsofit.

However, asdescrib ed ab ove,apartfrom the

copy-pastefacilitie s areno mechanismsavail-

able forreusing an existing grammarsp eci -

cation. Again, it would b e advantageous to

separate the reused grammar from the new

grammarsp eci cations.

In the categorisation ab ove, we do not claim

to b e exhaustive in the problem identi cation.

We primarily make a case for a technique that

allows a designer to mo dularise, reuse and ex-

tend grammar sp eci cations. Also others have

identi ed one or more of these problems, see

e.g.[Minor94,Aksit 90]. Inthispap erwepresent

the concept of parser delegation as a solution to

theidenti ed problems.

ce t

The parser delegation concept, as mentioned b e-

fore,aimsatsupp ortingmo dularisationandreuse

of grammar sp eci cations. This is imp ortant

for systems that are likely to b e extended or

reused. Conventionally, a grammar sp eci cation

wasalarge monolithic description which wasdif-

cult to extend or reuse. We b elieve that apply-

inganobject-orientedapproach,in particularthe

concept of parser delegation, to the sp eci cation

grammarsandthe pro cess ofparsing will b every

b ene cial. In the following, the layered object

mo del example will b eintro duced rst. Then the

parser delegation concept is describ ed and illus-

trated byusing this example. Due to space con-

straints,we concentrate on grammatical analysis

anddonotdiscusslexicalanalysis,northesematic

actionsasso ciated with pro duction rules.

In this pap er, we apply bottom-up parsing,

rather than top-down parsing. The reason for

this is twofold. First, b ottom-up parsing handles

a larger class of grammars than top-down pars-

ing and, second, most software to ols tend to use

b ottom-up metho ds. For example, yaccaccepts

LAL (1) grammars, but also has disambiguat-

ingrulestohandlehigherorder LAL grammars.

However, the concept of parser delegation is not

dep ending on b ottom-up parsing, it can also b e

applied totop-downparsing.

3.1 xa le: a ere ject o el

The layeredobjectmo del (L ay

OM)[Bosch94b] is

an extension 2

of the conventional object mo del.

Thecentreof theobjectconsistsof instancevari-

ables and metho ds, like the conventional object

mo del,buttheobjectisencapsulatedby,socalled,

layers. These layerssp ecify additional functional-

ity of the object, generally in terms of relations

with other objects and constraints on the object

itself. Messagessent toor by the object have to

pass the layers. These layers have the ability to

2

Inthispap er,weuseasimpli e dversionoftheL ay

.

The L ay

object mo del supp orts many additiona l fea-

tures, e.g. states, condition s and dynamic layer creation.

owever,theseasp ects are notrelevantforthe discussion

inthispap er.

(4)

sendtoandfromtheobject. Butalayerdo esnot

have to function in resp onse to the receipt of a

message. It can also b e an active object,sending

messages and monitoring the contextit is placed

in.

As eachtyp e oflayerhasits distinct function-

ality,it also has its ownsp eci cation syntax. Al-

thoughlayerscan share largepartsoftheir sp eci-

cationsyntax,mostlayersadd someunique syn-

tax elements. In gure 1,an example LOM class

de nition isshown.

Inclass ater emperature ensor(seealso g-

ure2)threelayersarede ned, i.e. apartialinher-

itance layer pi , a delegation layer d E and

a mutual exclusion layercc. pi de nes a par-

tial inheritance relation with class emperature-

ensor. The star denotes that it will inherit all

metho ds from emperature ensor, but the third

element, i.e. ` calibrate ', indicates that the cal-

ibrate metho d is not inherited. p on creation

of an instance of class atertemperature ensor,

the pi layer will also create an instance of

class emperature ensor. All messages sent to

the object requesting a metho d implemented by

class emperature ensor will b e sent to the in-

stance created bydi . The second layerd E

delegates messagesrequestingthemetho dscheck-

ealand calibrateConstant to an external object

called aterE uipment. The third layer cc de-

nes a mutual exclusion constraint of `1', mean-

ing that not more than one thread is allowed to

b e active within the object. The semantics of

L ay

OM are only discussed very brie y here. We

refer to [Bosch94a, Bosch 94b, Bosch 94c] for a

detailed description.

Eachlayertyp ehas its own,indep en dent syn-

tax and semantics. Often, a new typ eof layer is

added to the system. Also, the syntax and se-

mantics of the existing layer typ es are changed

regularly. If the parser for L ay

OM would b e im-

plemented asa monolithic parser,we would have

tochange theparser veryoftenwith all theasso-

ciated problems discussed in section 2. A mech-

anism that supp orts mo dularisation and reuse of

grammar sp eci cations would b eextremely help-

ful in dealing withthecomplexityof constructing

and maintaining a parser and co de generatorfor

L ay

OM.

Figure 2: WaterTemp eratureSensorobject struc-

ture

3. arser ele at on once t

A monolithic grammar can b e de ned as =

( ; ; ; ), where isthe name of thegram-

mar, is the set of nonterminals, is the set

of terminals and is theset of pro duction rules.

Theset = isthevo cabularyofthegram-

mar. Each pro duction rule is de ned as

=( ; ), where is de ned as : where

and

3

and is the set of semantic

actionsasso ciatedwiththepro ductionrule . Dif-

ferentfrommostyacc-likegrammarsp eci cations

(e.g. [Aho ,Sun92]), a grammar in this de ni-

tion has a name, and the start symb ol is simply

denoted bythepro duction called start.

The concept of parser delegationallowsone to

delegate parsing of sections of the input token

stream to parsers dedicated to that part of the

syntax. Inthe example describ ed in theprevious

section,thelayersp eci cationsareparsedbyded-

icated parsers of, resp ectively, the artial nheri-

tance, elegateand utex typ e. Thebaseparser,

Class peci cation,instantiatestheselayerparsers

when it has parsed the layername and the layer

typ e. Parserdelegation can b e used in twoways,

i.e. for reusing an existinggrammar sp eci cation

and for partitioning a large grammar into a col-

lection ofsmaller grammarsp eci cations.

Aparsercanb eseenasanobjectandtheinput

token stream can b e viewed as messages to the

object. The right-hand sides of the pro duction

rules aremetho ds withmetho d names equivalent

to theasso ciated grammar sp eci cation. If, after

the receipt of a token, a pro duction rule can b e

executed, the parser object will do so, otherwise

the tokenis placed on the stack. If a pro duction

(5)

s

pi TS: PartialInheritance(Temp eratureSensor, ( ),(calibrate))

d WE:Delegate( WaterEquipment,(checkSeal, calibrateConstant))

cc : Mutex(1) concurrency constraint

s

voltage : eal

calibrate alue : eal

s

calibrate (

1

,:::, ) sBo olean

:::co de tocalibrate thesensor:::

readTemp erature() sInteger

:::co de tocalculate thetemp erature:::

WaterTemp eratureSensor

Figure1: Example Class ater emperature ensor

rulematches,theparserobjectwill(1)executethe

semantic actions asso ciated with the pro duction

rule and (2) subsequently replace the right-hand

sideelementswiththeleft-handside nonterminal.

This pro cess of parsing continues until the top-

levelpro duction rulehas b eenexecuted.

In this pap er, we concentrateon applying the

concept of delegation,rather thaninheritance, to

achieving reusein parsing. The rational for this

is two-fold. First, othershavede ned inheritance

mechanisms for parsing, and we, therefore, pre-

ferinvestigatinganalternativeapproach. Second,

and more imp ortant, the concept of delegation

providesauniformframeworkforb othreusingand

mo dularising grammarsp eci cations, whereasin-

heritance do es not. We elab orate on this in sec-

tion5.

. . s

In situations where a designer has to de ne a

grammarand arelated grammarsp eci cation ex-

ists,one would liketoreusetheexistinggrammar

and extendandrede ne partsofit. Ifagrammar

is reused, all the pro duction rules and semantic

actions b ecome available tothe reusing grammar

sp eci cation. Inourapproach,reuseofanexisting

grammaris achievedbycreating an instance of a

parserforthereusedgrammarup oninstantiation

oftheparserforthereusinggrammar. Thereusing

parser uses thereused parserbydelegatingparts

ofthe parsingpro cess tothereused parser.

When reusing an existing grammar sp eci ca-

tion,acrucial asp ectistheabilitytooverrideand

exclude pro duction rules from the reused gram-

mar. To achieve this, the reusing parser must

control the pro ductions p erformed by the reused

parser. A second imp ortant asp ect is that the

reusing grammar is able to extend pro ductions

declared at the reused parser. To allow o errid-

ingand extensionof reused pro duction rules, two

typ esof pro duction rule sp eci cation arede ned.

The rst is the o erriding sp eci cation, denoted

by '', and the second is the extending sp eci ca-

tion, denoted by' '. The semantics of the pro-

ductionrule sp eci cationtyp esarede ned b elow.

 :

1 2

... , where and

Allpro ductions

3

from areex-

Theterm`reusingparser'isashorthand fortheparser

generated from the reusing grammar sp eci cati on. imi-

larly, `reused parser' refers to the parser generated from

thereusedgrammarsp eci cati on

(6)

cluded from the grammar sp eci cation and

only the pro ductions from are in-

cluded.

 +:

1 2

... ,where and

The pro duction rule

1 2

... , if ex-

isting in is replaced by the sp eci ed

pro duction rule .

Thepro duction ruletyp esdonotallowthede-

signer tosp ecify exclusion of pro duction rules in

the reused grammar. Exclusion is sp eci ed to-

gether with thereuse sp eci cation, i.e. when the

name of the grammaris sp eci ed. Thus, when a

grammar

1

reuses agrammar

2 ,

1

can de ne

a set

2

2

suchthat if

2

then

1 .

In gure 3 the parser con guration for reuse

is shown. The reusing parser has a reference to

the reused parser. The reused parser has con-

trolled access to the stack of the reusing parser.

Theaccesshastob econtrolledinordertob eable

to exclude pro duction rules de ned in the reused

parser. As the pro duction rules de ned in the

reusinggrammararealwaystriedb eforetheparser

attempts to execute the reused pro duction rules,

noadditional controlforoverridingpro ductionsis

required.

. . s -

In thepreceding text wedescrib ed howgrammar

sp eci cationscanb ereused. Wewillnowdescrib e

howa grammarsp eci cation canb emo dularised.

When mo dularising a grammar sp eci cation, the

grammar sp eci cation is divided into a collection

of grammar mo dule classes, of which one is the

base grammar class. A third typ e of pro duc-

tion rule sp eci cation, called delegating pro duc-

tion rule has b een de ned to co ordinate b etween

mo dules. Whentheparserobjectexecutesadele-

gatingpro ductionrule,itcreatesanew parserob-

ject. The new parserobjectis an instance of the

classsp eci ed bythe pro duction rule. The active

parserobjectdelegates parsingto thenew parser

object, which will gain controlovertheinput to-

kenstream. The new parser object,now referred

toas thedelegatedparser, parsesthe input token

streamuntil it is nished and subsequently it re-

turns controltothedelegating parser object.

The delegated parserstartswith an emptyto-

ken (or message) queue and an empty stack. It

parses the input token stream from the current

lo cation untilitreachestheendof itsgrammar

sp eci cation, at which p oint it will have reached

lo cation . The delegatingparser continuespars-

ingatlo cation asifnotokensexistedb etween

and

01

. Theresult of thedelegating pro duction

isthe instantiatedparserobject.

A delegating pro duction rule is denoted by

'id ',whereidreferstothenameofthedelegated

parserthatisinstantiatedup ontheactualpro duc-

tion. One,veryuseful,typ eofidis i,whichrefers

tothe valueof thei-th elementatthe right-hand

side. The semantics of this pro duction rule typ e

areshownb elow.

 [ ]:

1 2

... ,where and

The element must contain the name of a

parser class which will b e instantiated and

parsing will b e delegated to this new parser.

When the delegated parser is nished pars-

ing, it will return control to the delegating

(7)

parser, which stores a reference to the del-

egated parser which contains the parsing re-

sult. When isusedasanidenti er, must

have avalidparser classname asits value.

In gure 4 the pro cess of parser delegation

for mo dularising purp oses is illu strated. In (1)

adelegating pro duction ruleis executed. Thisre-

sults (2) in the instantiationof a new, dedicated

parserobject. In(3)thecontrolovertheinputto-

kenstreamhasb eendelegated tothenew parser,

whichparsesitssectionofthetokenstream. In(4)

thenew parserhas nished parsingand it hasre-

turned thecontroltotheoriginatingparser. This

parser stores a reference to the dedicated parser

asitcontains theparsingresults.

. .

s

Here, we use the principl e of parser delegation

forthe samereasons delegation is usedin object-

orientedlanguages: reuseandmo dularisation. Al-

though these two goals are sometimes mixed, we

have to separate them clearly in grammar sp ec-

i cation. Because of the extended functionality

of a grammar sp eci cation in D-yacc,we de ne

a grammar as = ( ; ; ; ; ) where

is the name of the de ned grammar, is the

set of reused grammars, is the set of de ned

nonterminals, the set of de ned terminals and

the setof de ned pro ductions. is de ned as

=(

1

;

2

;:::; ), where =( ; ).

is the set of excluded nonterminals of the gram-

mar referred toby . A Grammar de ned

in D-yaccisequivalenttoamonolithic grammar

. In gure 5, the relation b etween a D-yacc

grammar and a monolithic grammar is de-

ned.

Due to space and complexity reasons wehave

not includ ed the delegated grammars in the for-

malde nition. Theresultingde nitionis solarge

and complex that it would not b e supp ortive to

this pap er. The causes forthe complexity of ex-

pressingdelegatedgrammarsalsoinanequivalent

monolithic parserarethefollowing:

 The pro duction rules of the delegated gram-

marcanonlyb eusedinthe(monolithic)pars-

ingpro cessimmediatelyafteracorresp onding

delegating pro duction hasb een executed.

 Similarly,afterthedelegatinggrammaris n-

ished the pro duction rules in the delegating

grammarcan notb eusedanymore.

 Also, while the monolithic parser is parsing

delegated pro duction rules, the pro duction

rules originally part of the delegating parser

arenotallowedtob eused.

 Lastly, is used as the identi er in a dele-

gating pro duction rule, the delegated parser

classisdeterminedatrun-time,i.e. whenthe

inputtextisparsed. Therefore, itisnotp os-

sibletode neanequivalentmonolithicparser

in caseswhere is used.

We b elieve that the diculty of sp ecifying in

an equivalent monolithic grammar for a D-yacc

(8)

=( ; ; ; ; )

=( 1

;:::; )

=( ; ) where refers tothereused grammar

=( ; ; ; ; )

has anequivalent monolithic grammar =( ; ; ; ),de ned as

=

=

1

::: ,where

=

= 1 :::

=

1

::: ,where

= =( ; ); :

Figure 5: A D-yaccgrammar and its equivalent monolithic grammar

grammar incorp orating delegated grammars is a

clear indication that the expressiveness of D-

yacc grammar sp eci cations is larger than the

expressiveness of conventional, monolithic gram-

mar sp eci cations.

3.3 ll strat n the once t o arser

ele at on

In this section we illustrate the concept of

parser delegationbyusing examples based onthe

L ay

OMobjectmo del describ ed in section 3.1. We

rst showan example of reusing a grammar and

subsequently anexample ofmo dularising agram-

mar sp eci cation.

. . s s

We will now illustrate the reuse of an existing

grammar sp eci cation with the example in g-

ure . The artial nheritance layer syntax has

muchin common with the nheritancelayersyn-

tax,but extends it toallowexclusion of metho ds

of the inherited class. In the grammar sp eci ca-

tions, rstthe nheritancelayersyntaxissp eci ed

andsubsequentlythe artial nheritancelayersyn-

tax. The latterreuses thepro duction rules of the

formerandextendstheClass eclarationrulewith

an additional righthand side that sp eci es, next

totheclass name,themetho ds that aretob e in-

heritedfromtheclass. Thisallowsthedesignerto

reuse only asubsetof themetho ds ofa class.

. . -

ext to using parser delegation forreusing exist-

inggrammarsp eci cations,itcanalsob eusedfor

mo dularising a grammar sp eci cation. When the

base parserexecutes a pro duction of thedelegat-

ingruletyp e, itinstantiatesa parserofthesp eci-

edparsertyp e. Afterinstantiationthedelegating

parser stores a reference to the delegated parser

andsubsequently handsitthecontroloverthein-

put token stream. The delegated parser parses

theinput tokenstreamuntilit reachestheend of

its grammarsp eci cation. Itthen returnscontrol

tothe delegatingparser whichcontinuesfromthe

p ositionintheinputtokenstreamwherethedele-

gatedparserstopp ed. Donotethatparserdelega-

tion istransitivein that thedelegated parsercan

instantiateagainanewparserandpassthecontrol

overtheinput tokenstreamto it. In gure ,the

section of theClass peci cation grammarsp eci -

cationforparsinglayersisshown. Inthisexample,

when the Layer rule matches, the action part is

executed, which instantiatesand delegates tothe

newparser objectof typ e 3.

(9)

' ' ' '

''

Figure : Example ofParserDelegation forGrammar euse

LayerDeclaration : 'layers'Layers

Layers : Layer''Layers

empty

Layer [ 3]: identi er':' identi er

Figure : Example ofParserDelegation forGrammarMo dularisation

ele atin arser ool

Toinvestigateandapplytheconceptofparserdel-

egationinrealapplications,wehavedevelop edD-

yacc , a graphical to ol for sp ecifying grammars

that can reuse from and delegate to other gram-

mars. Forpragmaticreasons,wedecided tomake

use of yaccforthe actual parser generation. By

doing this we werenotrequired tobuild a parser

generator,butcouldfo cusoure ortonconstruct-

ing a translator and a c prepro cessor. For

space reasons, wedo not discuss thelexical anal-

ysis,northe semanticactions,butconcentrateon

thegrammatical analysis.

ya is currently a prototyp eto ol, butweplan to

improveitand,p erhaps,makeitpublical lyavailable.

D-yacccanb eseenasb eingcomp osedofthree

parts:

 ser nterface: The user-interfaceallowsthe

grammardesignertoworkonthede nitionof

grammarsp eci cations. Grammars aresp ec-

i edintheD-yaccsyntaxwhichisanexten-

sion of the yacc sp eci cation syntax. The

full sp eci cation of the D-yacc syntax can

b efound in app endix A. The to olallowsthe

designertoworkwithmultiplegrammarsand

totestageneratedparserusingtestinputand

examiningtheresulting testoutput.

 D-yacc to yacc ranslator: The grammars

are sp eci ed by the designer using the D-

yacc syntax. As we make use of yacc, the

input grammar has to b e compliant with it.

The translator converts a D-yaccgrammar

(10)

tion. For pragmatic reasons, i.e. to b e able

to make use of yacc, the reused grammars

and theselected grammarare convertedinto

oneyaccgrammarsp eci cation, ratherthan

implementing itasdescrib ed insection 3.2.1.

Donote, however, that this makes no di er-

encefromtheuser'sp ersp ectiveasthegener-

ation is done automatically. This approach,

therefore, do es not su er from the problems

describ ed in section 2. Delegated grammars

for mo dularisation are implemented as de-

scrib ed in section3.2.2.

 c preprocessor: yacc generates a c

le containing the parser, but it uses prede-

ned function names. This prede ned nam-

ing will cause name con icts when we have

multiple parsers in one application. There-

fore, weprepro cess thec les torename

all p otentially con icting names, e.g. fo o,

intouniqueidenti ers, e.g. fo o.

In gure , the conceptual organisation of D-

yacc and the comp onents it uses in its course

of op eration are shown. We will now describ e

what happ ens if a user decides to test a gram-

mar. TheD-yaccenvironmentgenerallycontains

multiplegrammarswhichhavesomerelationwith

eachother. The selected grammar is analysed to

determine what grammars it uses. The selected

grammar and all reused grammarsare converted

intooneyaccgrammarsp eci cation. Eachgram-

marusedformo dularisationpurp oses,sp eci edin

D-yacc syntax, is translated into a corresp ond-

ingyaccsp eci cation. Eachyaccsp eci cationis

converted into a c program by yacc. The

D-yacc prepro cessor will convert the resulting

c programintoanequivalentc programin

which all yaccsp eci c identi ers arerenamedto

unique names. This lastconversionis required to

havemultipleparsersco existinginac applica-

tion. Thetranslatorhasalsogeneratedamake le

atthegenerationoftheyaccsp eci cation. When

it has converted, generated and prepro cessed all

the required grammarparsers, D-yaccwill start

themaketo oltocompileandlinkallnecessary les

andtheresultisanexecutableprogramcontaining

allthesp eci edparsers. Theusercannowtestthe

grammar bytyping text into the input text win-

dow and starting the executable program. The

userinterface. In gure9theuserinterfaceofthe

to olis shown.

elated or

The need for grammar reuse and mo dularisa-

tion has b een recognised by several researchers.

In[Minor94],theauthorsmentiontheimp ortance

forgrammarreuse and drawtheanalogyb etween

co dereuseandgrammarreuse,buttheydeferthis

tofuture work. Instead they try todealwith the

complexity of grammar sp eci cation throughthe

decomp osition ofa grammarsp eci cation into an

abstractand concretegrammar. Extensibili ty and

reuseabilityarenot addressed.

In the eld of attribute grammars, object-

oriented concepts are, among others, applied

by [Hedin 9, Grosch90, astens 92]. However,

theconcepts are,in general, notapplied toreuse

and or mo dularise grammar sp eci cations, but

to deal with attributes and attribute compu-

tation. Hedin [Hedin 9] describ es an object-

oriented notation for attribute grammars where

sub(child)pro ductions are sp eci ed as subtyp es

of their sup er(parent)pro ductions. This results

in the grammar sp eci cation b eing represented

analogous to a classi cation hierarchy. Although

this reducesthecomplexityof thegrammarsp ec-

i cation, little supp ort is o ered for mo dular-

ising, extending or reusing an existing gram-

mar sp eci cation. A similar approach is taken

in [Grosch90], where attributes and attribute

computations are inherited from the sup ertyp e.

astensandWaite[ astens92]takeaslightlydif-

ferent approach by de ning attribution modules

containing abstracted semantics, i.e. attributes

and attributecomputations,that can b ereused.

One approach to reuse of grammar sp eci ca-

tions is grammar inheritance. In [Aksit90] a

mechanism forgrammarinheritance is describ ed.

It allows a grammar to inherit pro duction rules

fromone ormoreprede ned grammars. Inherited

pro duction rules can b e overridden in the inher-

iting grammar,but exclusion of rules is not sup-

p orted. Althoughinheritance o ers a mechanism

to reuse existing grammarsp eci cations, no sup-

p ort for mo dularising a grammar sp eci cation is

(11)

o ered. Therefore, for purp oses of mo dularising

a large grammar sp eci cation, we are convinced

that delegation is a b etter mechanism than in-

heritance. The rational for this is that delega-

tion allows one to separate a grammar sp eci ca-

tionattheobjectlevel,whereasinheritance would

still require the de nition of a monolithic parser,

although b eing comp osed of inherited grammar

sp eci cations. Also, delegation o ers a uniform

mechanism for b oth reuse and mo dularisation of

grammarsp eci cations.

An approach to facilitate evolutionary parser

developmentisdescrib ed in [Hucklesby 9]. They

use a parsing library with classes that represent

no des in the grammar sp eci cation. Extensi-

bility is supp orted by inserting intermediate su-

p erclasses of no des. It, however, do es not sup-

p ort grammar decomp osition, nor do es it facili-

tate reuse of grammar sp eci cations as a whole.

Another approach, based on the aformentioned

work,isdescrib edin[Grap e92]wheretheauthors

aim at a high degree of sep eration b etween syn-

taxand semantics asameanstosupp ort extensi-

bility. Thisis achievedbymo delling syntaxtrees

separatelyfromparsetreesandbyde ningactions

thatconverttheparsetreeintoasyntaxtree. The

syntax tree is supp osed to b e more stable than

the parse tree, b ecause the latter changes for ev-

ery grammarchange whereasthe setof keywords

and op erators, which makes up the syntax tree,

tends to b e morestable. The work of [Grap e92]

do es notprovide meansformo dularising, reusing

orextending grammarsp eci cations.

In a way, one could view a preprocessor, e.g.

thec prepro cessor,asa means tomo dularise

and extend an existing grammar. A prepro cessor

can pro cess the input text and replace parts of

theinputtextwithtextthatcan b eparsedbythe

parser. We considerthis solutiontob e inferiorto

parser delegationfor, amongothers,thefollowing

reasons:

 It requires thesemantics of theprepro cessed

input text to b e expressable in terms of the

grammaronwhich the parserisbased.

 It provides no means for reusing, overriding

orextending partsin theparser.

 It do es not allow for changing the semantic

actions, e.g. the way the parse tree is con-

structed,in theparser.

Totheb estofourknowledge,noapproachesfor

mo dularising grammar sp eci cations have b een

de ned. Also, the application of the concept of

delegationforthereuseandextensionofgrammar

sp eci cations web elieve tob enovel.

onclusion and uture or

The traditional, monolithic approach to gram-

marsp eci cation has anumb erof problemswhen

(12)

the syntaxand semantics are subject tofrequent

changes. These problems are related to (1) deal-

ingwiththecomplexityofalargegrammarsp eci-

cation,(2)thedicultyofextendingagrammar

sp eci cation and(3)theimp ossibil ityofreusinga

grammar sp eci cation in a satisfying manner. In

this pap er wehave prop osed parser delegation, a

novelconcepttodealwithreusingandmo dularis-

inggrammarsp eci cations. Theconceptofparser

delegation providesasolution totheproblems as-

so ciated with conventional, monolithic grammar

sp eci cation.

To investigate and apply parser delegation in

real applications, we have develop ed D-yacc, a

graphical to ol for sp ecifying grammars that can

reuse from and delegate to other grammars. For

pragmatic reasons, this to ol converts a grammar

sp eci cation in D-yacc into a yacc grammar

sp eci cation. D-yacc mo di es the c co de

generated by yacc to allow multiple parsers in

a single c application. D-yacc is currently

a restricted prototyp e, but we plan toimproveit

and,p erhaps, makeit publicall y available.

We have used parser delegation in the

L ay

OM translator. Currently, parserdelegation is

b eing applied in thetelecommunications domain.

Amo dernphoneexchange,nowadays,o ersmany

serviceswhich arerequestedbydialingdigits,the

and . These services change over time and

when the parsing of the service requests would

b edone byamonolithic parser,thisparserwould

needtob eup dated regularly. Bycon guringeach

typ eofservicewithitsownparser,thebaseparser

candelegatetheservicerequeststotheparsersp e-

ci cforthatservicetyp e. Weintendtoapplythe

conceptofparserdelegationinseveralotherappli-

cationdomains.

c no le e ents

IwouldliketothankBertilEkdahl,MichaelMatt-

son,PeterMolin,LennartOhlsson,PetraSijtsema

andtheanonymousrefereesfortheircommentson

earlierversionsofthispap erand onas ilssonfor

implementing theprototyp eof theD-yaccto ol.

eferences

[Aho ] A. . Aho, . Sethi, .D. ll-

man, CompilersPrinciples, Tech-

niques, andTo ols, Addison Wes-

ley Publishi ng Company, March

19 .

(13)

Haverkort, Compiler Generation

Based on Grammar Inheritance,

echnical eport - Depart-

ment of Computer Science, ni-

versityof Twente,February1990.

[Bosch94a] . Bosch, Paradigm, Language

Mo del and Metho d, submitted

to the C E- orkshop on

research issues in the intersec-

tion of oftware Engineering and

rogramming anguages ovem-

b er 1994. Also esearch e-

p ort 94, Department of Com-

puter Science and Business Ad-

ministration, niversity of arl-

skrona onneby, ovemb er1994.

[Bosch94b] . Bosch, elations as First-

Class Entities in L ay

OM, sub-

mitted to EC ' ovem-

b er 1994. Also esearch e-

p ort 9 94, Department of Com-

puter Science and Business Ad-

ministration, niversity of arl-

skrona onneby, ovemb er1994.

[Bosch94c] . Bosch, Ab-

stracting Object State, submit-

ted to bject- riented ystems

Decemb er1994.Also esearch e-

p ort 10 94, Department of Com-

puter Science and Business Ad-

ministration, niversity of arl-

skrona onneby,Decemb er1994.

[Grap e92] P.Grap e, . Walden, Automat-

ing the Development of Syntax

Tree Generators for an Evolving

Language, roceedingsof echol-

ogy of bject- riented anguages

and ystems ,pp.1 5-

195,SantaBarbara,Aug. 1992.

[Grosch90] . Grosch, Object-Oriented At-

tribute Grammars, eport o

ProjectCompiler Generation,

GMD,August 1990.

[Hedin 9] G. Hedin, An Object-Oriented

otation for Attribute Gram-

ropean Conference on bject-

riented rogramming pp. 329-

345, BCS Workshop Series, uly

19 9.

[Hucklesby 9] P.Hucklesby,B. Meyer, The Eif-

fel Object-Oriented Parsing Li-

brary, roceedings of echnology

of bject- riented anguagesand

ystems pp.501-50 ,

Paris, ov.19 9.

[ astens 92] . astens, W.M. Waite, Mo d-

ularity and eusabili ty in At-

tributeGrammars, echnical e-

port C -C - - niversityof

Colorado at Boulder, Septemb er

1992.

[Minor94] S. Minor, B. Magnusson, sing

Mj lnerOrmasaStructure-Based

Meta Environment, To b e pub-

lished in tructure- riented Edi-

tors and En ironments L. eal

and G.Swillus (eds), 1994.

[Sun92] Sun Microsystems Inc., et An-

other Compiler Compiler, ro-

gramming tilities and ibraries

Solaris 1.1 SMCC elease A An-

swerb o ok, une 1992.

(14)

' ' ' '

''

'' ''

''

' ' ' '

''

' '

'' ' '

''

'' ' '

' ' ' '

''

References

Related documents

Som kritik av inte bara Eoin Carolan och Frank Vibert utan också av Kaare Strøm och Peter Lindseth vill jag hävda att de alla försummar att presentera sina respektive uppfattningar

För att uppnå kortare svarstider godkändes av arbetsutskottet en tidsbegränsad delegation där Kommundirektör, med kanslichef som ersättare, fick tidsbegränsad fullmakt att

Dessutom var tanken att man skulle kunna göra systemet kraftfullare genom att skapa möjlighet att i uttryck referera till andra celler i kalkylbladet.. Detta skulle ge användaren

Beslut om uthyrning av lokaler inom ramen för hyresavtalet med fastighetsägaren.. Ärende v Lagrum Delegat Kommentar Anmäls till

Their current parser is created using JAXB which is an API that maps Java classes to XML representations [16] (XML is a markup language such as HTML, a COLLADA file is an XML file

As said in the introduction, a fair comparison of classical models for transition-based parsing as opposed to models enhanced by neural network training is lacking4. Straka

Our main result states that, if all underlying tree generators generate regular tree lan- guages (or finite tree languages), then the tree-generating power of delegation networks is

Let us compare results for the permutation tests where we will base for the total distances statistic test and the Aitchison distance for Topic model against the Jaccard distance