• No results found

Multilingual text generation from structured formal representations


Academic year: 2021

Share "Multilingual text generation from structured formal representations"


Loading.... (view fulltext now)

Full text


Data linguistica 23

Multilingual text generation from structured formal representations

av Dana Dannélls

Akademisk avhandling för filosofie doktorsexamen i språkvetenskaplig databehandling,

som enligt beslut av humanistiska fakultetsnämnden vid Göteborgs universitet kommer att försvaras offentligt tisdagen den

5 februari 2013 kl. 10.15 i Lilla hörsalen, Humanisten.

Göteborg 2012




: Multilingual text generation from structured formal representations L


: English



: Dana Dannélls


This thesis aims to identify the optimal ways in which natural language generation techniques can be brought to bear upon the problem of processing a structured body of information in order to devise a coherent presentation of text content in multiple languages.

We investigate how chains of referential expressions are realized in English, Swedish and Hebrew, and suggest several coreference strategies that can be used to generate coherent de- scriptions about paintings. The suggested strategies focus on the need to produce paragraph- sized written natural language descriptions from formal structured representations presented in the Semantic Web.

We account for principles of coreference by introducing a new modularized approach to automatically generate chains of referential expressions from ontologies. We demonstrate the feasibility of the approach by implementing a system where a Semantic Web domain ontology serves as the background knowledge representation and where the language-specific corefer- ence strategies are incorporated. The system uses both the principles of discourse structures and coreference strategies to guide the generation process. We show how the system success- fully generates coherent, well-formed descriptions in multiple languages.



: Coherence, computational linguistics, coreference, corpus linguistics discourse structure, knowledge representation, language technology, lexical semantics, linked open data, multilingual natural language generation, natural language processing, ontology, semantic web.



: Department of Swedish University of Gothenburg Box 200

SE-405 30 Gothenburg Sweden

Data linguistica 23 ISSN 0347-948X ISBN 978-91-87850-48-6



in Sweden by Ineko AB Göteborg 2012


Related documents

These statements are supported by Harris et al (1994), who, using MBAR methods, find differ- ences in value relevance between adjusted and unadjusted German accounting numbers.

and “locus of control”. Judgement of risk-taking deals with whether or not the individual is prepared to take risks. According to some informants, exposure to loud music is not a

46 Konkreta exempel skulle kunna vara främjandeinsatser för affärsänglar/affärsängelnätverk, skapa arenor där aktörer från utbuds- och efterfrågesidan kan mötas eller

Therefore I’m asking what is the com- ponent matching, exact matching and execution accuracy of TypeSQL and SyntaxSQLNet on questions in the Spider dataset when including and

This report gives a good overview on how to approach and develop natural language processing support for applications, containing algorithms used within the field, tools

1998 ACM Subject Classification F.4.2 Grammars and Other Rewriting Systems, F.4.3 Formal Languages, G.2.2 Graph Theory, I.2.7 Natural Language Processing.. Keywords and phrases

This generator performed, in the randomness testing, compar- ably to an implementation of Mersenne Twister showing that with the right implementation it is possible to generate

N IKLAS M AGNUSSON Postoperative aspects on inguinal hernia surgery I 43 Even if no strategy has been unequivocally superior to the others, thor- ough preoperative