• No results found

Proceedings of the Symposium on Logic and Algorithms in Computational Linguistics 2018 (LACompLing2018)

N/A
N/A
Protected

Academic year: 2021

Share "Proceedings of the Symposium on Logic and Algorithms in Computational Linguistics 2018 (LACompLing2018)"

Copied!
102
0
0

Loading.... (view fulltext now)

Full text

(1)

Logic and Algorithms

in

Computational Linguistics 2018

(LACompLing2018)

Stockholm, 28–31 August 2018

Department of Mathematics and Department of Philosophy, Stockholm Univer-sity, Stockholm, Sweden

Editors:

Krasimir Angelov, University of Gothenburg, Sweden Kristina Liefke, Goethe University Frankfurt, Germany Roussanka Loukanova, Stockholm University, Sweden Michael Moortgat, Utrecht University, The Netherlands Satoshi Tojo, School of Information Science, JAIST, Japan Program Committee:

Lasha Abzianidze, Krasimir Angelov, Annie Foret, Harald Hammarstr¨om, Kristina Liefke, Roussanka Loukanova, Michael Moortgat, Sylvain Pogodalla, Kiyoaki Shirai, Elisa Sneed German, Lutz Straßburger, Satoshi Tojo, Adam Wilson, Christian Wurm

Publisher: Stockholm: Stockholm University, 2018, DiVA Portal for digital pub-lications

(2)

I

Abstracts of Talks and Short Biographies

Lasha Abzianidze . . . 7

Krasimir Angelov . . . 7

Rasmus Blanck and Aleksandre Maskharashvili . . . 7

Robin Cooper . . . 8 Hercules Dalianis . . . 9 Philippe de Groote. . . 10 Marie Duzi. . . 10 Tim Fernando . . . 11 Annie Foret . . . 11 Jonathan Ginzburg. . . 12 Justyna Grudzinska . . . 12

M. Dolores Jim´enez L´opez . . . 13

Ron Kaplan. . . 13

Yusuke Kubota . . . 14

Shalom Lappin . . . 14

Hans Leiß. . . 15

Zhaohui Luo . . . 15

Mehdi Mirzapour, Jean-Philippe Prost, and Christian Retor´e. . . 16 Aarne Ranta . . . 17 Frank Richter . . . 17 Mehrnoosh Sadrzadeh . . . 18 Manfred Sailer. . . 18 Satoshi Tojo . . . 19

Adri`a Torrens Urrutia . . . 19

Christian Wurm . . . 19

Yuan Xie . . . 20

Robert ¨Ostling . . . 20

II

Contributed Papers

1 Rasmus Blanck and Aleksandre Maskharashvili From TAG to HOL Representations of AMRs via ACGs . . . 23

2 Tim Fernando Intervals and Events with and without Points. . . 34

3 Shalom Lappin Towards a Computationally Viable Framework for Semantic Representation . . . 47

(3)

4 M. Dolores Jim´enez L´opez

Complexity, Natural Language and Machine Learning. . . 64

5 Zhaohui Luo

MTT-semantics in Martin-L¨of ’s Type Theory with HoTT’s Logic . . . . 68

6 Mehdi Mirzapour, Jean-Philippe Prost, and Christian Retor´e Categorial Proof Nets and Dependency Locality: A New Metric for

Linguistic Complexity . . . 73

7 Adri`a Torrens Urrutia

A Proposal to Describe Fuzziness in Natural Language. . . 87

8 Yuan Xie

Referential Dependencies in Chinese: A Syntax-Discourse Processing

(4)
(5)

Part I

Abstracts of Talks and

Short Biographies

(6)
(7)

Lasha Abzianidze (University of Groningen, Nederlands) Invited Talk Compositional Semantics in the Parallel Meaning Bank

(joint work with Johan Bos)

Abstract: The Parallel Meaning Bank (PMB) is a corpus of translations annotated with shared, formal meaning representations. The principle of com-positionality lies at the heart of the corpus as it drives a derivation process of phrasal semantics and enables cross-lingual projection of meaning representa-tions. The talk will present the PMB annotation pipeline and show how it leads to the formal, compositional semantics of translations. As a highlight, composi-tional treatment of several challenging semantic phenomena in English will be shown.

Short Biography: Lasha Abzianidze is a postdoc researcher at the Uni-versity of Groningen. His research interests span meaning representations and natural language inference. Currently he works on the Parallel Meaning Bank project where His research focuses on semantic annotation and compositional semantics for wide-coverage texts. He obtained his PhD, titled “A natural proof system for natural language”, at Tilburg University. In his PhD research, he developed a tableau-based theorem prover for a Natural Logic which operates and solves textual entailment problems.

Krasimir Angelov (University of Gothenburg and Digital Grammars AB, Swe-den) Invited Talk

A Parallel WordNet and Treebank in English, Swedish and Bulgarian Abstract: We present a work in progress about a parallel WordNet-like lexicon and a treebank for English, Swedish and Bulgarian. The lexicon uses the Princeton WordNet senses but in addition incorporates detailed morpho-logical and syntactic information. Words accross languages with the same sense which are moreover frequent mutual translations are grouped together via a language-independent identifier. These features make the lexicon directly usable as a library in GF applications. As part of the development we also converted all examples from WordNet to a treebank parsed with the GF Resource Grammars. Thanks to that the examples are translated to Swedish and Bulgarian.

Short Biography: Krasimir Angelov is an Associate Professor in computer science at the University of Gotheburg. His interests are in formal and natural languages, functional programming, machine translation and natural language parsing and generation. He is also one of the developers of Grammatical Frame-work (GF). The later is a programming language for developing hybride rule-based and statistical natural language applications. He is also one of the founders of Digital Grammars AB, a company which offers reliable language technologies, i.e. solutions where quality is prioritised usually in exchange of coverage. Rasmus Blanck and Aleksandre Maskharashvili (CLASP, FLOV, University of Gothenburg, Sweden)

From TAG to HOL Representations of AMRs via ACGs

Abstract: We investigate a possibility of constructing an Abstract Catego-rial Grammar (ACG) that relates Tree Adjoining Grammar (TAG) and Higher

(8)

Order Logic (HOL) formulas which encode Abstract Meaning Representations (AMRs). We also propose another ACG that relates TAG and HOL formulas expressing the neo-Davidsonian event semantics. Both of these encodings are based on the already existing ACG encoding of the syntax-semantics interface where TAG derivations are interpreted as HOL formulas representing Montague semantics. In particular, both of these encodings share the same abstract lan-guage coming from the ACG encoding of TAG with Montague semantics, which is second-order. For second-order ACGs, problems of parsing and generation are known to be of polynomial complexity. Thus we get the natural language generation and parsing with TAGs and HOL formulas modeling AMR for free. Robin Cooper (University of Gothenburg, Sweden) Invited Talk

How to Play Games with Types (joint work with Ellen Breitholtz)

Abstract: This talk will discuss how the kind of game theory (GT) pre-sented in the course by Heather Burnett and E. Allyn Smith at ESSLLI 2017 (https://www.irit.fr/esslli2017/courses/6) and Burnett’s paper “Signal-ling Games, Socio“Signal-linguistic Variation and the Construction of Style” (http:// www.heatherburnett.net/uploads/9/6/6/0/96608942/burnett_smgs.pdf) could be connected to work on TTR, a type theory with records, and Ginzburg’s KOS, a formal approach to conversational semantics. Here are some points I will consider:

1. Recasting GT in TTR. They both talk about types (of action) and when GT talks about possible worlds it is really what TTR would call types of situ-ations. (The same holds of the use of the term “possible worlds” in probability theory). I will sketch an example of how it might look.

2. But what might doing (1) add to a linguistic theory? KOS/TTR might provide a framework for dealing with issues like choosing which games to play, misunderstandings between two agents about what game is being played or ac-commodating a game on the basis of another agent’s behaviour. There is a notion of game in my paper “How to do things with types” (https://www.cisuc.uc. pt/ckfinder/userfiles/files/TR%202014-02.pdf). There is more detail in my book draft (https://sites.google.com/site/typetheorywithrecords/ drafts) and also in Ellen Breitholtz’s work on enthymemes and topoi in her thesis and book in preparation. Ginzburg’s work on genre and conversation types is related. The games in this literature are very simple from the perspective of GT. They are defined in terms of a string type for a string of events on the gameboard which is traversed by an agent trying to realize the types. We have nothing to say about how you would make choices in a non-deterministic game, but GT would add that. It could be extremely productive to embed game theory in a theory of dialogue — one even begins to imagine metagames, games you play about concerning which game to play. We can perhaps supply a way of connecting GT to dialogue and grammar in a formal setting.

3. We could view this as making a connection between games and a general theory of action along the lines of ”How to do things with types”. The assumption seems to be that you compute utility and then perform the action that has

(9)

highest utility for you. But you could think of other strategies: e.g. cooperative (make the move that has the highest utility irrespective of player), altruistic (maximize the utility of the other player). If you think of games as assigning utilities to event types at a given state of play, perhaps exploiting techniques from our work on probabilistic TTR (http://csli-lilt.stanford.edu/ojs/ index.php/LiLT/article/view/52) you could have a superordinate theory of action which would tell you what you might do depending on which strategy you are using.

Short Biography: Robin Cooper is Senior Professor at the University of Gothenburg, where he was previously Professor of Computational Linguistics. He is currently conducting research within the Centre for Linguistic Theory and Studies in Probability (CLASP) at Gothenburg. He has an undergraduate degree from the University of Cambridge and a PhD in Linguistics from the University of Massachusetts at Amherst. He has taught prevsiouly at the following univer-sities: Universit¨at Freiburg, University of Texas at Austin, University of Mas-sachusetts at Amherst, University of Wisconsin at Madison, Stanford University, Lund University and Edinburgh University. He has held a Mellon Postdoctoral Fellowship and a Guggenheim Fellowship and has been a fellow at the Centre for Advanced Study in the Behavioral Sciences at Stanford. He is a Fellow of the British Academy and the Royal Society of Arts and Sciences in Gothenburg and a member of Academia Europaea. He holds an honorary doctorate from Uppsala. His main research interests are semantics (both theoretical and computational), dialogue semantics and computational dialogue systems. Currently he is working on a type theoretical approach to language and cognition.

Hercules Dalianis (DSV-Stockholm University, Sweden) Invited Talk

HEALTH BANK — A Workbench for Data Science Applications in Health-care

Abstract: Healthcare has many challenges in form of monitoring and pre-dicting adverse events as healthcare associated infections or adverse drug events. All this can happen while treating a patient at the hospital for their disease. The research question is: When and how many adverse events have occurred, how can one predict them? Nowadays all information is contained in the electronic patient records and are written both in structured form and in unstructured free text. This talk will describe the data used for our research in HEALTH BANK — Swedish Health Record Research Bank containing over 2 million pa-tient records from 2007–2014. Topics are detection of symptoms, diseases, body parts and drugs from Swedish electronic patient record text, including deciding on the certainty of a symptom or disease and detecting adverse (drug) events. Future research are detecting early symptoms of cancer and de-identification of electronic patient records for secondary use.

Short Biography: Hercules Dalianis has Master of Science in engineer-ing (civilengineer-ingenj¨or) with speciality in electrical engineerengineer-ing, graduated in 1984 at the Royal Institute of Technology, KTH, Stockholm, Sweden, and received his PhD/Teknologie doktor in 1996 also at KTH. Since 2011 he is Professor in Com-puter and Systems Sciences at Stockholm University, Sweden. Dalianis was post

(10)

doc researcher at University of Southern California/ISI in Los Angeles 1997–98. Dalianis was also post doc researcher (forskarassistent) at NADA KTH 1999– 2003, moreover he held a three year guest professorship at CST, University of Copenhagen during 2002–2005, founded by Norfa, the Nordic Council of Minis-ters. Dalianis was on a sabbatical stay at CSIRO/Macquire University, Sydney, Australia 2016–17 compiling a text book with the title Clinical text mining: Secondary use of electronic patient records, that will be published open access at Springer in April 2018. Dalianis works in the interface between industry and university and with the aim to make research results useful for society. Dalianis has specialized in the area of human language technology, to make computer to understand and process human language text, but also to make a computer to produce text automatically. Currently Dalianis is working in the area of clini-cal text mining with the aim to improve healthcare in form of better electronic patient record systems, presentation of the patient records and extraction of valuable information both for clinical researchers but also for lay persons as for example patients.

Philippe de Groote (Directeur de Recherche, Inria, France) Invited Talk New Progress in Continuation-Based Dynamic Logic

Abstract: In this talk, we revisit the type-theoretic dynamic logic intro-duced by de Groote (2006) and developed by Lebedeva (2012). We show how a slightly richer notion of continuation allows new dynamic connectives and quan-tifiers to be defined in a systematic way.

Short Biography: Dr. Philippe de Groote received his PhD degree in en-gineering from the Universit´e Catholique de Louvain in March 1991. After a postdoc at the University of Pennsylvania, he joined Inria in September 1992, initially as Charg´e de Recherche and then Directeur de Recherche. His research interests include mathematical logic, type-theory, proof-theory, computational linguistics, and natural language formal semantics.

Marie Duzi (VSB-Technical University of Ostrava, Czech Republic) Invited Talk Negation, Presupposition and Truth-Value Gaps

Abstract: There are many kinds of negation and denial. Perhaps the most common is Boolean negation ‘not’ that applies to propositions-in-extension, i.e. truth-values. The others are, inter alia, the property of propositions of not being true which applies to propositions; the complement function which applies to sets; privation which applies to properties; negation as failure applied in logic programming; negation as argumentation ad absurdum, and many others. I am going to deal with negation of propositions that come attached with a pre-supposition that is entailed by the positive as well as negated form of a given proposition. However, there are two kinds of negation, namely internal and ex-ternal negation, which are not equivalent. I will prove that while the former is presupposition-preserving, the latter is presupposition-denying. This issue has much in common with the difference between topic and focus articulation within a sentence. Whereas articulating the topic of a sentence activates a presuppo-sition, articulating the focus frequently yields merely an entailment. While the

(11)

Russellian wide-scope (external) negation gets the truth-conditions of a sentence right for a subject occurring as a focus, Strawsonian narrow-scope (internal) negation is validly applicable for a subject occurring as the topic. My back-ground theory is Transparent Intensional Logic (TIL). It is an expressive logic apt for the analysis of sentences with presuppositions, because in TIL we work with partial functions, in particular with propositions with truth-value gaps. Moreover, procedural semantics of TIL makes it possible to uncover the hidden semantic features of sentences, make them explicit and logically tractable.

Short Biography: Marie Duzi is a professor of Computer Science at VSB-Technical University of Ostrava. She graduated from mathematics and her main professional interests concern mathematical logic, Transparent Intensional Logic and natural-language processing. She is also a visiting professor at the Faculty of Informatics, Masaryk University of Brno where she closely cooperates with the group of computational linguists in the Centre for natural language processing. Tim Fernando (Trinity College Dublin, Ireland)

Intervals and Events with and without Points

Abstract: Intervals and events are examined in terms of strings with and without the requirement that certain symbols occur uniquely. Allen interval re-lations, Dowty’s aspect hypothesis and inertia are understood against strings, compressed into canonical forms, describable in Monadic Second-Order logic. See:https://www.scss.tcd.ie/Tim.Fernando/stock.pdf

Short Biography: Tim Fernando is a lecturer in computer science at Trin-ity College Dublin. He is interested in semantics and particularly finite-state methods.

Annie Foret ( IRISA - University of Rennes 1, France) Invited Talk On Categorial Grammatical Inference and Logical Information Systems Abstract: We shall consider several classes of categorial grammars and discuss their learnability. We consider learning as a symbolic issue in an unsu-pervised setting, from raw or from structured data, for some variants of Lam-bek grammars and of categorial dependency grammars. In that perspective, we discuss for these frameworks different type constructors and structures, some limitations (negative results) but also some algorithms (positive results) under some hypothesis.

On the experimental side, we also consider the Logical Information Systems approach, that allows for navigation, querying, updating, and analysis of hetero-geneous data collections where data are given (logical) descriptors. Categorial grammars can be seen as a particular case of Logical Information System.

Short Biography: Annie Foret is an associate-professor of computer sci-ence in Rennes 1 university, France. She belongs to the SemLIS research team (on “Semantics, Logics, Information Systems for Data-User Interaction”) in the Data and Knowledge Management department at IRISA. Her general research interests are on logic, language and computation. Her current research interests include grammatical inference and categorial grammars. Previously, she stud-ied mathematics and computer science at Ecole normale sup´erieure, and

(12)

non-classical logics and rewriting in her PHD under the supervision by G. Huet. She then joined IRISA and Rennes 1 where she completed her habilitation on “some classes of type-logical grammars that model syntax”.

Jonathan Ginzburg (Laboratoire de Linguistique Formelle, Universit´e Paris-Diderot and Laboratoire d’Excellence LabEx-EFL, France) Invited Talk

Combining Verbal and Non-Verbal Interaction in Dialogue

Abstract: The talk will provide detailed motivation, contrary to received wisdom until recently, as to the mutual interaction between non-verbal social signals such as laughter, smiling, frowning etc and content emanating from ver-bal material. In particular, I will argue that such non-verver-bal social signals bear propositional content and can participate in own and other communication man-agement (e.g., clarification requests and corrections). I will show how the content emanating from non-verbal social signals can be integrated in type theoretic ac-counts of dialogue interaction by combining work in existing frameworks with psychological and computational approaches to emotion appraisal and to com-mon sense reasoning.

Short Biography: Jonathan Ginzburg is Professor of Linguistics at Univer-sit´e Paris-Diderot (Paris 7). He has held appointments at the Hebrew University of Jerusalem and King’s College, London. He is one of the founders and asso-ciate editors of the journal Dialogue and Discourse. His research interests include semantics, dialogue, language acquisition, and musical meaning. He is the au-thor of Interrogative Investigations (CSLI Publications, 2001, with Ivan A. Sag) and The Interactive Stance: meaning for conversation (Oxford University Press, 2012).

Justyna Grudzinska (University of Warsaw, Poland) Invited Talk Taking Scope with Continuations and Dependent Types (joint work with Marek Zawadowski)

Abstract: Dependent type theoretical frameworks have been used to model linguistic phenomena of central importance, e.g., unbound anaphora (Ranta 1994, Cooper 2004, Bekki 2014, Grudzinska et al. 2014), lexical phenomena such as selectional restrictions and coercions (Asher 2011, Luo 2012), adjectival and adverbial modification (Luo et al. 2017). Continuations have been used for an influential in situ analysis of quantifier scope ambiguities (Barker 2002). In my talk I will present a semantic system combining continuations and dependent types (joint work with Marek Zawadowski) that is sufficient to account for a broad range of existing readings for multi-quantifier sentences, including simple sentences and more complex syntactic environments such as inverse linking.

Short Biography: Justyna Grudzinska obtained a Ph.D. in philosophy at the University of Warsaw. Her research interests are formal semantics and phi-losophy of language, and her current main focus is on the use of dependent type theories to the study of natural language semantics (plural unbound anaphora, long-distance indefinites, in situ semantics for scope ambiguities, possessive and Haddock definites). She is also a coordinator of the Cognitive Science Programme at the University of Warsaw.

(13)

M. Dolores Jim´enez L´opez (GRLMC-Research Group on Mathematical Linguis-tics, Universitat Rovira i Virgili, Tarragona, Spain) Invited Talk

Complexity, Natural Language and Machine Learning

Abstract: The talk focuses on linguistic complexity. Are all languages equally complex? Does it make sense to compare the complexity of languages? Can languages differ in complexity? Complexity is a controversial concept in linguistics. Until recently, natural language complexity has not been widely re-searched and and it is still not clear how complexity has to be defined and mea-sured. It is necessary to provide an objective and meaningful method to calculate linguistic complexity. In order to reach this goal, an interdisciplinary solution — where computational models should be taken into account — is needed. Linguis-tics must propose tools for the analysis of natural language complexity, since the results obtained from these studies may have important implications both from a theoretical and from a practical point of view.

Short Biography: M. Dolores Jim´enez-L´opez is an Associate Professor at Departament de Filologies Romaniques at the Universitat Rovira i Virgili, Tar-ragona, Spain. She has a PhD degree in linguistics. She worked for two years, as a pre-doctoral fellow, at the Computer and Automation Research Institute of the Hungarian Academy of Sciences in Budapest, Hungary. Her post-doctoral training includes a three-year stay at Department of Computer Science in Uni-versity of Pisa, Italy. Application of formal models to natural language analysis is one of her main research topics.

Ron Kaplan (Stanford University, US) KeyNote Talk An Architecture for Structured Ambiguity Management

Abstract: A pipeline for full-fledged natural language understanding con-sists of components that deal with information at different levels of remove from the elements that make up an utterance. Computing across the full pipeline is difficult because complex patterns (at all levels) may overlap in different ways, giving rise to ambiguities that feed from one component to the next. A typical approach is to apply probabilistic or heuristic preferences within each compo-nent so as to reduce the number of candidates that it feeds forward to the next. This has an obvious disadvantage: ambiguity resolution based on local informa-tion may eliminate the only candidate that gives the best result when all later components are taken into account. An alternative approach is to organize rep-resentations so as to ”manage” the way ambiguous structures are propagated rather than attempting to resolve ambiguity at each level. The final result can then be globally optimal with respect to the whole pipeline. The trick is to do this without blowing up the computation.

Short Biography: Ron Kaplan is an Adjunct Full Professor of Linguistics at Stanford University. He served previously as a Vice President of Amazon and the Chief Scientist of Amazon Search Technologies. He was founder and director of the Natural Language and Artificial Intelligence Laboratory at Nuance Com-munications, with a focus on dialog and the conversational user interface. Before Nuance, he managed the Semantic Initiatives and Natural Language Platform teams for the Bing search engine. He also served as Chief Technology Officer and

(14)

Chief Scientific Officer at Powerset, a deep semantic-search company acquired by Microsoft. Powerset was a spin-out of the (Xerox) Palo Alto Research Center based on technology developed by the natural language and artificial intelli-gence research group that Ron directed at PARC for many years. He is known for his influential contributions to computational linguistics and linguistic the-ory, particularly for the development of Lexical Functional Grammar and for the mathematical underpinnings and implementation of finite-state morphology.

Ron is a past President and Fellow of the Association for Computational Lin-guistics, a co-recipient of the 1992 Software System Award of the Association for Computing Machinery, a Fellow of the ACM, and a Fellow of the Cognitive Science Society. He received his Ph.D. in Social Psychology from Harvard Uni-versity and was awarded an honorary doctorate by the Faculty of Humanities of Copenhagen University.

Yusuke Kubota (University of Tsukuba, Japan) Invited Talk Type-Logical Grammar and Natural Language Syntax

Abstract: In this talk, I will first briefly sketch my recent work, which focused on developing a particular version of Type-Logical Grammar with em-phasis on linguistic application. I will then speculate on what (I think) is still missing in my own research and what still needs to be done and whether now is a good time to start addressing these issues seriously. While I believe that my previous work has revealed some interesting points of comparison between Type-Logical Grammar and mainstream Chomskian syntax, it has also raised (or at least made me aware of) many issues pertaining to the relationship be-tween theoretical linguistics and computational linguistics. I will touch on these issues and speculate on future directions.

Short Biography: Yusuke Kubota has received Ph.D at the Department of Linguistics at Ohio State University in 2010 and is currently an Assistant Professor at the University of Tsukuba. His main research interests are natu-ral language syntax and semantics and mathematical linguistics. Together with Robert Levine, he has been developing a version of Type-Logical Grammar called Hybrid Type-Logical Categorial Grammar. Some of the results of this work, mainly dealing with empirical issues in the domain of coordination and ellipsis, have recently appeared in major linguistics journals including Linguistics and Philosophy, Linguistic Inquiry, and Natural Language Linguistic Theory. Shalom Lappin (University of Gothenburg, Sweden) Invited Talk

Towards a Computationally Viable Framework for Semantic Representation Abstract: Most formal semantic theories proposed since Montague (1974) employ possible worlds to model intensions and modality. Classical theories of knowledge representation also use worlds to represent epistemic states and rea-soning. If worlds are construed as equivalent to ultrafilters in a lattice of propo-sitions (maximal consistent sets of propopropo-sitions), then they pose serious prob-lems of tractable representability. In addition, traditional worlds-based semantic theories are unable to accommodate vagueness, which is a pervasive feature of predication. They also do not explain semantic learning, and it is not clear how

(15)

they could be naturally extended to incorporate such an explanation. To offer a cognitively plausible system for interpreting expressions in natural language a semantic theory should generate tractable representations, handle vagueness of predication, and provide the basis for an account of semantic learning. In this paper I discuss the problem of computational tractability of semantic represen-tation. I suggest a probabilistic Bayesian alternative to classical worlds-based semantics, and I indicate how it can deal with intensions, modality, vagueness, epistemic states, and semantic learning.

Short Biography: available at

https://www.kcl.ac.uk/artshums/depts/philosophy/people/staff/ associates/emeritus/lappin/index.aspx

Hans Leiß (Ludwig-Maximilians-Universit¨at M¨unchen, Germany) Invited Talk Predication with Sentential Subject in GF

Abstract: The resource grammar library of the Grammatical Framework of Ranta et al. distinguishes binary or ternary verbs with nominal or prepositional objects from verbs whose objects have the form of a sentence, a question or an infinitive. No such distinction is made for the subject position of verbs. We intro-duce syntactic categories for verbs, adjectives and verb phrases with sentential subjects and extend the predication grammar of Ranta (EACL, 2014) so that sentential subjects can only be combined with verb phrases of appropriate types (which may arise by passivizing verbs with sentential objects). We also report on the price in computational complexity that has to be paid for the gain in linguistic accuracy.

Short Biography: Hans Leiß has studied mathematics and computer sci-ence at the University of Bonn and wrote his doctoral thesis in model theory. After a few years in theoretical computer science at the Technical University of Aachen he joined Siemens AG in Munich, working on object-oriented pro-gramming, hardware verification, and parsing. He switched to computational linguistics (CIS) at the University of Munich (LMU) in 1990. His reseach in-terests are in formal language theory, type theories for programming languages, parsing and grammar development for natural languages, semantics of natural language. He retired from LMU in 2017.

Zhaohui Luo (Royal Holloway, University of London, UK) Invited Talk Formal Semantics in Modern Type Theories: An Overview

Abstract: I’ll give an overview, and report some recent developments, of Formal Semantics in Modern Type Theories (semantics for short). MTT-semantics is a semantic framework for natural language, in the tradition of Mon-tague’s semantics. However, while MonMon-tague’s semantics is based on Church’s simple type theory (and its models in set theory), MTT-semantics is based on dependent type theories, which we call modern type theories, such as Martin-Lof’s type theory (MLTT) and the Unifying Theory of dependent Types (UTT). Thanks to recent development, MTT-semantics has become not only a full-blown alternative to Montague’s semantics, but also a very attractive framework with a promising future for linguistic semantics.

(16)

In this talk, MTT-semantics will be explicated, and its advantages explained, by focussing on the following:

1. The rich structures in MTTs, together with subtyping, make MTTs a nice and powerful framework for formal semantics of natural language.

2. MTT-semantics is both model-theoretic and proof-theoretic and hence very attractive, both theoretically and practically.

By explaining the first point, we’ll introduce MTT-semantics and, at the same time, show that the use and development of coercive subtyping play a crucial role in making MTT-semantics viable. The second point shows that MTTs provide a unique and nice semantic framework that was not available before for linguis-tic semanlinguis-tics. Being model-theorelinguis-tic, MTT-semanlinguis-tics provides a wide coverage of various linguistic features. Being proof-theoretic, its foundational languages MTTs have proof-theoretic meaning theory based on inferential uses (appeal-ing philosophically and theoretically) and it establishes a solid foundation for practical reasoning in natural languages based on proof assistants such as Coq (appealing practically). Altogether, this strengthens the argument that MTT-semantics is a promising framework for formal MTT-semantics, both theoretically and practically.

Short Biography: Zhaohui Luo is Professor of Computer Science at Royal Holloway, University of London. He is an expert in dependent type theory and its applications. In the last decade, he has worked on, among other things, formal semantics in modern type theories, applying type theory to linguistic semantics. His publications include “Computation and Reasoning”, a monograph on type theories ECC/UTT that was published by OUP in 1994, and “Formal Semantics in Modern Type Theories”, a forthcoming book (jointly with S. Chatzikyriakidis) to be published by Wiley/ISTE Science Publishing Ltd.

Mehdi Mirzapour, Jean-Philippe Prost, and Christian Retor´e (LIRMM, Mont-pellier University CNRS, 161 Rue Ada, France)

Categorial Proof Nets and Dependency Locality: A New Metric for Linguistic Complexity

Abstract: This work provides a quantitative computational account of why a sentence has harder parse than some other one, or that one analysis of a sentence is simpler than another one. We take for granted Gibson’s results on human processing complexity, and we provide a new metric which uses (Lambek) Categorial Proof Nets. In particular, we correctly model Gibson’s account in his Dependency Locality Theory. The proposed metric correctly predicts some performance phenomena such as structures with embedded pronouns, garden pathing, unacceptability of center embedding, preference for lower attachment and passive paraphrases acceptability. Our proposal extends existing distance-based proposals on Categorial Proof Nets for complexity measurement while it opens the door to include semantic complexity, because of the syntax-semantics interface in categorial grammars.

(17)

Aarne Ranta (University of Gothenburg and Digital Grammars AB, Sweden) Invited Talk

Concept Alignment for Compositional Translation

Abstract: Translation between natural languages is not compositional in a naive word-to-word sense. But many problems can be solved by using higher-level concepts, implementable as abstract syntax constructors in type theory together with compositional linearization functions in Grammatical Framework (GF). The question then arises: what are these constructors for a given set of languages? A whole spectrum of possibilities suggests itself: word senses (as in WordNet), multiword phrases (as in statistical machine translation), predication frames (as in FrameNet), syntactic deep structures (as in GF Resource Grammar Library), and lexico-syntactic constructions (as in Construction Grammar). The talk will study the problem in the light of experiences for building a cross-lingual lexicon of concepts in the General Data Protection Regulation (GDPR) in five languages. We have identified over 3000 concepts of varying complexity. A lot of manual work has been needed in the process, but some ideas have emerged toward a computational approach that generates concept alignment candidates by automated analysis.

Short Biography: Aarne Ranta is Professor of Computer Science at the University of Gothenburg as well as CEO and co-founder of Digital Grammars AB. Ranta’s research was initially focused on constructive type theory and its ap-plications to natural language semantics. It evolved gradually to computational applications, leading to the implementation of GF (Grammatical Framework). The mission of GF is to formalize the grammars of the world and make them available for computer applications. It enables the processing of natural language with the same precision as programming languages are processed in compilers. Frank Richter (Goethe University Frankfurt a.M., Germany) Invited Talk

Computational Semantics: Representations and Reasoning

Abstract: Computing with classical meaning representations of formal se-mantics encounters two major problems (with many sub-problems): How do we compose logical representations for natural language expressions in a computa-tionally feasible grammar, and how do we actually reason with the sophisticated logical representations that theoretical linguists devise? This talk revisits the construction of logical representations in a few empirically and theoretically challenging areas of grammar, and presents a treatment of formulae of higher-order logic which makes it possible to use first higher-order model builders and theorem provers to reason with them, with special attention to the emerging overall ar-chitecture.

Short Biography: Frank Richter is Privatdozent and senior lecturer at the Institut f¨ur England und Amerikastudien at Goethe Universit¨at Frankfurt a.M., Germany, since 2014. After studying general linguistics, computer science and psychology in T¨ubingen and a year at the University of Massachusetts at Amherst, he earned his PhD in general and computational linguistics at T¨ubingen University. He worked as researcher, lecturer and visiting professor at the University of T¨ubingen, University of Stuttgart and University of D¨usseldorf.

(18)

His publications are on the formal foundations of constraint-based grammar, he is co-inventor of the framework of Lexical Resource Semantics (with Manfred Sailer), and he published on sentential negation, negative concord, idiomatic ex-pressions, polarity items and syntactic and semantic grammar implementations. Mehrnoosh Sadrzadeh (Queen Mary University of London, UK) Invited Talk

Lambdas, Vectors, and Dynamic Logic

(This is joint work with Reinhard Muskens and is supported by a Royal Society International Exchange Award.)

Abstract: Vector models of language are based on the contextual aspects of language, the distributions of words and how they co-occur in text. Truth condi-tional models focus on the logical aspects of language, composicondi-tional properties of words and how they compose to form sentences. In the truth conditional ap-proach, the denotation of a sentence determines its truth conditions, which can be taken to be a truth value, a set of possible worlds, a context change potential, or similar. In the vector models, the degree of co-occurrence of words in context determines how similar the meanings of words are. In this talk, we put these two models together and develop a vector semantics based on the simply typed lambda calculus models of natural language. We provide two types of vector semantics: a static one that uses techniques familiar from the truth conditional tradition of Montague and a dynamic one based on a form of dynamic interpre-tation inspired by Heim’s context change potentials. We show how the dynamic model revokes a dynamic logic whose implication can be applied to admittance of a sentence by a corpus, and provide examples.

Short Biography: I got a BSc and an MSc from Sharif University, Tehran, Iran and a PhD with joint supervision at UQAM and Oxford. I held an EPSRC PDRF, EPSRC CAF, and a JRF in Wolfson College, Oxford; at the moment I am a senior lecturer in Queen Mary University London, where I teach NLP and mathematics for engineers. I have worked on algebra and proof theory for multi-agent systems and on vector composition and algebras for distributional semantics. I have done recurrent PC and PC chair work in conferences and workshops of the field and have edited volumes

Manfred Sailer (Goethe University Frankfurt a.M., Germany) Invited Talk Contraint-Based Underspecified Semantic Combinatorics

Abstract: In this talk, I will review a number of challenges of the syntax-semantics interface for a standard concept of compositionality. Such phenomena include: scope ambiguity, negative concord, discontinuous semantic contribution, polyadic quantification, and incomplete utterances. I will argue that a constraint-based underspecified semantic combinatorics, as pursued in Lexical Resource Se-mantics (LRS), allows for a natural and interesting analysis of such phenomena. A system like LRS combines insights and techniques of computational and formal semantics and, as such, continues the tradition of fruitful interaction between computational and theoretical linguistics.

Short Biography: Manfred Sailer is professor of English Linguistics at Goethe-University Frankfurt a.M. He studied general linguistics, computer

(19)

sci-ence and psychology at Universit¨at T¨ubingen (Master 1995, Promotion 2003) and received his postdoctoral degree (Habilitation) in English and General Lin-guistics at G¨ottingen University (2010). His main areas of research are the syntax-semantics interface, formal phraseology, negation, and the interaction of regularity and irregularity in language.

Satoshi Tojo (School of Information Science, Japan Advanced Institute of Sci-ence and Technology (JAIST), Japan) Invited Talk

Linear Algebraic Representation of Knowledge State of Agent

Abstract: We first propose a linear algebraic representation for the frame property, that is the accessibility in possible worlds as adjacency matrix. We show that the product between an adjacency matrix and a column vector of val-uation results in possibility modality, and translate also the necessity modality, employing Boolean operations. Then, we apply the method to agent communica-tion; we represent the belief change of agents by dynamic epistemic logic (DEL), and show that the belief change can also be shown by a sequence of linear trans-formation on accessibility matrix. Finally, we discuss the requirements for the formal presentation of ‘who knows what at which time’.

Short Biography: Satoshi Tojo received a Bachelor of Engineering, Master of Engineering, and Doctor of Engineering degrees from the University of Tokyo, Japan. He joined Mitsubishi Research Institute, Inc. (MRI) in 1983, and the Japan Advanced Institute of Science and Technology (JAIST), Ishikawa, Japan, as associate professor in 1995 and became professor in 2000. His research interest is centered on grammar theory and formal semantics of natural language, as well as logic in artificial intelligence, including knowledge and belief of rational agents. Also, ha has studied the iterated learning model of grammar acquisition, and linguistic models of western tonal music.

Adri`a Torrens Urrutia (Universitat Rovira i Virgili, Tarragona, Spain) A Proposal to Describe Fuzziness in Natural Language

Abstract: In this presentation, we propose formal models that consider grammaticality as a gradient property instead of the categorical view of gram-maticality defended in theoretical linguistics. Given that deviations from the norm are inherent to the spontaneous use of language, linguistic analysis tools should account for different levels of grammaticality.

Christian Wurm (University of D¨usseldorf, Germany) Invited Talk Reasoning with Ambiguity

Abstract: Ambiguity is often considered to be a nemesis of logical rea-soning. Still, when addressing natural language semantics with formal logic, we somehow have to address it: we can “lose it in translation” by saying all ambi-guity is syntactic and we interpret unambiguous syntactic derivations; we can use meta-formalisms in order to represent it; but the fact remains that humans usually can perfectly reason with ambiguous statements. Hence it seems to be an interesting idea to include ambiguity into logic itself. In this talk, I will present the results of my pursuit of this idea, which are partly very surprising and odd,

(20)

but in the very end (I hope) provide us with a deeper understanding of ambiguity and maybe even the nature of meaning.

Short Biography: I completed my PhD in Bielefeld with Marcus Kracht and Greg Kobele, the topic being what I called “metalinguistics”, that is the construction of language as an infinite object. My main interests are accordingly formal languages, automata and substructural logic. Currently, I am a lecturer at the University of D¨usseldorf and focus on the analysis of ambiguity, by means of logic but also machine learning techniques.

Yuan Xie (Utrecht University, The Netherlands)

Referential Dependencies in Chinese: A Syntax- Discourse Processing Model Abstract: I am proposing a syntax-discourse processing model for the rep-resentation and interpretation of referential dependencies in Chinese. Chinese referentially dependent expressions (e.g. pronouns, reflexives, certain full noun phrases) are different from those in many indo-European languages and rely more on discourse (e.g. using bare noun phrases to express definiteness–lacking overt article the; sentence-free reflexive ziji (self-N)– referring to the speaker), for this reason, this model, taking both the morphosyntactic and discourse fea-tures of the referentially dependent expressions into consideration, reflects the view that referentially dependent nominal expressions and their antecedents are information units that are stored in our working memory system and the refer-ential dependencies are established through the interactions of those information units in our working memory system.

Robert ¨Ostling (Stockholm University, Sweden) Invited Talk Language Structure from Parallel Texts

Abstract: Some texts have been translated into thousands of languages, a fact that allows us to compare the structures of language from a bird’s-eye view. This information can then be used to study the evolutionary forces driving language change. I will discuss some of our results in this area, as well as current models for formalizing the phenomenon of human language on a global scale.

Short Biography: I am a researcher in computational linguistics, (un)fo-cusing on a variety of topics including machine translation, language modeling, computational approaches to linguistic typology and sign language, multilingual natural language processing, and language learning. I am currently employed at the Department of Linguistics, Stockholm University, but have also worked at the University of Helsinki.

(21)

Part II

(22)
(23)

From TAG to HOL Representations of AMRs via ACGs

Rasmus Blanck and Aleksandre Maskharashvili

CLASP, FLOV, University of Gothenburg, Sweden

{Rasmus.Blanck and Aleksandre.Maskharashvili}@gu.se Abstract. We investigate the possibility of constructing an Abstract Categorial Grammar (ACG) that relates Tree Adjoining Grammar (TAG) and Higher Or-der Logic (HOL) formulas encoding Abstract Meaning Representations (AMRs). We also propose another ACG that relates TAG and HOL formulas expressing the neo-Davidsonian event semantics. Both of these encodings are based on the already existing ACG encoding of the syntax-semantics interface where TAG derivations are interpreted as HOL formulas representing Montague semantics. In particular, both of these encodings share the same abstract language coming from the ACG encoding of TAG with Montague semantics, which is second-order. For second-order ACGs, problems of parsing and generation are known to be of polynomial complexity. Thus we get the natural language generation and parsing with TAGs and HOL formulas modeling AMR for free.

1 Introduction

Abstract Meaning Representations (AMRs) [2] have been subject to the interest of the computational linguistics community as they offer meaning representations of natu-ral language expressions (sentences, noun phrases, etc.) without explicitly referring to morpho-syntactic features of a particular natural language. Several works were pro-posed to make use of AMRs for natural language (semantic) parsing [1] as well as for (sentence) generation [10]. To provide a logical setting for AMR semantics, recently the following two approaches were offered: [3], which provides translations of AMRs into First Order Logic (FOL) formulas, and [19], which translates AMRs into Higher Order Logic (HOL) formulas that express neo-Davidsonian event semantics.

It has been claimed that the Tree Adjoining Grammar (TAG) formalism [12] [13] is beneficial for modelling natural language syntax as TAGs can express various phenom-ena (such as encoding long-distance dependencies) and at the same time, polynomial parsing algorithms exist for TAG. Various approaches have been developed for natural language parsing and generation using TAGs, not only at the sentential level [11] but also for discourse [13] [5].

Abstract Categorial Grammars (ACGs) [7] present a grammatical framework, de-signed in the spirit of type-logical grammars. ACGs proved to be capable of encoding various grammatical formalisms, including TAG. Moreover, ACGs allow one to model the syntax-semantics interface where the syntactic part comes from a TAG grammar [15]. Importantly, ACGs constructed for encoding TAG with semantics belong to the class of ACGs that enjoy polynomial parsing and generation algorithms [14].

An approach with a compositional treatment of event semantics, by interpreting syn-tactic trees of sentences into formulas expressing event semantics, is offered by [4]. In

(24)

order to obtain event semantic interpretations from syntactic descriptions of sentences, ACG were employed ACGs in [20]. Neither of these two works, however, uses TAGs for their syntax.

One of the main problems of a compositional approach to event semantics is related to quantification. Following [9], which studies interactions of quantifiers and events in a type-logical setting, (1) is a typical example challenging compositional approaches to event semantics: while the syntactic scope of every woman is inside that of kissed (i.e. of an event), a part of the semantic interpretation of every woman scopes over kissed (i.e., ∃k.(kiss k)) and it operates inside the scope of kissed (i.e., arg1k x).

(1) John kissed every woman.

∀x(woman x ⊃ ∃k(kiss k) ∧ (arg0kjohn) ∧ (arg1k x))

The present work offers an approach to the syntax-semantics interface where syntax comes from TAG and the semantics is neo-Davidsonian. We follow the ACG encoding of TAG with Montague semantics given in [17], but we provide neo-Davidsonian se-mantics instead of Montagovian.

2 AMR

Banarescu et al. [2] introduce AMRs as a means of representing basic meaning of natu-ral language phrases to facilitate producing uniform semantic annotations across various languages. An AMR is a directed, acyclic graph, with a unique root and labelled nodes and edges. These graphs can be represented in PENMAN notation (2.a), or as a FOL formula (2.b).

(2) a. (w/want01: arg0(b/boy)

: arg1(g/go01: arg0b))

b. ∃w∃g∃b (instance(w, want01)∧ instance(g, w)∧

instance(b, boy)∧ arg0(w, b)∧ arg1(w, g)∧ arg0(g, b))

Graph nodes represent entities and events in a neo-Davidsonian style, while edges represent relations among them. Leaves can only be labelled with concepts, so that, for example, (b/boy) refers to an instance b of the concept boy. AMRs do not contain in-formation about tense, aspect, number and articles, etc. AMRs do not express universal quantification either; rather, such quantifiers are treated as modifiers of the nouns they are quantifying over. To overcome these problems,1Stabler [19] suggests an augmenta-tion (AAMR) of AMR in which decoraaugmenta-tions such as want01.pres and boy.sg are used to express tense and number, and where quantification is given a more general treatment. AAMRs are mapped to HOL formulas, in which quantifiers always outscope the event existential quantifier, thus generating the basic surface order reading. To accomplish this, AAMR graphs are transformed into trees, where roles are encoded as node labels, contrary to the original AMR representation where they are encoded as arc labels. This allows AAMRs to have a standard term representation to which tree transducers can

1Bos [3] also deals with the restrictions of AMRs related to universal quantification, but he uses

(25)

be applied, yielding HOL formulas encoding intended meanings that the initial AMRs are not able to express. For example, (3.a) is an AAMR in PENMAN notation, whereas (3.b) is its translation into HOL that represents event semantics for the sentence Most boys do not walk.

(3) a.

walk(:instance(walk01.pres),

:arg0(b(: instance(boy.pl), : quant(most))),

:polarity(−))

b. most(boy.pl, λb¬∃w(walk01.pres(w)∧ : arg0(w, b)))

3 Tree Adjoining Grammars (TAG)

TAG is a tree generating formalism. A TAG derived tree language is obtained by com-bining elementary trees, which are either initial or auxiliary. Conceptually, an initial tree models domain of locality (e.g. verbs and their arguments), whereas auxiliary trees enable one to recursively expand (e.g. adverbs, adjectives) a syntactic tree. TAG express that by allowing initial trees to substitute only frontier nodes of a tree, whereas auxiliary trees can substitute internal nodes of a tree - this is called adjunction.2A node that is being substituted or adjoined should have the same label (usually modelling a category such as NP, VP, S, etc.) as the root node of the substituted or adjoined tree. Such nodes are called substitution and adjuction sites of a tree. For example, γkissed, γJohn and γMary are initial trees, whereas γpassionatelyis an auxiliary one (see Figure 1). We can substitute γJohnand γMary into γkissedon the frontier nodes labeled withnp and adjoin γpassionatelyinto γkissedon the node with labelvp, we obtain the derived tree depicted in Figure 2(a).

γMary = np Mary γkissed = S vp np v kissed np γJohn = np John γpassionately = vp vp∗ Adv passionately

Fig. 1. TAG trees

The process of the production of the derived tree 2(a) is recorded by the correspond-ing derivation tree, which is represented as a tree 2(b).

4 The Syntax-Semantics interface for TAG using ACGs

An abstract categorial grammar (ACG) defines two languages, the abstract and object ones (the tecto and pheno grammatical levels, respectively, `a la Curry). The object lan-guage is a homomorphic image (translation `a la Montague) of the abstract one [7]. To

2Since (by definition) an internal node n of a tree γ has got some children, they would be left

orphan as a result of adjoining of an auxiliary tree β on n. TAG has a solution for that: an(y) auxiliary tree β has a frontier node, marked with ∗, which has the same label as the root of β (and thus the same label as n). This frontier node, called the foot node of β, becomes mother to the children of the node n into the resultant tree of adjoining β into γ on n.

(26)

S vp np John v kissed Adv passionately np Mary

(a) Derived tree

γkissed

γJohn γMary γpassionately

(b) Derivation tree

Fig. 2. A TAG derived tree and the corresponding derivation tree for Mary passionately kissed John

define ACGs, let us first define the notion of a higher-order linear signature: it is a triple Σ = hA, C, τi, where A is a finite set of atomic types, C is a finite set of con-stants, and τ is a mapping from C to TA, where TA is the set of types built on A: TA ::= A|TA( TA. Λ(Σ) denotes the set of λ-terms3 built on Σ. To denote that M∈Λ(Σ) is of type α, we write M :Σαor just M :α.

Definition 1. An ACG is a quadruple G = hΣ1, Σ2, L, si where:

1. Σ1 and Σ2 are two higher-order linear signatures, called the abstract vocabulary and the object vocabulary, respectively.

2. L : Σ1−→ Σ2is called a lexicon of the ACG G . L is a homomorphic mapping of types and terms built on Σ1to types and terms built on Σ2, defined as follows:

(a) If α(β ∈ TΣ1then L(α(β) = L(α)(L(β).

(b) If λx.M, (M K)∈Λ(Σ1)then L(λx.M)=λx.L(M) and L(M K)=L(M) L(K).

(c) For any constant c :Σ1 αof Σ1we have L(c) :Σ2 L(α).

3. s ∈ TΣ1(i.e., s is a type of the abstract vocabulary) is the distinguished type of the grammar G .

The abstract language ofG is defined as:A(G ) = {M ∈ Λ(Σ1)| M :Σ1sand M is closed} The object language ofG isO(G ) = {M ∈ Λ(Σ2)| ∃u ∈ A(G ). M =βηLG(u)}

ACGs enable one to encode TAG derivation trees within the grammar: they are modelled as the abstract language [7]. Derived trees are modelled as the object lan-guage. One defines the following signatures and lexicons: a signature ΣTAG, where TAG derivation trees are encoded; a signature Σtreesthat encodes TAG derived trees; a lexicon Ld-ed trees: ΣTAG−→ Σtreesthat maps derivation trees to derived trees; the signature ΣLog where one defines HOL terms encoding Montague semantics; and LLog: ΣTAG−→ ΣLog that maps derivation trees to Montague semantics [15], [17].

ΣTAG: Its atomic types includeS, vp, np, SA, vpA. . . , where the X types stand for the categories (i.e. labels) X of the nodes where a substitution can occur, while the XA types stand for the categories X of the nodes where an adjunction can occur. For each elementary tree γlex. entry, ΣTAGcontains a constant Clex. entry whose type encodes the adjunction and substitution sites of γlex. entry: every X-adjunction (resp. X-substitution)

3As a notational convention, we may use λx y.K instead of λx.λ y.K. Instead of L(K) = M,

(27)

site is modelled by an argument of type XA (resp. X) of Clex. entry. ΣTAG additionally contains constants IX : XAthat are meant to provide a fake auxiliary tree in the cases where no adjunction actually takes place in a TAG derivation. Since arguments of a Clex. entry can be only atomic ones (any XAand/or X is atomic), ΣTAGis a second-order signature.

Here we are interested in semantic interpretations.4Constants of the semantic vo-cabulary ΣLogare shown in Table 1. We have two atomic types in ΣLog, e for entities and t for propositions. The lexicon LLogfrom ΣTAGto ΣLogis provided in Table 2. The distinguished type of the ACGs for encoding TAG with semantics isS.

john, mary : e certainly : t → t

woman, important, walk : e → t kiss, love : e → e → t

passionately, fast : t→ t ¬ : t→ t

⇒, ∨, ∧ : t→ t → t ∃, ∀ : (e→ t) → t

Table 1. Constants in the semantic vocabulary ΣLogfor encoding Montague semantics

Constants of ΣTAG Their interpretations by LLog

Cwoman:nA( np λD.λq .D(λx.woman x)q

Csmart:nA( nA λD. λn .λq . D (λ x. (important x) ∧ (n x))q

Cevery, Ceach:nA λ P Q .∀ x. (P x) ⊃ (Q x)

Csome, Ca:nA λ P Q .∃ x. (P x) ∧ (Q x)

Cpassionately:vpA( vpA λadvvpred. advv(λx.passionately (pred x))

Ckissed :SA( vpA( np ( np ( S

λadvsadvvsbj obj.

advs(sbj (λx.(obj (advv(λy.kiss x y)))))

IX : XA λx.x

S t

Table 2. Interpretations by LLog

M0= Ckissed IS (Cpassionatelyvp Ivp) CMary CJohn :S

Ld-ed trees(M0) =S2 (vp2 (np1 Mary) (vp2 passionately (v1 kissed))) (np1 John)

LLog(M0) =passionately (kiss mary john)

For instance, the term M0models the TAG derivation tree on Figure 2(b). By map-ping M0with Ld-ed trees, one obtains the term representation of the derived tree shown on Figure 2(a); and by mapping M0with LLog, one gets a Montague style HOL for-mula, which expresses semantics of Mary passionately kissed John.

5 From TAG Derivation trees to AMR Style Formulas

Our goal is to interpret terms modeling TAG derivation trees as HOL formulas that are close to the standard AMR representations. While we focus on the declarative sentences

4For the details of mapping terms modelling TAG derivation trees into ones modelling derived

(28)

and their interpretations, what we propose also includes a compositional approach to noun phrases and other expressions.5

We add a variable for every verb that denotes an event. For instance, a predicate signalled by an intransitive verb, such as go, becomes λx.λh.∃g (go g) ∧ (arg0g x)∧ (h g)instead of λx.go x. The former term is of type e → (v → t) → t. This treatment is inspired by Champollion’s [4] approach to neo-Davidsonian semantic interpretations. While he chooses not to make a difference between arguments and adjuncts, we would like to encode arguments (of type e) within the semantics of an event predicate in order to be close to the AMR semantic representations, where core relations (arguments) are licensed by a verb frame. However, note that in AMRs, there is only one kind of internal node to represent both nouns and verbs. To reflect this in our encodings, all of these entities should have the type v instead of type e, and this change makes predicates signalled by intransitive verbs and nouns of type v → (v → t) → t. This suggests that we should encode nouns as follows: λx.λf.∃n (noun n) ∧ (instance n x) ∧ (f n) instead of λx.noun x. This decision results in rather implausible interpretations such as ∃x ∃m(man m ∧ (instance m x) ∧ ∃g (go g) ∧ (arg0g x))corresponding to a man goes. Even more problematic to interpret would be the formula encoding the semantics of every man goes : ∀x (∃mman m ∧ (instance m x) ⊃ ∃g (go g) ∧ (arg0g x)). We deal with this shortcoming in the next section.

In order to encode a negation, let us note that in TAG it is modelled by an auxiliary tree that adjoins on thevp node of a tree. We model this fact by a constant of type vpA in the vocabulary ΣTAG with the following interpretation: λV xh.¬(V xh). In words, it means that the negation scopes over the existentially closed event formula, but still allows for further continuations (which has been argued for in [4]).

john, mary : v certainly : t → t

woman, important, walk : v → t kiss, love : v → t

passionately, fast : t→ t ¬ : t→ t

⇒, ∨, ∧ : t→ t → t ∃, ∀ : (v→ t) → t

arg0, arg1, arg2 : v→ v → t T rue : v→ t

Table 3. Constants in the semantic vocabulary ΣAMR

We model trees anchored by nouns, adjectives and determiners (quantifier words and phrases), verbs, etc. as before in ΣTAGbut attribute to them different semantic inter-pretations. These new interpretations are shown in Table 4. To be precise, we create a new vocabulary ΣE

TAGfor encoding derivation trees by adding to ΣTAGone more typeT and a constant Closure of typeS( T. We need them in order to close a sentence (S), i.e., to model that there is no more content to come (no more continuations). In seman-tics, we interpret Closure by applying an interpretation of a sentence that is looking for a continuation to a vacuous continuation, which one models as T rue : v → t, where (T rue x)∧ p is equivalent to p (for any x : v). Now, T will be our distinguished type. It is straightforward to map the new vocabulary, ΣE

TAG, to the old one, ΣTAG. We mapT

5Pogodalla [16][17] shows how to encode other kind of sentences in the same principled way

(29)

toS, and Closure to λx.x; the rest of ΣE

TAGis exactly the same as ΣTAG, i.e., we map any ξ (being a constant or a type) from ΣE

TAGto ξ in ΣTAG.

To define the semantic interpretation with events, we create a new signature, called ΣAMR, shown in Table 3. We construct Ldere-amr: ΣETAG−→ ΣAMRprovided in Table 4.

In (4), we list the examples and their encodings in Λ(ΣE

TAG)that we use here and onwards to exemplify our interpretations as we maintain ΣE

TAGas the abstract vocabulary and therefore these terms will be in reused again.

(4) a. Every smart woman walks.

M1=Closure(CwalksISIvp(Cwoman(CsmartCevery))) :T b. John does not walk.

M2=Closure(CwalksISCdoes notCjohn) :T c. Every smart woman walks fast.

M3=Closure(CwalksIS(CfastIvp)(Cwoman(CsmartCevery))) :T d. Certainly, every smart woman walks.

M4=Closure(Cwalks(CcertainlyIS)Ivp(Cwoman(CsmartCevery))) :T

For instance, consider (4)(a). It is modelled by the term M3of typeT, which we can then interpret using a lexicon. Table 5 shows its interpretation by Ldere-amr.

S := (v→ t) → t T := t

Closure := λP.P T rue : ((v→ t) → t) → t

Cjohn := λP. Pjohn

Cwalks := λadvsadvvsubj.

advs(subj (advv(λx.λh.∃w. (walk w) ∧ (arg0w x)∧ (h w))))

Csmart := λD.λn.λq.λf.D(λxh.(n x h)∧ (smart x))q f CnA every := λp.λq.λf.∀x.(p x f) ⊃ (q x f) CnA(np woman := λD.D(λ x h.(woman x ∧ h x)) CSA(SA certainly := λm. λV. m (λh.V (λv.(certainly v) ∧ (h v)) CvpA(vpA fast := λm. λV. m (λx.λh.V x(λv.(fast v) ∧ (h v))) CvpA does not := λV xh.¬(V x h)

Table 4. Interpretations by Ldere-amr

M1:=∀x(woman x ∧ smart x ⊃ ∃w (walk w) ∧ (arg0w x))

M2:=¬∃w (walk w) ∧ (arg0wjohn)

M3:=∀x(woman x ∧ smart x ∧ fast x ⊃ ∃w(walk w) ∧ (arg0w x)∧ (fast w))

M4:=∀x (woman x∧smart x ∧ certainly x ⊃ ∃w(walk w) ∧ (arg0w x)∧(certainly w))

Table 5. Interpretations of M1, M2, M3and M4by Ldere-amr

As Table 5 shows, we obtain the desired interpretations for M1and M2, but not for M3and M4. This is due to the failure to distinguish between event entities and discourse entities. By this uniform treatment, the event continuation is applied not only to the verb

(30)

phrase but also to the noun phrase, and that is the source of the incorrect results (e.g. in M4above, we obtain the subformula (certainly x), in words certainly woman, which is clearly not what we want). One concludes that in event semantics with continuations, nouns and predicates should not be treated in the same manner: event continuations should not operate on discourse referents, but on event ones. In the next section, the current approach is modified to make a proper distinction between event entities and other components of discourse.

6 From AMRs to Montague style HOL and to Neo-Davidson HOL

Thanks to the polynomial reversibility properties of second-order ACGs, we obtain from HOL formulas encoding AMRs the ones encoding Montague semantics, which are HOL formulas that do not incorporate a notions of event. For that we construct two ACGs sharing the abstract vocabulary encoding TAG derivation trees. In addition, in order to obtain a translation of HOL formulas encoding AMRs into HOL formulas encoding event semantics, we define yet another lexicon from ΣE

TAG into a new signa-ture Σevhol, shown in Table 6. Figure 3 shows interpretations of ΣETAGinto Σevhol(event semantics).

john, mary : e kiss, love : v → t

woman, important, walk : e → t ∃, ∀ : (e→ t) → t

passionately, fast : v→ t ¬ : t→ t

⇒, ∨, ∧ : t→ t → t v : (v

→ t) → t

arg0, arg1 : v→ e → t Arg0, Arg1 : v→ v → t

Table 6. Constants in the semantic vocabulary Σevhol

While we interpret constants encoding trees anchored by verbs and adverbs with the same terms as in the previous section, their types are now different. We denote this new lexicon with Levhol: ΣETAG−→ Σevhol. Moreover, constants modelling quantifiers (e.g. Cevery) are now of type (e → t) → (e → (v → t) → t) → (v → t) → t. This means that our encoding is asymmetric, whereas the standard one is symmetric: its type is (e→ t) → (e → t) → t. This is explained by our choice not to use event continuations for noun phrases but only for events. In this setting, we obtain the following, correct interpretations of both M3and M4, which were problematic in the previous section:

Levhol(M3) =∀x(woman x ∧ smart x ⊃ ∃vw(walk w) ∧ (arg0w x)∧ (fast w))

Levhol(M4) =∀x(woman x ∧ smart x ⊃ ∃vw(walk w) ∧ (arg0w x)∧ (certainly w))

Coreference: AMRs are useful for representing coreferences (for example, in the case of raising verbs such as wants, (5a)), but this property gets lost in Stabler’s transfor-mation of AMRs into trees. To encode coreferences, we follow [17]. In TAG, wants anchors an auxiliary tree, whereas to sleep anchors an initial tree. For instance, to de-rive the sentence (5a) in TAG, one substitutes the tree for John into the one for wants and the resultant tree adjoins into the S-labeled node of the initial tree to sleep. So, we

(31)

def(v→ t) → t

Cjohn := λP. Pjohn : (e → Ω) → Ω Cwalks := λadvsadvvsubj .advs

(subj (advv(λx.λh.∃vw. (walk w) ∧ (arg0w x)∧ (h w))))

Ckissed := λadv(obj (advsadvvsubj obj . advs(subj(λx.

v(λy.λh.∃vw (kiss w) ∧ (arg0w x)∧ (arg1w y)∧ (h w))))))

Cevery := λP Q.λh.∀x(P x ⊃ Qxh) : (e → t) → (e → Ω) → Ω Ca := λP Q.λh.∃x(P x ∧ Qxh) : (e → t) → (e → Ω) → Ω Csmart := λD.λn.λq.λf.D(λx.(n x )∧ (smart x))q f Cwoman := λD.D(λ x.woman x) Ccertainly := λm. λV. m (λh.V (λv.(certainly v) ∧ (h v)) Cfast := λm. λV. m (λx.λh.V x(λv.(fast v) ∧ (h v))) Cdoes not:= λV xh.¬(V x h) nA := (e→ t) → (e → Ω) → Ω n := e → t np := (e→ Ω) → Ω vpA:= (e→ Ω) → e → Ω SA := Ω→ Ω S := Ω T := t Fig. 3. Interpretation of ΣE TAGby Levhol

introduce the constants and then interpret them (see Figure 4). Note that even in the case of coreference and universal quantification, we obtain the desired results (e.g. (5b)).6

Cwants: SA( vpA( np ( S0A Cto-sleep: S0A( S

Cwants:= λadvsadvvsubj.λP red.advs(subj(advv.λx h.

∃vw((want w) ∧ (h w) ∧ (arg

0w x)∧ P red(λQ.Q x)(λr. Arg1w r))

Cto-sleep:= λcont.cont(λsubj.subj(λx.λf.∃vu.(sleep u) ∧ (arg0u x)∧ (f u))

S0

A:= (((e→ Ω) → Ω) → Ω) → Ω

Fig. 4. Types and constants for modeling raising verbs and their interpretations

(5) a. John wants to sleep.

M5=Closure(Cto-sleep(CwantsISIvpCjohn)) :T ∃vw(want w) ∧ (arg

0wjohn) ∧ (∃vu(sleep u) ∧ (Arg1w u)∧ (arg0ujohn))

b. Every boy wants to sleep.

M6=Closure(Cto-sleep(CwantsISIvp(CboyCevery))) :T ∀x(boy x⊃∃vw(want w)∧(arg

0w x)∧(∃vu.(sleep u)∧(Arg1w u)∧(arg0u x)))

6ACG files encoding grammar and examples provided in Section 6 can be found at the

fol-lowing link: https://www.dropbox.com/s/g2c58yq0ulp7a3j/AMR-TAG_ACG. zip?dl=0.

(32)

7 Future Work and Conclusion

To encode certain kind of complex interactions between events and quantification, second-order ACGs may not suffice. For instance, consider (6) (from [20]). In seman-tics, everyday quantifies over times of events of kissing, but in syntax, everyday is an S-modifier of a sentence. To model this kind of complex scope interactions, one may invent new arguments of verbs that can be their modifiers in syntax whereas they play special roles in semantics. However, such an approach would deviate from the generic approach of the ACG encoding of TAG. Another way is to use higher-order ACGs. [17] presents a generic way of overcoming scoping problems of a similar kind. His approach leads to third-order ACGs, for which one cannot guarantee the polynomial parsing prop-erty (there is a third-order ACG generating an NP-complete language [18]).

(6) John kisses Mary everyday. ∀x (day x) ⊃ ∃vw(kiss w) ∧ (arg

0wJohn) ∧ (arg1wMary) ∧ (time w x)

Stabler suggests to use de Groote and Lebedeva’s approach [8] to pronouns and the definite article while dealing with AMRs in HOL. The dynamic setting where their approach is developed is a type-logical one, which is very close to the idea of ACGs as one can make distinctions between two levels of grammars. This makes us believe that the encoding proposed in this paper could be beneficial for that. In addition, ACGs were employed to study some discourse formalisms based on TAGs [6]. Thus, further extending the current approach with an aim of integrating it within already existing discourse encoding of TAG could be done as future work.

Λ(ΣE TAG) Λ(ΣTAG) Λ(Σtrees) Λ(ΣAMR) Λ(Σevhol) Λ(ΣLog)

Fig. 5. ACG architecture for a syntactic and several semantic interpretations

To sum up, the current work makes it explicit that one can obtain AMR style se-mantic formulas compositionally from a grammar. With the same grammar, one obtains Montagovian HOL semantic representations. Again, the same grammar is employed in order to obtain HOL representations modelling neo-Davidsonian event semantics where negations and quantifiers, including a universal one, interact with events so that one ob-tains correct interpretations. Since all these encodings are done within second-order ACGs, one can draw correspondences between these interpretations using an algorithm of polynomial complexity. This makes the ACG architecture we constructed (depicted in Figure 5) beneficial for natural language generation/parsing tasks with AMRs and TAGs.

References

1. Y. Artzi, K. Lee, and L. Zettlemoyer. Broad-coverage CCG semantic parsing with AMR. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1699–1710. Association for Computational Linguistics, 2015.

References

Related documents

The major reason for using the language of regular expressions is to avoid an unnecessary use of recursion in BNF specifications. The braces used in this notation bear no relation

In this thesis we explore to what extent deep neural networks DNNs, trained end-to-end, can be used to perform natural language processing tasks for code-switched colloquial

In this thesis we explore to what extent deep neural networks (DNNs), trained end-to-end, can be used to perform natural language processing tasks for code-switched colloquial

Furthermore, we have shown how multi-slot semantics for call-routing systems allows straight- forward division of categories into routing catego- ries and disambiguation

The paper “A Pattern for Almost Compositional Functions” describes a method for simplifying a common class of functions over rich tree-like data types, such as abstract syntax trees

This thesis includes six research papers by the author which cover the var- ious aspects of this approach: entity recognition and modality extraction from natural language,

This report gives a good overview on how to approach and develop natural language processing support for applications, containing algorithms used within the field, tools

1998 ACM Subject Classification F.4.2 Grammars and Other Rewriting Systems, F.4.3 Formal Languages, G.2.2 Graph Theory, I.2.7 Natural Language Processing.. Keywords and phrases