• No results found

M Splitting rocks: Learning word sense representations from corpora and lexica

N/A
N/A
Protected

Academic year: 2021

Share "M Splitting rocks: Learning word sense representations from corpora and lexica"

Copied!
1
0
0

Loading.... (view fulltext now)

Full text

(1)

Luis Nieto Piña / Splitting rocks: Learning word sense representations from corpora and lexica

30 • 2019

Splitting rocks:

Learning word sense representations from corpora and lexica

Luis Nieto Piña

Data linguistica Luis Nieto Piña

ISBN 978-91-87850-75-2 ISSN 0347-948X

Meaning representation is a central problem in Language tech- nology. By assigning semantic representations to language units such as words or sentences, computer systems are able to integrate meaning into their processes. This is crucial in many of today’s language applications such as sentiment analysis or text summarization.

Current Machine learning models tend to focus on representing word forms. This might be problematic for words with more than one meaning, given that several meanings are assigned a single representation. Furthermore, these models usually learn semantics from large collections of text, which entails that the word meanings captured depend heavily on the chosen text.

In his doctoral thesis, Luis Nieto Piña presents three models that address those shortcomings. On one hand, the focus is shifted from words to word senses, making it possible to obtain a representation for each meaning of a word. On the other, semantic data is not only obtained from text, but also from lexica to supply the model with curated information about the meanings of words and the relations between them. Luis shows that a combination of these two sources of data yields higher quality word sense representations. In the evaluation of these models, he also demonstrates the utility of these representations, both in established applications and in the develop- ment of linguistic resources.

References

Related documents

This is done by building a prototype ontology learning system based on the state of the art architecture of such systems, using the Korp NLP framework for Swedish text, the

Taken together, this research demonstrates that the rapid vocabulary growth and striking individual differences in productive vocabulary development seen during children’s second

Thus, the objective of this thesis is twofold; the first goal is to explore techniques for converting the free-text trader comments into meaningful numerical features, the second

This is an interpretive research (Walsham, 1995) aiming to investigate the factors explaining adoption of mobile phone technology among the farmers in Bangladesh for a broader

In this work, we use implicit sense classification for shallow discourse parsing for extrinsic evaluation of word representation models using several different machine

(86) Otto, in his narrative, dwelt on the kick with special pride and pleasure. example [17]), as this data suggests, this is likely to originate in the idea popular during the Late

This thesis explores the optimal ways in which natural language generation techniques can be brought to bear upon the problem of processing a structured body of information in

This thesis aims to identify the optimal ways in which natural language generation techniques can be brought to bear upon the problem of processing a structured body of information