• No results found

From Numerical Sensor Data to Semantic Representations:

N/A
N/A
Protected

Academic year: 2021

Share "From Numerical Sensor Data to Semantic Representations: "

Copied!
214
0
0

Loading.... (view fulltext now)

Full text

(1)

From Numerical Sensor Data to Semantic Representations:

A Data-driven Approach for Generating Linguistic Descriptions

From Nu me rical Sensor D ata to Sem an

tic Re presen tation

s:

A Da ta-dri ven Ap proach for Ge

ne ratin g L in gu isti c D escri pti on s

From Numerical Sensor Data to Semantic Representations:

A Data-driven Approach for Generating Linguistic Descriptions

From Nu me rical Sensor D ata to Sem an

tic Re presen tation

s:

A Da ta-dri ven Ap proach for Ge

ne

ratin

g L

in

gu

isti

c D

escri

pti

on

s

(2)

To Sepideh

For her kindness and devotion, and for her endless support.

To Sepi deh

For her kindnes s a nd devot

ion,

and f or her en dless s

upport .

To Sepideh

For her kindness and devotion, and for her endless support.

To Sepi deh

For her kindnes s a nd devot

ion,

and f or her en dless s

upport

.

(3)

Örebro Studies in Technology 78

H ADI B ANAEE

From Numerical Sensor Data to Semantic Representations:

A Data-driven Approach for Generating Linguistic Descriptions

Örebro St udie

s in Te chnology 78

H B ADI ANAEE

From Numerical Sensor Data

to Semantic

Representat ions:

A Data-driven Approach for Generating Linguistic Descriptions

Örebro Studies in Technology 78

H ADI B ANAEE

From Numerical Sensor Data to Semantic Representations:

A Data-driven Approach for Generating Linguistic Descriptions

Örebro St udie

s in Te chnology 78

H B ADI ANAEE

From Numerical Sensor Data

to Semantic

Representat ions:

A Data-driven Approach for Generating Linguistic

Descriptions

(4)

© Hadi Banaee, 2018

Title: From Numerical Sensor Data to Semantic Representations:

A Data-driven Approach for Generating Linguistic Descriptions Publisher: Örebro University 2018

www.publications.oru.se

Print: Örebro University, Repro 03/2018

ISSN1650-8580 ISBN978-91-7529-240-3

Hadi Bana © ee, 2018

Titl e: Fr om N um eri ca l Sens or D ata to Sem antic Repr ese ntati ons:

A Da ta-dr ive n Appr oa ch for G ene ra ting Ling uistic Des cri ptions

Publisher: Öreb ro Un

iversity 2 01

8

ww w.public ations.oru

.se

Print: Ö reb ro Un ivers ity , R epr o 03/2018

ISSN 1650

-8580

ISBN 978-91 -7529 -240-3

© Hadi Banaee, 2018

Title: From Numerical Sensor Data to Semantic Representations:

A Data-driven Approach for Generating Linguistic Descriptions Publisher: Örebro University 2018

www.publications.oru.se

Print: Örebro University, Repro 03/2018

ISSN1650-8580 ISBN978-91-7529-240-3

Hadi Bana © ee, 2018

Titl e: Fr om N um eri ca l Sens or D ata to Sem antic Repr ese ntati ons:

A Da ta-dr ive n Appr oa ch for G ene ra ting Ling uistic Des cri ptions

Publisher: Öreb ro Un

iversity 2 01

8

ww w.public ations.oru

.se

Print: Ö reb ro Un ivers ity , R epr o 03/2018

ISSN 1650

-8580

ISBN 978-91 -7529 -240-3

(5)

Abstract

Hadi Banaee (2018): From Numerical Sensor Data to Semantic Representations:

A Data-driven Approach for Generating Linguistic Descriptions.

Örebro Studies in Technology 78.

In our daily lives, sensors recordings are becoming more and more ubiquitous.

With the increased availability of data comes the increased need of systems that can represent the data in human interpretable concepts. In order to describe unknown observations in natural language, an artificial intelligence system must deal with several issues involving perception, concept formation, and linguistic description. These issues cover various subfields within artificial intelligence, such as machine learning, cognitive science, and natural language generation.

The aim of this thesis is to address the problem of semantically modelling and describing numerical observations from sensor data. This thesis introduces data-driven approaches to perform the tasks of mining numerical data and creating semantic representations of the derived information in order to de- scribe unseen but interesting observations in natural language.

The research considers creating a semantic representation using the theory of conceptual spaces. In particular, the central contribution of this thesis is to present a data-Eriven approach that automatically constructs conceptual spaces from labelled numerical data sets. This constructed conceptual space then utilises semantic inference techniques to derive linguistic interpretations for novel unknown observations. Another contribution of this thesis is to explore an instantiation of the proposed approach in a real-world application. Specifically, this research investigates a case study where the proposed approach is used to describe unknown time series patterns that emerge from physiological sensor data. This instantiation first presents automatic data analysis methods to extract time series patterns and temporal rules from multiple channels of physiological sensor data, and then applies various linguistic description approaches (includ

ing the proposed semantic representation based on conceptual spaces) to generate human-readable natural language descriptions for such time series patterns and temporal rules.

The main outcome of this thesis is the use of data-driven strategies that ena- ble the system to reveal and explain aspects of sensor data which may other- wise be difficult to capture by knowledge-driven techniques alone. Briefly put, the thesis aims to automate the process whereby unknown observations of data can be 1) numerically analysed, 2) semantically represented, and eventually 3) linguistically described.

Keywords: Semantic representations, Conceptual spaces, Natural language

generation, Temporal rule mining, Physiological sensors, Health monitoring system.

Hadi Banaee, School of Science and Technology

Örebro University, SE-701 82 Örebro, Sweden, hadi.banaee@oru.se Abstract

Hadi B ana ee (2 018) : From Nu mer ical Sen sor Dat a t o S em ant ic Re pres enta tion s:

A Da ta -d riven Approa ch for Genera ting Ling uist ic Descr iption s.

Örebro S tudie s in Tech nolog y 78.

In our daily

lives, senso rs re

cordin gs a re becoming more

and more ubiq uitou

s.

With th e increase

d av aila bility of da ta comes th

e increased nee d of

syst ems t ha t

can repre sent

th e data in hu man int erp reta ble concept s. I n or der to

descri be

unk now n ob serv ati ons i n na tu ral la ngua ge, a n art ifi cia l i ntell igence syst em mu st

deal w ith severa l iss ues in vo lving pe rcep tio n, co ncep t f orm atio n, a nd li ngu ist ic

desc ript ion.

Th ese issu es cover v ario us su bfie lds w ith in art ific ial i nte llig en ce, su ch

as ma ch in e l ear ning , cogn itiv e scien ce, an d na tu ral la ngua ge gene ratio n.

The aim of th is t hesi s is t o a ddress t he pr oblem of

sema ntica lly modelli ng

and de scribing

nu merica l ob serva tion s f ro m sens or da ta . This t hes is int roduce s

data -driv en a pproa ches to

perfor m the ta sks of mini ng nume rica l data and

creating semantic representations of the derived information in order to de-

scribe unseen but interesting observations in natural language.

The research considers creating a semantic representation using the theory of

conceptual spaces. In particular, the central contribution of this thesis is to present a data-Eriven approach that automatically constructs conceptual spaces

from labelled numerical data sets. This constructed conceptual space then utilises

semantic inference techniques to derive linguistic interpretations for novel unknown observations. Another contribution of this thesis is to explore an instantiation of the proposed approach in a real-world application. Specifically,

this research investigates a case study where the proposed approach is used to

describe unknown time series patterns that emerge from physiological sensor data. This instantiation first

presents automatic

data analysis methods

to extract

time series patterns and temporal rules from multiple channels of physio logical

sensor data, and then applies various linguistic description approaches (includ ing the proposed semantic representation based on conceptual spaces) to generate human-readable natural language descriptions for such time series patterns and

temporal rules. The main outcome of this thesis is the use of data-

driven strategies that ena- ble the system to reveal and explain aspects of sensor data which may other-

wise be difficult to capture by knowledge-driven techniques alone. Briefly put, the thesis aims to automate the process whereby unknown observations of data can be 1) numerically analysed, 2) semantically represented, and eventually 3)

linguistically described.

Keywords: Semantic representations, Conceptual spaces, Natural language

generation, Temporal rule mining, Physiological sensors, Health monitoring

system.

Hadi Banaee, School of Science and T

echnology

Örebro University, SE-701 82 Örebro, Sweden, hadi.banaee@oru.se

Abstract

Hadi Banaee (2018): From Numerical Sensor Data to Semantic Representations:

A Data-driven Approach for Generating Linguistic Descriptions.

Örebro Studies in Technology 78.

In our daily lives, sensors recordings are becoming more and more ubiquitous.

With the increased availability of data comes the increased need of systems that can represent the data in human interpretable concepts. In order to describe unknown observations in natural language, an artificial intelligence system must deal with several issues involving perception, concept formation, and linguistic description. These issues cover various subfields within artificial intelligence, such as machine learning, cognitive science, and natural language generation.

The aim of this thesis is to address the problem of semantically modelling and describing numerical observations from sensor data. This thesis introduces data-driven approaches to perform the tasks of mining numerical data and creating semantic representations of the derived information in order to de- scribe unseen but interesting observations in natural language.

The research considers creating a semantic representation using the theory of conceptual spaces. In particular, the central contribution of this thesis is to present a data-Eriven approach that automatically constructs conceptual spaces from labelled numerical data sets. This constructed conceptual space then utilises semantic inference techniques to derive linguistic interpretations for novel unknown observations. Another contribution of this thesis is to explore an instantiation of the proposed approach in a real-world application. Specifically, this research investigates a case study where the proposed approach is used to describe unknown time series patterns that emerge from physiological sensor data. This instantiation first presents automatic data analysis methods to extract time series patterns and temporal rules from multiple channels of physiological sensor data, and then applies various linguistic description approaches (includ

ing the proposed semantic representation based on conceptual spaces) to generate human-readable natural language descriptions for such time series patterns and temporal rules.

The main outcome of this thesis is the use of data-driven strategies that ena- ble the system to reveal and explain aspects of sensor data which may other- wise be difficult to capture by knowledge-driven techniques alone. Briefly put, the thesis aims to automate the process whereby unknown observations of data can be 1) numerically analysed, 2) semantically represented, and eventually 3) linguistically described.

Keywords: Semantic representations, Conceptual spaces, Natural language

generation, Temporal rule mining, Physiological sensors, Health monitoring system.

Hadi Banaee, School of Science and Technology

Örebro University, SE-701 82 Örebro, Sweden, hadi.banaee@oru.se Abstract

Hadi B ana ee (2 018) : From Nu mer ical Sen sor Dat a t o S em ant ic Re pres enta tion s:

A Da ta -d riven Approa ch for Genera ting Ling uist ic Descr iption s.

Örebro S tudie s in Tech nolog y 78.

In our daily

lives, senso rs re

cordin gs a re becoming more

and more ubiq uitou

s.

With th e increase

d av aila bility of da ta comes th

e increased nee d of

syst ems t ha t

can repre sent

th e data in hu man int erp reta ble concept s. I n or der to

descri be

unk now n ob serv ati ons i n na tu ral la ngua ge, a n art ifi cia l i ntell igence syst em mu st

deal w ith severa l iss ues in vo lving pe rcep tio n, co ncep t f orm atio n, a nd li ngu ist ic

desc ript ion.

Th ese issu es cover v ario us su bfie lds w ith in art ific ial i nte llig en ce, su ch

as ma ch in e l ear ning , cogn itiv e scien ce, an d na tu ral la ngua ge gene ratio n.

The aim of th is t hesi s is t o a ddress t he pr oblem of

sema ntica lly modelli ng

and de scribing

nu merica l ob serva tion s f ro m sens or da ta . This t hes is int roduce s

data -driv en a pproa ches to

perfor m the ta sks of mini ng nume rica l data and

creating semantic representations of the derived information in order to de-

scribe unseen but interesting observations in natural language.

The research considers creating a semantic representation using the theory of

conceptual spaces. In particular, the central contribution of this thesis is to present a data-Eriven approach that automatically constructs conceptual spaces

from labelled numerical data sets. This constructed conceptual space then utilises

semantic inference techniques to derive linguistic interpretations for novel unknown observations. Another contribution of this thesis is to explore an instantiation of the proposed approach in a real-world application. Specifically,

this research investigates a case study where the proposed approach is used to

describe unknown time series patterns that emerge from physiological sensor data. This instantiation first

presents automatic

data analysis methods

to extract

time series patterns and temporal rules from multiple channels of physio logical

sensor data, and then applies various linguistic description approaches (includ ing the proposed semantic representation based on conceptual spaces) to generate human-readable natural language descriptions for such time series patterns and

temporal rules. The main outcome of this thesis is the use of data-

driven strategies that ena- ble the system to reveal and explain aspects of sensor data which may other-

wise be difficult to capture by knowledge-driven techniques alone. Briefly put, the thesis aims to automate the process whereby unknown observations of data can be 1) numerically analysed, 2) semantically represented, and eventually 3)

linguistically described.

Keywords: Semantic representations, Conceptual spaces, Natural language

generation, Temporal rule mining, Physiological sensors, Health monitoring

system.

Hadi Banaee, School of Science and T

echnology

Örebro University, SE-701 82 Örebro,

Sweden, hadi.banaee@oru.se

(6)
(7)

Acknowledgements

Conceptual Space of Acknowledgement

1

:

1HDUE\

'RPDLQRI 6XSHUYLVRUV

$QGUH\ (ULN 0RE\HQ

$P\

'RPDLQRI

&RQWULEXWRUV

$OL6 0DMDQ

6DKDU

6DDG0

$OEHUW*

<DML6

<DQQLV.

6HSLGHK

(KXG5

0HKXO% 3HWHU*

'LPLWUD*

$QWRQLR/

$P\ (ULN

$ 0XVLF 0RKDPDG 0DUFHO

'RPDLQRI

)ULHQGV

)DU$ZD\ 6LQD

$GULQ 6DHHGHK $OL6 +HVVDP

+HUH#8QL

$$66

)DELHQ

$OL$E 7RPHN

<DVKD 0DUMDQ

,UDQ

6DKDU 5DQ\D +RXVVDP 6HSLGHK

03,

'RPDLQRI 6RFLDO/LIH

6YHQVNDNXUV/LQD 0DULDQQH .ODVVNDPUDWHU

3L]]DNHEDEJDQJ 9ROOH\EDOO

7HDPPDWHV $PLU $OL$6DUD

$FFRPSDQLHUV

%LR5R[\

0XVLN

K|JVNRODQ

'RPDLQRI

)DPLO\

0\EURWKHUV 0\PRP 6HSLGHK

$UHI

0\QHSKHZV 0DUWLQ

0\QLHFHV

6HSLGHK¶VIDPLO\

)DUKDQJ +DPLG

5RX]EHK

$GHO 6RKHLO

0LD

<DQQLV3 6WHOOD

$UPLQ

$OLUH]D

1There are no linguistic descriptions generated for this conceptual space.

iii

Acknowledgements

ConceptualSpace ofAcknowledgement

:

1

1HDUE\

'RPDLQRI 6XSHUYLVRUV

$QGUH\

(ULN 0RE\HQ

$P\

'RPDLQRI &RQWULEXWRUV

$OL6

0DMDQ 6DKDU

6DDG0

$OEHUW*

<DML6

<DQQLV.

6HSLGHK (KXG5 0HKXO%

3HWHU*

'LPLWUD*

$QWRQLR/

(ULN $P\

$0XVLF 0RKDPDG

0DUFHO

'RPDLQRI

)ULHQGV

)DU$ZD\

6LQD

$GULQ

6DHHGHK

$OL6 +HVVDP

+HUH#8QL

$$66

)DELHQ $OL$E

7RPHN

<DVKD

0DUMDQ ,UDQ 6DKDU 5DQ\D

+RXVVDP

6HSLGHK 03,

'RPDLQRI 6RFLDO/LIH

6YHQVNDNXUV /LQD

0DULDQQH .ODVVNDPUDWHU 3L]]DNHEDEJDQJ

9ROOH\EDOO

6DUD $OL$ $PLU 7HDPPDWHV

$FFRPSDQLHUV

%LR5R[\

0XVLN

K|JVNRODQ

'RPDLQRI )DPLO\

0\EURWKHUV

0\PRP 6HSLGHK

$UHI 0\QHSKHZV

0DUWLQ 0\QLHFHV 6HSLGHK¶VIDPLO\

)DUKDQJ +DPLG 5RX]EHK

$GHO 6RKHLO

0LD

<DQQLV3 6WHOOD

$UPLQ

$OLUH]D

Thereare 1 nolinguistic descriptionsgenerated

forthis conceptualspace.

iii

Acknowledgements

Conceptual Space of Acknowledgement

1

:

1HDUE\

'RPDLQRI 6XSHUYLVRUV

$QGUH\

(ULN 0RE\HQ

$P\

'RPDLQRI

&RQWULEXWRUV

$OL6 0DMDQ

6DKDU

6DDG0

$OEHUW*

<DML6

<DQQLV.

6HSLGHK

(KXG5

0HKXO% 3HWHU*

'LPLWUD*

$QWRQLR/

$P\ (ULN

$ 0XVLF 0RKDPDG 0DUFHO

'RPDLQRI

)ULHQGV

)DU$ZD\ 6LQD

$GULQ 6DHHGHK $OL6 +HVVDP

+HUH#8QL

$$66

)DELHQ

$OL$E 7RPHN

<DVKD 0DUMDQ

,UDQ

6DKDU 5DQ\D +RXVVDP 6HSLGHK

03,

'RPDLQRI 6RFLDO/LIH

6YHQVNDNXUV/LQD 0DULDQQH .ODVVNDPUDWHU

3L]]DNHEDEJDQJ 9ROOH\EDOO

7HDPPDWHV 6DUD

$PLU $OL$

$FFRPSDQLHUV

%LR5R[\

0XVLN

K|JVNRODQ

'RPDLQRI

)DPLO\

0\EURWKHUV 0\PRP 6HSLGHK

$UHI

0\QHSKHZV 0DUWLQ

0\QLHFHV

6HSLGHK¶VIDPLO\

)DUKDQJ +DPLG

5RX]EHK

$GHO 6RKHLO

0LD

<DQQLV3 6WHOOD

$UPLQ

$OLUH]D

1There are no linguistic descriptions generated for this conceptual space.

iii

Acknowledgements

ConceptualSpace ofAcknowledgement

:

1

1HDUE\

'RPDLQRI 6XSHUYLVRUV

$QGUH\

(ULN 0RE\HQ

$P\

'RPDLQRI &RQWULEXWRUV

$OL6

0DMDQ 6DKDU

6DDG0

$OEHUW*

<DML6

<DQQLV.

6HSLGHK (KXG5 0HKXO%

3HWHU*

'LPLWUD*

$QWRQLR/

(ULN $P\

$0XVLF 0RKDPDG

0DUFHO

'RPDLQRI

)ULHQGV

)DU$ZD\

6LQD

$GULQ

6DHHGHK

$OL6 +HVVDP

+HUH#8QL

$$66

)DELHQ $OL$E

7RPHN

<DVKD

0DUMDQ ,UDQ 6DKDU 5DQ\D

+RXVVDP

6HSLGHK 03,

'RPDLQRI 6RFLDO/LIH

6YHQVNDNXUV /LQD

0DULDQQH .ODVVNDPUDWHU 3L]]DNHEDEJDQJ

9ROOH\EDOO

6DUD $OL$ $PLU 7HDPPDWHV

$FFRPSDQLHUV

%LR5R[\

0XVLN

K|JVNRODQ

'RPDLQRI )DPLO\

0\EURWKHUV

0\PRP 6HSLGHK

$UHI 0\QHSKHZV

0DUWLQ 0\QLHFHV 6HSLGHK¶VIDPLO\

)DUKDQJ +DPLG 5RX]EHK

$GHO 6RKHLO

0LD

<DQQLV3 6WHOOD

$UPLQ

$OLUH]D

Thereare 1 nolinguistic descriptionsgenerated

forthis conceptualspace.

iii

(8)
(9)

Contents

1 Introduction 1

1.1 Motivation . . . . 2

1.2 Problem Statement . . . . 5

1.3 Research Question . . . . 6

1.4 Contributions . . . . 7

1.5 Thesis Outline . . . . 9

1.6 Publications . . . . 12

I Creating Semantic Representations for Numerical Data 17 2 Background and Related Work 19 2.1 Semantic Representation . . . . 19

2.2 On the Theory of Conceptual Spaces . . . . 20

2.2.1 Identifying Quality Dimensions . . . . 24

2.2.2 Related Work on Conceptual Spaces and AI . . . . 25

2.3 Generating Linguistic Descriptions . . . . 27

2.3.1 Linguistic Descriptions of Data (LDD) . . . . 27

2.3.2 Natural Language Generation (NLG) . . . . 29

2.4 Conclusions . . . . 33

3 Data-Driven Construction of Conceptual Spaces 35 3.1 Domain and Quality Dimension Specification . . . . 37

3.1.1 Feature Subset Ranking . . . . 39

3.1.2 Feature Subset Grouping . . . . 41

3.2 Concept Representation . . . . 44

3.2.1 Convex Regions of Concepts . . . . 46

3.2.2 Context-dependent Weights of Concepts . . . . 49

3.3 Discussion . . . . 49

v

Contents 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 ProblemStatement . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 ResearchQuestion . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.5 ThesisOutline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.6 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 I CreatingSemantic Representationsfor NumericalData 17 2 Backgroundand RelatedW ork 19 2.1 SemanticRepresentation . . . . . . . . . . . . . . . . . . . . . . 19 2.2 Onthe Theoryof ConceptualSpaces . . . . . . . . . . . . . . . 20 2.2.1 IdentifyingQuality Dimensions . . . . . . . . . . . . . . 24 2.2.2 RelatedW orkon ConceptualSpaces andAI . . . . . . . 25 2.3 GeneratingLinguistic Descriptions . . . . . . . . . . . . . . . . 27 2.3.1 LinguisticDescriptions ofData (LDD) . . . . . . . . . . 27 2.3.2 NaturalLanguage Generation(NLG) . . . . . . . . . . . 29 2.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3 Data-DrivenConstruction ofConceptual Spaces 35 3.1 Domainand QualityDimension Specification . . . . . . . . . . 37 3.1.1 FeatureSubset Ranking . . . . . . . . . . . . . . . . . . 39 3.1.2 FeatureSubset Grouping. . . . . . . . . . . . . . . . . . 41 3.2 ConceptRepresentation . . . . . . . . . . . . . . . . . . . . . . 44 3.2.1 ConvexRegions ofConcepts . . . . . . . . . . . . . . . 46 3.2.2 Context-dependentW eightsof Concepts . . . . . . . . . 49 3.3 Discussion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

v

Contents 1 Introduction 1 1.1 Motivation . . . . 2

1.2 Problem Statement . . . . 5

1.3 Research Question . . . . 6

1.4 Contributions . . . . 7

1.5 Thesis Outline . . . . 9

1.6 Publications . . . . 12

I Creating Semantic Representations for Numerical Data 17 2 Background and Related Work 19 2.1 Semantic Representation . . . . 19

2.2 On the Theory of Conceptual Spaces . . . . 20

2.2.1 Identifying Quality Dimensions . . . . 24

2.2.2 Related Work on Conceptual Spaces and AI . . . . 25

2.3 Generating Linguistic Descriptions . . . . 27

2.3.1 Linguistic Descriptions of Data (LDD) . . . . 27

2.3.2 Natural Language Generation (NLG) . . . . 29

2.4 Conclusions . . . . 33

3 Data-Driven Construction of Conceptual Spaces 35 3.1 Domain and Quality Dimension Specification . . . . 37

3.1.1 Feature Subset Ranking . . . . 39

3.1.2 Feature Subset Grouping . . . . 41

3.2 Concept Representation . . . . 44

3.2.1 Convex Regions of Concepts . . . . 46

3.2.2 Context-dependent Weights of Concepts . . . . 49

3.3 Discussion . . . . 49

v

Contents

1 Introduction

1

1.1 Motivation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 ProblemStatement

. . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 ResearchQuestion

. . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4 Contributions

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.5 ThesisOutline

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.6 Publications

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

I CreatingSemantic

Representationsfor NumericalData

17

2 Backgroundand

RelatedW ork 19

2.1 SemanticRepresentation

. . . . . . . . . . . . . . . . . . . . . . 19

2.2 Onthe Theoryof

ConceptualSpaces .

. . . . . . . . . . . . . . 20

2.2.1 IdentifyingQuality

Dimensions . . . . . . . . . . . . . . 24

2.2.2 RelatedW

orkon ConceptualSpaces

andAI . . . . . . . 25

2.3 GeneratingLinguistic

Descriptions . . . . . . . . . . . . . . . . 27

2.3.1 LinguisticDescriptions

ofData (LDD)

. . . . . . . . . . 27

2.3.2 NaturalLanguage

Generation(NLG) .

. . . . . . . . . . 29

2.4 Conclusions

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3 Data-DrivenConstruction

ofConceptual Spaces

35

3.1 Domainand

QualityDimension Specification

. . . . . . . . . . 37

3.1.1 FeatureSubset

Ranking . . . . . . . . . . . . . . . . . . 39

3.1.2 FeatureSubset

Grouping.

. . . . . . . . . . . . . . . . . 41

3.2 ConceptRepresentation

. . . . . . . . . . . . . . . . . . . . . . 44

3.2.1 ConvexRegions

ofConcepts .

. . . . . . . . . . . . . . 46

3.2.2 Context-dependentW

eightsof Concepts

. . . . . . . . . 49

3.3 Discussion.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

v

(10)

vi CONTENTS

4 Semantic Inference in Conceptual Spaces 53

4.1 Symbol Space Definition . . . . 54

4.2 Inferring Linguistic Descriptions . . . . 56

4.2.1 Phase A: Inference in Conceptual Space . . . . 57

4.2.2 Phase B: Inference in Symbol Space . . . . 65

4.3 Discussion . . . . 70

5 Results and Evaluation: A Case Study on Leaf Data Set 73 5.1 Constructing a Conceptual Space of Leaves . . . . 74

5.1.1 Domain Specification for Leaf Data set . . . . 75

5.1.2 Concept Representation for Leaf Concepts . . . . 77

5.2 Semantic Inference for Unknown Leaf Samples . . . . 77

5.2.1 Inference in Conceptual Space of Leaves . . . . 78

5.2.2 Inference in Symbol Space of Leaves . . . . 78

5.3 Empirical Evaluation for Leaf Samples . . . . 78

5.3.1 Survey: Design and Procedure . . . . 80

5.3.2 Identifying Leaf Observations from Linguistic Descriptions 82 5.3.3 Rating Linguistic Descriptions of Leaf Observations . . . 83

5.4 Discussion . . . . 85

II Physiological Sensor Data: From Data Analysis to Linguis- tic Descriptions 89 6 An Overview of Health Monitoring with Mining Physiological Sensor Data 91 6.1 Data Mining Tasks in Health Monitoring Systems . . . . 92

6.2 Data Mining in Health Monitoring Systems . . . . 95

6.2.1 Preprocessing . . . . 96

6.2.2 Feature Extraction/Selection . . . . 96

6.2.3 Modelling and Learning Methods . . . . 97

6.3 Data Sets: Acquisition and Properties . . . . 99

6.3.1 Sensor Data Acquisition . . . . 99

6.3.2 Sensor Data Properties . . . 100

6.4 Discussion and Challenges . . . 101

7 Physiological Time Series Data: Preparation and Processing 105 7.1 Input Time Series Sensor Data: Collection and Acquisition . . . 106

7.1.1 Wearable Sensors, Non-clinical Data . . . 106

7.1.2 Clinical Physiological Data . . . 107

7.2 Trend Detection in Physiological Time Series . . . 109

7.3 Pattern Abstraction in Physiological Data . . . 113

7.3.1 Background on Pattern Abstraction . . . 113

7.3.2 Prototypical Pattern Abstraction . . . 115

vi CONTENTS

4 SemanticInference inConceptual Spaces 53 4.1 SymbolSpace Definition . . . . . . . . . . . . . . . . . . . . . . 54 4.2 InferringLinguistic Descriptions. . . . . . . . . . . . . . . . . . 56 4.2.1 PhaseA: Inferencein ConceptualSpace . . . . . . . . . . 57 4.2.2 PhaseB: Inferencein SymbolSpace . . . . . . . . . . . . 65 4.3 Discussion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5 Resultsand Evaluation:A CaseStudy onLeaf DataSet 73 5.1 Constructinga ConceptualSpace ofLeaves . . . . . . . . . . . . 74 5.1.1 DomainSpecification forLeaf Dataset . . . . . . . . . . 75 5.1.2 ConceptRepresentation forLeaf Concepts . . . . . . . . 77 5.2 SemanticInference forUnknown LeafSamples . . . . . . . . . . 77 5.2.1 Inferencein ConceptualSpace ofLeaves . . . . . . . . . 78 5.2.2 Inferencein SymbolSpace ofLeaves . . . . . . . . . . . 78 5.3 EmpiricalEvaluation forLeaf Samples . . . . . . . . . . . . . . 78 5.3.1 Survey:Design andProcedure . . . . . . . . . . . . . . . 80 5.3.2 IdentifyingLeaf Observationsfrom LinguisticDescriptions 82 5.3.3 RatingLinguistic Descriptionsof LeafObservations . . . 83 5.4 Discussion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 II PhysiologicalSensor Data:From DataAnalysis toLinguis- ticDescriptions 89 6 AnOverview ofHealth Monitoringwith MiningPhysiological Sensor Data 91 6.1 DataMining Tasks inHealth MonitoringSystems . . . . . . . . 92 6.2 DataMining inHealth MonitoringSystems . . . . . . . . . . . 95 6.2.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . 96 6.2.2 FeatureExtraction/Selection . . . . . . . . . . . . . . . . 96 6.2.3 Modellingand LearningMethods . . . . . . . . . . . . . 97 6.3 DataSets: Acquisitionand Properties . . . . . . . . . . . . . . . 99 6.3.1 SensorData Acquisition . . . . . . . . . . . . . . . . . . 99 6.3.2 SensorData Properties . . . . . . . . . . . . . . . . . . . 10 0 6.4 Discussionand Challenges . . . . . . . . . . . . . . . . . . . . . 10 1 7 PhysiologicalT imeSeries Data:Preparation andProcessing 105 7.1 InputT imeSeries SensorData: Collectionand Acquisition . . . 106 7.1.1 Wearable Sensors,Non-clinical Data . . . . . . . . . . . 10 6 7.1.2 ClinicalPhysiological Data. . . . . . . . . . . . . . . . . 10 7 7.2 Trend Detectionin PhysiologicalT imeSeries . . . . . . . . . . . 10 9 7.3 PatternAbstraction inPhysiological Data. . . . . . . . . . . . . 11 3 7.3.1 Backgroundon PatternAbstraction . . . . . . . . . . . . 11 3 7.3.2 PrototypicalPattern Abstraction. . . . . . . . . . . . . . 11 5

vi CONTENTS

4 Semantic Inference in Conceptual Spaces 53 4.1 Symbol Space Definition . . . . 54

4.2 Inferring Linguistic Descriptions . . . . 56

4.2.1 Phase A: Inference in Conceptual Space . . . . 57

4.2.2 Phase B: Inference in Symbol Space . . . . 65

4.3 Discussion . . . . 70

5 Results and Evaluation: A Case Study on Leaf Data Set 73 5.1 Constructing a Conceptual Space of Leaves . . . . 74

5.1.1 Domain Specification for Leaf Data set . . . . 75

5.1.2 Concept Representation for Leaf Concepts . . . . 77

5.2 Semantic Inference for Unknown Leaf Samples . . . . 77

5.2.1 Inference in Conceptual Space of Leaves . . . . 78

5.2.2 Inference in Symbol Space of Leaves . . . . 78

5.3 Empirical Evaluation for Leaf Samples . . . . 78

5.3.1 Survey: Design and Procedure . . . . 80

5.3.2 Identifying Leaf Observations from Linguistic Descriptions 82 5.3.3 Rating Linguistic Descriptions of Leaf Observations . . . 83

5.4 Discussion . . . . 85

II Physiological Sensor Data: From Data Analysis to Linguis- tic Descriptions 89 6 An Overview of Health Monitoring with Mining Physiological Sensor Data 91 6.1 Data Mining Tasks in Health Monitoring Systems . . . . 92

6.2 Data Mining in Health Monitoring Systems . . . . 95

6.2.1 Preprocessing . . . . 96

6.2.2 Feature Extraction/Selection . . . . 96

6.2.3 Modelling and Learning Methods . . . . 97

6.3 Data Sets: Acquisition and Properties . . . . 99

6.3.1 Sensor Data Acquisition . . . . 99

6.3.2 Sensor Data Properties . . . 100

6.4 Discussion and Challenges . . . 101

7 Physiological Time Series Data: Preparation and Processing 105 7.1 Input Time Series Sensor Data: Collection and Acquisition . . . 106

7.1.1 Wearable Sensors, Non-clinical Data . . . 106

7.1.2 Clinical Physiological Data . . . 107

7.2 Trend Detection in Physiological Time Series . . . 109

7.3 Pattern Abstraction in Physiological Data . . . 113

7.3.1 Background on Pattern Abstraction . . . 113

7.3.2 Prototypical Pattern Abstraction . . . 115

vi CONTENTS

4 SemanticInference

inConceptual Spaces

53

4.1 SymbolSpace

Definition . . . . . . . . . . . . . . . . . . . . . . 54

4.2 InferringLinguistic

Descriptions.

. . . . . . . . . . . . . . . . . 56

4.2.1 PhaseA:

Inferencein ConceptualSpace

. . . . . . . . . . 57

4.2.2 PhaseB:

Inferencein SymbolSpace

. . . . . . . . . . . . 65

4.3 Discussion.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5 Resultsand

Evaluation:A CaseStudy onLeaf

DataSet 73

5.1 Constructinga

ConceptualSpace ofLeaves

. . . . . . . . . . . . 74

5.1.1 DomainSpecification

forLeaf Dataset

. . . . . . . . . . 75

5.1.2 ConceptRepresentation

forLeaf Concepts

. . . . . . . . 77

5.2 SemanticInference

forUnknown LeafSamples

. . . . . . . . . . 77

5.2.1 Inferencein

ConceptualSpace ofLeaves

. . . . . . . . . 78

5.2.2 Inferencein

SymbolSpace ofLeaves

. . . . . . . . . . . 78

5.3 EmpiricalEvaluation

forLeaf Samples

. . . . . . . . . . . . . . 78

5.3.1 Survey:Design

andProcedure .

. . . . . . . . . . . . . . 80

5.3.2 IdentifyingLeaf

Observationsfrom LinguisticDescriptions

82

5.3.3 RatingLinguistic

Descriptionsof LeafObservations .

. . 83

5.4 Discussion.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

II PhysiologicalSensor

Data:From DataAnalysis

toLinguis- ticDescriptions

89

6 AnOverview ofHealth

Monitoringwith MiningPhysiological

Sensor

Data 91

6.1 DataMining Tasks

inHealth MonitoringSystems

. . . . . . . . 92

6.2 DataMining inHealth

MonitoringSystems .

. . . . . . . . . . 95

6.2.1 Preprocessing

. . . . . . . . . . . . . . . . . . . . . . . . 96

6.2.2 FeatureExtraction/Selection

. . . . . . . . . . . . . . . . 96

6.2.3 Modellingand

LearningMethods .

. . . . . . . . . . . . 97

6.3 DataSets:

Acquisitionand Properties

. . . . . . . . . . . . . . . 99

6.3.1 SensorData

Acquisition . . . . . . . . . . . . . . . . . . 99

6.3.2 SensorData

Properties . . . . . . . . . . . . . . . . . . . 10 0

6.4 Discussionand

Challenges . . . . . . . . . . . . . . . . . . . . . 10 1

7 PhysiologicalT

imeSeries Data:Preparation

andProcessing 105

7.1 InputT imeSeries SensorData:

Collectionand Acquisition

. . . 106

7.1.1 Wearable Sensors,Non-clinical

Data . . . . . . . . . . . 10 6

7.1.2 ClinicalPhysiological

Data.

. . . . . . . . . . . . . . . . 10 7

7.2 Trend Detectionin

PhysiologicalT imeSeries .

. . . . . . . . . . 10 9

7.3 PatternAbstraction

inPhysiological Data.

. . . . . . . . . . . . 11 3

7.3.1 Backgroundon

PatternAbstraction .

. . . . . . . . . . . 11 3

7.3.2 PrototypicalPattern

Abstraction.

.

.

.

.

.

.

.

.

.

.

.

.

.

11

5

(11)

CONTENTS vii

7.4 Discussion and Summary . . . 117

8 Mining and Describing Physiological Time Series Data 119 8.1 Mining Temporal Rules in Physiological Sensor Data . . . 120

8.1.1 Background on Temporal Rule Mining . . . 120

8.1.2 A New Approach for Temporal Rule Mining . . . 122

8.1.3 Temporal Rule Set Similarity . . . 124

8.1.4 Results: Distinctive Rules in Clinical Settings . . . 126

8.1.5 Evaluation of Rule Set Similarity in Clinical Conditions . 128 8.2 Linguistic Descriptions for Patterns and Temporal Rules . . . 131

8.2.1 Trend and Pattern Description . . . 132

8.2.2 Temporal Rule Representation . . . 133

8.3 Discussion and Summary . . . 136

9 Linguistic Descriptions for Patterns using Conceptual Spaces 139 9.1 Constructing Conceptual Space of Patterns . . . 140

9.1.1 Domain Specification for Time Series Pattern Data Set . . 141

9.1.2 Concept Representation for Pattern Concepts . . . 143

9.2 Semantic Inference for Unknown Patterns . . . 144

9.2.1 Inference in Conceptual Space of Patterns . . . 144

9.2.2 Inference in Symbol Space of Patterns . . . 145

9.3 Evaluation: Descriptions from Conceptual Spaces . . . 147

9.3.1 Survey: Design and Procedure for Pattern Data Set . . . . 147

9.3.2 Identifying Pattern Observations from Linguistic Descrip- tions . . . 147

9.3.3 Rating Linguistic Descriptions of Pattern Observations . 149 9.4 Discussion and Summary . . . 150

10 Conclusions 153 10.1 Summary of Contributions . . . 153

10.1.1 Construction of Conceptual Spaces (C1) . . . 153

10.1.2 Semantic Inference in Conceptual Spaces (C2) . . . 154

10.1.3 Mining Physiological Sensor Data (C3) . . . 155

10.1.4 Linguistic Description by Semantic Representations (C4) 155 10.2 Limitations . . . 156

10.3 Societal and Ethical Impacts . . . 158

10.4 Future Research Directions . . . 159

10.5 Final Words . . . 163

References 165

CONTENTS vii

7.4 Discussionand Summary. . . . . . . . . . . . . . . . . . . . . . 11 7 8 Miningand DescribingPhysiological Time SeriesData 119 8.1 MiningT emporalRules inPhysiological SensorData . . . . . . 12 0 8.1.1 Backgroundon Temporal RuleMining . . . . . . . . . . 12 0 8.1.2 ANew Approachfor Temporal RuleMining . . . . . . . 12 2 8.1.3 Temporal RuleSet Similarity. . . . . . . . . . . . . . . . 12 4 8.1.4 Results:Distinctive Rulesin ClinicalSettings . . . . . . . 12 6 8.1.5 Evaluationof RuleSet Similarityin ClinicalConditions . 128 8.2 LinguisticDescriptions forPatterns andT emporalRules . . . . . 13 1 8.2.1 Trend andPattern Description. . . . . . . . . . . . . . . 13 2 8.2.2 Temporal RuleRepresentation . . . . . . . . . . . . . . . 13 3 8.3 Discussionand Summary. . . . . . . . . . . . . . . . . . . . . . 13 6 9 LinguisticDescriptions forPatterns usingConceptual Spaces 139 9.1 ConstructingConceptual Spaceof Patterns . . . . . . . . . . . . 14 0 9.1.1 DomainSpecification forT imeSeries PatternData Set. . 141 9.1.2 ConceptRepresentation forPattern Concepts . . . . . . 14 3 9.2 SemanticInference forUnknown Patterns . . . . . . . . . . . . 14 4 9.2.1 Inferencein ConceptualSpace ofPatterns . . . . . . . . 14 4 9.2.2 Inferencein SymbolSpace ofPatterns . . . . . . . . . . . 14 5 9.3 Evaluation:Descriptions fromConceptual Spaces . . . . . . . . 14 7 9.3.1 Survey:Design andProcedure forPattern DataSet . . . . 14 7 9.3.2 IdentifyingPattern Observationsfrom LinguisticDescrip- tions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 7 9.3.3 RatingLinguistic Descriptionsof PatternObservations . 149 9.4 Discussionand Summary. . . . . . . . . . . . . . . . . . . . . . 15 0 10Conclusions 153 10.1Summary ofContributions . . . . . . . . . . . . . . . . . . . . . 15 3 10.1.1Construction ofConceptual Spaces(C1) . . . . . . . . . 15 3 10.1.2Semantic Inferencein ConceptualSpaces (C2) . . . . . . 15 4 10.1.3Mining PhysiologicalSensor Data(C3) . . . . . . . . . . 15 5 10.1.4Linguistic Descriptionby SemanticRepresentations (C4) 155 10.2Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 6 10.3Societal andEthical Impacts . . . . . . . . . . . . . . . . . . . . 15 8 10.4Future ResearchDirections . . . . . . . . . . . . . . . . . . . . . 15 9 10.5Final Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3 References 165

CONTENTS vii

7.4 Discussion and Summary . . . 117

8 Mining and Describing Physiological Time Series Data 119 8.1 Mining Temporal Rules in Physiological Sensor Data . . . 120

8.1.1 Background on Temporal Rule Mining . . . 120

8.1.2 A New Approach for Temporal Rule Mining . . . 122

8.1.3 Temporal Rule Set Similarity . . . 124

8.1.4 Results: Distinctive Rules in Clinical Settings . . . 126

8.1.5 Evaluation of Rule Set Similarity in Clinical Conditions . 128 8.2 Linguistic Descriptions for Patterns and Temporal Rules . . . 131

8.2.1 Trend and Pattern Description . . . 132

8.2.2 Temporal Rule Representation . . . 133

8.3 Discussion and Summary . . . 136

9 Linguistic Descriptions for Patterns using Conceptual Spaces 139 9.1 Constructing Conceptual Space of Patterns . . . 140

9.1.1 Domain Specification for Time Series Pattern Data Set . . 141

9.1.2 Concept Representation for Pattern Concepts . . . 143

9.2 Semantic Inference for Unknown Patterns . . . 144

9.2.1 Inference in Conceptual Space of Patterns . . . 144

9.2.2 Inference in Symbol Space of Patterns . . . 145

9.3 Evaluation: Descriptions from Conceptual Spaces . . . 147

9.3.1 Survey: Design and Procedure for Pattern Data Set . . . . 147

9.3.2 Identifying Pattern Observations from Linguistic Descrip- tions . . . 147

9.3.3 Rating Linguistic Descriptions of Pattern Observations . 149 9.4 Discussion and Summary . . . 150

10 Conclusions 153 10.1 Summary of Contributions . . . 153

10.1.1 Construction of Conceptual Spaces (C1) . . . 153

10.1.2 Semantic Inference in Conceptual Spaces (C2) . . . 154

10.1.3 Mining Physiological Sensor Data (C3) . . . 155

10.1.4 Linguistic Description by Semantic Representations (C4) 155 10.2 Limitations . . . 156

10.3 Societal and Ethical Impacts . . . 158

10.4 Future Research Directions . . . 159

10.5 Final Words . . . 163

References 165

CONTENTS vii

7.4 Discussionand

Summary.

. . . . . . . . . . . . . . . . . . . . . 11 7

8 Miningand

DescribingPhysiological Time

SeriesData 119

8.1 MiningT

emporalRules inPhysiological

SensorData . . . . . . 12 0

8.1.1 Backgroundon

Temporal RuleMining

. . . . . . . . . . 12 0

8.1.2 ANew Approachfor

Temporal RuleMining

. . . . . . . 12 2

8.1.3 Temporal RuleSet

Similarity.

. . . . . . . . . . . . . . . 12 4

8.1.4 Results:Distinctive

Rulesin ClinicalSettings

. . . . . . . 12 6

8.1.5 Evaluationof

RuleSet Similarityin

ClinicalConditions .

128

8.2 LinguisticDescriptions

forPatterns andT

emporalRules . . . . . 13 1

8.2.1 Trend andPattern

Description.

. . . . . . . . . . . . . . 13 2

8.2.2 Temporal RuleRepresentation

. . . . . . . . . . . . . . . 13 3

8.3 Discussionand

Summary.

. . . . . . . . . . . . . . . . . . . . . 13 6

9 LinguisticDescriptions

forPatterns usingConceptual

Spaces 139

9.1 ConstructingConceptual

Spaceof Patterns

. . . . . . . . . . . . 14 0

9.1.1 DomainSpecification

forT imeSeries PatternData

Set.

. 141

9.1.2 ConceptRepresentation

forPattern Concepts

. . . . . . 14 3

9.2 SemanticInference

forUnknown Patterns

. . . . . . . . . . . . 14 4

9.2.1 Inferencein

ConceptualSpace ofPatterns

. . . . . . . . 14 4

9.2.2 Inferencein

SymbolSpace ofPatterns

. . . . . . . . . . . 14 5

9.3 Evaluation:Descriptions

fromConceptual Spaces

. . . . . . . . 14 7

9.3.1 Survey:Design

andProcedure forPattern

DataSet . . . . 14 7

9.3.2 IdentifyingPattern

Observationsfrom LinguisticDescrip-

tions.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 7

9.3.3 RatingLinguistic

Descriptionsof PatternObservations

. 149

9.4 Discussionand

Summary.

. . . . . . . . . . . . . . . . . . . . . 15 0

10Conclusions 153

10.1Summary ofContributions

. . . . . . . . . . . . . . . . . . . . . 15 3

10.1.1Construction ofConceptual

Spaces(C1) . . . . . . . . . 15 3

10.1.2Semantic Inferencein

ConceptualSpaces (C2)

. . . . . . 15 4

10.1.3Mining PhysiologicalSensor

Data(C3) . . . . . . . . . . 15 5

10.1.4Linguistic Descriptionby

SemanticRepresentations (C4)

155

10.2Limitations .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 6

10.3Societal andEthical

Impacts . . . . . . . . . . . . . . . . . . . . 15 8

10.4Future ResearchDirections

. . . . . . . . . . . . . . . . . . . . . 15 9

10.5Final Words

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3

References

165

(12)
(13)

List of Figures

1.1 A part of the Rumi’s poem in Persian, together with an illustra-

tion of the story, adapted from [7]. . . . 1

1.2 A schematic overview of the tasks to be performed in both the- oretical (inner box) and application (outer box) focuses of the

thesis. . . . 6

1.3 Thesis MindMap, illustrating the appearance of the research tasks in the chapters, together with the contributions of this thesis. 14

2.1 A schematic presentation of a conceptual space of fruits. . . . . 24

2.2 The architecture of data-to-text systems, proposed by Reiter. . . 31

3.1 Illustration of the main steps for constructing a conceptual space

from a set of numeric data. . . . 37

3.2 Two phases of the domain and quality dimension specification,

with input and output parameters of each phase. . . . 39

3.3 A weighted bipartite graph with two sets of vertices from the

labels Y and the selected features F



. . . . . 43

3.4 A bigraph graph and one selected biclique (blue edges) for the

leaf example (explained in Example 3.5). . . . 45

3.5 A concept representation example in a conceptual space with

domains δ

a

and δ

b

. . . . 48

4.1 Illustration of the steps of inferring linguistic descriptions for an unknown observation via the constructed conceptual spaces and

its corresponding symbol space. . . . 54

4.2 Schematic of a conceptual space and the coupled symbol space. . 55

4.3 Two phases of the semantic inference for generating linguistic descriptions, with the input and output parameters of each phase. 57 4.4 An illustration of four different cases with respect to the various

positions of an instance points within domains. . . . 60

ix

Listof Figures

1.1 Apart ofthe Rumi’s

poemin Persian,together

withan illustra-

tionof thestory ,adapted from[7].

. . . . . . . . . . . . . . . . 1

1.2 Aschematic overviewof

thetasks tobe

performedin boththe-

oretical(inner box)and

application(outer box)focuses

ofthe

thesis.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 ThesisMindMap,

illustratingthe appearanceof

theresearch

tasksin thechapters, togetherwith

thecontributions ofthis

thesis.14

2.1 Aschematic presentationof

aconceptual spaceof

fruits.

. . . . 24

2.2 Thearchitecture ofdata-to-text

systems,proposed byReiter

. . . 31

3.1 Illustrationof

themain stepsfor

constructinga conceptualspace

froma setof numericdata.

. . . . . . . . . . . . . . . . . . . . . 37

3.2 Two phasesof

thedomain andquality

dimensionspecification,

withinput andoutput

parametersof eachphase.

. . . . . . . . . 39

3.3 Aweighted bipartitegraph

withtwo setsof

verticesfrom the

labels Yand theselected features



F

. . . . . . . . . . . . . . . . 43

3.4 Abigraph graphand

oneselected biclique(blue

edges)for the

leafexample (explainedin

Example3.5).

. . . . . . . . . . . . . 45

3.5 Aconcept representationexample

ina conceptualspace

with

domainsδ

andδ

a

..

b

. . . . . . . . . . . . . . . . . . . . . . . . 48

4.1 Illustrationof

thesteps ofinferring

linguisticdescriptions foran

unknownobservation viathe

constructedconceptual spacesand

itscorresponding symbolspace.

. . . . . . . . . . . . . . . . . . 54

4.2 Schematicof

aconceptual spaceand

thecoupled symbolspace.

. 55

4.3 Two phasesof

thesemantic inferencefor

generatinglinguistic

descriptions,with theinput

andoutput parametersof

eachphase.

57

4.4 Anillustration offour

differentcases withrespect

tothe various

positionsof aninstance pointswithin

domains.

. . . . . . . . . 60

ix

List of Figures

1.1 A part of the Rumi’s poem in Persian, together with an illustra-

tion of the story, adapted from [7]. . . . 1

1.2 A schematic overview of the tasks to be performed in both the- oretical (inner box) and application (outer box) focuses of the

thesis. . . . 6

1.3 Thesis MindMap, illustrating the appearance of the research tasks in the chapters, together with the contributions of this thesis. 14

2.1 A schematic presentation of a conceptual space of fruits. . . . . 24

2.2 The architecture of data-to-text systems, proposed by Reiter. . . 31

3.1 Illustration of the main steps for constructing a conceptual space

from a set of numeric data. . . . 37

3.2 Two phases of the domain and quality dimension specification,

with input and output parameters of each phase. . . . 39

3.3 A weighted bipartite graph with two sets of vertices from the

labels Y and the selected features F



. . . . . 43

3.4 A bigraph graph and one selected biclique (blue edges) for the

leaf example (explained in Example 3.5). . . . 45

3.5 A concept representation example in a conceptual space with

domains δ

a

and δ

b

. . . . 48

4.1 Illustration of the steps of inferring linguistic descriptions for an unknown observation via the constructed conceptual spaces and

its corresponding symbol space. . . . 54

4.2 Schematic of a conceptual space and the coupled symbol space. . 55

4.3 Two phases of the semantic inference for generating linguistic descriptions, with the input and output parameters of each phase. 57 4.4 An illustration of four different cases with respect to the various

positions of an instance points within domains. . . . 60

ix

Listof Figures

1.1 Apart ofthe Rumi’s

poemin Persian,together

withan illustra-

tionof thestory ,adapted from[7].

. . . . . . . . . . . . . . . . 1

1.2 Aschematic overviewof

thetasks tobe

performedin boththe-

oretical(inner box)and

application(outer box)focuses

ofthe

thesis.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 ThesisMindMap,

illustratingthe appearanceof

theresearch

tasksin thechapters, togetherwith

thecontributions ofthis

thesis.14

2.1 Aschematic presentationof

aconceptual spaceof

fruits.

. . . . 24

2.2 Thearchitecture ofdata-to-text

systems,proposed byReiter

. . . 31

3.1 Illustrationof

themain stepsfor

constructinga conceptualspace

froma setof numericdata.

. . . . . . . . . . . . . . . . . . . . . 37

3.2 Two phasesof

thedomain andquality

dimensionspecification,

withinput andoutput

parametersof eachphase.

. . . . . . . . . 39

3.3 Aweighted bipartitegraph

withtwo setsof

verticesfrom the

labels Yand theselected features



F

. . . . . . . . . . . . . . . . 43

3.4 Abigraph graphand

oneselected biclique(blue

edges)for the

leafexample (explainedin

Example3.5).

. . . . . . . . . . . . . 45

3.5 Aconcept representationexample

ina conceptualspace

with

domainsδ

andδ

a

..

b

. . . . . . . . . . . . . . . . . . . . . . . . 48

4.1 Illustrationof

thesteps ofinferring

linguisticdescriptions foran

unknownobservation viathe

constructedconceptual spacesand

itscorresponding symbolspace.

. . . . . . . . . . . . . . . . . . 54

4.2 Schematicof

aconceptual spaceand

thecoupled symbolspace.

. 55

4.3 Two phasesof

thesemantic inferencefor

generatinglinguistic

descriptions,with theinput

andoutput parametersof

eachphase.

57

4.4 Anillustration offour

differentcases withrespect

tothe various

positionsof aninstance pointswithin

domains.

. . . . . . . . . 60

ix

References

Related documents

After the document mining processing, the next task begins to classify services. This paper puts forward the modified suffix tree clustering algorithm to the service classification.

Some people consider owl:InverseFunctionalProperty to be the most impor- tant modeling construct in RDFS-Plus, especially in situations in which a model is being used to manage

Our model correctly accounts for the non-linear path of the light inside the atmosphere (in Earth’s case), the light absorption effects by molecules and dust particles, such as

In 2012 he joined the Center of Applied Autonomous Sensor Systems (AASS) of Örebro University in Sweden as a doctoral student. His research interests include various aspects of

This instantiation first presents automatic data analysis methods to extract time series patterns and temporal rules from multiple channels of physiological sensor data,

I början av 1900-talet menar Hafez att det var en romantisk explosion med flera olika författare av vilka Jibrān Khalīl Jibrān (Libanon) var en av dem mest inflytelserika. När

(2015) present a framework for unsupervised semantic parsing of videos which identifies so-called semantic steps in the video from both video and audio data.. The framework

This thesis includes four research articles by the author which investigate the various compo- nents that a system such as this would require; from entity recognition and