From Numerical Sensor Data to Semantic Representations:
A Data-driven Approach for Generating Linguistic Descriptions
From Nu me rical Sensor D ata to Sem an
tic Re presen tation
s:
A Da ta-dri ven Ap proach for Ge
ne ratin g L in gu isti c D escri pti on s
From Numerical Sensor Data to Semantic Representations:
A Data-driven Approach for Generating Linguistic Descriptions
From Nu me rical Sensor D ata to Sem an
tic Re presen tation
s:
A Da ta-dri ven Ap proach for Ge
ne
ratin
g L
in
gu
isti
c D
escri
pti
on
s
To Sepideh
For her kindness and devotion, and for her endless support.
To Sepi deh
For her kindnes s a nd devot
ion,
and f or her en dless s
upport .
To Sepideh
For her kindness and devotion, and for her endless support.
To Sepi deh
For her kindnes s a nd devot
ion,
and f or her en dless s
upport
.
Örebro Studies in Technology 78
H ADI B ANAEE
From Numerical Sensor Data to Semantic Representations:
A Data-driven Approach for Generating Linguistic Descriptions
Örebro St udie
s in Te chnology 78
H B ADI ANAEE
From Numerical Sensor Data
to Semantic
Representat ions:
A Data-driven Approach for Generating Linguistic Descriptions
Örebro Studies in Technology 78
H ADI B ANAEE
From Numerical Sensor Data to Semantic Representations:
A Data-driven Approach for Generating Linguistic Descriptions
Örebro St udie
s in Te chnology 78
H B ADI ANAEE
From Numerical Sensor Data
to Semantic
Representat ions:
A Data-driven Approach for Generating Linguistic
Descriptions
© Hadi Banaee, 2018
Title: From Numerical Sensor Data to Semantic Representations:
A Data-driven Approach for Generating Linguistic Descriptions Publisher: Örebro University 2018
www.publications.oru.se
Print: Örebro University, Repro 03/2018
ISSN1650-8580 ISBN978-91-7529-240-3
Hadi Bana © ee, 2018
Titl e: Fr om N um eri ca l Sens or D ata to Sem antic Repr ese ntati ons:
A Da ta-dr ive n Appr oa ch for G ene ra ting Ling uistic Des cri ptions
Publisher: Öreb ro Un
iversity 2 01
8
ww w.public ations.oru
.se
Print: Ö reb ro Un ivers ity , R epr o 03/2018
ISSN 1650
-8580
ISBN 978-91 -7529 -240-3
© Hadi Banaee, 2018
Title: From Numerical Sensor Data to Semantic Representations:
A Data-driven Approach for Generating Linguistic Descriptions Publisher: Örebro University 2018
www.publications.oru.se
Print: Örebro University, Repro 03/2018
ISSN1650-8580 ISBN978-91-7529-240-3
Hadi Bana © ee, 2018
Titl e: Fr om N um eri ca l Sens or D ata to Sem antic Repr ese ntati ons:
A Da ta-dr ive n Appr oa ch for G ene ra ting Ling uistic Des cri ptions
Publisher: Öreb ro Un
iversity 2 01
8
ww w.public ations.oru
.se
Print: Ö reb ro Un ivers ity , R epr o 03/2018
ISSN 1650
-8580
ISBN 978-91 -7529 -240-3
Abstract
Hadi Banaee (2018): From Numerical Sensor Data to Semantic Representations:
A Data-driven Approach for Generating Linguistic Descriptions.
Örebro Studies in Technology 78.
In our daily lives, sensors recordings are becoming more and more ubiquitous.
With the increased availability of data comes the increased need of systems that can represent the data in human interpretable concepts. In order to describe unknown observations in natural language, an artificial intelligence system must deal with several issues involving perception, concept formation, and linguistic description. These issues cover various subfields within artificial intelligence, such as machine learning, cognitive science, and natural language generation.
The aim of this thesis is to address the problem of semantically modelling and describing numerical observations from sensor data. This thesis introduces data-driven approaches to perform the tasks of mining numerical data and creating semantic representations of the derived information in order to de- scribe unseen but interesting observations in natural language.
The research considers creating a semantic representation using the theory of conceptual spaces. In particular, the central contribution of this thesis is to present a data-Eriven approach that automatically constructs conceptual spaces from labelled numerical data sets. This constructed conceptual space then utilises semantic inference techniques to derive linguistic interpretations for novel unknown observations. Another contribution of this thesis is to explore an instantiation of the proposed approach in a real-world application. Specifically, this research investigates a case study where the proposed approach is used to describe unknown time series patterns that emerge from physiological sensor data. This instantiation first presents automatic data analysis methods to extract time series patterns and temporal rules from multiple channels of physiological sensor data, and then applies various linguistic description approaches (includ
ing the proposed semantic representation based on conceptual spaces) to generate human-readable natural language descriptions for such time series patterns and temporal rules.
The main outcome of this thesis is the use of data-driven strategies that ena- ble the system to reveal and explain aspects of sensor data which may other- wise be difficult to capture by knowledge-driven techniques alone. Briefly put, the thesis aims to automate the process whereby unknown observations of data can be 1) numerically analysed, 2) semantically represented, and eventually 3) linguistically described.
Keywords: Semantic representations, Conceptual spaces, Natural language
generation, Temporal rule mining, Physiological sensors, Health monitoring system.
Hadi Banaee, School of Science and Technology
Örebro University, SE-701 82 Örebro, Sweden, hadi.banaee@oru.se Abstract
Hadi B ana ee (2 018) : From Nu mer ical Sen sor Dat a t o S em ant ic Re pres enta tion s:
A Da ta -d riven Approa ch for Genera ting Ling uist ic Descr iption s.
Örebro S tudie s in Tech nolog y 78.
In our daily
lives, senso rs re
cordin gs a re becoming more
and more ubiq uitou
s.
With th e increase
d av aila bility of da ta comes th
e increased nee d of
syst ems t ha t
can repre sent
th e data in hu man int erp reta ble concept s. I n or der to
descri be
unk now n ob serv ati ons i n na tu ral la ngua ge, a n art ifi cia l i ntell igence syst em mu st
deal w ith severa l iss ues in vo lving pe rcep tio n, co ncep t f orm atio n, a nd li ngu ist ic
desc ript ion.
Th ese issu es cover v ario us su bfie lds w ith in art ific ial i nte llig en ce, su ch
as ma ch in e l ear ning , cogn itiv e scien ce, an d na tu ral la ngua ge gene ratio n.
The aim of th is t hesi s is t o a ddress t he pr oblem of
sema ntica lly modelli ng
and de scribing
nu merica l ob serva tion s f ro m sens or da ta . This t hes is int roduce s
data -driv en a pproa ches to
perfor m the ta sks of mini ng nume rica l data and
creating semantic representations of the derived information in order to de-
scribe unseen but interesting observations in natural language.
The research considers creating a semantic representation using the theory of
conceptual spaces. In particular, the central contribution of this thesis is to present a data-Eriven approach that automatically constructs conceptual spaces
from labelled numerical data sets. This constructed conceptual space then utilises
semantic inference techniques to derive linguistic interpretations for novel unknown observations. Another contribution of this thesis is to explore an instantiation of the proposed approach in a real-world application. Specifically,
this research investigates a case study where the proposed approach is used to
describe unknown time series patterns that emerge from physiological sensor data. This instantiation first
presents automatic
data analysis methods
to extract
time series patterns and temporal rules from multiple channels of physio logical
sensor data, and then applies various linguistic description approaches (includ ing the proposed semantic representation based on conceptual spaces) to generate human-readable natural language descriptions for such time series patterns and
temporal rules. The main outcome of this thesis is the use of data-
driven strategies that ena- ble the system to reveal and explain aspects of sensor data which may other-
wise be difficult to capture by knowledge-driven techniques alone. Briefly put, the thesis aims to automate the process whereby unknown observations of data can be 1) numerically analysed, 2) semantically represented, and eventually 3)
linguistically described.
Keywords: Semantic representations, Conceptual spaces, Natural language
generation, Temporal rule mining, Physiological sensors, Health monitoring
system.
Hadi Banaee, School of Science and T
echnology
Örebro University, SE-701 82 Örebro, Sweden, hadi.banaee@oru.se
Abstract
Hadi Banaee (2018): From Numerical Sensor Data to Semantic Representations:
A Data-driven Approach for Generating Linguistic Descriptions.
Örebro Studies in Technology 78.
In our daily lives, sensors recordings are becoming more and more ubiquitous.
With the increased availability of data comes the increased need of systems that can represent the data in human interpretable concepts. In order to describe unknown observations in natural language, an artificial intelligence system must deal with several issues involving perception, concept formation, and linguistic description. These issues cover various subfields within artificial intelligence, such as machine learning, cognitive science, and natural language generation.
The aim of this thesis is to address the problem of semantically modelling and describing numerical observations from sensor data. This thesis introduces data-driven approaches to perform the tasks of mining numerical data and creating semantic representations of the derived information in order to de- scribe unseen but interesting observations in natural language.
The research considers creating a semantic representation using the theory of conceptual spaces. In particular, the central contribution of this thesis is to present a data-Eriven approach that automatically constructs conceptual spaces from labelled numerical data sets. This constructed conceptual space then utilises semantic inference techniques to derive linguistic interpretations for novel unknown observations. Another contribution of this thesis is to explore an instantiation of the proposed approach in a real-world application. Specifically, this research investigates a case study where the proposed approach is used to describe unknown time series patterns that emerge from physiological sensor data. This instantiation first presents automatic data analysis methods to extract time series patterns and temporal rules from multiple channels of physiological sensor data, and then applies various linguistic description approaches (includ
ing the proposed semantic representation based on conceptual spaces) to generate human-readable natural language descriptions for such time series patterns and temporal rules.
The main outcome of this thesis is the use of data-driven strategies that ena- ble the system to reveal and explain aspects of sensor data which may other- wise be difficult to capture by knowledge-driven techniques alone. Briefly put, the thesis aims to automate the process whereby unknown observations of data can be 1) numerically analysed, 2) semantically represented, and eventually 3) linguistically described.
Keywords: Semantic representations, Conceptual spaces, Natural language
generation, Temporal rule mining, Physiological sensors, Health monitoring system.
Hadi Banaee, School of Science and Technology
Örebro University, SE-701 82 Örebro, Sweden, hadi.banaee@oru.se Abstract
Hadi B ana ee (2 018) : From Nu mer ical Sen sor Dat a t o S em ant ic Re pres enta tion s:
A Da ta -d riven Approa ch for Genera ting Ling uist ic Descr iption s.
Örebro S tudie s in Tech nolog y 78.
In our daily
lives, senso rs re
cordin gs a re becoming more
and more ubiq uitou
s.
With th e increase
d av aila bility of da ta comes th
e increased nee d of
syst ems t ha t
can repre sent
th e data in hu man int erp reta ble concept s. I n or der to
descri be
unk now n ob serv ati ons i n na tu ral la ngua ge, a n art ifi cia l i ntell igence syst em mu st
deal w ith severa l iss ues in vo lving pe rcep tio n, co ncep t f orm atio n, a nd li ngu ist ic
desc ript ion.
Th ese issu es cover v ario us su bfie lds w ith in art ific ial i nte llig en ce, su ch
as ma ch in e l ear ning , cogn itiv e scien ce, an d na tu ral la ngua ge gene ratio n.
The aim of th is t hesi s is t o a ddress t he pr oblem of
sema ntica lly modelli ng
and de scribing
nu merica l ob serva tion s f ro m sens or da ta . This t hes is int roduce s
data -driv en a pproa ches to
perfor m the ta sks of mini ng nume rica l data and
creating semantic representations of the derived information in order to de-
scribe unseen but interesting observations in natural language.
The research considers creating a semantic representation using the theory of
conceptual spaces. In particular, the central contribution of this thesis is to present a data-Eriven approach that automatically constructs conceptual spaces
from labelled numerical data sets. This constructed conceptual space then utilises
semantic inference techniques to derive linguistic interpretations for novel unknown observations. Another contribution of this thesis is to explore an instantiation of the proposed approach in a real-world application. Specifically,
this research investigates a case study where the proposed approach is used to
describe unknown time series patterns that emerge from physiological sensor data. This instantiation first
presents automatic
data analysis methods
to extract
time series patterns and temporal rules from multiple channels of physio logical
sensor data, and then applies various linguistic description approaches (includ ing the proposed semantic representation based on conceptual spaces) to generate human-readable natural language descriptions for such time series patterns and
temporal rules. The main outcome of this thesis is the use of data-
driven strategies that ena- ble the system to reveal and explain aspects of sensor data which may other-
wise be difficult to capture by knowledge-driven techniques alone. Briefly put, the thesis aims to automate the process whereby unknown observations of data can be 1) numerically analysed, 2) semantically represented, and eventually 3)
linguistically described.
Keywords: Semantic representations, Conceptual spaces, Natural language
generation, Temporal rule mining, Physiological sensors, Health monitoring
system.
Hadi Banaee, School of Science and T
echnology
Örebro University, SE-701 82 Örebro,
Sweden, hadi.banaee@oru.se
Acknowledgements
Conceptual Space of Acknowledgement
1:
1HDUE\
'RPDLQRI 6XSHUYLVRUV
$QGUH\ (ULN 0RE\HQ
$P\
'RPDLQRI
&RQWULEXWRUV
$OL6 0DMDQ
6DKDU
6DDG0
$OEHUW*
<DML6
<DQQLV.
6HSLGHK
(KXG5
0HKXO% 3HWHU*
'LPLWUD*
$QWRQLR/
$P\ (ULN
$ 0XVLF 0RKDPDG 0DUFHO
'RPDLQRI
)ULHQGV
)DU$ZD\ 6LQD
$GULQ 6DHHGHK $OL6 +HVVDP
+HUH#8QL
$$66
)DELHQ
$OL$E 7RPHN
<DVKD 0DUMDQ
,UDQ
6DKDU 5DQ\D +RXVVDP 6HSLGHK
03,
'RPDLQRI 6RFLDO/LIH
6YHQVNDNXUV/LQD 0DULDQQH .ODVVNDPUDWHU
3L]]DNHEDEJDQJ 9ROOH\EDOO
7HDPPDWHV $PLU $OL$6DUD
$FFRPSDQLHUV
%LR5R[\
0XVLN
K|JVNRODQ
'RPDLQRI
)DPLO\
0\EURWKHUV 0\PRP 6HSLGHK
$UHI
0\QHSKHZV 0DUWLQ
0\QLHFHV
6HSLGHK¶VIDPLO\
)DUKDQJ +DPLG
5RX]EHK
$GHO 6RKHLO
0LD
<DQQLV3 6WHOOD
$UPLQ
$OLUH]D
1There are no linguistic descriptions generated for this conceptual space.
iii
Acknowledgements
ConceptualSpace ofAcknowledgement
:
11HDUE\
'RPDLQRI 6XSHUYLVRUV
$QGUH\
(ULN 0RE\HQ
$P\
'RPDLQRI &RQWULEXWRUV
$OL6
0DMDQ 6DKDU
6DDG0
$OEHUW*
<DML6
<DQQLV.
6HSLGHK (KXG5 0HKXO%
3HWHU*
'LPLWUD*
$QWRQLR/
(ULN $P\
$0XVLF 0RKDPDG
0DUFHO
'RPDLQRI
)ULHQGV
)DU$ZD\
6LQD
$GULQ
6DHHGHK
$OL6 +HVVDP
+HUH#8QL
$$66
)DELHQ $OL$E
7RPHN
<DVKD
0DUMDQ ,UDQ 6DKDU 5DQ\D
+RXVVDP
6HSLGHK 03,
'RPDLQRI 6RFLDO/LIH
6YHQVNDNXUV /LQD
0DULDQQH .ODVVNDPUDWHU 3L]]DNHEDEJDQJ
9ROOH\EDOO
6DUD $OL$ $PLU 7HDPPDWHV
$FFRPSDQLHUV
%LR5R[\
0XVLN
K|JVNRODQ
'RPDLQRI )DPLO\
0\EURWKHUV
0\PRP 6HSLGHK
$UHI 0\QHSKHZV
0DUWLQ 0\QLHFHV 6HSLGHK¶VIDPLO\
)DUKDQJ +DPLG 5RX]EHK
$GHO 6RKHLO
0LD
<DQQLV3 6WHOOD
$UPLQ
$OLUH]D
Thereare 1 nolinguistic descriptionsgenerated
forthis conceptualspace.
iii
Acknowledgements
Conceptual Space of Acknowledgement
1:
1HDUE\
'RPDLQRI 6XSHUYLVRUV
$QGUH\
(ULN 0RE\HQ
$P\
'RPDLQRI
&RQWULEXWRUV
$OL6 0DMDQ
6DKDU
6DDG0
$OEHUW*
<DML6
<DQQLV.
6HSLGHK
(KXG5
0HKXO% 3HWHU*
'LPLWUD*
$QWRQLR/
$P\ (ULN
$ 0XVLF 0RKDPDG 0DUFHO
'RPDLQRI
)ULHQGV
)DU$ZD\ 6LQD
$GULQ 6DHHGHK $OL6 +HVVDP
+HUH#8QL
$$66
)DELHQ
$OL$E 7RPHN
<DVKD 0DUMDQ
,UDQ
6DKDU 5DQ\D +RXVVDP 6HSLGHK
03,
'RPDLQRI 6RFLDO/LIH
6YHQVNDNXUV/LQD 0DULDQQH .ODVVNDPUDWHU
3L]]DNHEDEJDQJ 9ROOH\EDOO
7HDPPDWHV 6DUD
$PLU $OL$
$FFRPSDQLHUV
%LR5R[\
0XVLN
K|JVNRODQ
'RPDLQRI
)DPLO\
0\EURWKHUV 0\PRP 6HSLGHK
$UHI
0\QHSKHZV 0DUWLQ
0\QLHFHV
6HSLGHK¶VIDPLO\
)DUKDQJ +DPLG
5RX]EHK
$GHO 6RKHLO
0LD
<DQQLV3 6WHOOD
$UPLQ
$OLUH]D
1There are no linguistic descriptions generated for this conceptual space.
iii
Acknowledgements
ConceptualSpace ofAcknowledgement
:
11HDUE\
'RPDLQRI 6XSHUYLVRUV
$QGUH\
(ULN 0RE\HQ
$P\
'RPDLQRI &RQWULEXWRUV
$OL6
0DMDQ 6DKDU
6DDG0
$OEHUW*
<DML6
<DQQLV.
6HSLGHK (KXG5 0HKXO%
3HWHU*
'LPLWUD*
$QWRQLR/
(ULN $P\
$0XVLF 0RKDPDG
0DUFHO
'RPDLQRI
)ULHQGV
)DU$ZD\
6LQD
$GULQ
6DHHGHK
$OL6 +HVVDP
+HUH#8QL
$$66
)DELHQ $OL$E
7RPHN
<DVKD
0DUMDQ ,UDQ 6DKDU 5DQ\D
+RXVVDP
6HSLGHK 03,
'RPDLQRI 6RFLDO/LIH
6YHQVNDNXUV /LQD
0DULDQQH .ODVVNDPUDWHU 3L]]DNHEDEJDQJ
9ROOH\EDOO
6DUD $OL$ $PLU 7HDPPDWHV
$FFRPSDQLHUV
%LR5R[\
0XVLN
K|JVNRODQ
'RPDLQRI )DPLO\
0\EURWKHUV
0\PRP 6HSLGHK
$UHI 0\QHSKHZV
0DUWLQ 0\QLHFHV 6HSLGHK¶VIDPLO\
)DUKDQJ +DPLG 5RX]EHK
$GHO 6RKHLO
0LD
<DQQLV3 6WHOOD
$UPLQ
$OLUH]D
Thereare 1 nolinguistic descriptionsgenerated
forthis conceptualspace.
iii
Contents
1 Introduction 1
1.1 Motivation . . . . 2
1.2 Problem Statement . . . . 5
1.3 Research Question . . . . 6
1.4 Contributions . . . . 7
1.5 Thesis Outline . . . . 9
1.6 Publications . . . . 12
I Creating Semantic Representations for Numerical Data 17 2 Background and Related Work 19 2.1 Semantic Representation . . . . 19
2.2 On the Theory of Conceptual Spaces . . . . 20
2.2.1 Identifying Quality Dimensions . . . . 24
2.2.2 Related Work on Conceptual Spaces and AI . . . . 25
2.3 Generating Linguistic Descriptions . . . . 27
2.3.1 Linguistic Descriptions of Data (LDD) . . . . 27
2.3.2 Natural Language Generation (NLG) . . . . 29
2.4 Conclusions . . . . 33
3 Data-Driven Construction of Conceptual Spaces 35 3.1 Domain and Quality Dimension Specification . . . . 37
3.1.1 Feature Subset Ranking . . . . 39
3.1.2 Feature Subset Grouping . . . . 41
3.2 Concept Representation . . . . 44
3.2.1 Convex Regions of Concepts . . . . 46
3.2.2 Context-dependent Weights of Concepts . . . . 49
3.3 Discussion . . . . 49
v
Contents 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 ProblemStatement . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 ResearchQuestion . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.5 ThesisOutline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.6 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 I CreatingSemantic Representationsfor NumericalData 17 2 Backgroundand RelatedW ork 19 2.1 SemanticRepresentation . . . . . . . . . . . . . . . . . . . . . . 19 2.2 Onthe Theoryof ConceptualSpaces . . . . . . . . . . . . . . . 20 2.2.1 IdentifyingQuality Dimensions . . . . . . . . . . . . . . 24 2.2.2 RelatedW orkon ConceptualSpaces andAI . . . . . . . 25 2.3 GeneratingLinguistic Descriptions . . . . . . . . . . . . . . . . 27 2.3.1 LinguisticDescriptions ofData (LDD) . . . . . . . . . . 27 2.3.2 NaturalLanguage Generation(NLG) . . . . . . . . . . . 29 2.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3 Data-DrivenConstruction ofConceptual Spaces 35 3.1 Domainand QualityDimension Specification . . . . . . . . . . 37 3.1.1 FeatureSubset Ranking . . . . . . . . . . . . . . . . . . 39 3.1.2 FeatureSubset Grouping. . . . . . . . . . . . . . . . . . 41 3.2 ConceptRepresentation . . . . . . . . . . . . . . . . . . . . . . 44 3.2.1 ConvexRegions ofConcepts . . . . . . . . . . . . . . . 46 3.2.2 Context-dependentW eightsof Concepts . . . . . . . . . 49 3.3 Discussion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
v
Contents 1 Introduction 1 1.1 Motivation . . . . 2
1.2 Problem Statement . . . . 5
1.3 Research Question . . . . 6
1.4 Contributions . . . . 7
1.5 Thesis Outline . . . . 9
1.6 Publications . . . . 12
I Creating Semantic Representations for Numerical Data 17 2 Background and Related Work 19 2.1 Semantic Representation . . . . 19
2.2 On the Theory of Conceptual Spaces . . . . 20
2.2.1 Identifying Quality Dimensions . . . . 24
2.2.2 Related Work on Conceptual Spaces and AI . . . . 25
2.3 Generating Linguistic Descriptions . . . . 27
2.3.1 Linguistic Descriptions of Data (LDD) . . . . 27
2.3.2 Natural Language Generation (NLG) . . . . 29
2.4 Conclusions . . . . 33
3 Data-Driven Construction of Conceptual Spaces 35 3.1 Domain and Quality Dimension Specification . . . . 37
3.1.1 Feature Subset Ranking . . . . 39
3.1.2 Feature Subset Grouping . . . . 41
3.2 Concept Representation . . . . 44
3.2.1 Convex Regions of Concepts . . . . 46
3.2.2 Context-dependent Weights of Concepts . . . . 49
3.3 Discussion . . . . 49
v
Contents
1 Introduction
1
1.1 Motivation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 ProblemStatement
. . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 ResearchQuestion
. . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Contributions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 ThesisOutline
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.6 Publications
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
I CreatingSemantic
Representationsfor NumericalData
17
2 Backgroundand
RelatedW ork 19
2.1 SemanticRepresentation
. . . . . . . . . . . . . . . . . . . . . . 19
2.2 Onthe Theoryof
ConceptualSpaces .
. . . . . . . . . . . . . . 20
2.2.1 IdentifyingQuality
Dimensions . . . . . . . . . . . . . . 24
2.2.2 RelatedW
orkon ConceptualSpaces
andAI . . . . . . . 25
2.3 GeneratingLinguistic
Descriptions . . . . . . . . . . . . . . . . 27
2.3.1 LinguisticDescriptions
ofData (LDD)
. . . . . . . . . . 27
2.3.2 NaturalLanguage
Generation(NLG) .
. . . . . . . . . . 29
2.4 Conclusions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3 Data-DrivenConstruction
ofConceptual Spaces
35
3.1 Domainand
QualityDimension Specification
. . . . . . . . . . 37
3.1.1 FeatureSubset
Ranking . . . . . . . . . . . . . . . . . . 39
3.1.2 FeatureSubset
Grouping.
. . . . . . . . . . . . . . . . . 41
3.2 ConceptRepresentation
. . . . . . . . . . . . . . . . . . . . . . 44
3.2.1 ConvexRegions
ofConcepts .
. . . . . . . . . . . . . . 46
3.2.2 Context-dependentW
eightsof Concepts
. . . . . . . . . 49
3.3 Discussion.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
v
vi CONTENTS
4 Semantic Inference in Conceptual Spaces 53
4.1 Symbol Space Definition . . . . 54
4.2 Inferring Linguistic Descriptions . . . . 56
4.2.1 Phase A: Inference in Conceptual Space . . . . 57
4.2.2 Phase B: Inference in Symbol Space . . . . 65
4.3 Discussion . . . . 70
5 Results and Evaluation: A Case Study on Leaf Data Set 73 5.1 Constructing a Conceptual Space of Leaves . . . . 74
5.1.1 Domain Specification for Leaf Data set . . . . 75
5.1.2 Concept Representation for Leaf Concepts . . . . 77
5.2 Semantic Inference for Unknown Leaf Samples . . . . 77
5.2.1 Inference in Conceptual Space of Leaves . . . . 78
5.2.2 Inference in Symbol Space of Leaves . . . . 78
5.3 Empirical Evaluation for Leaf Samples . . . . 78
5.3.1 Survey: Design and Procedure . . . . 80
5.3.2 Identifying Leaf Observations from Linguistic Descriptions 82 5.3.3 Rating Linguistic Descriptions of Leaf Observations . . . 83
5.4 Discussion . . . . 85
II Physiological Sensor Data: From Data Analysis to Linguis- tic Descriptions 89 6 An Overview of Health Monitoring with Mining Physiological Sensor Data 91 6.1 Data Mining Tasks in Health Monitoring Systems . . . . 92
6.2 Data Mining in Health Monitoring Systems . . . . 95
6.2.1 Preprocessing . . . . 96
6.2.2 Feature Extraction/Selection . . . . 96
6.2.3 Modelling and Learning Methods . . . . 97
6.3 Data Sets: Acquisition and Properties . . . . 99
6.3.1 Sensor Data Acquisition . . . . 99
6.3.2 Sensor Data Properties . . . 100
6.4 Discussion and Challenges . . . 101
7 Physiological Time Series Data: Preparation and Processing 105 7.1 Input Time Series Sensor Data: Collection and Acquisition . . . 106
7.1.1 Wearable Sensors, Non-clinical Data . . . 106
7.1.2 Clinical Physiological Data . . . 107
7.2 Trend Detection in Physiological Time Series . . . 109
7.3 Pattern Abstraction in Physiological Data . . . 113
7.3.1 Background on Pattern Abstraction . . . 113
7.3.2 Prototypical Pattern Abstraction . . . 115
vi CONTENTS
4 SemanticInference inConceptual Spaces 53 4.1 SymbolSpace Definition . . . . . . . . . . . . . . . . . . . . . . 54 4.2 InferringLinguistic Descriptions. . . . . . . . . . . . . . . . . . 56 4.2.1 PhaseA: Inferencein ConceptualSpace . . . . . . . . . . 57 4.2.2 PhaseB: Inferencein SymbolSpace . . . . . . . . . . . . 65 4.3 Discussion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5 Resultsand Evaluation:A CaseStudy onLeaf DataSet 73 5.1 Constructinga ConceptualSpace ofLeaves . . . . . . . . . . . . 74 5.1.1 DomainSpecification forLeaf Dataset . . . . . . . . . . 75 5.1.2 ConceptRepresentation forLeaf Concepts . . . . . . . . 77 5.2 SemanticInference forUnknown LeafSamples . . . . . . . . . . 77 5.2.1 Inferencein ConceptualSpace ofLeaves . . . . . . . . . 78 5.2.2 Inferencein SymbolSpace ofLeaves . . . . . . . . . . . 78 5.3 EmpiricalEvaluation forLeaf Samples . . . . . . . . . . . . . . 78 5.3.1 Survey:Design andProcedure . . . . . . . . . . . . . . . 80 5.3.2 IdentifyingLeaf Observationsfrom LinguisticDescriptions 82 5.3.3 RatingLinguistic Descriptionsof LeafObservations . . . 83 5.4 Discussion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 II PhysiologicalSensor Data:From DataAnalysis toLinguis- ticDescriptions 89 6 AnOverview ofHealth Monitoringwith MiningPhysiological Sensor Data 91 6.1 DataMining Tasks inHealth MonitoringSystems . . . . . . . . 92 6.2 DataMining inHealth MonitoringSystems . . . . . . . . . . . 95 6.2.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . 96 6.2.2 FeatureExtraction/Selection . . . . . . . . . . . . . . . . 96 6.2.3 Modellingand LearningMethods . . . . . . . . . . . . . 97 6.3 DataSets: Acquisitionand Properties . . . . . . . . . . . . . . . 99 6.3.1 SensorData Acquisition . . . . . . . . . . . . . . . . . . 99 6.3.2 SensorData Properties . . . . . . . . . . . . . . . . . . . 10 0 6.4 Discussionand Challenges . . . . . . . . . . . . . . . . . . . . . 10 1 7 PhysiologicalT imeSeries Data:Preparation andProcessing 105 7.1 InputT imeSeries SensorData: Collectionand Acquisition . . . 106 7.1.1 Wearable Sensors,Non-clinical Data . . . . . . . . . . . 10 6 7.1.2 ClinicalPhysiological Data. . . . . . . . . . . . . . . . . 10 7 7.2 Trend Detectionin PhysiologicalT imeSeries . . . . . . . . . . . 10 9 7.3 PatternAbstraction inPhysiological Data. . . . . . . . . . . . . 11 3 7.3.1 Backgroundon PatternAbstraction . . . . . . . . . . . . 11 3 7.3.2 PrototypicalPattern Abstraction. . . . . . . . . . . . . . 11 5
vi CONTENTS4 Semantic Inference in Conceptual Spaces 53 4.1 Symbol Space Definition . . . . 54
4.2 Inferring Linguistic Descriptions . . . . 56
4.2.1 Phase A: Inference in Conceptual Space . . . . 57
4.2.2 Phase B: Inference in Symbol Space . . . . 65
4.3 Discussion . . . . 70
5 Results and Evaluation: A Case Study on Leaf Data Set 73 5.1 Constructing a Conceptual Space of Leaves . . . . 74
5.1.1 Domain Specification for Leaf Data set . . . . 75
5.1.2 Concept Representation for Leaf Concepts . . . . 77
5.2 Semantic Inference for Unknown Leaf Samples . . . . 77
5.2.1 Inference in Conceptual Space of Leaves . . . . 78
5.2.2 Inference in Symbol Space of Leaves . . . . 78
5.3 Empirical Evaluation for Leaf Samples . . . . 78
5.3.1 Survey: Design and Procedure . . . . 80
5.3.2 Identifying Leaf Observations from Linguistic Descriptions 82 5.3.3 Rating Linguistic Descriptions of Leaf Observations . . . 83
5.4 Discussion . . . . 85
II Physiological Sensor Data: From Data Analysis to Linguis- tic Descriptions 89 6 An Overview of Health Monitoring with Mining Physiological Sensor Data 91 6.1 Data Mining Tasks in Health Monitoring Systems . . . . 92
6.2 Data Mining in Health Monitoring Systems . . . . 95
6.2.1 Preprocessing . . . . 96
6.2.2 Feature Extraction/Selection . . . . 96
6.2.3 Modelling and Learning Methods . . . . 97
6.3 Data Sets: Acquisition and Properties . . . . 99
6.3.1 Sensor Data Acquisition . . . . 99
6.3.2 Sensor Data Properties . . . 100
6.4 Discussion and Challenges . . . 101
7 Physiological Time Series Data: Preparation and Processing 105 7.1 Input Time Series Sensor Data: Collection and Acquisition . . . 106
7.1.1 Wearable Sensors, Non-clinical Data . . . 106
7.1.2 Clinical Physiological Data . . . 107
7.2 Trend Detection in Physiological Time Series . . . 109
7.3 Pattern Abstraction in Physiological Data . . . 113
7.3.1 Background on Pattern Abstraction . . . 113
7.3.2 Prototypical Pattern Abstraction . . . 115
vi CONTENTS
4 SemanticInference
inConceptual Spaces
53
4.1 SymbolSpace
Definition . . . . . . . . . . . . . . . . . . . . . . 54
4.2 InferringLinguistic
Descriptions.
. . . . . . . . . . . . . . . . . 56
4.2.1 PhaseA:
Inferencein ConceptualSpace
. . . . . . . . . . 57
4.2.2 PhaseB:
Inferencein SymbolSpace
. . . . . . . . . . . . 65
4.3 Discussion.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5 Resultsand
Evaluation:A CaseStudy onLeaf
DataSet 73
5.1 Constructinga
ConceptualSpace ofLeaves
. . . . . . . . . . . . 74
5.1.1 DomainSpecification
forLeaf Dataset
. . . . . . . . . . 75
5.1.2 ConceptRepresentation
forLeaf Concepts
. . . . . . . . 77
5.2 SemanticInference
forUnknown LeafSamples
. . . . . . . . . . 77
5.2.1 Inferencein
ConceptualSpace ofLeaves
. . . . . . . . . 78
5.2.2 Inferencein
SymbolSpace ofLeaves
. . . . . . . . . . . 78
5.3 EmpiricalEvaluation
forLeaf Samples
. . . . . . . . . . . . . . 78
5.3.1 Survey:Design
andProcedure .
. . . . . . . . . . . . . . 80
5.3.2 IdentifyingLeaf
Observationsfrom LinguisticDescriptions
82
5.3.3 RatingLinguistic
Descriptionsof LeafObservations .
. . 83
5.4 Discussion.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
II PhysiologicalSensor
Data:From DataAnalysis
toLinguis- ticDescriptions
89
6 AnOverview ofHealth
Monitoringwith MiningPhysiological
Sensor
Data 91
6.1 DataMining Tasks
inHealth MonitoringSystems
. . . . . . . . 92
6.2 DataMining inHealth
MonitoringSystems .
. . . . . . . . . . 95
6.2.1 Preprocessing
. . . . . . . . . . . . . . . . . . . . . . . . 96
6.2.2 FeatureExtraction/Selection
. . . . . . . . . . . . . . . . 96
6.2.3 Modellingand
LearningMethods .
. . . . . . . . . . . . 97
6.3 DataSets:
Acquisitionand Properties
. . . . . . . . . . . . . . . 99
6.3.1 SensorData
Acquisition . . . . . . . . . . . . . . . . . . 99
6.3.2 SensorData
Properties . . . . . . . . . . . . . . . . . . . 10 0
6.4 Discussionand
Challenges . . . . . . . . . . . . . . . . . . . . . 10 1
7 PhysiologicalT
imeSeries Data:Preparation
andProcessing 105
7.1 InputT imeSeries SensorData:
Collectionand Acquisition
. . . 106
7.1.1 Wearable Sensors,Non-clinical
Data . . . . . . . . . . . 10 6
7.1.2 ClinicalPhysiological
Data.
. . . . . . . . . . . . . . . . 10 7
7.2 Trend Detectionin
PhysiologicalT imeSeries .
. . . . . . . . . . 10 9
7.3 PatternAbstraction
inPhysiological Data.
. . . . . . . . . . . . 11 3
7.3.1 Backgroundon
PatternAbstraction .
. . . . . . . . . . . 11 3
7.3.2 PrototypicalPattern
Abstraction.
.
.
.
.
.
.
.
.
.
.
.
.
.
11
5
CONTENTS vii
7.4 Discussion and Summary . . . 117
8 Mining and Describing Physiological Time Series Data 119 8.1 Mining Temporal Rules in Physiological Sensor Data . . . 120
8.1.1 Background on Temporal Rule Mining . . . 120
8.1.2 A New Approach for Temporal Rule Mining . . . 122
8.1.3 Temporal Rule Set Similarity . . . 124
8.1.4 Results: Distinctive Rules in Clinical Settings . . . 126
8.1.5 Evaluation of Rule Set Similarity in Clinical Conditions . 128 8.2 Linguistic Descriptions for Patterns and Temporal Rules . . . 131
8.2.1 Trend and Pattern Description . . . 132
8.2.2 Temporal Rule Representation . . . 133
8.3 Discussion and Summary . . . 136
9 Linguistic Descriptions for Patterns using Conceptual Spaces 139 9.1 Constructing Conceptual Space of Patterns . . . 140
9.1.1 Domain Specification for Time Series Pattern Data Set . . 141
9.1.2 Concept Representation for Pattern Concepts . . . 143
9.2 Semantic Inference for Unknown Patterns . . . 144
9.2.1 Inference in Conceptual Space of Patterns . . . 144
9.2.2 Inference in Symbol Space of Patterns . . . 145
9.3 Evaluation: Descriptions from Conceptual Spaces . . . 147
9.3.1 Survey: Design and Procedure for Pattern Data Set . . . . 147
9.3.2 Identifying Pattern Observations from Linguistic Descrip- tions . . . 147
9.3.3 Rating Linguistic Descriptions of Pattern Observations . 149 9.4 Discussion and Summary . . . 150
10 Conclusions 153 10.1 Summary of Contributions . . . 153
10.1.1 Construction of Conceptual Spaces (C1) . . . 153
10.1.2 Semantic Inference in Conceptual Spaces (C2) . . . 154
10.1.3 Mining Physiological Sensor Data (C3) . . . 155
10.1.4 Linguistic Description by Semantic Representations (C4) 155 10.2 Limitations . . . 156
10.3 Societal and Ethical Impacts . . . 158
10.4 Future Research Directions . . . 159
10.5 Final Words . . . 163
References 165
CONTENTS vii7.4 Discussionand Summary. . . . . . . . . . . . . . . . . . . . . . 11 7 8 Miningand DescribingPhysiological Time SeriesData 119 8.1 MiningT emporalRules inPhysiological SensorData . . . . . . 12 0 8.1.1 Backgroundon Temporal RuleMining . . . . . . . . . . 12 0 8.1.2 ANew Approachfor Temporal RuleMining . . . . . . . 12 2 8.1.3 Temporal RuleSet Similarity. . . . . . . . . . . . . . . . 12 4 8.1.4 Results:Distinctive Rulesin ClinicalSettings . . . . . . . 12 6 8.1.5 Evaluationof RuleSet Similarityin ClinicalConditions . 128 8.2 LinguisticDescriptions forPatterns andT emporalRules . . . . . 13 1 8.2.1 Trend andPattern Description. . . . . . . . . . . . . . . 13 2 8.2.2 Temporal RuleRepresentation . . . . . . . . . . . . . . . 13 3 8.3 Discussionand Summary. . . . . . . . . . . . . . . . . . . . . . 13 6 9 LinguisticDescriptions forPatterns usingConceptual Spaces 139 9.1 ConstructingConceptual Spaceof Patterns . . . . . . . . . . . . 14 0 9.1.1 DomainSpecification forT imeSeries PatternData Set. . 141 9.1.2 ConceptRepresentation forPattern Concepts . . . . . . 14 3 9.2 SemanticInference forUnknown Patterns . . . . . . . . . . . . 14 4 9.2.1 Inferencein ConceptualSpace ofPatterns . . . . . . . . 14 4 9.2.2 Inferencein SymbolSpace ofPatterns . . . . . . . . . . . 14 5 9.3 Evaluation:Descriptions fromConceptual Spaces . . . . . . . . 14 7 9.3.1 Survey:Design andProcedure forPattern DataSet . . . . 14 7 9.3.2 IdentifyingPattern Observationsfrom LinguisticDescrip- tions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 7 9.3.3 RatingLinguistic Descriptionsof PatternObservations . 149 9.4 Discussionand Summary. . . . . . . . . . . . . . . . . . . . . . 15 0 10Conclusions 153 10.1Summary ofContributions . . . . . . . . . . . . . . . . . . . . . 15 3 10.1.1Construction ofConceptual Spaces(C1) . . . . . . . . . 15 3 10.1.2Semantic Inferencein ConceptualSpaces (C2) . . . . . . 15 4 10.1.3Mining PhysiologicalSensor Data(C3) . . . . . . . . . . 15 5 10.1.4Linguistic Descriptionby SemanticRepresentations (C4) 155 10.2Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 6 10.3Societal andEthical Impacts . . . . . . . . . . . . . . . . . . . . 15 8 10.4Future ResearchDirections . . . . . . . . . . . . . . . . . . . . . 15 9 10.5Final Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3 References 165
CONTENTS vii7.4 Discussion and Summary . . . 117
8 Mining and Describing Physiological Time Series Data 119 8.1 Mining Temporal Rules in Physiological Sensor Data . . . 120
8.1.1 Background on Temporal Rule Mining . . . 120
8.1.2 A New Approach for Temporal Rule Mining . . . 122
8.1.3 Temporal Rule Set Similarity . . . 124
8.1.4 Results: Distinctive Rules in Clinical Settings . . . 126
8.1.5 Evaluation of Rule Set Similarity in Clinical Conditions . 128 8.2 Linguistic Descriptions for Patterns and Temporal Rules . . . 131
8.2.1 Trend and Pattern Description . . . 132
8.2.2 Temporal Rule Representation . . . 133
8.3 Discussion and Summary . . . 136
9 Linguistic Descriptions for Patterns using Conceptual Spaces 139 9.1 Constructing Conceptual Space of Patterns . . . 140
9.1.1 Domain Specification for Time Series Pattern Data Set . . 141
9.1.2 Concept Representation for Pattern Concepts . . . 143
9.2 Semantic Inference for Unknown Patterns . . . 144
9.2.1 Inference in Conceptual Space of Patterns . . . 144
9.2.2 Inference in Symbol Space of Patterns . . . 145
9.3 Evaluation: Descriptions from Conceptual Spaces . . . 147
9.3.1 Survey: Design and Procedure for Pattern Data Set . . . . 147
9.3.2 Identifying Pattern Observations from Linguistic Descrip- tions . . . 147
9.3.3 Rating Linguistic Descriptions of Pattern Observations . 149 9.4 Discussion and Summary . . . 150
10 Conclusions 153 10.1 Summary of Contributions . . . 153
10.1.1 Construction of Conceptual Spaces (C1) . . . 153
10.1.2 Semantic Inference in Conceptual Spaces (C2) . . . 154
10.1.3 Mining Physiological Sensor Data (C3) . . . 155
10.1.4 Linguistic Description by Semantic Representations (C4) 155 10.2 Limitations . . . 156
10.3 Societal and Ethical Impacts . . . 158
10.4 Future Research Directions . . . 159
10.5 Final Words . . . 163
References 165
CONTENTS vii
7.4 Discussionand
Summary.
. . . . . . . . . . . . . . . . . . . . . 11 7
8 Miningand
DescribingPhysiological Time
SeriesData 119
8.1 MiningT
emporalRules inPhysiological
SensorData . . . . . . 12 0
8.1.1 Backgroundon
Temporal RuleMining
. . . . . . . . . . 12 0
8.1.2 ANew Approachfor
Temporal RuleMining
. . . . . . . 12 2
8.1.3 Temporal RuleSet
Similarity.
. . . . . . . . . . . . . . . 12 4
8.1.4 Results:Distinctive
Rulesin ClinicalSettings
. . . . . . . 12 6
8.1.5 Evaluationof
RuleSet Similarityin
ClinicalConditions .
128
8.2 LinguisticDescriptions
forPatterns andT
emporalRules . . . . . 13 1
8.2.1 Trend andPattern
Description.
. . . . . . . . . . . . . . 13 2
8.2.2 Temporal RuleRepresentation
. . . . . . . . . . . . . . . 13 3
8.3 Discussionand
Summary.
. . . . . . . . . . . . . . . . . . . . . 13 6
9 LinguisticDescriptions
forPatterns usingConceptual
Spaces 139
9.1 ConstructingConceptual
Spaceof Patterns
. . . . . . . . . . . . 14 0
9.1.1 DomainSpecification
forT imeSeries PatternData
Set.
. 141
9.1.2 ConceptRepresentation
forPattern Concepts
. . . . . . 14 3
9.2 SemanticInference
forUnknown Patterns
. . . . . . . . . . . . 14 4
9.2.1 Inferencein
ConceptualSpace ofPatterns
. . . . . . . . 14 4
9.2.2 Inferencein
SymbolSpace ofPatterns
. . . . . . . . . . . 14 5
9.3 Evaluation:Descriptions
fromConceptual Spaces
. . . . . . . . 14 7
9.3.1 Survey:Design
andProcedure forPattern
DataSet . . . . 14 7
9.3.2 IdentifyingPattern
Observationsfrom LinguisticDescrip-
tions.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 7
9.3.3 RatingLinguistic
Descriptionsof PatternObservations
. 149
9.4 Discussionand
Summary.
. . . . . . . . . . . . . . . . . . . . . 15 0
10Conclusions 153
10.1Summary ofContributions
. . . . . . . . . . . . . . . . . . . . . 15 3
10.1.1Construction ofConceptual
Spaces(C1) . . . . . . . . . 15 3
10.1.2Semantic Inferencein
ConceptualSpaces (C2)
. . . . . . 15 4
10.1.3Mining PhysiologicalSensor
Data(C3) . . . . . . . . . . 15 5
10.1.4Linguistic Descriptionby
SemanticRepresentations (C4)
155
10.2Limitations .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 6
10.3Societal andEthical
Impacts . . . . . . . . . . . . . . . . . . . . 15 8
10.4Future ResearchDirections
. . . . . . . . . . . . . . . . . . . . . 15 9
10.5Final Words
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3
References
165
List of Figures
1.1 A part of the Rumi’s poem in Persian, together with an illustra-
tion of the story, adapted from [7]. . . . 1
1.2 A schematic overview of the tasks to be performed in both the- oretical (inner box) and application (outer box) focuses of the
thesis. . . . 6
1.3 Thesis MindMap, illustrating the appearance of the research tasks in the chapters, together with the contributions of this thesis. 14
2.1 A schematic presentation of a conceptual space of fruits. . . . . 24
2.2 The architecture of data-to-text systems, proposed by Reiter. . . 31
3.1 Illustration of the main steps for constructing a conceptual space
from a set of numeric data. . . . 37
3.2 Two phases of the domain and quality dimension specification,
with input and output parameters of each phase. . . . 39
3.3 A weighted bipartite graph with two sets of vertices from the
labels Y and the selected features F
. . . . . 43
3.4 A bigraph graph and one selected biclique (blue edges) for the
leaf example (explained in Example 3.5). . . . 45
3.5 A concept representation example in a conceptual space with
domains δ
aand δ
b. . . . 48
4.1 Illustration of the steps of inferring linguistic descriptions for an unknown observation via the constructed conceptual spaces and
its corresponding symbol space. . . . 54
4.2 Schematic of a conceptual space and the coupled symbol space. . 55
4.3 Two phases of the semantic inference for generating linguistic descriptions, with the input and output parameters of each phase. 57 4.4 An illustration of four different cases with respect to the various
positions of an instance points within domains. . . . 60
ix
Listof Figures
1.1 Apart ofthe Rumi’s
poemin Persian,together
withan illustra-
tionof thestory ,adapted from[7].
. . . . . . . . . . . . . . . . 1
1.2 Aschematic overviewof
thetasks tobe
performedin boththe-
oretical(inner box)and
application(outer box)focuses
ofthe
thesis.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 ThesisMindMap,
illustratingthe appearanceof
theresearch
tasksin thechapters, togetherwith
thecontributions ofthis
thesis.14
2.1 Aschematic presentationof
aconceptual spaceof
fruits.
. . . . 24
2.2 Thearchitecture ofdata-to-text
systems,proposed byReiter
. . . 31
3.1 Illustrationof
themain stepsfor
constructinga conceptualspace
froma setof numericdata.
. . . . . . . . . . . . . . . . . . . . . 37
3.2 Two phasesof
thedomain andquality
dimensionspecification,
withinput andoutput
parametersof eachphase.
. . . . . . . . . 39
3.3 Aweighted bipartitegraph
withtwo setsof
verticesfrom the
labels Yand theselected features
F
. . . . . . . . . . . . . . . . 43
3.4 Abigraph graphand
oneselected biclique(blue
edges)for the
leafexample (explainedin
Example3.5).
. . . . . . . . . . . . . 45
3.5 Aconcept representationexample
ina conceptualspace
with
domainsδ
andδ
a..
b. . . . . . . . . . . . . . . . . . . . . . . . 48
4.1 Illustrationof
thesteps ofinferring
linguisticdescriptions foran
unknownobservation viathe
constructedconceptual spacesand
itscorresponding symbolspace.
. . . . . . . . . . . . . . . . . . 54
4.2 Schematicof
aconceptual spaceand
thecoupled symbolspace.
. 55
4.3 Two phasesof
thesemantic inferencefor
generatinglinguistic
descriptions,with theinput
andoutput parametersof
eachphase.
57
4.4 Anillustration offour
differentcases withrespect
tothe various
positionsof aninstance pointswithin
domains.
. . . . . . . . . 60
ix
List of Figures
1.1 A part of the Rumi’s poem in Persian, together with an illustra-
tion of the story, adapted from [7]. . . . 1
1.2 A schematic overview of the tasks to be performed in both the- oretical (inner box) and application (outer box) focuses of the
thesis. . . . 6
1.3 Thesis MindMap, illustrating the appearance of the research tasks in the chapters, together with the contributions of this thesis. 14
2.1 A schematic presentation of a conceptual space of fruits. . . . . 24
2.2 The architecture of data-to-text systems, proposed by Reiter. . . 31
3.1 Illustration of the main steps for constructing a conceptual space
from a set of numeric data. . . . 37
3.2 Two phases of the domain and quality dimension specification,
with input and output parameters of each phase. . . . 39
3.3 A weighted bipartite graph with two sets of vertices from the
labels Y and the selected features F
. . . . . 43
3.4 A bigraph graph and one selected biclique (blue edges) for the
leaf example (explained in Example 3.5). . . . 45
3.5 A concept representation example in a conceptual space with
domains δ
aand δ
b. . . . 48
4.1 Illustration of the steps of inferring linguistic descriptions for an unknown observation via the constructed conceptual spaces and
its corresponding symbol space. . . . 54
4.2 Schematic of a conceptual space and the coupled symbol space. . 55
4.3 Two phases of the semantic inference for generating linguistic descriptions, with the input and output parameters of each phase. 57 4.4 An illustration of four different cases with respect to the various
positions of an instance points within domains. . . . 60
ix
Listof Figures
1.1 Apart ofthe Rumi’s
poemin Persian,together
withan illustra-
tionof thestory ,adapted from[7].
. . . . . . . . . . . . . . . . 1
1.2 Aschematic overviewof
thetasks tobe
performedin boththe-
oretical(inner box)and
application(outer box)focuses
ofthe
thesis.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 ThesisMindMap,
illustratingthe appearanceof
theresearch
tasksin thechapters, togetherwith
thecontributions ofthis
thesis.14
2.1 Aschematic presentationof
aconceptual spaceof
fruits.
. . . . 24
2.2 Thearchitecture ofdata-to-text
systems,proposed byReiter
. . . 31
3.1 Illustrationof
themain stepsfor
constructinga conceptualspace
froma setof numericdata.
. . . . . . . . . . . . . . . . . . . . . 37
3.2 Two phasesof
thedomain andquality
dimensionspecification,
withinput andoutput
parametersof eachphase.
. . . . . . . . . 39
3.3 Aweighted bipartitegraph
withtwo setsof
verticesfrom the
labels Yand theselected features
F
. . . . . . . . . . . . . . . . 43
3.4 Abigraph graphand
oneselected biclique(blue
edges)for the
leafexample (explainedin
Example3.5).
. . . . . . . . . . . . . 45
3.5 Aconcept representationexample
ina conceptualspace
with
domainsδ
andδ
a..
b. . . . . . . . . . . . . . . . . . . . . . . . 48
4.1 Illustrationof
thesteps ofinferring
linguisticdescriptions foran
unknownobservation viathe
constructedconceptual spacesand
itscorresponding symbolspace.
. . . . . . . . . . . . . . . . . . 54
4.2 Schematicof
aconceptual spaceand
thecoupled symbolspace.
. 55
4.3 Two phasesof
thesemantic inferencefor
generatinglinguistic
descriptions,with theinput
andoutput parametersof
eachphase.
57
4.4 Anillustration offour
differentcases withrespect
tothe various
positionsof aninstance pointswithin
domains.
. . . . . . . . . 60
ix