Markov chain Monte Carlo for rare-event simulation in heavy-tailed settings

(1)

simulation in heavy-tailed settings

Thorbjörn Gudmundsson

(2)

(3)

InthisthesisamethodbasedonaMarkov hainMonteCarlo(MCMC)

algorithmisproposedto omputetheprobabilityofarareevent.The on-

ditionaldistributionoftheunderlyingpro essgiventhattherareevento -

urshastheprobabilityoftherareeventasitsnormalising onstant. Us-

ingtheMCMCmethodologyaMarkov hainissimulated,withthat on-

ditionaldistributionas itsinvariantdistribution,and informationabout

thenormalising onstantisextra tedfromitstraje tory.

Thealgorithmisdes ribedinfullgeneralityandappliedtofourdier-

entproblemsof omputingrare-eventprobability. Therstproblem on-

sidersarandomwalk

Y ₁ +· · ·+Y n

êx
eedingâ^high^threshold,^where^theîn-

rements

Y

^areindependentandidenti allydistributedandheavy-tailed.

These ondproblemisanextensionoftherstonetoaheavy-tailedran-

dom sum

Y ₁ + · · · + Y N

^ex
eeding ^a ^high ^threshold, ^where ^the ^number

of in rements

N

^is ^random ^and independent of

Y ₁ , . . . , Y n

^. ^The ^third

problem onsidersasto hasti re urren eequation

X n = A n X _n−1 + B n

ex eedingahighthreshold,wheretheinnovations

B

^areindependentand identi allydistributedandheavy-tailed. Thenalproblem onsidersthe

ruinprobabilityforaninsuran e ompanywithriskyinvestments.

An unbiased estimator of the re ipro al probability for ea h orre-

spondingproblemis onstru tedwhosenormalisedvarian evanishesasymp-

toti ally. Thealgorithmisillustratednumeri allyand omparedtoexist-

ingimportan esamplingalgorithms.

(4)

Idennaavhandlingpresenterasenmetodbaseradpå MCMC(Markov

hain Monte Carlo) för att beräkna sannolikheten av en sällsynt hän-

delse. Den betingadefördelningenfördenunderliggande pro essengivet

att den sällsynta händelsen inträar har den sökta sannolikheten som

sin normaliseringskonstant. Med hjälp av MCMC-metodiken skapasen

Markovkedja med betingade fördelningen som sin invarianta fördelning

o henskattningavnormaliseringskonstantenbaseraspå densimulerade

kedjan.

Algoritmenbeskrivsifull generaliteto htillämpas på fyraexempel-

problem. Förstaproblemethandlaromenslumpvandring

Y ₁ + · · · + Y n

som överskriderenhög tröskel,då stegen

Y

^är ^oberoende,likafödelade medtungsvansadfördelning. Andraproblemet är enutvidgning av det

första till summa av ett stokastisktantal termer. Tredjeproblemetbe-

handlar sannolikhetenattlösningen

X n

^till^en^stokastiskrekurrensekva- tion

X n = A n X _n−1 + B n

överskrider enhög tröskel då innovationerna

B

^är ^oberoende, likafördelade medtungsvansadfördelning. Sista prob- lemethandlaromruinsannolikhetförettförsäkringsbolagmedriskfyllda

investeringar.

Förvarjeexempelproblemkonstruerasenväntevärdesriktigskattning

av denre iproka sannolikheten. Skattningarnaär eektivai meningen

attderasnormaliseradevariansgårmotnoll. Vidare ärdekonstruerade

Markovkedjornalikformigtergodiska. Algoritmernaillustrerasnumeriskt

o hjämfösmedexisterandeimportan esamplingalgoritmer.

(5)

Iwanttoexpressmydeepestappre iationforthesupportandhelpthatIhave

re eived frommy supervisorHenrikHult. I amtrulygrateful forbegiven the

opportunitytoworkunder hisguidan e.

Iwanttooermyspe ialthanksto olleaguesatthefa ultyfortheiradvi e

andhelp,inparti ularFilip LindskogandTobias Rydén. Alsowantthankmy

fellow Ph.D. students, Björn, Johan and Pierre, for ountless dis ussions and

pra ti esessionsonthebla kboard.

Finally, I wantto thankmytwospe ial ones Rannveig and Gyðafor their

immensesupport andlove.

(6)

1 Introdu tion 1

1.1 Sto hasti simulation. . . 2

1.1.1 Samplingarandomvariable. . . 2

1.1.2 Markov hainMonteCarlo . . . 4

1.1.3 Rare-eventsimulation . . . 5

1.1.4 Importan esampling . . . 6

1.1.5 Heavy-taileddistributions . . . 6

1.2 Markov hainMonteCarloinrare-eventsimulation. . . 7

1.2.1 Formulation. . . 7

1.2.2 Controllingthenormalisedvarian e . . . 8

1.2.3 Ergodi properties . . . 10

1.2.4 E ien yoftheMCMCalgorithm . . . 11

1.3 Outlineand ontributionofthisthesis . . . 11

2 General Markov hain MonteCarlo formulation 13 2.1 Asymptoti e ien y riteria . . . 14

3 Heavy-tailedRandomWalk 15 3.1 AGibbssamplerfor omputing

P(S n > a n )

^. ^. ^. ^. ^. ^. ^. ^. ^. ^. ^. ^. ¹⁵

3.2 Constru tingane ientestimator . . . 18

3.3 Numeri alexperiments. . . 19

4 Heavy-tailedRandomSum 21 4.1 AGibbssamplerfor omputing

P(S N n > a n )

^. ^. ^. ^. ^. ^. ^. ^. ^. ^. ^. ²²

5 Sto hasti Re urren e Equations 28 5.1 AGibbssamplerfor omputing

P(X m > c n )

^. ^. ^. ^. ^. ^. ^. ^. ^. ^. ^. ²⁸

6 Ruin probability in an Insuran e Model with Risky Invest- ments 35 6.1 AGibbssamplerfor omputingtheruinprobability . . . 36

6.2 Constru tingane ientestimatorofthere ipro alruinprobability 36

(7)

Mathemati almodellingofsystems,inforinstan enaturals ien eshasbeenone

ofthekeybuildingblo ksofs ienti understanding. Thesystemofinterestmay

bethemotionoftheplanets,thedynami owinaliquid, hangesinsto kpri es

orthetotalamountofinsuran e laimsmadeinayear. Oftenthemodelinvolves

thesystem'sdynami laws,long-timebehavioranddierentpossibles enarios.

Su hmodels nearlyalwaysin lude aparameter,oraset ofparameters,whi h,

thoughunknown in advan e are still needed to alibratethe model to reality.

Thus in order tohaveafully spe iedmodel apable offore astingthefuture

properties or value,then oneneedsto measure thevaluesof thetheunknown

parametersandtherebymostlikelyintrodu ingsomemeasurementerror. This

errorisassumedtoberandomandthustheresultingfore astistheout omeof

asto hasti mathemati almodel.

Withtheeverin reasing omputational apa ityinre entde adesthemod-

elsarebe omingmoreand more omplex. Minoraspe tsthat wereignored in

thesimpler models an now bein luded in the omputations, within reasing

omplexity. Resear hersandpra titionersalikestrivetoenhan e urrentmodels

andintrodu emoreandmoredetails toit,inthehopeofin reasingtheirfore-

astingability. Weathersystemsandnan epro essesareexamplesofmodels

that today are so involvedthat it is be oming di ult to give analyti aland

losedform answersto propertyand fore astingquestions. This hasgivenrise

toalternativeapproa hestohandlingsu h omplexsto hasti models,namely

sto hasti simulation.

Briey, simulation is the pro ess of sampling the underlying random fa -

tors of a model to generate many instan es of it, in order to make inferen es

about its properties. This has proved to be a powerfultool for omputation

in many a ademi elds su h as physi s, hemistry, e onomi s, nan e, in-

suran e. Generatinginstan es of even thehighly advan ed sto hasti models,

multi-dimensional, non-linear and highly sto hasti models an be done in a

few millise onds. Sto hasti simulationhas thus playedits partin the s ien-

ti progressofre entde adesandthesimulationthemselveshasgrownintoan

a ademi eldinitsownright.

In physi s,hypothesisare oftentested and veried viaanumberof exper-

iments. Oneexperiment is arried outafter another, and if su iently many

of the experiments support the hypothesis then it a quires a ertain validity

andbe omesatheory. Thiswasforinstan e the aseatCERN inthesummer

of2012, whenthe existen eoftheHiggs boson was onrmed throughexperi-

mentswhi hsupportedthe oldand well known hypothesis. However,one an

notalways arryoutexperimentstovalidatehypotheses. Sometimesit issim-

ply impossibleto repli ate the model in reality, asis the ase when studying

the ee ts of global warming. Obviously, sin e we anonly generate a single

physi al instan e of theEarth, any simulationsneedto bedone via omputer

modelling. To better ree treality, theresolution needsto behigh and many

dierentphysi alandmeteorologi alfa torsneedtobetakenintoa ount. The

surfa e of the Earth is broken into 10km times 10km squares, ea h with its

temperature, airpressure,moistureand more. Thedynami sof theseweather

fa torsneed to besimulatedwith small times steps,perhapsmany years into

thefuture. TheMathemati s andClimate Resear h Network (MCRN) arries

outextensivesto hasti simulations,repli atingtheEarthusingdierenttypes

(8)

ti simulationis immensely omputationally ostly. This s ienti work alone

justiestheimportan e of ontinuingresear handimprovementin theeld of

sto hasti simulation.

Asubeldofsto hasti simulationwhi hdealswithunlikelyeventsofsmall

probabilityis alled rare-eventsimulation. Examples of rare-eventsimulation

is when al ulating apital requirements of a nan ing rm subje t to Basel

III regulations, orof a insuran e ompany subje t to Solven y II regulations.

Natural atastrophessu h asavalan hes,vol ani eruptions,to name but few,

are also typesrare-eventsfor whi h we areinterested in analysing. This is of

parti ular importan e when it omes to omputationally heavy models. That

isbe ause,ifaneventisrarea omputerneedsmanysimulationsto getafair

pi ture of its frequen y and the ir umstan es in whi h it o urred. And if

everysimulationtakes upalot of omputationaltime, then athoroughstudy

wouldrequireaprohibitiveamountof omputertimewouldindeedberequired.

Thereforetheimprovementofe ientrare-eventsto hasti simulationisofhigh

importan e.

Theee tofheavy-tailsinsto hasti modellingisanimportantfa tornotto

beoverlooked. Byheavytailswemeanessentiallythatthereisanon-negligible

probabilityofextremeout omesthatdiersigni antlyfromtheaverage.Su h

extremeout omesmayhavea onsiderableimpa tonasto hasti system. For

instan e,large laimsduetoa atastrophi eventarriveataninsuran e ompany

ausing seriousnan ialdistressforthe ompany. Similarly,largeu tuations

onthenan ialmarketmayleadtoinsolven yofnan ialinstitutions. Indata

networksthearrivalofhugelesmay auseseriousdelaysinthenetwork,and

soon.

This thesispresentsa newmethodology in rare-event simulationbased on

the theory of Markov hain Monte Carlo. The general method presented in

Se tion 2makesverymodestprobabilisti assumptionsand in subsequentse -

tions(randomwalkinSe tion3,randomsuminSe tion4,sto hasti re urrent

equationsin Se tion 5,ruinprobabilityin Se tion6)is applied tofew on rete

examplesandshowntobee ient.

1.1 Sto hasti simulation

In this se tion we introdu e the basi tools in sto hasti simulation, su h as

pseudo random number, the inversionmethod and Monte Carlo. We present

theMarkov hain MonteCarlomethodologyanddis ussbrieyergodi ity.

1.1.1 Samplinga randomvariable

Inthisse tionwepresentthefoundationsof sto hasti simulation,namelythe

generationofapseudorandomnumberbya omputerandhowit anbeused

tosamplearandomvariable viatheinversionmethod.

Most statisti al software programs provide methods for generating a uni-

formly distributed pseudo random number on the interval, say,

[0, 1]

^. ^These

algorithms are deterministi , at its ore, and an only imitate the properties

and behaviour of auniformly distributed randomvariable. The early designs

ofsu halgorithmsshowedawsin thesense thatthepseudorandomnumbers

generated followed a pattern whi h ould easily be identied and predi ted.

(9)

randomnumbers,mimi kingatruerandomnumberquitewell. Forthepurposes

of this thesis we assume the existen eof an algorithm produ ing auniformly

distributedpseudorandomnumber,andignoreanyde ien iesanderrorsaris-

ing from the algorithm. Inshort, we assume that we an sample a perfe tly

uniformlydistributed randomvariableinsome omputerprogram. Foramore

thoroughanddetaileddis ussionwereferto[48℄.

Now onsider arandomvariable

X

^and ^denote ^by

F

^itsprobabilitydistri- bution. Saywewould like,viasome omputersoftware,to sampletherandom

variable

X

^.One âpproa
h îs ^the înversion^method. ^The înversion^method în-

volvesonlyapplyingthequantilefun tion touniformlyrandomvariable. More

formallythealgorithmisasfollows.

1. Sample

U

^from ^the^standard^uniformdistribution.

2. Compute

Z = F ⁻¹ (U )

^,

where

F ⁻¹ = min{x | F (x) ≥ p}

^. ^The^random^variable

Z

^has^the^same^distri-

butionas

X

^as^the^following^display^shows.

P(Z ≤ x) = P(F ⁻¹ {U } ≤ x) = P(U ≤ F (x)) = F (x)

^.

Themethod aneasilybeextendedtosampling

X

onditionedonbeinglarger than some onstant

c

^. ^Meaning ^that ^we^want^to ^sample^from ^the onditional distribution

P(X ∈ · | X > c)

^.

Thealgorithmisformallyasfollows.

1. Sample

U

^from ^the^standard^uniformdistribution.

2. Compute

Z = F ⁻¹

1 − F (c)

U + F (c)

.

Thedistributionof

Z

^is^given^by,

P(Z ≤ x) = P (1 − F (c))U + F (c) ≤ F (x)

= P

U ≤ F (x) − F (c) 1 − F (c)

= F (x) − F (c)

1 − F (c) = P(c ≤ X ≤ x)

P(X > c) = P(X ≤ x | X > c)

^.

Thustheinversionmethodprovidesasimplewayofsamplingarandomvariable,

onditionedonbeinglargerthan

c

^,^based^solelyôn^the^generationôfâûniformly

distributedrandomnumber.

The moststandardtoolfor sto hasti simulationis the Monte Carlote h-

nique. ThepowerofMonteCarloisitssimpli ity. Let

X

^be^a^random^variable

andassumewewantto omputetheprobabilitythat

{X ∈ A}

^for^some^Borel

set

A

^. ^The îdeaôf ^Monte ^Carloîs ^to ^sample independentand identi ally distributed opies of random variable, say

X 1 , . . . , X n

^and ^simply ^ompute ^the

frequen y of hitting the set

A

^. ^More ^formally, ^the ^Monte ^Carlo ^estimator^of

P(X ∈ A)

^is^given^by

b p = 1

n X n i=1

I{X i ∈ A}

^.

Whilethe pro edure is easyand simplethere are drawba ksthat will be dis-

ussedinSe tion1.1.3.

(10)

In this se tion we present a simulation te hnique alled Markov hain Monte

Carlo(MCMC) forsampling arandomvariable

X

^despite^only^having^limited

informationaboutitsdistribution.

MCMC is typi ally useful when sampling a random variable

X

^having ^a

density

f

^that îsônly^knownûp^toâônstant,^say

f (x) = π(x)

c

^,

where

π

^is^known ^but

c = R

π(x)dx

^is^unknown. ^This ^may^seem^strange^setup

atrstbuton enotedthatthenormalising onstant

c

^may^be^di
ult^to^deter-

mine,saythereisnoknown losedformfor

c

^,^then^this^is^a^naturalformulation.

Anexampleofthistypeofsetup anbefoundinBayesianstatisti sandhidden

Markov hains.

Inshort,thebasi ideaofsamplingviaMCMCistogenerateaMarkov hain

(Y t ) t≥0

^whose învariant ^density îs ^the ^sameâs ôf

X

^, ^namely

f

^. ^There ^exists

plentiful of MCMC algorithms but weshall only name twoin this thesis, the

Metropolis-HastingsalgorithmandtheGibbsalgorithm.

ThemethodrstlaidoutbyMetropolis[41℄andthenextendedbyHastings

[26℄ is based on a proposal density, whi h we shall denote by

g

^. ^Firstly ^the

Markov hain

(Y t ) t≥0

^is initialised with some

Y 0 = y 0

^. ^The ^idea ^behind ^the

Metropolis-Hastingsalgorithmis togenerateaproposalstate

Z

^using^the^pro-

posaldensity

g

^. ^The^next^stateôf^the^Markov^hain îs^thenâssigned^the^value

Z

^with ^the ^a

eptan
eprobability

α

^, ^otherwise ^the ^next^state ^of ^the ^Markov

hain staysun hanged (i.e. retains the same value as before). More formally

thealgorithmisasfollows.

Algorithm 1.1. Set

Y 0 = y 0

^. ^F^or^a^given^state

Y k

^, ^for^some

k = 0, 1, . . .

^, ^the

nextstate

Y k+1

^is^sampled ^as^follows

1. Sample

Z

^from^the^proposal^density

g

^.

2. Let

Y k+1 =

Z

^with probability

α(Y k , Z) Y k

^otherwise

where

α(y, z) = min{1, r(y, z)}

^,

r(y, z) = ^π(z)g(z,y) _π(y)g(y,z)

^.

Thisalgorithmprodu esaMarkov hain

(Y k ) k≥1

^whose^invariant^density^is

givenby

f

^. ^Fore^more^details ^on^theMetropolis-Hastingsalgorithmwereferto [3℄and[23℄.

Anothermethodof MCMCsamplingistheGibbssampler,whi hwasorig-

inally introdu ed by Gemanand Geman in [22℄. If the random variable

X

^is

multi-dimensional

X = (X 1 , . . . , X d )

^, ^the ^Gibbs ^sampler ûpdates êa
h ôm-

ponent at the time by sampling from the onditional marginal distributions.

Let

f _k|6k (x k | x 1 , . . . , x k−1 , x k+1 , . . . , x d ), k = 1, . . . , d,

^denote ^the onditional density of

X k

^given

X 1 , . . . , X k−1 , X k+1 , . . . , X d

^. ^The ^Gibbs ^sampler ^an ^be

viewed as a spe ial ase of the Metropolis-Hastings algorithm where, given

Y k = (Y k,1 , . . . , Y k,d )

^,^one^rst^updates

Y k,1

^from^the onditional density

f 1|61 (· |

Y k,2 , . . . , Y k,d )

^,^then

Y k,2

^from^the onditionaldensity

f 2|62 (· | Y k+1,1 , Y k,3 , . . . , Y k,d )

^,

(11)

waysequalto

1

^,^so^no^a

eptan
e^step^is^needed.

An importantproperty of aMarkov hainis its ergodi ity. Informally, er-

godi itymeasuresthehowqui klytheMarkov hainmixesand thus howsoon

thedependen y of the hain dies out. This is ahighly desired property sin e

goodmixing speedsupthe onvergen eoftheMarkov hain.

1.1.3 Rare-eventsimulation

Insomespe i asesweareinterestedin omputingtheprobabilityof arare

event. Thismaybetheprobabilityofruinofanan ial ompanyduetorandom-

nessinthefuturevalueofassetsandliabilities. Themultidimensionalsystemof

investmentsandbondsmaybeso omplexthatasimulationofthe atastrophi

eventofaruinmaybefeasible. Foranotherexample, onsideragraphofsome

sort and say wesend out a parti le on arandom walkalong the graphgiven

somestartingposition. Computingthesmall,and qui klyde reasingprobabil-

ity,of thatparti lereturningto itsstartingposition maybeofinterestasitis

anindi atorofthat graph'sdimension. Forthese reasonsand manyother, the

omputationoftheprobabilityforarare-eventisrelevant.

Consideranunbiasedestimator

p b

^of^theprobability

p

^andinvestigateitsper- forman eastheprobabilitygetssmaller

p → 0

^. ^A^usefulperforman emeasure istherelativeerror:

RE

(b p) =

^Std

(b p) p

^.

An estimatoris said to have vanishing relative error if RE

(b p) → 0

^as

p → 0

andboundedrelative error ifRE

(b p) < ∞

^as

p → 0

^.

Itiswellknownthat theMonteCarloestimatorisine ientfor omputing

rare-event probabilities as the following argument shows. Let

X

^be ^a ^given

randomvariablewithdistributionfun tion

F

^and^say^we^would^like^to^ompute

p = P(X ∈ A)

^. ^We^sample^numberôfî.i.d. ôpiesôf

X

^,^denoted^by

X 1 , . . . , X n

and ompute

b p = 1 n

X n i=1

I{X i ∈ A}

^.

Thevarian eoftheestimatoris

Var(b p) = _n ¹ p(1 − p)

^,^whi
h^learly^tends^to^zero

as

n → ∞

^but ^that îs^not ^main ôn
ern^here. ^Whatîs ^moreinteresting is its relativeerrorastheprobability

p

^tends^to^zero. Îts^relativeêrrorîs^given^by

Std

(b p)

p =

r 1 n

1 p − 1

.

The relative error tends to innity as

p → 0

^. ^Thus ^making ^the ^Monte ^Carlo

algorithmvery ostlywhenit omestorare-eventsimulation. Forexample,ifa

relativeerrorat

1%

^is^desired^and^theprobabilityisoforder

10 ⁻⁶

^then^we^need

totake

n

^su
h^that

p

(10 ⁶ − 1)/n ≤ 0.01

^. ^This ^implies^that

n ≈ 10 ¹⁰

^whi
h^is

infeasibleonmost omputersystems.

Toimproveon standardMonte Carloa ontrol me hanismneedsto bein-

trodu ed that steer the samples towardsthe relevant partof the statespa e,

therebyin reasing therelevan eof ea h sample. There are severalwaysto do

this,forinstan ebyimportan esamplingdes ribedbrieybelow,orbysplitting

(12)

[14℄.

1.1.4 Importan esampling

Thesimulationmethodofimportan esampling omesasaremedytotheprob-

lemarisinginrare-eventsimulation. TheunderlyingproblemoftheMonteCarlo

simulationforrare-eventstudiesis thefa t thatwegettoofew samples in the

importantpartofthe outputspa e,meaningthatwegettoofewsampleswhere

{X ∈ A}

^. ^The ^basiîdea ôf împortan
e ^sampling îs ^that însteadôf ^sampling

from the original distribution

F

^the

X 1 , . . . , X n

^are ^sampled ^from ^a ^so-
alled

samplingdistribution,say

G

^. ^The^sampling distribution

G

^is^hosen^su
h^that

weobtainmoresampleswhere

{X ∈ A}

^. ^The^importan
e^sampling^is^then^sim-

plytheaverageofhittingtheevent,weightedwiththerelevantRadon-Nikodym

derivative,

b p

IS

= 1

n X n i=1

dF

dG I{X i ∈ A}

^.

Thisisaunbiasedand onsistentestimatorsin e

E _G [b p

IS

] = Z

A

dF

dG dG = P(X ∈ A)

^.

Themaindi ultyinimportan esamplingistodesignthesamplingdistri-

bution. Traditionally the fun tionality and reliabilityof new sto hasti simu-

lation algorithmsisproved byrunningextensivenumeri alexperiments. But

numeri al eviden e alone is insu ient. There are numerous exampleswhere

the standard heuristi s fail and the numeri al eviden e indi ates that the al-

gorithm has onverged when, in fa t, it is severely biased [24℄. The limited

eviden e providedbysimplyrunningnumeri alexperimentshasgenerated the

need for a deeper theoreti al understanding and analysis of the performan e

of sto hasti simulationalgorithms. Overthe last de ade mathemati al tools

from stability theoryand ontrol theory havebeen developed withthe aimto

theoreti ally quantify the performan e of sto hasti simulation algorithms for

omputingprobabilitiesof rareevents. Inthe ontext ofimportan esampling

two main approa hes have been studied; the subsolution approa h, based on

ontrol theory, by Dupuis, Wang, and ollaborators, see e.g. [18, 19, 17℄, and

the approa h based on Lyapunov fun tions and stability theory by Blan het,

Glynn, andothers,see[5,6,7,10℄.

Inthetheoreti alworkone ientimportan esamplinganalgorithmissaid

to bee ient ifrelativeerrorpersample,Std

(b p)/p

^does^not^grow^too ^rapidly

as

p ↓ 0

^.

1.1.5 Heavy-taileddistributions

Inthisthesiswe onsider inparti ular probabilitydistributions

F

^with^heavy-

tails. Thenotionofheavytailsreferstotherateofde ayofthetail

F = 1 − F

ofadistributionfun tion

F

^. ^A^popular^lass^ofheavy-taileddistributionsisthe lass of subexponentialdistributions. A distribution fun tion

F

^supported ^on

thepositiveaxisissaidto belongtothesubexponentialdistributions if

x→∞ lim

P(X 1 + X 2 > x)

P(X 1 > x) = 2

^,

(13)

forindependentrandom variables

X 1

^and

X 2

^with distribution

F

^. ^A ^sub
lass

ofthe subexponentialdistributions is theregularlyvarying distributions.

F

^is

alledregularlyvarying(at

∞

⁾^with^index

−α ≤ 0

^if

t→∞ lim F (tx)

F (t) = x ^−α ,

^for^all

x > 0

^.

Theheavy-taileddistributions are oftendes ribed withtheonebigjump

analogy,meaningthattheeventofasumofheavy-tailedrandomvariablesbeing

largeis dominated by the aseof one of the variables being very largewhilst

therestarerelativelysmall. Thisisin sharp ontrastto the aseoflight-tails,

where thesame event is dominated by the aseof everyvariable ontributing

equallyto the total. As areferen eto theonebig jump analogywerefer the

readerto[28,30,15℄.

This one big jump phenomena has been observed in empiri al data. For

instan e, when we onsider sto k market indi es su h as Nasdaq, Dow Jones

et . itturnsoutthatthedistributionofdailylogreturnstypi allyhasaheavy

lefttail,see Hultetal.in[29℄. AnotherexampleisthewellstudiedDanishre

insuran e data, whi h onsists of real-life laims aused by industrial res in

Denmark. Whilethearrivalsof laimsisshowedtobenotfarfromPoisson,the

laimsizedistributionshows learheavy-tailbehavior.Thedatasetisanalysed

byMikos h in [43℄and thetailof the laimsize is shown tobet wellwith a

Pareto distribution.

Sto hasti simulationin thepresen eofheavy-taileddistributionshasbeen

studiedwith mu h interestin re entyears. The onditionalMonte Carlote h-

niquewasappliedonthissettingbyAsmussenetal.[2,4℄. Dupuisetal.[16℄used

importan e sampling algorithm in a heavy-tailed setting. Finally we mention

theworkofBlan hetetal. onsideringheavy-taileddistributions in[11,8℄.

1.2 Markov hain Monte Carlo in rare-event simulation

Inthis se tion wedes ribeanewmethodologybasedon Markov hain Monte

Carlo (MCMC), for omputing probabilities of rare events. A more general

versionof the algorithm, for omputingexpe tations, is provided in Se tion 2

alongwithapre iseasymptoti e ien y riteria.

1.2.1 Formulation

Let

X

^be^areal-valuedrandomvariablewithdistribution

F

^and^density

f

^with

respe ttotheLebesguemeasure. Theproblem isto omputetheprobability

p = P(X ∈ A) = Z

A

dF

^. ^(1.1)

Theevent

{X ∈ A}

îs^thoughtôfâs^rareⁱⁿ^the^sense^that

p

^is^small. ^Let

F A

^be

the onditionaldistributionof

X

^given

X ∈ A

^. ^The^density^of

F A

^is^given^by

dF A

dx (x) = f (x)I{x ∈ A}

p .

^(1.2)

ConsideraMarkov hain

(X t ) t≥0

^with^invariant^density^given^by^(1.2). ^Su
h^a

Markov hain anbe onstru ted byimplementing anMCMC algorithm su h

asaGibbssampleroraMetropolis-Hastingsalgorithm, seee.g.[3,23℄.

(14)

To onstru tan estimator for the normalising onstant

p

^, ^onsider ^a ^non-

negativefun tion

v

^, ^whi
h ^is^normalisedⁱⁿ ^the^sense ^that

R

A v(x)dx = 1

^. ^The

fun tion

v

^will ^be^hosen^later âs^partôf ^the^designôf ^theêstimator. ^Fôrâny

hoi e of

v

^the^sample^mean,

1 T

T−1 X

t=0

v(X t )I{X t ∈ A}

f (X t ) ,

anbeviewedasanestimateof

E _F _A

v(X)I{X ∈ A}

f (X)

= Z

A

v(x) f (x)

f (x) p dx = 1

p Z

A

v(x)dx = 1 p .

Thus,

b q T = 1

T

T−1 X

t=0

u(X t ),

^where

u(X t ) = v(X t )I{X t ∈ A}

f (X t )

^, ^(1.3)

isanunbiasedestimatorof

q = p ⁻¹

^. ^Then

p b T = b q _T ⁻¹

îsânêstimatorôf

p

^.

The expe tedvalueaboveis omputed undertheinvariantdistribution

F A

of the Markov hain. It is impli itly assumed that the sample size

T

^is ^su-

iently largethat theburn-in period, the timeuntil the Markov hain rea hes

stationarity,is negligibleoralternativelythat theburn-in period is dis arded.

Anotherremarkisthatitistheoreti allypossiblethatallthetermsin thesum

in (1.3) arezero, leadingto the estimate

q b T = 0

^and^then

p b T = ∞

^. ^T^o^avoid

su hnonsenseone ansimplytake

p b T

^as^the^minimum^of

q b ⁻¹ _T

^and^one.

Therearetwoessentialdesign hoi esthatdeterminetheperforman eofthe

algorithm: the hoi e ofthefun tion

v

^and ^the^design^of^the^MCMC ^sampler.

Thefun tion

v

^inuen
es^the^varian
e^of

u(X t )

ⁱⁿ^(1.3)ândîs^thereforeôf^main

on ernfor ontrollingtherare-eventpropertiesofthealgorithm. Itisdesirable

totake

v

^su
h^that^the^normalised^varian
e^of^the^estimator,^given^by

p ² Var(b q T )

^,

isnottoolarge. ThedesignoftheMCMCsampler,ontheotherhand,is ru ial

to ontrolthedependen eoftheMarkov hainandtherebythe onvergen erate

ofthealgorithm asafun tionof thesamplesize. Tospeedupsimulationit is

desirable that the Markov hain mixes fast so that the dependen e dies out

qui kly.

1.2.2 Controllingthe normalisedvarian e

This se tion ontains a dis ussion on how to ontrol the performan e of the

estimator

q b T

^by ontrollingitsnormalisedvarian e.

Fortheestimator

q b T

^to^beûsefulîtîsôfôurseîmportant^thatîts^varian
e

isnottoolarge. Whentheprobability

p

^to^beêstimatedîs^smallîtîs^reasonable

toask that

Var(b q T )

îsôf^sizeômparable^to

q ² = p ⁻²

^,^orequivalently,that the standarddeviationoftheestimatoris roughlyofthesamesizeas

p ⁻¹

^. ^To^this

endthenormalisedvarian e

p ² Var(b q T )

^is^studied.

Letus onsider

Var(b q T )

^. ^With

u(x) = v(x)I{x ∈ A}

f (x) ,

(15)

p ² Var F A (b q T ) = p ² Var F A

1 T

T X −1 t=0

u(X t )

= p ² 1

T Var F A (u(X 0 )) + 2 T ²

T−1 X

t=0 T X −1 s=t+1

Cov F A (u(X s ), u(X t ))

, (1.4)

Letusforthemomentfo usourattentionontherstterm. It anbewritten

as

p ²

T Var F A u(X 0 )

= p ² T

E _F _A

u(X 0 ) ²

− E F A

u(X 0 ) 2

= p ² T

Z v(x)

f (x) I{x ∈ A} 2

F A (dx) − 1 p ²

= p ² T

Z v ² (x)

f ² (x) I{x ∈ A} f (x) p dx − 1

p ²

= 1

T

Z

A

v ² (x)p

f (x) dx − 1

.

Therefore,inorderto ontrolthenormalisedvarian ethefun tion

v

^must^be

hosensothat

R

A v ² (x)

f(x) dx

^is^lose^to

p ⁻¹

^. ^An ^importantobservationisthat the onditionaldensity(1.2)playsakeyrolein ndingagood hoi eof

v

^. ^Letting

v

^be^the onditionaldensityin (1.2)leadsto

Z

A

v ² (x) f (x) dx =

Z

A

f ² (x)I{x ∈ A}

p ² f (x) dx = 1 p ²

Z

A

f (x)dx = 1 p

^,

whi h implies,

p ²

T Var F A u(X)

= 0

^.

This motivatestaking

v

^as ^an approximationof the onditional density (1.2).

Thisissimilartotheideologybehind hoosingane ientimportan esampling

estimator.

Ifforsomeset

B ⊂ A

^theprobability

P(X ∈ B)

ân^beômputedêxpli
itly^,

thena andidatefor

v

^is

v(x) = f (x)I{x ∈ B}

P(X ∈ B)

^,

the onditionaldensityof

X

^given

X ∈ B

^. ^This ^andidate^is^likely^to ^perform

wellif

P(X ∈ B)

^is^a^goodapproximationof

p

^. ^Indeed,ⁱⁿ^this^ase

Z

A

v ² (x) f (x) dx =

Z

A

f ² (x)I{x ∈ B}

P(X ∈ B) ² f (x) dx = 1 P(X ∈ B) ²

Z

B

f (x)dx = 1 P(X ∈ B)

^,

whi h willbe loseto

p ⁻¹

^.

Now,letus shiftemphasisto the ovarian etermin (1.4). Asthe samples

(X t ) ^T _t=0 ⁻¹

^form^a^Markov^hain^the

X t

^'s^are^dependent. ^Therefore^the^ovarian
e

termin(1.4)isnon-zeroandmaynotbeignored. The rudeupperbound

Cov F A (u(X s ), u(X t )) ≤ Var F A (u(X 0 )),

(16)

2p ² T ²

T X −1 t=0

T X −1 s=t+1

Cov F A (u(X s ), u(X t )) ≤ p ² 1 − 1

T

Var F A (u(X 0 ))

forthe ovarian eterm. Thisisavery rudeupperbound asitdoesnotde ay

to zero as

T → ∞

^. ^But, ât ^the ^moment, ^the êmphasis îs ôn ^small

p

^so ^we

willpro eedwiththisupperboundanyway. Asindi atedabovethe hoi eof

v

ontrols theterm

p ² Var F A (u(X 0 ))

^. ^We^on
lude^that ^the^normalised^varian
e

(1.4)oftheestimator

q b T

îsôntrolled^by^the^hoi
eôf

v

^when

p

^is^small.

1.2.3 Ergodi properties

Aswehavejustseenthe hoi eofthefun tion

v

^ontrols^the^normalised^varian
e

of theestimator forsmall

p

^. ^The^design ôf ^the^MCMC ^sampler, ôn^the ôther

hand,determinesthestrengthof thedependen ein theMarkov hain. Strong

dependen eimpliesslow onvergen ewhi hresultsinahigh omputational ost.

The onvergen e rate of MCMC samplers an be analysed within the theory

of

ϕ

-irredu ible Markov hains. Fundamentalresults for

ϕ

-irredu ibleMarkov hainsaregivenin[42,44℄. Wewillfo uson onditionsthatimplyageometri

onvergen erate. The onditionsgivenbelowarewellstudiedinthe ontextof

MCMC samplers. Conditionsforgeometri ergodi ityin the ontext ofGibbs

samplers have been studied by e.g. [12, 51, 52℄, and for Metropolis-Hastings

algorithmsby[40℄.

A Markov hain

(X t ) t≥0

^with ^transition^kernel

p(x, ·) = P(X t+1 ∈ · | X t = x)

^is

ϕ

-irredu ible if there exists a measure

ϕ

^su
h ^that

P

t p ^(t) (x, ·) ≪ ϕ(·)

^,

where

p ^(t) (x, ·) = P(X t ∈ · | X 0 = x)

^denotes ^the

t

^-step^transition ^kernel^and

≪

^denotes^absolute ontinuity. AMarkov hainwithinvariantdistribution

π

^is

alledgeometri allyergodi ifthereexistsapositivefun tion

M

ândâônstant

r ∈ (0, 1)

^su
h^that

kp ^(t) (x, ·) − π(·)k

^TV

≤ M (x)r ^t ,

^(1.5)

where

k · k

TV

denotesthetotal-variationnorm. This onditionensuresthatthe

distributionoftheMarkov hain onvergesatageometri ratetotheinvariant

distribution. Ifthefun tion

M

^is^bounded,^then^the^Markov^hain^is^said^to^be

uniformlyergodi . Conditionssu has(1.5)maybedi ulttoestablishdire tly

and are therefore substituted by suitable minorisation or drift onditions. A

minorisation onditionholdsonaset

C

îf^thereêxist âprobabilitymeasure

ν

^,

apositiveinteger

t 0

^,^and

δ > 0

^su
h^that

p ^(t ⁰ ⁾ (x, B) ≥ δν(B),

for all

x ∈ C

^and ^Borel ^sets

B

^. ^In ^this ^ase

C

^is ^said ^to ^be ^a ^small ^set.

Minorisation onditions havebeen used for obtaining rigorous bounds on the

onvergen eofMCMCsamplers,seee.g.[49℄.

If the entire state spa e is small, then the Markov hain is uniformly er-

godi . Uniformergodi itydoestypi allynothold forMetropolissamplers, see

Mengersen and Tweedie in [40℄ Theorem 3.1. Therefore useful su ient on-

ditions for geometri ergodi ityare often givenin theform of drift onditions

[12, 40℄. Drift onditions,establishedthrough the onstru tionof appropriate

Lyapunovfun tions, are alsouseful for establishing entral limit theorems for

MCMCalgorithms,see[34, 42℄andthereferen estherein.

(17)

Roughlyspeaking,theargumentsgivenaboveleadtothefollowingdesiredprop-

ertiesoftheestimator.

1. Rareevent e ien y: Constru tanunbiasedestimator

q b T

^of

p ⁻¹

^a

ord-

ing to (1.3)by nding a fun tion

v

^whi
h approximates the onditional density (1.2). The hoi e of

v

^ontrols ^the ^normalised ^varian
e ^of ^the

estimator.

2. Large sample e ien y: Design the MCMC sampler, by nding an ap-

propriateGibbssampleroraproposaldensityintheMetropolis-Hastings

algorithm,su hthattheresultingMarkov hainisgeometri allyergodi .

1.3 Outline and ontribution of this thesis

Theoutlineand ontributionofthethesisareasfollows.

a. General formulation of the algorithm in Se tion 2. In this se tion we

presenttheformalmethodology in howto set uptheMCMC simulation

fore ientrare-event omputation. Theprobabilisti assumptionsmade

aremildandthesettingis forinstan e notrestri tedtoheavy-tails. The

twoessentialdesign hoi esarehighlighted. Correspondingtorare-event

e ien yand largesamplee ien y.

b. Appli ationtoheavy-tailedrandomwalksinSe tion3. Inthisse tionthe

MCMCmethodologyisappliedto theproblem of omputing

p n = P(Y 1 + · · · + Y n > a n )

^,

where

a n → ∞

^su
iently^fast^so^that^theprobabilitytendstozero. The in rements

Y

^are^assumed^to^beheavy-tailed. WepresentaGibbssampler toprodu eaMarkov hainwhoseinvariantdistributionisthe onditional

distribution

P (Y 1 , . . . , Y n ) ∈ · | Y 1 + · · · + Y n > a n

.

TheMarkov hainisshowntopreservestationarityanduniformlyergodi ,

ensuring the largesample e ien y. Inaddition we designan estimator

for

1/p n

^having ^vanishing ^normalised ^varian
e. ^Numeri
al experiments performed and omparison made between MCMC and best-performing

existingimportan esamplingestimatorsaswellasstandardMonteCarlo.

. Appli ationtoheavy-tailedrandomsumsinSe tion4. Inthisse tionthe

MCMCmethodologyisappliedto theproblem of omputing

p n = P(Y 1 + · · · + Y N n > a N n )

^,

where

N

îsâ ^random^variable ând

a N → ∞

^su
iently ^fast^so ^that ^the

probability tends to zero. The in rements

Y

^are ^assumed ^to ^be ^heavy-

tailed. We present a Gibbs sampler to produ e a Markov hain whose

invariantdistributionisthe onditionaldistribution

P (N, Y 1 , . . . , Y N ) ∈ · | Y 1 + · · · + Y N > a N

.

(18)

ensuring the largesample e ien y. Inaddition wedesignan estimator

for

1/p n

e. ^Numeri

d. Appli ationtosto hasti re urrentequationsinSe tion5. Inthisse tion

the MCMC methodology is applied to the problem of omputing

p n = P(X n > a n )

^,^where

X n = A n X n−1 + B n

^,

X 0 = 0

^,

and

a n → ∞

^su
iently^fast^so^that^theprobabilitytendstozero. Thein- rements

B

âreâssumed^to^be^regularly^varyingôfîndex

α

^and

E[A ^α+ǫ ] <

∞

^for^some

ǫ > 0

^. ^We^present^a^Gibbs^sampler^to^produ
e^a^Markov^hain

whoseinvariantdistribution isthe onditional distribution

P (A 2 , . . . , A n , B 1 , . . . , B n ) ∈ · | X n > a n

.

TheMarkov hainisshowntopreservestationarityanduniformlyergodi ,

ensuring the largesample e ien y. Inaddition wedesignan estimator

for

1/p n

e. ^Numeri

e. Appli ation to omputingprobabilityofruininaninsuran emodel with

riskyinvestmentsinSe tion6...

A papertitled Markov hain Monte Carlo for omputing rare-event proba-

bilities for a heavy-tailed random walk by Gudmundssonand Hult [25℄ based

on Se tions 2, 3, and4 in the thesis hasbeena epted for publi ation in the

JournalofAppliedProbabilityinJune2014.

(19)

tion

Inthisse tiontheMarkov hainMonteCarloideasareappliedto theproblem

of omputinganexpe tation. Here thesettingisgeneral,forinstan e, thereis

noassumptionthatdensitieswithrespe ttoLebesguemeasureexist.

Let

X

^be ^a ^random^variable ^with distribution

F

^and

h

^be ^anon-negative

F

-integrablefun tion. Theproblemisto omputetheexpe tation

θ = E h(X)

= Z

h(x)dF (x)

^.

Inthe spe ial asewhen

F

^has^density

f

^and

h(x) = I{x ∈ A}

^this ^problem

redu estothesimplerproblemof omputingtheprobabilityin(1.1). illustrated

inSe tion 1.2.

The analogueof the onditional distribution in (1.2)is thedistribution

F h

givenby

F h (B) = 1 θ

Z

B

h(x)dF (x),

^for^measurable^sets

B

^.

ConsideraMarkov hain

(X t ) t≥0

^having

F h

âsîtsînvariantdistribution. To deneanestimatorof

θ ⁻¹

^, ^onsider^aprobabilitydistribution

V

^with

V ≪ F h

^.

Thenitfollowsthat

V ≪ F

ândîtîsâssumed^that^the^density

dV /dF

^is^known.

Considertheestimatorof

ζ = θ ⁻¹

^given^by

ζ b T = 1

T

T X −1 t=0

u(X t )

^, ^where

u(x) = 1 θ

dV dF h

(x)

^. ^(2.1)

Notethat

u

^does^not^depend^on

θ

^be
ause

V ≪ F h

^and^therefore

u(x) = 1 θ

dV dF h

(x) = 1 h(x)

dV dF (x),

for

x

^su
h^that

h(x) > 0

^. ^Theêstimator^(2.1)îsâgeneralisationoftheestimator (1.3) where one an think of

v

^as ^the ^density ^of

V

^with ^respe
t ^to ^Lebesgue

measure. Anestimatorof

θ

^an ^then onstru tedas

θ b T = b ζ _T ⁻¹

^.

The varian e analysis of

ζ b T

^follows ^pre
isely ^the ^steps ^outlined ⁱⁿ ^Se
tion

1.2. Thenormalisedvarian eis

θ ² Var F h (b ζ T ) = θ ²

T Var F h u(X 0 ) + 2θ ²

T ²

T X −1 t=0

T−1 X

s=t+1

Cov F h u(X s ), u(X t )

, (2.2)

wheretherstterm anberewritten,similarlyto thedisplay(1.4),as

θ ²

T Var F h u(X 0 )

= 1 T

E _V h dV

dF h

i

− 1

.

The analysis above indi ates that an appropriate hoi e of

V

^is ^su
h ^that

E _V [ _dF ^dV

h ]

^is^lose^to

1

^. ^Again,^the^ideal^hoi
e^would^be^taking

V = F h

^leading^to

zerovarian e. This hoi eisnotfeasiblebutneverthelesssuggestssele ting

V

^as

anapproximationof

F h

^. Âsâlready^noted^thisîs^similar^to^theîdeology^behind

hoosingane ientimportan esamplingestimator. Thedieren ebeingthat

here

V ≪ F

^is^required^whereasⁱⁿ^importan
e^sampling

F

^needs^be^absolutely

ontinuouswith respe tto thesampling distribution. The rudeupperbound

forthe ovarian etermin (2.2)is valid,justasinSe tion1.2.

(20)

Asymptoti e ien y anbe onvenientlyformulatedintermsofalimit riteria

as alarge deviation parametertends to innity. As is ustomaryin problems

relatedtorare-eventsimulationtheproblemathandisembeddedinasequen e

ofproblems,indexedby

n = 1, 2, . . .

^. ^The^general^setup^is^formalised^as^follows.

Let

(X ⁽ⁿ⁾ ) n≥1

^be^a ^sequen
e^of ^random ^variables ^with

X ⁽ⁿ⁾

^having ^distri-

bution

F ⁽ⁿ⁾

^. ^Let

h

^be^anon-negativefun tion,integrablewithrespe tto

F ⁽ⁿ⁾

^,

forea h

n

^. ^Suppose

θ ⁽ⁿ⁾ = E

h(X ⁽ⁿ⁾ )

= Z

h(x)dF ⁽ⁿ⁾ (x) → 0,

as

n → ∞

^. ^The^problem^is^to^ompute

θ ⁽ⁿ⁾

^for^some^large

n

^.

Denote by

F _h ⁽ⁿ⁾

^the distribution with

dF _h ⁽ⁿ⁾ /dF ⁽ⁿ⁾ = h/θ ⁽ⁿ⁾

^. ^F^or ^the

n

^th

problem, aMarkov hain

(X _t ⁽ⁿ⁾ ) ^T _t=0 ⁻¹

^with^invariantdistribution

F _h ⁽ⁿ⁾

^is ^gener-

ated byan MCMCalgorithm. The estimatorof

ζ ⁽ⁿ⁾ = (θ ⁽ⁿ⁾ ) ⁻¹

îs^based ônâ

probabilitydistribution

V ⁽ⁿ⁾

^, ^su
h^that

V ⁽ⁿ⁾ ≪ F _h ⁽ⁿ⁾

^,^with^known^density^with

respe tto

F ⁽ⁿ⁾

^. ^An^estimator

ζ b _T ⁽ⁿ⁾

^of

ζ

^is^given^by

ζ b _T ⁽ⁿ⁾ = 1

T

T X −1 t=0

u ⁽ⁿ⁾ (X _t ⁽ⁿ⁾ ),

where

u ⁽ⁿ⁾ (x) = 1 h(x)

dV ⁽ⁿ⁾ dF ⁽ⁿ⁾ (x).

Theheuristi e ien y riteriainSe tions1.2 annowberigorouslyformu-

latedasfollows:

1. Rare-evente ien y: Sele t theprobabilitydistributions

V ⁽ⁿ⁾

^su
h^that

(θ ⁽ⁿ⁾ ) ² Var _F ⁽ⁿ⁾

h (u ⁽ⁿ⁾ (X)) → 0,

^as

n → ∞.

2. Largesamplesizee ien y: DesigntheMCMCsampler,byndinganap-

propriateGibbssampleroraproposaldensityfortheMetropolis-Hastings

algorithm, su hthat,forea h

n ≥ 1

^, ^the^Markov^hain

(X _t ⁽ⁿ⁾ ) t≥0

^is^geo-

metri ally ergodi .

Remark 2.1. The rare-event e ien y riteria is formulated in terms of the

e ien y of estimating

(θ ⁽ⁿ⁾ ) ⁻¹

^by

ζ b _T ⁽ⁿ⁾

^. Îf ôneînsists ôn ^studying ^the ^mean

andvarian eof

θ b ⁽ⁿ⁾ _T = (b ζ _T ⁽ⁿ⁾ ) ⁻¹

^,^then^the^ee
ts^of^thetransformation

x 7→ x ⁻¹

must be taken into a ount. Forinstan e, the estimator

b θ ⁽ⁿ⁾ _T

îs ^biasedând îts

varian e ould beinnite. Thebias anberedu ed forinstan e viathedelta

methodillustratedin [3,p. 76℄. Wealsoremark thatevenin theestimation of

(θ ⁽ⁿ⁾ ) ⁻¹

^by

ζ b _T ⁽ⁿ⁾

^thereîsâ^biasôming^from^the^fa
t^that^the^Markov^hain^not

beingperfe tlystationary.

(21)

The MCMC methodology presented in Se tion 2 is here applied to ompute

theprobability that arandom walk

S n = Y 1 + · · · + Y n

^, ^where

Y 1 , . . . , Y n

^are

non-negative,independentandheavy-tailed,ex eedsahigh threshold

a n

^. ^This

problemhasre eivedsomeattentionin the ontextof onditionalMonte Carlo

algorithms[2, 4℄andimportan e samplingalgorithms[35,16,11,8℄.

In this se tion a Gibbs sampler is presented for sampling from the on-

ditional distribution

P((Y 1 , . . . , Y n ) ∈ · | S n > a n )

^. ^The ^resulting ^Markov

hain isprovedtobeuniformlyergodi . An estimatorfor

(p ⁽ⁿ⁾ ) ⁻¹

^of^the^form

(2.1)issuggestedwith

V ⁽ⁿ⁾

^as^the onditionaldistributionof

(Y 1 , . . . , Y n )

^given

max{Y 1 , . . . , Y n } > a n

^. ^The^estimator^is ^proved ^to ^have^vanishing ^normalised

varian ewhenthedistributionof

Y 1

^belongs^to^the^lass^ofsubexponentialdis- tributions. Theproofiselementaryandis ompleted inafewlines. This isin

sharp ontrast to e ien y proofs forimportan e sampling algorithms for the

sameproblem,whi hrequiremorerestri tiveassumptionsonthetailof

Y 1

^and

tend to be long and te hni al [16, 11, 9℄. The se tion is on luded with nu-

meri alexperimentstoillustratethe omparativenesswithexistingimportan e

samplingalgorithmandstandardMonteCarlo.

3.1 A Gibbs sampler for omputing

P(S ⁿ > a _n )

Let

Y 1 , . . . , Y n

^benon-negativeindependentandidenti allydistributed random variables with ommon distribution

F Y

^and ^density

f Y

^with ^respe
t ^to ^some

referen e measure

µ

^. ^Consider ^the^random ^walk

S n = Y 1 + · · · + Y n

^and ^the

problemof omputingtheprobability

p ⁽ⁿ⁾ = P(S n > a n )

^,

where

a n → ∞

^su
iently^fast^that

p ⁽ⁿ⁾ → 0

^as

n → ∞

^.

Itis onvenientto denoteby

Y ⁽ⁿ⁾

^the

n

-dimensionalrandomve tor

Y ⁽ⁿ⁾ = (Y 1 , . . . , Y n ) ^⊤

^,

andtheset

A n = {y ∈ R ⁿ : 1 ^⊤ y > a n }

^,

where

1 = (1, . . . , 1) ^⊤ ∈ R ⁿ

^and

y = (y 1 , . . . , y n ) ^⊤

^. ^With ^this^notation

p ⁽ⁿ⁾ = P(S n > a n ) = P(1 ^⊤ Y ⁽ⁿ⁾ > a n ) = P(Y ⁽ⁿ⁾ ∈ A n )

^.

The onditionaldistribution

F _A ⁽ⁿ⁾ _n (·) = P(Y ⁽ⁿ⁾ ∈ · | Y ⁽ⁿ⁾ ∈ A n )

^,

hasdensity

dF _A ⁽ⁿ⁾ _n

dµ (y 1 , . . . , y n ) = Q n

j=1 f Y (y j )I{y 1 + · · · + y n > a n }

p ⁽ⁿ⁾

^. ^(3.1)

The rst step towards dening the estimator of

p ⁽ⁿ⁾

^is ^to ^onstru
t ^the

Markov hain

(Y ⁽ⁿ⁾ _t ) t≥0

^whoseînvariant^densityîs^given^by^(3.1)ûsingâ^Gibbs

sampler. In short,the Gibbs samplerupdates oneelementof

Y _t ⁽ⁿ⁾

^at ^a^time

keepingtheotherelements onstant. Formallythealgorithmpro eedsasfollows.

(22)

Algorithm3.1. Startataninitialstate

Y ₀ ⁽ⁿ⁾ = (Y 0,1 , . . . , Y 0,n ) ^⊤

^where

Y 0,1 +

· · · + Y 0,n > a n

^. ^Given

Y ⁽ⁿ⁾ _t = (Y t,1 , . . . , Y t,n ) ^⊤

^, ^for^some

t = 0, 1, . . .

^,^the^next

state

Y ⁽ⁿ⁾ _t+1

^is^sampled^as^follows:

1. Draw

j 1 , . . . , j n

^from

{1, . . . , n}

^withoutrepla ementandpro eedbyup- datingthe omponentsof

Y ⁽ⁿ⁾ _t

ⁱⁿ^the^order^thus^obtained.

2. Forea h

k = 1, . . . , n

^,^repeat^the^following.

(a) Let

j = j k

^be^theîndex^to^beûpdatedând^write

Y _t,−j = (Y t,1 , . . . , Y t,j−1 , Y t,j+1 , . . . , Y t,n ) ^⊤

^.

Sample

Y _t,j ^′

^from^the onditionaldistributionof

Y

^given^that^the^sum

ex eedsthethreshold. That is,

P(Y _t,j ^′ ∈ · | Y t,−j ) = P

Y ∈ · | Y + X

k6=j

Y t,k > a n

.

(b) Put

Y ^′ _t = (Y t,1 , . . . , Y t,j−1 , Y _t,j ^′ , Y t,j+1 , . . . , Y t,n ) ^⊤

^.

3. Drawarandompermutation

π

^of^the^numbers

{1, . . . , n}

^from^the^uniform

distributionandput

Y _t+1 ⁽ⁿ⁾ = (Y _t,π(1) ^′ , . . . , Y _t,π(n) ^′ ) ^⊤

^.

Iteratesteps(1)-(3)untiltheentireMarkov hain

(Y ⁽ⁿ⁾ _t ) ^T−1 _t=0

^is onstru ted.

Remark3.2. (i)Intheheavy-tailedsettingthetraje toriesoftherandomwalk

leading to the rare event are likelyto onsist of one largein rement (the big

jump)whiletheotherin rementsareaverage.Thepurposeofthepermutation

step is to for e the Markov hain to mix faster by moving the big jump to

dierentlo ations. However,thepermutationstepinAlgorithm3.1isnotreally

needed when onsidering the probability

P(S n > a n )

^. ^This ^is^due ^to ^the^fa
t

thatthesummationisinvariantoftheorderingofthesteps.

(ii)Thealgorithmrequiressamplingfromthe onditionaldistribution

P(Y ∈

· | Y > c)

^for ^arbitrary

c

^. ^This îs êasy ^wheneverînversionîs ^feasible, ^see ^[3,

p. 39℄,ora eptan e/reje tionsampling anbeemployed. Thereare,however,

situations where sampling from the onditional distribution

P(Y ∈ · | Y > c)

maybedi ult,see[33,Se tion2.2℄.

Thefollowingproposition onrmsthattheMarkov hain

(Y ⁽ⁿ⁾ _t ) t≥0

^,^gener-

atedbyAlgorithm3.1,has

F _A ⁽ⁿ⁾ _n

âsîtsînvariantdistribution.

Proposition 3.3. The Markov hain

(Y _t ⁽ⁿ⁾ ) t≥0

^, ^generated ^by ^Algorithm ^3.1,

has the onditionaldistribution

F _A ⁽ⁿ⁾ _n

âsîtsînvariant distribution.

Proof. The goalis to show that ea h updating step(Step 2 and 3) of the al-

gorithmpreserves stationarity. Sin e the onditional distribution

F _A ⁽ⁿ⁾ _n

^is ^per-

mutation invariantitis learthat Step3preservesstationarity. Thereforeitis

su ientto onsiderStep2ofthealgorithm.

Let

P j (y, ·)

^denote^the^transitionprobabilityoftheMarkov hain

(Y _t ⁽ⁿ⁾ ) t≥0

orrespondingtothe

j

^thômponent^beingûpdated. Îtîs^su
ient^to^show^that,

(23)

forall

j = 1, . . . , m

ândâll^Borel^setsôf^produ
t^form

B 1 × · · · × B n ⊂ A n

^,^the

followingequalityholds:

F _A ⁽ⁿ⁾ _n (B 1 × · · · × B n ) = E _F ⁽ⁿ⁾

An [P j (Y, B 1 × · · · × B n )].

Observethat,be ause

B 1 × · · · × B n ⊂ A n

^,

F _A ⁽ⁿ⁾ _n (B 1 × · · · × B n ) = E h Y ⁿ

k=1

I{Y k ∈ B k } | S n > a n

i

= E [I{Y j ∈ B j }I{S n > a n } Q

k6=j I{Y k ∈ B k }]

P(S n > a n )

=

E h _E[I{Y _j _∈B _j _}|Y _j _>a _n _−S _n,−j _,Y ⁽ⁿ⁾

−j ] Q

k6=j I{Y k ∈B k } P(Y j >a n −S n,−j |Y ⁽ⁿ⁾ −j )

i

P(S n > a n )

= E[P j (Y ⁽ⁿ⁾ , B 1 × · · · × B n ) Q

k6=j I{Y k ∈ B k }]

P (S n > a n )

= E[P j (Y ⁽ⁿ⁾ , B 1 × · · · × B n ) | S n > a n ]

= E

F _An ⁽ⁿ⁾ [P j (Y, B 1 × · · · × B n )]

^,

withthe onventionalnotationofwriting

Y ⁽ⁿ⁾ = (Y 1 , . . . , Y n ) ^⊤

^,

S n = Y 1 + · · · + Y n

^,

Y _−j ⁽ⁿ⁾ = (Y 1 , . . . , Y j−1 , Y j+1 , Y n ) ^⊤

^and

S n,−j = Y 1 + · · · + Y j−1 + Y j+1 + · · · + Y n

^.

Asfortheergodi properties,Algorithm3.1produ esaMarkov hainwhi h

isuniformlyergodi .

Proposition 3.4. For ea h

n ≥ 1

^, ^the ^Markov ^hain

(Y ⁽ⁿ⁾ _t ) t≥0

^is ^uniformly

ergodi . In parti ular, it satises the following minorisation ondition: there

exists

δ > 0

^su
h^that

P(Y ₁ ⁽ⁿ⁾ ∈ B | Y ⁽ⁿ⁾ ₀ = y) ≥ δF _A ⁽ⁿ⁾ _n (B),

for all

y ∈ A n

^and^all ^Borel^sets

B ⊂ A n

^.

Proof. Takean arbitrary

n ≥ 1

^. Ûniform êrgodi
ity ân^be ^dedu
ed ^from ^the

following minorisation ondition (see [44℄): there exists a probability measure

ν

^,

δ > 0

^,ândânînteger

t 0

^su
h^that

P(Y ⁽ⁿ⁾ _t ₀ ∈ B | Y ⁽ⁿ⁾ ₀ = y) ≥ δν(B)

^,

forevery

y ∈ A n

^and^Borel^set

B ⊂ A n

^. ^Take

y ∈ A n

^and^write

g( · | y)

^for^the

density of

P(Y ₁ ⁽ⁿ⁾ ∈ · | Y ⁽ⁿ⁾ ₀ = y)

^. ^The^goal^is ^to ^show ^that^the minorisation onditionholds with

t 0 = 1

^,

δ = p ⁽ⁿ⁾ /n!

^,^and

ν = F _A ⁽ⁿ⁾ _n

^.

Forany

x ∈ A n

^thereêxistsânôrdering

j 1 , . . . , j n

^of^the^numbers

{1, . . . , n}

su h that

y j 1 ≤ x j 1 , . . . , y j k ≤ x j k , y j k+1 > x j k+1 , . . . , y j n > x j n

^,

Markov chain Monte Carlo for rare-event simulation in heavy-tailed settings

Y 1 +· · ·+Y n

Y

Y 1 + · · · + Y N

N

Y 1 , . . . , Y n

X n = A n X n−1 + B n

B

Y 1 + · · · + Y n

Y

X n

X n = A n X n−1 + B n

B

P(S n > a n )

P(S N n > a n )

P(X m > c n )

[0, 1]

X

F

X

U

Z = F −1 (U )

F −1 = min{x | F (x) ≥ p}

Z

X

P(Z ≤ x) = P(F −1 {U } ≤ x) = P(U ≤ F (x)) = F (x)

X

c

P(X ∈ · | X > c)

U

Z = F −1 

1 − F (c)

U + F (c) 

Z

P(Z ≤ x) = P (1 − F (c))U + F (c) ≤ F (x)

= P 

U ≤ F (x) − F (c) 1 − F (c)



= F (x) − F (c)

1 − F (c) = P(c ≤ X ≤ x)

P(X > c) = P(X ≤ x | X > c)

c

X

{X ∈ A}

A

X 1 , . . . , X n

A

P(X ∈ A)

b p = 1

n X n i=1

I{X i ∈ A}

X

X

f

f (x) = π(x)

c

π

c = R

π(x)dx

c

c

(Y t ) t≥0

X

f

g

(Y t ) t≥0

Y 0 = y 0

Z

g

Z

α

Y 0 = y 0

Y k

k = 0, 1, . . .

Y k+1

Z

g

Y k+1 =

 Z

α(Y k , Z) Y k

Y ₁ +· · ·+Y n

Y ₁ + · · · + Y N

Y ₁ , . . . , Y n

X n = A n X _n−1 + B n

Y ₁ + · · · + Y n

X n = A n X _n−1 + B n

Z = F ⁻¹ (U )

F ⁻¹ = min{x | F (x) ≥ p}

P(Z ≤ x) = P(F ⁻¹ {U } ≤ x) = P(U ≤ F (x)) = F (x)

Z = F ⁻¹

U + F (c)

= P

Z

r(y, z) = ^π(z)g(z,y) _π(y)g(y,z)

f _k|6k (x k | x 1 , . . . , x k−1 , x k+1 , . . . , x d ), k = 1, . . . , d,

Var(b p) = _n ¹ p(1 − p)

1 p − 1

10 ⁻⁶

(10 ⁶ − 1)/n ≤ 0.01

n ≈ 10 ¹⁰

E _G [b p