Introduction to Monte Carlo Techniques in High Energy Physics

(1)

CERN Summer Student Lecture Part 1, 19 July 2012

Introduction to

Monte Carlo Techniques in High Energy Physics

Torbj¨ orn Sj¨ ostrand

How are complicated multiparticle events created?

How can such events be simulated with a computer?

(2)

Lectures Overview

today: Introduction the Standard Model Quantum Mechanics

the role of Event Generators Monte Carlo random numbers

integration simulation tomorrow: Physics hard interactions

parton showers

multiparton interactions hadronization

Generators Herwig, Pythia, Sherpa MadGraph, AlpGen, . . . common standards

(3)

The Standard Model

Matter particles:

type (shorthand) generation charge

1 2 3

up-type quarks (q) u c t +2/3

down-type quarks (q) d s b −1/3

neutrinos (ν) ν_e ν_µ ν_τ 0

charged leptons (`⁻) e µ τ −1 each with its antiparticle (q, ν, `⁺)

Interactions:

interaction mediator

strong g(gluon)

electromagnetic γ (photon)

weak W⁺, W⁻, Z⁰

mass generation H(Higgs)

Hadrons: mesons qq bound by strong interactions

baryons qqq (confinement; gluon self-interactions) Partons: quarks and gluons bound in a hadron

(4)

Feynman Diagrams

incoming quarks

outgoing quarks exchanged

gluon propagator

vertex

time space

Introduce kinematics-dependent factors for incoming, outgoing and exchanged particles, and couplings for vertices:

together they give the amplitude for the process.

(5)

Quantum Mechanics

A given initial and final state typically can be related via several separate intermediate histories, e.g.

qg → qg

(t) (s) (u)

Cross section σ ∝ |At+ As+ Au|²6= |A_t|²+ |As|²+ |Au|². Interference ⇒ not possible to know which path process took.

If one amplitude dominates then approximate simplifications (e.g. At dominates for scattering angle → 0).

Trick : σ_t ∝ |A_t+ A_s+ A_u|² |A_t|²

|A_t|²+ |As|²+ |Au|²

(6)

Fluctuations

0 20 40 60 80 100 120 140 160 180n

nP

10-6

10-5

10-4

10-3

10-2

10-1

1 10 102

103 CMS Data

PYTHIA D6T PYTHIA 8 PHOJET

4) 7 TeV (x10

2) 2.36 TeV (x10

0.9 TeV (x1)

| < 2.4

|η > 0 pT

CMS NSD (a)

Wide distribution of the number of charged particles in an event, each particle with continuum of possible momenta.

Combination of QM processes at play.

So an infinity of final states, with a probabilistic spread of properties.

(7)

Complexity

Impossible to predict complete distribution of events from theory:

Strong interactions not solved (e.g. bound hadron states).

Even if, then production of ∼ 100 particles computationally impossible to handle.

Some simple tasks still ∼ solvable, e.g. qq → Z⁰ → `⁺`⁻.

But a quark/gluon shows up as ajet

= a spray of hadrons.

Ill-defined borders + underlying activity

⇒ difficult interpretation.

(8)

Search for New Physics: an Example

SM process e.g. gg → tt → bW⁺bW⁻→ b`⁺νbqq

= lepton + missing p⊥ + 4 jets

Need to understand background to look for signal.

(9)

Event Generator Philosophy

Divide et impera(Divide and conquer/rule; Latin proverb)

Way forward:

Accept approximate framework.

Evolution in “time”: one step at a time.

Each step “simple”, e.g. n-particle → (n + 1)-particle.

Diffferent mechanisms at different “time” epochs.

Computer algorithms for physics and bookkeeping.

Generate samples of events, just like experimentalists do.

Strive to predict/reproduce average behaviour & fluctuations.

Random numbers represent quantum mechanical choices.

(10)

Welcome to Monte Carlo!

(11)

Event Generators

Three general-purpose generators:

Herwig Pythia Sherpa

Many others good/better at some specific tasks.

Generators to be combined with detector simulation (Geant) accelerator/collisions ⇔ event generator

detector/electronics ⇔ detector simulation to be used to • predict event rates and topologies

• simulate possible backgrounds

• study detector requirements

• study detector imperfections

(12)

The Main Physics Components (in Pythia)

More tomorrow!

(13)

How to Compose a Complete Dinner

1 pick main course (≈ hard process = ME ⊕ PDF)

2 pick matching first course(≈ ISR)

3 pick matching dessert(≈ FSR)

4 pick side dishes and drinks (≈ MPI)

5 pick coffee/tea & cookies(≈ hadronization)

6 pick after-dinner snacks (≈ decays)

7 pick plates, cutlery, table setting (≈ administrative structure) thousands of possible (published) recipes

uncountable combinations

never identical results (meat, spices, temperature, . . . ) Having a Higgs event ≈ having beef for dinner.

(Don’t look down on the work of the chef!)

(14)

Monte Carlo Methods

(15)

Random Numbers

Truly random R : • uniform distribution 0 < R < 1

• no correlations in sequence

Example: radioactive decay

Event generation + detector simulation voracious users

⇒ need pseudorandom computer algorithms Deterministic: in simplest form Ri = f (Ri −1)

more sophisticated R_i = f (R_{i −1}, R_{i −2}, . . . , R_{i −k})

Examples:

name k period

oldtimers 1 ∼ 10⁹

L’Ecuyer 3 ∼ 10²⁶

Marsaglia-Zaman 97 ∼ 10¹⁷¹ Mersenne twister 623 ∼ 10⁶⁰⁰

(16)

The Marsaglia Effect

2D-array: white if R < 0.5, black if R > 0.5:

Marsaglia: recursion ⇒ multiplets (Rmi, Rmi +1, . . . , Rmi +m−1), i = 1, 2, . . ., fall on parallel planes in m-dimensional hypercube. A small m spells disaster. Don’t play on your own!

(17)

The Marsaglia Effect

2D-array: white if R < 0.5, black if R > 0.5:

Marsaglia: recursion ⇒ multiplets (Rmi, Rmi +1, . . . , Rmi +m−1), i = 1, 2, . . ., fall on parallel planes in m-dimensional hypercube.

A small m spells disaster. Don’t play on your own!

(18)

Spatial vs. Temporal Problems

“Spatial” problems: no memory

1 What is the land area of your home country?

2 What is the integrated cross section of a process?

“Temporal” problems: has memory

1 Traffic flow: What is probability for a car to pass a given point at time t, given traffic flow at earlier times?

(Lumping from red lights, antilumping from finite size of cars!)

2 Radioactive decay: what is the probability for a radioactive nucleus to decay at time t, gven that it was created at time 0? In reality normally combined into multidimensional problems

1 What is traffic flow in a whole city?

2 What is the probability for a radioactive nucleus to decay sequentially at several different times, each time into one of several possible channels?

(19)

Spatial vs. Temporal Problems

2 Radioactive decay: what is the probability for a radioactive nucleus to decay at time t, gven that it was created at time 0?

In reality normally combined into multidimensional problems

(20)

Spatial vs. Temporal Problems

2 Radioactive decay: what is the probability for a radioactive nucleus to decay at time t, gven that it was created at time 0?

In reality normally combined into multidimensional problems

(21)

Spatial methods

In practical applications often need not only value of integral, but also sample of randomly distributed points inside “area”:

represents quantum mechanical spread, like real data;

allows separation of messy multidimensional problems.

pq p_q

γ^∗/Z⁰

M_γ²_∗_/Z₀ = (pq+ p_q)²

p^"_q p^"_q

p^"_f

θ

Example: qq → γ^∗/Z⁰→ ff is 2 → 2 but can be split into steps, that consecutively provide more information on the event:

1 production qq → γ^∗/Z⁰, notably choice of mass M_γ^∗_/Z⁰;

2 choice of final flavour f = d, u, s, c, b, t, e⁻, νe, µ⁻, νµ, τ⁻, ντ;

3 decay γ^∗/Z⁰ → ff, notably choice of rest frame polar angle θ;

4 further steps, up to and including detector cuts.

(22)

Simple Integration

(flat Earth approximation)

1 Pick x coordinate at random between horizontal limits.

2 Pick y coordinate at random between vertical limits.

3 Find whether point is inside Swiss border.

4 Repeat many times and keep statistics.

Area = width × height ×#inside

#tries

(23)

Integration of a function

Assume function f (x), x = (x₁, x₂, . . . , x_n), n ≥ 1, where xi ,min< xi < xi ,max, and 0 ≤ f (x ) ≤ f_max. Then

Theorem

An n-dimensional integration ≡ an n + 1-dimensional volume Z

f (x1, . . . , xn) dx1. . . dxn≡

Z Z f (x1,...,xn) 0

1 dx1. . . dxndxn+1

So Monte Carlo integration of a function is a simple generalization of area calculation.

(24)

Hit-and-miss Monte Carlo

If f (x ) ≤ fmax in xmin < x < xmax

useinterpretation as an area

1 select

x = xmin+ R (xmax− x_min)

2 select y = R f_max (new R!)

3 while y > f (x ) cycle to 1 Integral as by-product:

I = Z xmax

xmin

f (x ) dx = fmax(xmax− x_min)Nacc

Ntry

= Atot

Nacc

Ntry

Binomial distribution with p = N_acc/N_try and q = N_fail/N_try, so error

δI

I = Atotpp q/N_try Atotp =

r q

p Ntry

=

r q

Nacc

−→ 1

√Nacc

for p 1

(25)

Analytical Solution

Same probability per unit area

⇒ area to right of selected x is uniformly distributed fraction of whole area:

Z x xmin

f (x⁰) dx⁰= R Z xmax

xmin

f (x⁰) dx⁰

If know primitive function F (x ) and know inverse F⁻¹(y ) then F (x ) − F (xmin) = R(F (xmax) − F (xmin))= RAtot

=⇒x = F⁻¹(F (xmin)+ RAtot) Example:

f (x ) = 2x , 0 < x < 1, =⇒ F (x ) = x²

F (x ) − F (0) = R (F (1) − F (0)) =⇒ x²= R =⇒x =√ R

(26)

Importance Sampling

Improved version of hit-and-miss:

If f (x ) ≤ g (x ) in x_min < x < x_max

andG (x ) =R g (x⁰) dx⁰ is simple andG⁻¹(y ) is simple

1 select x according to g (x ) distribution

2 select y = R g (x ) (new R!)

3 while y > f (x ) cycle to 1

Further extensions: stratified sampling multichannel

variable transformations . . .

(27)

Multidimensional Integrals

In practice almost always multidimensional integrals Z

V

f (x) dx =

Z

V

g (x) dx

×N_acc Ntry

gives error ∝ 1/√

N irrespective of dimension

but constant of proportionality related to amount of fluctuations.

Contrast with normal integration methods:

trapezium rule error ∝ 1/N² → 1/N^2/d in d dimensions, Simpson’s rule error ∝ 1/N⁴→ 1/N^4/d in d dimensions so Monte Carlo methods always win in large dimensions

(28)

Temporal Methods: Radioactive Decays – 1

Consider “radioactive decay”:

N(t) = number of remaining nuclei at time t

but normalized to N(0) = N₀ = 1 instead, so equivalently

N(t) = probability that (single) nucleus has not decayed by time t P(t) = −dN(t)/dt = probability for it to decay at time t

Naively P(t) = c =⇒ N(t) = 1 − ct.

Wrong! Conservation of probability driven by depletion:

a given nucleus can only decay once Correctly

P(t) = cN(t) =⇒ N(t) = exp(−ct) i.e. exponential dampening

P(t) = c exp(−ct)

(29)

Radioactive Decays – 2

For radioactive decays P(t) = cN(t), with c constant, but now generalize to time-dependence:

P(t) = −dN(t)

dt = f (t) N(t) ; f (t) ≥ 0 Standard solution:

dN(t)

dt = −f (t)N(t) ⇐⇒ dN

N = d(ln N) = −f (t) dt ln N(t)−ln N(0) = −

Z t 0

f (t⁰) dt⁰ =⇒ N(t) = exp

− Z t

0

f (t⁰) dt⁰

F (t) = Z t

f (t⁰) dt⁰ =⇒ N(t) = exp (−(F (t) − F (0))) N(t) = R =⇒ t = F⁻¹(F (0) − ln R)

(30)

The Veto Algorithm

What now if f (t) has no simple F (t) or F⁻¹, but f (t) ≤ g (t), with g “nice”?

The veto algorithm

1 start with i = 0 and t0= 0

2 + + i (i.e. increase i by one)

3 t_i = G⁻¹(G (t_{i −1}) − ln R), i.e t_i > t_{i −1}

4 y = R g (t)

5 while y > f (t) cycle to 2

That is, when you fail, you keep on going from the time when you failed, anddo notrestart at time t = 0. (Memory!)

(31)

The Veto Algorithm: Proof

define S_g(t_a, t_b) = exp

−Rtb

tag (t⁰) dt⁰

P₀(t) = P(t = t₁) = g (t)S_g(0, t)f (t)

g (t)= f (t)S_g(0, t) P₁(t) = P(t = t₂) =

Zt 0

dt₁g (t₁)S_g(0, t₁)

1 −f (t₁)

g (t₁)

g (t)S_g(t₁, t)f (t) g (t)

= f (t)S_g(0, t) Z t

0

dt₁(g (t₁) − f (t₁)) = P₀(t) I_{g −f}

P₂(t) = · · · = P0(t) Z t

0

dt₁(g (t₁) − f (t₁)) Z t

t1

dt₂(g (t₂) − f (t₂))

= P₀(t) Z t

0

dt₁(g (t₁) − f (t₁)) Z t

0

dt₂(g (t₂) − f (t₂)) θ(t₂− t₁)

= P₀(t)1 2

Zt 0

dt₁(g (t₁) − f (t₁))

2

= P₀(t)1 2I_{g −f}² P(t) =

∞

X

i =0

P_i(t) = P₀(t)

∞

X

i =0

I_{g −f}ⁱ

i ! = P₀(t) exp(I_{g −f})

= f (t)exp

− Z t

0

g (t⁰)dt⁰

exp

Zt 0

dt1(g (t1)−f (t1))

= f (t)exp

− Z t

0

f (t⁰)dt⁰

(32)

The winner takes it all

Assume “radioactive decay” with two possible decay channels 1&2 P(t) = −dN(t)

dt = f₁(t)N(t) + f₂(t)N(t) Alternative 1:

use normal veto algorithm with f (t) = f1(t) + f2(t).

Once t selected, pick decays 1 or 2 in proportions f₁(t) : f₂(t).

Alternative 2:

The winner takes it all

select t₁ according to P₁(t₁) = f₁(t₁)N₁(t₁) and t₂ according to P₂(t₂) = f₂(t₂)N₂(t₂), i.e. as if the other channel did not exist.

If t₁ < t₂ then pick decay 1, while if t₂ < t₁ pick decay 2.

(33)

The winner takes it all: proof

P1(t) = (f1(t) + f2(t)) exp

− Z t

0

(f1(t⁰) + f2(t⁰)) dt⁰

f1(t) f₁(t) + f₂(t)

= f1(t) exp

− Z t

0

(f1(t⁰) + f2(t⁰)) dt⁰

= f1(t) exp

− Z t

0

f1(t⁰) dt⁰

exp

− Z t

0

f2(t⁰) dt⁰

Algorithm especially convenient when temporal and/or spatial dependence of f1 and f2 are rather different.

(34)

Summary

Nature is Quantum Mechanical.

LHC events contain infinite variability.

Use random numbers to pick among possible outcomes, to give complete computer-generated LHC events, hopefully predicting/reproducing average and spread of any observable quantity.

Tomorrow:

A closer look at some of the key physics components of generators.

A survey of existing generators.