A Machine Learning Approach for Comprehending Cosmic Expansion

(1)

A Machine Learning Approach for Comprehending Cosmic Expansion

Constraining the Redshift-Distance Test Using a Cosmological Pinhole Camera and a Deep

Convolutional VAE with VampPrior LUDVIG DOESER

KTH ROYAL INSTITUTE OF TECHNOLOGY

SCHOOL OF ENGINEERING SCIENCES

(2)

A

L D , @ .

E P ; The e ical Ph ic

K H R I

S , S

E

J L

A P A D H P D ,

K H R I

J J

A P P D ,

S

K H

F R

P P A P ,

K H R I

(3)

M.Sc. Thesis

A Machine Learning Approach for Comprehending Cosmic Expansion

Constraining the Redshift-Distance Test Using a Cosmological Pinhole Camera and a Deep Convolutional VAE with VampPrior

Ludvig B. Doeser –– doeser@kth.se Supervisor: Jens Jasche

11 juni 2021

(4)

Abstract

This thesis aims at using novel machine learning techniques to test the dynamics of the Universe via the cosmological redshift-distance test. Currently, one of the most outstanding questions in cosmology is the physical cause of the accelerating cosmic expansion observed with supernovae. Simultaneously, tensions in measure- ments of the Hubble expansion parameter H

0

are emerging. Measuring the Universe expansion with next generation galaxy imaging surveys, such as provided by the Vera Rubin Observatory, oﬀers the opportunity to discover new physics governing the Universe dynamics. In this thesis, with the long-term goal to unravel these peculiarities, we create a deep generative model in the form of a convolutional varia- tional auto-encoder (VAE), trained with a "Variational Mixture of Posteriors" prior (VampPrior) and high-resolution galaxy images from the simulation project TNG-50.

Our model is able to learn a prior on the visual features of galaxies and can generate synthetic galaxy images which preserve the coarse features (shape, size, inclination, and surface brightness profile), but not finer morphological features, such as spiral arms. The generative model for galaxy images is applicable to uses outside the scope of this thesis and is thus a contribution in itself. We next implement a cosmological pinhole camera model, taking angular diameter changes with redshift into account, to forward simulate the actual observation on a telescope detector. Building upon the hypothesis that certain features of galaxies should be of proper physical sizes, we use probabilistic triangulation to find the comoving distance r(z, ⌦) to these in a flat (K = 0) Universe. Using a sample of high-resolution galaxy images from red- shifts z 2 [0.05, 0.5] from TNG-50, we demonstrate that the implemented Bayesian inference approach successfully estimates r(z) within 1 -error ( r

est

= 140 (580) Mpc for z = 0.05 (0.5)). Including the surface brightness attenuation and utilizing the avalanche of upcoming galaxy images could significantly lower the uncertainties.

This thesis thus shows a promising path forward utilizing novel machine learning

techniques and massive next-generation imaging data to improve and generalize the

traditional cosmological angular-diameter test, which in turn has the potential to

increase our understanding of the Universe.

(5)

Sammanfattning

Denna avhandling syftar till att använda nya maskininlärningstekniker för att testa universums dynamik via det kosmologiska rödförskjutningsavståndstestet. För när- varande är en av de mest framstående frågorna inom kosmologi den fysiska orsaken till den accelererande kosmiska expansionen som observerats med supernovor. Sam- tidigt uppstår spänningar i mätningar av Hubble-expansionsparametern H

0

. Att mäta universums expansion med nästa generations galaxundersökningar, såsom de som ska genomföras av Vera Rubin Observatory, ger möjlighet att upptäcka ny fysik som styr universums dynamik. I den andan skapar vi i den här avhandlingen en djup generativ modell i form av en "convolutional variational auto-encoder" (VAE), trä- nad med en "Variational Mixture of Posteriors" prior (VampPrior) och högupplösta galaxbilder från simuleringsprojektet TNG-50. Vår modell kan lära sig en "prior"

om galaxernas visuella egenskaper och kan generera syntetiska galaxbilder som be-

varar de grova dragen (form, storlek, lutning och ytans ljusprofil), men inte finare

morfologiska egenskaper, såsom spiralarmar. Den generativa modellen för galax-

bilder är tillämplig på användningar som inte omfattas av denna avhandling och är

därmed ett bidrag i sig. Därefter implementerar vi en kosmologisk hålkameramod-

ell, med vilken hänsyn till förändringar i vinkelstorleken med rödförskjutning tas,

för att framåt-simulera den faktiska observationen på en teleskopdetektor. Med ut-

gångspunkt från hypotesen att galaxer i grunden borde ha gemensamma egenskaper

med liknande fysiska storlekar, använder vi probabilistisk triangulering för att hitta

avståndet (s.k. "comoving distance") r(z, ⌦) till dessa i ett platt (K = 0) uni-

versum. Med hjälp av ett urval av högupplösta galaxbilder från rödförskjutningar

z 2 [0.05, 0.5] från TNG-50 visar vi att den implementerade "Bayesian inference"-

metoden framgångsrikt uppskattar r(z) inom 1 felmarginaler ( r

est

= 140 (580)

Mpc för z = 0, 05 (0, 5)). Att inkludera dämpning i ytljusstyrka med rödförskjutning

och att använda den massiva mängd av kommande galaxbilder skulle kunna min-

ska den erhållna osäkerheten betydligt. Denna avhandling visar således en lovande

väg framåt med nya maskininlärningstekniker och kommande enorma mängder av

galaxbilder för att förbättra och generalisera det traditionella kosmologiska vinkel-

diametertestet, vilket i sin tur har potentialen att öka vår förståelse om universum.

(6)

iii

Acknowledgements

First of all, I would like to thank the COPS-group at Stockholm University, who warmly welcomed me to the group and made me feel included despite the diﬃcul- ties in socializing due to COVID-19. Thanks to Adam, Eleni, and Raﬀaella for the weekly planning meetings and for asking relevant questions as well as answering questions I had. Especially, thanks to my supervisor Jens, who has been supporting, helpful, and inspiring; "Science is not about finishing something, it is about contin- uing something. The best research ends with a new question. We don’t want to close a chapter, we want to open a whole book."

Thanks also to my supervisor Felix Ryde and examiner Josefin Larsson at KTH for help with the administrative parts of the thesis. Thanks to Dylan Nelson, con- tact person for the TNG-Illustris-project, for answering questions related to the TNG-50 simulation dataset.

A huge thanks also to my friends and family. A special thanks to my close friends Daniel, Erik, Ida, and Marcus for daily company and discussions for the duration of the thesis. Also, thanks to Anna, Filip, and Petter, for contributing with valu- able inputs and ideas. I’m also grateful for my friends Frida, Hanna, John, and Tarek, who have all been supporting and shown an interest in my work. An extra big thank you to John for suggesting a folding ruler to meassure the distance to galaxies; "The trick is to unfold the ruler". Lastly, a big thanks to my parents and my sister for all the support during the thesis as well as during my full-time at KTH.

This research utilized Google Colab for implementing the deep learning algorithms

in Python. Packages used include Astropy, Matplotlib, Numpy, and Tensorflow.

(7)

List of Acronyms

ACS Advanced Camera for Surveys CCD charge-coupled device

CMB Cosmic Microwave Background CNN convolutional neural network EB Empirical Bayes

ELBO Evidence Lower Bound

FLRW Friedmann–Lemaitre–Robertson–Walker FP fundamental plane

GAN Generative Adversarial Network GSS golden section search

HST Hubble Space Telescope JWST James Webb Space Telescope KL Kullback-Leibler

ML Machine Learning

MSE mean squared error

NN neural network

SB surface brightness

TFR Tully-Fisher relation

VAE Variational Auto-Encoder

NIRCam Near-Infrared Camera

(8)

Chapter 1 Introduction

Cosmology is the branch of physics concerned with the origin, evolution and struc- ture of the Universe. Although recent observational results, such as provided by the Planck satellite mission (Ade et al., 2016; Aghanim et al., 2020), have strength- ened our current understanding of the Universe as described by the standard model of cosmology, unresolved observational tensions are challenging it. In the coming sections, the current state of cosmology will be discussed, adressing the following questions: What do we know? What are the emerging observational tensions? What cosmological tests can help confirm/deny the currently accepted cosmological model and/or unravel new physics?

1.1 State of Cosmology

According to the current paradigm, the Universe began 13.8 billion years ago in a hot Big Bang scenario, where the basic elements of the Universe were created together with the initial seed fluctuations, from which cosmic structures formed (see e.g. Mo et al. (2010)). This hypothesis is partly rooted in the discovery by Hubble (1929), who found a linear dependence between the distances and the recession velocities of galaxies

¹

– the further away, the faster a galaxy moves away from us. This is captured by Hubble’s law

v = H

0

D, (1.1)

1Galaxies are the essential observable objects, consisting of millions to billions of stars, tracing the matter distribution of the Universe (see more in section 1.7).

(11)

where v, H

0

, D is the velocity, the Hubble constant of today, and the distance, re- spectively. Hubble’s discovery indicates an expanding Universe, which after extrapo- lation backwards in time leads to a denser Universe and eventually to the singularity where space and time started, the Big Bang.

In addition, an accelerated expansion of the Universe has been observed, both by the Supernova Cosmology Project (Perlmutter et al., 1999) and the High-z Super- nova Team (Riess et al., 2001). The currently best description of this observation is a cosmological constant, i.e. a dark energy component (see more in section 1.2), which, however, is currently not explained by the general corpus of physics; in fact, it leads to the largest missfit in physics today, called the cosmological constant prob- lem (for a review see Padilla (2015)). The problem lies in the inconsistency between general relativity (the observed vacuum energy density through the cosmological constant) and quantum mechanics / quantum field theory (the prediction of the vacuum energy density). A potential resolution to this problem may involve new physics, requiring either to modify the theory of gravity or the standard model of particle physics.

Increased interest to understand the cosmic expansion history is further driven by the so-called H

0

-tension (see e.g. Bernal et al. (2016)). The Planck measurements (Aghanim et al., 2020) of the Cosmic Microwave Background (CMB), which is faint electromagnetic radiation filling all of space and that is thought to be relic radiation from the Big Bang, yield a value that is statistically diﬀerent from the one derived from cepheid variable stars (see more in section 1.4) and supernovae (Riess et al., 2016).

1.2 Friedmann Model of a Homogeneous Isotropic Universe

The physical description of the Universe builds upon Einstein’s theory of general

relativity, which tells us that the geometrical space-time structure of the Universe

is caused by the matter distribution within it. This interaction between space-

(12)

1.2. FRIEDMANN MODEL OF A HOMOGENEOUS ISOTROPIC UNIVERSE3

time and matter is encapsulated by Einstein’s field equations (Einstein, 1916). In the specific case when assuming the cosmological principle, which states that the Universe on larger scales is spatially homogenous – looks the same in all locations – and isotropic – looks the same in all directions – these equations have a solution. The cosmological principle of a simple matter distribution, which has been strengthened by measurements of the CMB (see e.g. Clifton et al. (2012)), is captured by the Friedmann–Lemaitre–Robertson–Walker (FLRW) metric

²

, which reads

ds

²

= c

²

dt

²

dl

²

= c

²

dt

²

a

²

(t)

 dr

²

1 Kr

²

+ r

²

(d#

²

+ sin

²

#d

²

) . (1.2) Here, ds is the line element between two points in space-time separated temporally by dt (weighted by the speed of light c) and spatially by dl, which in turn is defined through changes in the spherical comoving

³

coordinates (r, #, ). The FLRW-metric provides an explanation for Hubble’s discovery through the time-dependent scale fac- tor a(t), which accounts for the expansion (or possibly contraction) of the Universe:

If the separation at time t

⁰

between any pair of points (fixed in the cosmological background) is r

⁰

, then the separation will be

r = r

⁰

a(t)

a(t

⁰

) (1.3)

at time t. The time derivative (with respect to t) of the separation is then

˙r = d dt

✓ r

⁰

a(t)

a(t

⁰

)

◆ = r

⁰

˙a(t)

a(t

⁰

) = ˙a(t) a(t) r

⁰

a(t)

a(t

⁰

) = ˙a(t)

a(t) r ⌘ H(t)r, (1.4) where H(t) is the Hubble constant

⁴

. Noting that H

0

⌘ H(t

0

) = ˙ a

0

/a

0

, where sub- script 0 means present time and v = ˙r, we recognize Eq. (1.4) as Hubble’s law Eq.

(1.1), and we see that ˙a(t) > 0 (˙a(t) < 0) corresponds to an expanding (shrink- ing) Universe. A related quantity is the deceleration parameter q

0

= ¨ a

0

a

0

/ ˙a

²₀

= 1 H/H ˙

²

, so that q

0

< 0 (q

0

> 0 ) means that the rate of expansion accelerates (decelerates).

2The metric, or distance function, of a manifold explains how the distance between two points on the manifold is meassured.

3Comoving means that these coordinates are at rest in a frame that evolves (expands/contracts) with the dynamics of space-time. An observer at those coordinates is said to be freely falling since no force, but only the curvature of space-time, is acting on it.

4Unfortunate nomenclature, as this is a constant that changes with time.

(13)

The scale factor can, however, not alone capture the current, past and future ge- ometry and dynamics of spacetime. The other needed quantity is the curvature signature K (= 1, 0, or 1) of the Universe. These values ensures that space-time is maximally symmetric, as assumed by the cosmological principle, and correspond to an open (hyperbolic), a flat (Euclidean), and a closed (spherical) Universe, respec- tively. The curvature can, in turn, be expressed through another useful cosmological parameter, the mean density ⌦

0

of today, as ⌦

K,0

⌘

_H^Kc²²

0a²₀

= 1 ⌦

0

. The mean density further describes the energy density of the Universe, which is usually said to consist of four main components:

• baryonic matter, that is neutrons, protons, and electrons, which make up the visible Universe,

• dark matter, a form of matter that interacts with visible matter but not the electromagnetic field (and therefore is invisible),

• dark energy, a form of energy that drives the cosmic expansion, and

• radiation, either in the form of photons or relativistic particles.

The matter content is usually seen as a linear combination of these, ⌦

0

= ⌦

_⇤,0

+

⌦

m,0

+ ⌦

r,0

. Here, ⇤, m, and r stand for the vacuum/dark energy density (also referred to as the cosmological constant), the non-relativistic (dark and baryonic) matter, and the relativistic matter (e.g. radiation), respectively

⁵

. The deceleration parameter is related to these parameters through q

0

=

^⌦^m,0₂

+ ⌦

r,0

⌦

⇤,0

.

The cosmic expansion history can be understood by solving for the scale factor a(t) in the FLRW-metric by using Einsteins equations. The solution is given by the Friedmann equations (Friedman, 1922), which are

✓ ˙a a

◆

2

= H

²

= 8⇡G

3 ⇢ K

a

²

(1.5a)

¨ a

a = 4⇡G

3 (⇢ + 3p) (1.5b)

5The interested reader is recommendedMo et al. (2010) for a more thorough discussion.

(14)

1.3. COSMOLOGICAL REDSHIFT 5

where G is the universal gravitational constant and where we introduced a perfect fluid

⁶

, with energy density ⇢ and pressure p, to describe the energy and momentum of the matter-fields. These are assumed to be related through the equation of state p = w⇢ , where w is a dimension-less parameter that adopts diﬀerent values for dif- ferent matter components.

Despite the unknown nature of both dark matter and dark energy, they are thought to jointly make up more than 95% of the total energy density of the Universe (Mo et al., 2010). Cosmological models diﬀer in (i) the nature of dark matter and dark energy, and (ii) the relative contributions of the diﬀerent energy compo- nents (baryonic matter, dark matter, and dark energy). The concordance

⁷

⇤ CDM model of the Universe, is a flat Universe (K = 0) with ⇠ 75% dark energy (de- noted ⇤), ⇠ 21% (cold) dark matter (hence CDM), and ⇠ 4% baryonic matter, while the radiation content today is negligible. From the most recent Plack Col- laboration (Aghanim et al., 2020), the following parameter values have been found:

H

0

= (67.4 ± 0.5)kms

¹

Mpc

¹

, ⌦

K,0

= 0.001 ± 0.002, ⌦

^m,0

= 0.315 ± 0.007.

1.3 Cosmological Redshift

Due to the previously described space-time expansion (also called the Hubble flow), the wavelength

em

of emitted photons (the elementary particles of light) will be stretched, yielding the so-called cosmological redshift of light. Defining the scale factor of today as a

0

= 1 and a

em

⌘ a(t

^em

) , we have from Eq. (1.3) that the observed wavelength will be:

obs

=

^em

a

_em

. (1.6)

For convenience, we characterize the wavelength change with a redshift-parameter,

z =

^obs

em

1 = 1

a

_em

1 () a

em

= 1

1 + z . (1.7)

The observed redshift z and the scale factor a describing space-time expansion are

6A perfect fluid is isotropic in its rest-frame since it lacks any heat flow and viscosity.

7Suﬃciently many independent observations and subsequent data analyses are coherent with this cosmological model for it to be called the concordance model (see e.g. Tegmark et al.(2006)).

(15)

thus intrinsically related. As ˙a(t) > 0 in an expanding Universe, this means that a was smaller in the past, so that a larger z corresponds to light emission further back in time. The stretching-of-space interpretation of the cosmological redshift is, however, not necessarily the sole interpretation. Historically, the Doppler eﬀect was thought to explain the discovery made by Hubble (1929), which is why it is usually brought up in conjunction with the cosmological redshift. The Doppler eﬀect explains at least the peculiar motion of objects with respect to the expansion (which is usually negligible for large objects), but it could also be explaining the cosmological redshift

⁸

. The non-relativistic (v ⌧ c) Doppler eﬀect

⁹

reads,

⌫

em

=

✓

1 + v · ˆr c

◆ ⌫

obs

, (1.8)

where v is the velocity of the emitting source, ˆr is the unit vector in the direction from the observer to the source, and ⌫

obs

(⌫

em

) is the observed (emitted) frequency of the light. This eﬀect is a consequence of Einstein’s postulate that the speed of light c is constant with respect to all inertial frames (Einstein, 1905). Although the speed of light is independent of the motion of the light source, the wavelength/frequency is not. As a result, we have two cases:

• Blueshift: If the light source is moving toward us, the observed frequency (wavelength) will be higher (shorter), ⌫

obs

> ⌫

em

(

obs

<

em

), and we therefore say that the light has been blueshifted. A contracting Universe also contributes with a blueshift, since the light wave is contracted.

• Redshift: Conversely, if the source is receding from us, ⌫

obs

< ⌫

em

(

obs

>

em

), and we say that is has been redshifted. An expanding Universe also contributes with a redshift, since the wavelength is stretched.

By combining Eq. (1.7) and Eq. (1.8), yielding the radial velocity v

r

= cz , and then inserting this into Hubble’s law, we obtain

D = cz

H

₀

. (1.9)

8In short, it depends on the observer, the choice of coordinates and how distant the object is.

The interested reader is directed toFrancis et al.(2007);Bunn and Hogg(2009);Lewis(2016).

9Note that the wavelength and the frequency are related through ⌫ = c/ .

(16)

1.4. DISTANCE MEASUREMENTS 7

Although this redshift-distance relation is only valid for small redshifts (as the redshift-recession velocity relation depends on the cosmic expansion through the cosmological parameters), it does enlighten the importance of redshift surveys for studying the evolution of the Universe; an object that is further away from us will have a higher redshift. This, together with the fact that the light speed is constant, means that the higher redshift we observe, the younger the Universe was at the time of emission of the light that we observe today.

1.4 Distance Measurements

Distance estimations are complicated by the fact that the Universe is expanding, which means that the more distant objects we observe, the more careful we need to be with the concept of distance. For nearby objects, we can use the trigonometric parallax method, which is a triangulation technique based on the apparent shift in position of an object in the sky when seen from two diﬀerent lines of sight

¹⁰

. Using the diameter of Earth’s orbit around the Sun as baseline, we can infer the distance through

d = A

✓ , (1.10)

where A = 1 AU is the semimajor axis of Earth’s elliptical orbit around the Sun and

✓ is half the angular shift of the object. The special case when ✓ = 1

⁰⁰

(arcsecond, defined as 1/3600 of a degree) defines the length unit 1 pc (parsec), which is equal to 3.26 light-years. With this method we have been able to measure distances up to a couple of kpc (Mo et al., 2010). The parallax method is very important as other methods of the cosmic distance ladder are calibrated through it and, thus, ultimately rely on it

¹¹

.

For distant objects, cosmic expansion becomes relevant and several distinct, but related, concepts of distance emerges

¹²

. Related to the comoving coordinates, the comoving distance D

C

is a distance concept which accounts for the cosmic expansion;

10As we humans have two eyes, this is how we’re able to infer distances in everyday life.

11Seecosmic distance ladder (link to wikipedia)for an overview.

12Note that in the nearby universe, they all yield the same distance.

(17)

two nearby objects that move with the expansion will always be separated by the same comoving distance. To obtain an expression for the comoving distance, one can integrate the radial component (disregarding the scale factor) of the FLRW-metric,

D

_C

= Z

r1

0

p dr

1 Kr

²

= c H

0

Z

z 0

dz

E(z) , (1.11)

where the last expression originates from an alternative formulation of the Hub- ble constant, derived from Eq. (1.5a) by relating the density ⇢ with the matter components,

H(z) = H

0

p ⌦

M

(1 + z)

³

+ ⌦

K

(1 + z)

²

+ ⌦

⇤

⌘ H

⁰

E(z). (1.12)

The solution to Eq. (1.11) tells us how the comoving distance D

C

is related to the comoving coordinate r and depends on the curvature K of space-time as

D

C

(r) = 8 >

> <

> >

:

sin

¹

r (K = +1)

r (K = 0)

sinh

¹

r (K = 1).

(1.13)

In a flat universe (⌦

m,0

+ ⌦

⇤,0

= 1, ⌦

K

= 0, D

C

= r), we insert Eq. (1.12) into Eq.

(1.11), yielding

r = c H

0

Z

z 0

dz

[⌦

⇤,0

+ ⌦

m,0

(1 + z)

³

]

^1/2

, (1.14) which can only be solved numerically. By performing a Taylor expansion one can show (see e.g. Mo et al. (2010)) that

r ⇡ c H

₀



z 1

2 z

²

(1 + q

0

) + . . . ⇡ cz

H

₀

+ . . . , (1.15) which we recognize from Eq. (1.9) and which indicates that the diﬀerent distance concepts indeed yield the same expression (d =

_H^cz₀

) in the nearby (z ⌧ 1) Universe.

If we, on the other hand, could freeze the expansion and lay out a ruler instantly, then we would meassure the proper distance D

P

, given by

D

P

= a(t)D

C

= D

C

1 + z , (1.16)

(18)

1.4. DISTANCE MEASUREMENTS 9

where we see that the comoving distance of today (z = 0) corresponds to the proper distance of today. We can, however, neither meassure the proper nor the comoving distance to an object as no observable correspond to where the object is right now, only where it was when the light was emitted. Two directly observable distances are instead the luminosity distance d

L

and the angular diameter distance d

A

, which are often used in conjunction with the concepts of standard candles and standard rulers.

Cepheid variables, i.e. stars that exhibit periodical changes in brightness, were the first ones to be used as standard candles, which are objects thought to have a fixed intrinsic luminosity; one can thus estimate the distance through the brightness attenuation. More recently, Type 1a Supernova and Luminous Radio Galaxies have been used as such candles, see e.g. Perlmutter et al. (1999). Similarly, standard rulers are objects that are thought to be of a fixed physical length, which means that the distance can be estimated by trigonometrics and the observed angular size.

1.4.1 Angular Diameter Distance

Letting the intrinsic physical diameter of the object be D and the angular size be

#, we can define the angular diameter distance as

d

A

= D

# . (1.17)

In the FLRW metric, the angular diameter distance d

A

approximates the proper distance that light has travelled since its emission. To derive an expression for d

A

we first define the proper physical size D as the proper distance between two light signals emitted on either side of the object. These signals have been emitted at the same radial coordinate r = r(t

em

) at the same cosmic time t

em

with scale factor a = a(t

em

). The expression for D is then given by the integral of dl in Eq. (1.2) over the transverse direction,

D = ar Z

d# = ar#, (1.18)

which inserted into Eq. (1.17) gives

d

A

= ar = a

0

r

1 + z = r

1 + z , (1.19)

(19)

Figure 1.1: Diﬀerent cosmological distances vs redshift. Both the comoving and luminosity distance are monotonically increasing functions of the redshift, while there is a turnover for the angular diameter distance. This means that the same object becomes slightly larger in the sky for z > z

t

if brightness attenuation is neglected, as shown in the images. Cosmological parameters were adopted from Planck 2018 (Aghanim et al., 2020). (see section 1.2)

where a

0

is again set to 1. In the case of a flat (K = 0) Universe, the angular diameter distance is related to the comoving distance through

d

A

= D

C

1 + z . (1.20)

The relationship between the comoving distance and the angular diameter distance

is shown in Fig. 1.1, with the numerical solution to Eq. (1.14) shown. We see that

the angular diameter distance due to the

_1+z¹

-factor has a maximum value and then

decreases with increasing redshift, which means that the angular size of an object

of fixed size will increase with increasing redshift. In essence, if attenuation of the

surface brightness, which quantifies the apparent brightness of a spatially large ob-

ject in the sky, were to be neglected, the same object would become larger in the

sky after the turnover redshift z

t

⇡ 1.5. This turnover is a consequence of looking

(20)

1.5. COSMOLOGICAL TESTS USING ASTRONOMICAL DATA 11

smaller because far away is being counteracted by looking larger because closer in the past (see e.g. Davis and Lineweaver (2004)). The Universe was namely smaller at the time of emission and the emitting object was thus relatively larger in the Universe than it is today.

The angular diameter distance can furthermore be shown to be related to the lumi- nosity distance via Etherington’s reciprocity relation (Etherington, 1933),

d

L

=

r L

4⇡F = D

C

(1 + z) = d

A

(1 + z)

²

, (1.21) where L is the luminosity and F is the flux. As seen in Fig. 1.1, the luminosity distance lacks a turn-over and instead increases quickly with the redshift. One can show that Eq. (1.19) and (1.21) together lead to the surface brightness I attenuating as I / (1 + z)

⁴

, which is discussed more in Appendix A.

1.5 Cosmological Tests using Astronomical Data

To put constraints on the cosmological parameters and on cosmological models, we need to meassure the large scale structures and geometrical properties of the Uni- verse. This can be achieved by finding two independent observables that are related to each other through these parameters. Several methods have been proposed:

• the Hubble diagram (or redshift-magnitude relation), m = m(M, z, ⌦),

• the Angular diameter test (or redshift-angle relation), # = #(D, z, ⌦),

• the Hubble test (or redshift-count relation), N = N(n, z, ⌦),

• the Alcock-Paczinsky test (or redshift-deformation relation), z/(z #) ⌘ k = k(z, ⌦).

In all these cases, some observable (m, #, N, k) of a reference standard in magnitude

M (standard candle), size L (standard rod), and density n, is meassured for diﬀerent

redshifts and compared with model predictions. Note that the Hubble diagram

and the angular diameter test both are redshift-distance tests, since the distance is

(21)

related to the observables through the angular diameter distance and the luminosity distance respectively. Since the various distance measures are related and together constitute a unique concept, we have

d(z, ⌦) = cz H

0

[1 + F

d

(z, ⌦)] , (1.22)

where F

d

is a function of the cosmological parameters that depends on the cosmolog- ical model and ⌦ = {H

⁰

, ⌦

m

, ⌦

r,0

, ⌦

⇤,0

} denotes the set of cosmological parameters.

Note that Eq. (1.15) follows this form.

The Hubble diagram version of the distance-redshift relation was used in 1998 with Type Ia supernovae to show that two of the cosmological parameters follow the approximated joint probability distribution given by 0.8⌦

M

0.6⌦

A

⇡ 0.2±0.1, for

⌦

M

 1.5 (Perlmutter et al., 1999). One can show that this implies q

0

< 0 , i.e. that the expansion of the Universe is currently accelerating. The angular diameter test version is one of the most important probes of cosmology through the baryon acoustic oscillations in galaxies, and is in high agreement with the ⇤CMD predictions from CMB data (Eisenstein et al., 2005). This test has also been applied on quasars, radio galaxies, and galaxy clusters; Kapahi (1987) arrived at no conclusive conclusion, Gurvits et al. (1999) received q

0

= 0.21 ±0.30, and Wei et al. (2014) obtained results consistent with ⇤CMD. The underlying reason for galaxies to contain cosmological information will be investigated next.

1.6 Hierarchial Structure Formation

In a ⇤CMD scenario, cosmic structures originate from primordinal perturbations

in the density field, which created over-dense regions that started to accumulate

dust, gas and dark matter through gravity. Such a formation process is denoted

hierarchical or bottom-up, as larger objects are formed by the merging of smaller

progenitors, as shown in Fig. 1.2. Eventually, these isolated systems of baryonic

and dark matter, referred to as halos, became the basic units of the cosmic structure

and accumulated enough gravitationally bound matter to form deep gravitational

potentials and become decoupled from the cosmic expansion.

(22)

1.7. GALAXY OBSERVATIONS AND PROPERTIES 13

Figure 1.2: Hierarchical formation process. Smaller progenitors merge to form larger objects, halos, that eventually become hosts for galaxies. These massive objects form deep gravitational potentials and decouple themselves from the cosmic expansion.

These halos have continued to grow in both mass and size, either by merging with other halos or by accretion of gas and dust. Every halo can furthermore contain several distinct clumps and peaks of dark matter, which are called subhalos. Within the gravitational potential of the host halo, the smaller halos orbit. Above a certain treshold mass, each halo and subhalo is expected to host a galaxy (Wechsler and Tinker, 2018). In this hierarchial fashion, the large scale structures of the Universe look very similar in all directions and due to the scaling relations of this structure formation build up, one can imagine that there are shared intrinsic features of galax- ies. Due to the universal presence of gravity, observationally found scaling relations for galaxy properties, contain essential cosmological information.

1.7 Galaxy Observations and Properties

Formed by the gravitational aggregation of intergalactic gas particles, galaxies grow

to host millions to billions of stars. The set of observed galaxies do, however, con-

stitute a very diverse class of objects, indicting that the characterization of galaxies

demands the use of many parameters. Enabled by the suﬃcient high resolution

observations at the time, Hubble contributed with the Hubble Sequence (Hubble,

(23)

Figure 1.3: The Hubble sequence. Galaxies can take many diﬀerent visual forms, from various elliptical types (left) to spirals and irregulars with/without a centered bar (bottom/top). Credit: Figure 3 from Cui et al. (2014).

1926), which is a classification of galaxies dependent on their morphology, that is their visual appearance, as presented in Figure 1.3.

Tremendous progress, both in the development of optical instruments and in obser- vational techniques, has been made since the beginning of the 20

^th

century. Not only have we observed nearby galaxies of high resolution in all wavebands (e.g. optical and infrared), we have also performed redshift surveys with hundreds of thousands of galaxies and managed to unveil the high redshift population of galaxies (see e.g. Abazajian et al. (2009)). In total, we have performed deep, high-quality spec- troscopy

¹³

for millions of galaxies.

Some of the most prominent properties of galaxies are: morphology/visual appear- ance, luminosity, total stellar mass, size, shape, surface brightness, gas mass frac- tion, color, environment, nuclear activity, and redshift (see more in Mo et al. (2010)).

13A primary research tool in astronomy is the spectrograph, which spreads out the observed light into a spectrum (like a rainbow) using a fine grating or prism. This decoding of light into diﬀerent wavelenghts and the subsequent study of the shifts from the expected lines on the spectra, enables astronomers to find the redshift of celestial objects. Apart from spectroscopic surveys, astronomers also perform photometric surveys. During these, light from a source is filtered to only let through a certain color (wavelength span) and then captured on an image sensor, such as a charge-coupled device (CCD) with a finite number of pixels. Such image will then display the surface brightness distribution of the observed object.

(24)

1.7. GALAXY OBSERVATIONS AND PROPERTIES 15

In the pursuit of unravelling the mysteries behind galaxy formation, much eﬀort has been put into understanding the joint probability distribution function of these prop- erties. The Hubble sequence is a simple classification scheme, but there are more visual distinctions than just disc/elliptical/irregular and the bar/non-bar segmen- tations, see e.g. the Galaxy Zoo 2 project (Willett et al., 2013). One can, for example, further classify elliptical galaxies depending on boxy or disky shapes and disk galaxies depending on whether they have a bulge, whether they are seen edge- on or face-on, and whether they have grand design spirals or flocculent spirals (Mo et al., 2010).

Size and shape are two other fundamental properties. The size distribution of galax- ies with a given luminosity L has been shown in Shen et al. (2003) to approximately follow a normal-log distribution of the form

P (R |L)dR = 1

p 2⇡

_ln(R)

exp

"

ln

²

(R/ ¯ R) 2

_ln(R)²

# dR

R , (1.23)

where

ln(R)

is the dispersion, and ¯ R is the median radius, which is shown to increase with the luminosity. Related to the size is the observed shape, which can be ascribed to several components, for example the intrinsic shape and the weak gravitational lensing shear (see e.g. Kogai et al. (2021)).

Another important structural parameter is the surface brightness (SB)-profile, which depends on the surface luminosity density and is measured in magnitudes per square arcsecond in a particular filter band. The observed image of an extended astronom- ical object, thus, reflects the SB-distribution, and isophotal contours (level curves of constant surface brightness) can be used to define the size of the object. Elliptical/- spheroidal galaxies’ SB-profiles are often well-fitted by the one-dimensional Sérsic profile (Sérsic, 1963),

ln I(R) = ln I

0

kR

^1/n

, (1.24)

where I(R) is the SB, R is the radial distance from the object’s center, and I

0

=

I(R = 0). The special case when n = 4, which is called De Vaucouleurs’s law, is often

suitable. Disk galaxies are, however, often decomposed into several contributions,

(25)

one from the bulge, which is often fitted by a Sérsic profile, and one from the disc, which is often an exponential fit,

I(R) = I

0

exp ( R/R

d

) , (1.25)

where R

d

is the exponential scale length.

1.7.1 Scaling Relations as Distance Indicators

Other well-known scaling relations include the fundamental plane (FP) of elliptical galaxies and the Tully-Fisher relation (TFR) for spiral galaxies, both of which can be used for distance estimations. The FP

¹⁴

is an empiral relation that relates the central velocity dispersion

0

, the eﬀective radius R

0

, and the renormalized surface brightness I

0

of the galaxy through,

log

₁₀

(R

0

) = a · log

10

(

0

) + b · log

10

(I

0

) + c, (1.26) where a, b, c are fitting parameters. As the eﬀective radius is the only distance- dependent quantity the observed radius can be compared with the FP-predicted one to estimate the distance. Distance estimations can also be found through two- dimensional projections of the FP, such as the Faber–Jackson relation (Faber and Jackson, 1976).

Even though spiral galaxies span a wide range of diﬀerent types, the intrinsic lumi- nosity or mass has been shown to be related to the rotation velocity

¹⁵

through

L = AV

_max^↵

, (1.27)

where A and ↵ are the zero-point and slope respectively (Tully and Fisher, 1976).

This TFR has been an essential part of the cosmic distance ladder since luminosity and luminosity distance are related as L / d

L²

.

14SeeSaulder et al.(2013) andSingh et al.(2020) for a more thorough explanation.

15This is usually taken as the maximum velocity Vmax of the rotation-curve, which the orbital speed of visible stars or gas is plotted against their radial distance from the center of the galaxy.

(26)

1.8. UPCOMING ASTRONOMICAL SURVEYS 17

1.8 Upcoming Astronomical Surveys

The next-generation of astronomical instruments will generate enormous amounts of image data. Until today, the Sloan Digital Sky Survey (SDSS) has produced the most detailed 3D maps of the Universe, including multi-color images of one third of the sky (see e.g. Abazajian et al. (2009)). With an expected operation start in 2023 (Acquaviva, 2019), the Legacy Survey of Space and Time (LSST) undertaken at VeraC. Rubin Observatory, a ground-based observatory in Chile, will produce the deepest and widest images of the Universe, capturing more than 30 billion objects, out of which ⇠ 18 billion will be galaxies (Robertson and others., 2019; Acquaviva, 2019). The main objectives are to study galaxy formation, evolution science and dark energy. Other large galaxy surveys are also coming through Euclid (launch- ing in 2023), DESI (2021), the Nancy Grace Roman Space Telescope (formerly WFIRST), the SPHEREx mission, and the James Webb Space Telescope (JWST).

1.9 What is Machine Learning?

Historically, and for long, astronomers had to study the world above us without access to high-resolution telescopes and detectors, large data volume storage and data reduction techniques. The management of large datasets that previously was tedious and limited due to manual work, is nowadays less complicated and less time-demanding. Facilitated by computers, we can now handle larger quantities of data streaming in from an extended coverage of the light spectrum, alongside improvements in depth and resolution (see e.g the Hubble Deep Field in Figure 1.4). Recent years have shown a concerted research eﬀort in leveraging machine learning techniques to handle the immense quantities of data that naturally follows the advancements in observational astronomy.

The quest for finding patterns in, and extracting information from, data is a corner-

stone of science. The specific field of pattern recognition deals with the automatic

discovery of regularities, which in turn can be used, for example, to classify the data

according to various features (see more in Bishop (2006)). Machine Learning (ML)

is a branch of data analysis, artificial intelligence, and statistics, which is concerned

(27)

Figure 1.4: A section of the Hubble Ultra Deep Field (2006). After a total expo- sure time of over 11 days, the Hubble Space Telescope (HST), and its ACS/WFC- camera had captured almost 10, 000 galaxies from diﬀerent cosmological distances and epochs. The field of view displayed is a mere fraction, ⇠ 1/30, 000, 000, of the whole sky; in other words, the amount of galaxy data that can be gathered is enormous. Credit: NASA, ESA, and S. Beckwith (STScI) and the HUDF Team.

with developing systems that with minimal human intervention can learn to recog- nize features and patterns from example data, and based on this, make informative decisions. In general, ML algorithms often make eﬀective use of big data to increase the performance. ML is, however, a quite broad term, and can therefore be divided into several sub-categories:

• Supervised learning deals with the case when the machine/algorithm is pro- vided with input examples with corresponding labelled outputs.

• Unsupervised learning deals with the case when the machine is free to find patterns and clusters in the data on its own.

• Deep learning, which can be both supervised or unsupervised, is a branch of

ML that builds on the use of artificial neural networks with the goal to uncover

the underlying features in the data.

(28)

1.9. WHAT IS MACHINE LEARNING? 19

Figure 1.5: An artifical neural network. The nodes n

⁽ⁱ⁾1:N

in layer i are connected with the nodes in the adjacent layers through learnable weights. This type of NN, in which all nodes from adjacent layers are connected, is called a multilayer percepton (MLP) and the layers are called dense layers.

An artificial neural network (NN), which is a network built up of many nodes (called neurons) aggregated into layers that interact with each other, can show up in all of these categories. As in Fig. 1.5, there is always an input and output layer of a NN, while the number of so-called hidden layers can vary from 0 to many (in the latter cases, we go into the regime of deep learning). More specifically, the NN consists of a system of linear combinations on the form

x

^(j)_i

= f X

K k=1

w

_k,i^{(j 1)}

x

^{(j 1)}_k

!

, (1.28)

where x

^j_i

denotes node i in layer j, and w

^{(j 1)}_k,i

are learnable weights between node k in layer j 1 and node i in layer j. In addition, f is a non-linear activation function, usually a ReLu, sigmoid, or tanh, which is needed to introduce non-linearity into the network. The learnable weights are learned by backpropagation through the network, using stochastic gradient descent on some loss function. The use of neural networks is motivated by the universal approximation theorem, which states that

"all continuous functions defined on a compact set can be arbitrary well approxi-

mated with a neural network with one hidden layer" (Csáji, 2001).

(29)

A particular family of methods within deep learning is deep generative models, where the framework of deep neural networks is utilized to create a generative model. This unsupervised machine learning technique has been tremendously popular lately

¹⁶

, where Variational Auto-Encoder (VAE) (Kingma and Welling, 2014) and Generative Adversarial Network (GAN) (Goodfellow et al., 2014) are among the most widely used and most eﬃcient approaches. With the objective to comprehend the true data distribution one utilizes neural networks to train the machine on examples. By do- ing so, complex systems can be represented without having to explicitly write down the mathematical expressions that describe them. In turn, the hope is that the ap- proximate model distribution captures the essential features to the extent that new, artificial data that resembles the example data can be generated. These essential features are often contained within a low dimensional latent space Z, with latent variables z 2 Z from which the data x can be generated through the generative model D

g

, as x = D

g

(z) .

The abundance of galaxy imaging data has been used for a wide range of applica- tions, out of which galaxy morphology classification was one of the first ones to be tested (De La Calleja and Fuentes, 2004; Dieleman et al., 2015). More recently, the usefulness of generative models have been demonstrated in several areas, and not only for generic image simulations that can augment real galaxy datasets (Fussell and Moews, 2019; Smith and Geach, 2019; Buncher et al., 2021). Complex tasks like deblending (Reiman and Göhre, 2019; Arcelin et al., 2020), deconvolution (Schawin- ski et al., 2017), denoising (Lanusse et al., 2021), and generation of high-resolution galaxy images have also been successfully implemented (Fussell and Moews, 2019).

High-resolution images can be used for multiple purposes, such as for calibration of galaxy shape measurement algorithms (which in turn can be used in weak-lensing measurements) (Ravanbakhsh et al., 2016).

16There has, in fact, been an exponential growth in the number of published papers in astronomy/cosmology with the words Machine Learning in the title or abstract (Acquaviva,2019)

(30)

Chapter 2 Distance Estimation

A cosmological redshift-distance test requires the redshift and the distance to be performed. Taking the redshift as given (having been measured by spectrographs), there is a need to develop an estimate of the distance. Assuming a flat Universe (K = 0) and using recorded images of galaxies we seek to develop a Bayesian inference approach to infer their distances. Then, having r(z, ⌦) and z we can constrain cosmological parameters through Eq. (1.22). The method should make use of two main eﬀects, that is intrinsic features of galaxies that correspond to specific physical sizes, which will be discussed in section 2.1, and the projection of photons, emitted by galaxies, onto the photo detector (CCD). The distance estimation algorithm will be presented in 2.2 and the camera model used in this thesis for projections will be discussed in section 2.3.

2.1 Motivation for using Galaxy Images

Based on the discussion on the cosmological model of the Universe in section 1.2 and on hierarchical structure formation in section 1.6, we restate:

• The physics in the Universe should everywhere be the same.

• When a massive object forms, such as a galaxy, it creates a deep gravitational potential and becomes decoupled from the background cosmological evolution.

• Since all galaxies form from fixed physical laws in such potentials they should all share similar intrinsic features, which contain cosmological information.

21

(31)

Although every galaxy will not look like the other, they share morphological fea- tures. By unravelling the underlying joint distribution over all features, we shall be able to infer the distance to them with uncertainties. In essence, we will use probabilistic triangulation with galaxies treated like standard rulers, but with the additional flexibility that we allow these rulers to vary in size and other visual prop- erties according to a prior distribution over these features.

As agnostically as possible, we seek to extract all relevant features of galaxies that could be used to determine their distances. Rather than going by hand and asking what the specific length scale features of galaxies are, we claim that this statistical information is inprinted into the images of galaxies. We can advantageously use an unsupervised machine learning approach in the form of training a deep generative model to find the intrinsic features of galaxies for us, thus evading the explicit inclu- sion of various scaling relations. It is expected that a much more detailed extraction of low-parameter relations presented in section 1.7, such as the Sersic profile, the Fundamental plane, the Tully-Fisher relation and the size distribution function, can be obtained through ML.

From both state-of-the-art simulations and observations we have access to images of galaxies, and soon we will have access to an even more excessive amount of such image data (see section 1.8). These large datasets can be used for two purposes:

(i) To improve the training of deep generative models that infer the distributions of intrinsic features of galaxies.

(ii) To reduce the uncertainty in average distance estimations to galaxies of various redshifts.

2.2 Probabilistic Triangulation with Generative Model

We seek to jointly infer from recorded, coarse galaxy images y an image of higher res- olution x and the distance r to the galaxy. The corresponding posterior distribution is

⇡(x, r |y) = ⇡(x, r) ⇡(y |x, r)

⇡(y) = ⇡(x) ⇡(r) ⇡(y |x, r)

⇡(y) , (2.1)

(32)

2.2. PROBABILISTIC TRIANGULATION WITH GENERATIVE MODEL 23

where we will assume the following:

8 >

> >

> <

> >

:

⇡(r) = 1, is a uniform prior.

⇡(x) , is a high resolution galaxy image prior

¹

.

⇡(y |x, r) =

^Q ¹

p

2⇡ _p²

e

1 2

P

p

(yp Cp(x,r))²

2p

, is the likelihood of a galaxy imaging observation,

(2.2)

where C

p

(x, r) is a camera model taking the high-resolution image x and the comov- ing distance r as inputs, and returning the p

^th

pixel of the CCD (camera sensor).

To be agnostic, we set the prior distribution over the distance r to a uniform distri- bution. Having

⇡(x) = Z

dz⇡(z)

^k

(x D

g

(z)), (2.3)

with D

g

(z) being a generative model, and z being the k-dimensional latent vector from which high-resolution images can be generated (this will be discussed in detail in Chapter 3). In this way, ⇡(z) encodes the intrinsic feature distribution in a lower dimensional space than ⇡(x). By writing the posterior as ⇡(x, z, r|y) and marginalizing out x (see Appendix B), we turn the posterior into

⇡(z, r |y) / ⇡(z) e

1 2

P

p(yp Cp(Dg (z),r))²

2p

Q

p

p 2⇡

_p²

. (2.4)

Since we are not interested in the latent vector z we can marginalize it out, which gives

⇡(r |y) = Z

dz⇡(z, r |y) / Z

dz⇡(z) e

1 2

P

p(yp Cp(Dg (z),r))²

2p

Q

p

p 2⇡

²_p

. (2.5) We can then do a Monte Carlo approximation ⇡(z) ⇡

_M¹

P

M

m=1 k

(z

m

z), which inserted into the integral Eq. (2.5) yields

ˆ

⇡(r |y) = 1 M

X

m

e

1 2

P

p

(yp Cp(r,Dg (zm)))²

p2

Q

i

p 2⇡

²_p

. (2.6)

1This is the intrinsic feature distribution.

(33)

where ˆ⇡ ⇡ ⇡ has a dependence on the selected z-ensemble {z

1:M

}. Finding the optimal distance can then be achieved by optimizing the posterior with respect to r and z

1:M

,

r

_est

= max

r

ˆ ⇡(r |y). (2.7)

Using an ideal noise-free camera model (introduced in section 2.3), the loss between y and C(x, r) can be taken as their resemblance through the pixel-wise mean squared error (MSE), that is

L(y, x, r) = 1 P

X

P p=1

(y

_p

C

_p

(x, r))

²

, (2.8)

which corresponds to

i

= p

P/2 in Eq. (2.6), so that

ˆ

⇡(r |y) ⇡ 1 M

X

m

e

^L(r,z^m⁾

(⇡P )

^P/2

/ X

m

e

^L(r,z^m⁾

. (2.9)

The distance estimation algorithm for obtaining r in Eq. (2.7) is summarized in Algorithm 1. After randomly sampling latent variables z

1

, . . . , z

N

and generating the corresponding N artificial high-resolution galaxies using a deep generative model (which will be introduced in the Chapter 3), we project these to diﬀerent distances until we find the r

i,opt

that minimizes the loss L

i

. In order to project the generated galaxies to diﬀerent distances and optimize over r, there is a need for a camera/te- lescope model, which we will introduce next in section 2.3.

As a first step of optimizing over the {z

^1:M

}-ensemble we then perform sieving.

This means that we pick out M ⌧ N of the generated galaxies that yield the lowest

loss. The reason for keeping several galaxies is that there is no unique standard-size

of a galaxy; there can be multiple high-resolution images that after projection to

diﬀerent distances matches well with the observed galaxy. Subsequently, we alter-

nately optimize over r

i

and z

i

(which can be done with the methods introduced

in section 2.3.1 and 2.3.2) for each sieved galaxy and then construct the joint dis-

tribution ˆ⇡(r|y), whose maximum yields the estimated distance r

est

. As long as

r

_est

⌘ |r

est,j

r

_est,j-1

| > r

tol

between iterations j and j 1, the process continues.

(34)

2.3. COSMOLOGICAL PINHOLE CAMERA MODEL 25

Algorithm 1: Distance Estimation for y using generative model D

g

.

1

Randomly sample z

i

, i = 1, . . . , N .

2

Generate high-resolution galaxies x

i

, i = 1, . . . , N using x

i

= D

_g

(z

_i

) .

3

while r

est

> r

tol

do

4

for each x

i

do

5

r-step: Find r

i,opt

so that L

i

(y, x, r

_i,opt

) is minimized.

6

z-step: Optimize z

i

using the current r

i,opt

. Then re-generate x

i

with D

g

(z

^new_i

).

7

end

8

if first iteration then

9

Sieve: only keep the M ⌧ N galaxies with lowest loss L

i

(y, x, r

_i,opt

) for future iterations.

10

end

11

r

_est

max

r

⇡(r ˆ |y).

12

end

2.3 Cosmological Pinhole Camera Model

The pinhole model is one of the simplest camera models and consists of a hole through which light rays from an object can pass through. This results in an in- verted image of the light source behind the hole, and it can therefore be more convenient to work with a virtual image plane located in front of the hole, as shown in Fig. 2.1. Despite the simplicity (e.g. by lacking lenses or system of lenses), the pinhole model often provides a valid approximation of the true imaging process (Forsyth and Ponce, 2003) and was especially developed as a model for CCD like sensors (Hartley and Zisserman, 2001). This camera is used in this work for demon- stration purposes and can be extended to more complex models in future work.

The use of a deterministic and controllable camera model enables us to keep track of the transformations from physical sizes to the recorded sizes on the CCD-detector.

The orientation and location of the camera reference frame can be assumed to be aligned with the world reference frame, so that no extrinisic parameters need to be considered. Intrinsic parameters, on the other hand, are needed to link the pixel coordinates to the coordinates in the camera reference frame.

A 3D point P can be written as [X, Y, Z, 1], where Z = r (note that we can always

(35)

Figure 2.1: Pinhole camera model. Despite its simplicity, this camera model ap- proximates CCD like sensor well, and works by letting light pass through a small hole. The galaxy is captured on an image plane behind the hole, but a virtual image plane in front of the whole is often more convient to work with.

align the Cartesian coordinate system so that the Z-axis points directly towards the observed galaxy, with radial comoving coordinate r) in a homogeneous coordinate- system, and its image p as [x

⁰

, y

⁰

, z

⁰

] , with the 2D image point being given by

(x, y) =

✓ x

⁰

z

⁰

, y

⁰

z

⁰

◆ . (2.10)

As a consequence, we have a projective equivalence relation through (x, y) = (x, y, 1) = (kx, ky, k) 8k 2 R

\{0}

. The projected coordinates, also called the perspective pro- jection equations, derived from the similar triangles rule, are

x = f X Z , y = f Y

Z ,

(2.11)

where f is the focal length of the camera/telescope. Now, in a cosmological context,

we need to account for the changes in angular diameter with redshift z, which aﬀects

the traverse observed size of an astronomical object (remember section 1.4.1). Since

Z = r = D

C

A Machine Learning Approach for Comprehending Cosmic Expansion

A Machine Learning Approach for Comprehending Cosmic Expansion

Constraining the Redshift-Distance Test Using a Cosmological Pinhole Camera and a Deep

Convolutional VAE with VampPrior LUDVIG DOESER

KTH ROYAL INSTITUTE OF TECHNOLOGY

SCHOOL OF ENGINEERING SCIENCES

A

L D , @ .

E P ; The e ical Ph ic

K H R I

S , S

E

J L

A P A D H P D ,

K H R I

J J

A P P D ,

S

K H

F R

P P A P ,

K H R I

M.Sc. Thesis

A Machine Learning Approach for Comprehending Cosmic Expansion

Constraining the Redshift-Distance Test Using a Cosmological Pinhole Camera and a Deep Convolutional VAE with VampPrior

Ludvig B. Doeser –– doeser@kth.se Supervisor: Jens Jasche

11 juni 2021

Abstract

= 140 (580) Mpc for z = 0.05 (0.5)). Including the surface brightness attenuation and utilizing the avalanche of upcoming galaxy images could significantly lower the uncertainties.

This thesis thus shows a promising path forward utilizing novel machine learning

techniques and massive next-generation imaging data to improve and generalize the

traditional cosmological angular-diameter test, which in turn has the potential to

increase our understanding of the Universe.

Sammanfattning

om galaxernas visuella egenskaper och kan generera syntetiska galaxbilder som be-

varar de grova dragen (form, storlek, lutning och ytans ljusprofil), men inte finare

morfologiska egenskaper, såsom spiralarmar. Den generativa modellen för galax-

bilder är tillämplig på användningar som inte omfattas av denna avhandling och är

därmed ett bidrag i sig. Därefter implementerar vi en kosmologisk hålkameramod-

ell, med vilken hänsyn till förändringar i vinkelstorleken med rödförskjutning tas,

för att framåt-simulera den faktiska observationen på en teleskopdetektor. Med ut-

gångspunkt från hypotesen att galaxer i grunden borde ha gemensamma egenskaper

med liknande fysiska storlekar, använder vi probabilistisk triangulering för att hitta

avståndet (s.k. "comoving distance") r(z, ⌦) till dessa i ett platt (K = 0) uni-

versum. Med hjälp av ett urval av högupplösta galaxbilder från rödförskjutningar

z 2 [0.05, 0.5] från TNG-50 visar vi att den implementerade "Bayesian inference"-

metoden framgångsrikt uppskattar r(z) inom 1 felmarginaler ( r

= 140 (580)

Mpc för z = 0, 05 (0, 5)). Att inkludera dämpning i ytljusstyrka med rödförskjutning

och att använda den massiva mängd av kommande galaxbilder skulle kunna min-

ska den erhållna osäkerheten betydligt. Denna avhandling visar således en lovande

väg framåt med nya maskininlärningstekniker och kommande enorma mängder av

galaxbilder för att förbättra och generalisera det traditionella kosmologiska vinkel-

diametertestet, vilket i sin tur har potentialen att öka vår förståelse om universum.

iii

Acknowledgements

Thanks also to my supervisor Felix Ryde and examiner Josefin Larsson at KTH for help with the administrative parts of the thesis. Thanks to Dylan Nelson, con- tact person for the TNG-Illustris-project, for answering questions related to the TNG-50 simulation dataset.

This research utilized Google Colab for implementing the deep learning algorithms

in Python. Packages used include Astropy, Matplotlib, Numpy, and Tensorflow.

List of Acronyms

ACS Advanced Camera for Surveys CCD charge-coupled device

CMB Cosmic Microwave Background CNN convolutional neural network EB Empirical Bayes

ELBO Evidence Lower Bound

FLRW Friedmann–Lemaitre–Robertson–Walker FP fundamental plane

GAN Generative Adversarial Network GSS golden section search

HST Hubble Space Telescope JWST James Webb Space Telescope KL Kullback-Leibler

ML Machine Learning

MSE mean squared error

NN neural network

SB surface brightness

TFR Tully-Fisher relation

VAE Variational Auto-Encoder

NIRCam Near-Infrared Camera

Contents

1 Introduction 1

1.1 State of Cosmology . . . . 1

1.2 Friedmann Model of a Homogeneous Isotropic Universe . . . . 2

1.3 Cosmological Redshift . . . . 5

1.4 Distance Measurements . . . . 7

1.4.1 Angular Diameter Distance . . . . 9