Interactive out-of-core rendering and filtering of one billion stars measured by the ESA Gaia mission

(1)

Department of Science and Technology Institutionen för teknik och naturvetenskap

Linköping University Linköpings universitet

g n i p ö k r r o N 4 7 1 0 6 n e d e w S , g n i p ö k r r o N 4 7 1 0 6 -E S

LiU-ITN-TEK-A--18/034--SE

Interactive out-of-core

rendering and filtering of one

billion stars measured by the

ESA Gaia mission

Adam Alsegård

(2)

LiU-ITN-TEK-A--18/034--SE

Interactive out-of-core

rendering and filtering of one

billion stars measured by the

ESA Gaia mission

Examensarbete utfört i Medieteknik

vid Tekniska högskolan vid

Linköpings universitet

Adam Alsegård

Handledare Emil Axelsson

Examinator Anders Ynnerman

(3)

Upphovsrätt

Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare –

under en längre tid från publiceringsdatum under förutsättning att inga

extra-ordinära omständigheter uppstår.

Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner,

skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för

ickekommersiell forskning och för undervisning. Överföring av upphovsrätten

vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av

dokumentet kräver upphovsmannens medgivande. För att garantera äktheten,

säkerheten och tillgängligheten finns det lösningar av teknisk och administrativ

art.

Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i

den omfattning som god sed kräver vid användning av dokumentet på ovan

beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan

form eller i sådant sammanhang som är kränkande för upphovsmannens litterära

eller konstnärliga anseende eller egenart.

För ytterligare information om Linköping University Electronic Press se

förlagets hemsida

http://www.ep.liu.se/

Copyright

The publishers will keep this document online on the Internet - or its possible

replacement - for a considerable time from the date of publication barring

exceptional circumstances.

The online availability of the document implies a permanent permission for

anyone to read, to download, to print out single copies for your own use and to

use it unchanged for any non-commercial research and educational purpose.

Subsequent transfers of copyright cannot revoke this permission. All other uses

of the document are conditional on the consent of the copyright owner. The

publisher has taken technical and administrative measures to assure authenticity,

security and accessibility.

According to intellectual property law the author has the right to be

mentioned when his/her work is accessed as described above and to be protected

against infringement.

For additional information about the Linköping University Electronic Press

and its procedures for publication and for assurance of document integrity,

please refer to its WWW home page:

http://www.ep.liu.se/

(4)

Linköping University| Department of Science and Technology Master Thesis| Media Technology and Engineering Spring 2018| LiU-ITN-TEK-A--00/000--SE

Interactive out-of-core rendering

and filtering of one billion stars

measured by the ESA Gaia mission

Adam Alsegård

Supervisor: Emil Axelsson Examiner: Anders Ynnerman

Linköping University SE-601 74 Norrköping, Sweden 013–28 10 00, www.liu.se

(5)

Abstract

The purpose of this thesis was to visualize the 1.7 billion stars released by the European Space Agency, as the second data release (DR2) of their Gaia mission, in the open source software OpenSpace with interactive framerates and also to be able to filter the data in real-time. An additional implementation goal was to streamline the data pipeline so that astronomers could use OpenSpace as a visualization tool in their research.

An out-of-core rendering technique has been implemented where the data is streamed from disk during runtime. To be able to stream the data it first has to be read, sorted into an octree structure and then stored as binary files in a preprocess. The results of this report show that the entire DR2 dataset can be read from multiple files in a folder and stored as binary values in about seven hours. This step determines what values the user will be able to filter by and only has to be done once for a specific dataset. Then an octree can be created in about 5 to 60 minutes where the user can define if the stars should be filtered by any of the previously stored values. Only values used in the rendering will be stored in the octree. If the created octree can fit in the computer’s working memory then the entire octree will be loaded asynchronously on start-up otherwise only a binary file with the structure of the octree will be read during start-up while the actual star data will be streamed from disk during runtime.

When the data have been loaded it is streamed to the GPU. Only stars that are visible are uploaded and the application also keeps track of which nodes that already have been uploaded to eliminate redundant updates. The inner nodes of the octree store the brightest stars in all its descendants as a level-of-detail cache that can be used when the nodes are small enough in screen space.

The previous star rendering in OpenSpace has been improved by dividing the rendering phase into two passes. The first pass renders into a framebuffer object while the second pass then performs a tone-mapping of the values. The rendering can be done either with billboard instancing or point splatting. The latter is generally the faster alternative. The user can also switch between using VBOs or SSBOs when updating the buffers. The latter is faster but requires OpenGL 4.3, which Apple products do not currently support.

The rendering runs with interactive framerates for both flat and curved screen, such as domes/plan-etariums. The user can also switch dataset during render as well as render technique, buffer objects, color settings and many other properties. It is also possible to turn time on and see the stars move with their calculated space velocity, or transverse velocity if the star lacks radial velocity measurements. The calculations omits the gravitational rotation.

The purpose of the thesis has been fulfilled as it is possible to fly through the entire DR2 dataset on a moderate desktop computer and filter the data in real-time. However, the main contribution of the project may be that the ground work has been laid in OpenSpace for astronomers to actually use it as a tool when visualizing their own datasets and also for continuing to explore the coming Gaia releases. Keywords:out-of-core rendering, large-scale visualization, hierarchical octree, GPU streaming, real-time filtering.

(6)

Acknowledgments

This thesis would not have been the same without the help and enthusiasm from a number of people to whom I would like to express my gratitude.

First and foremost I have to thank my supervisor Emil Axelsson for all of our technical discussions, for letting me exploit your illustration skills, for being my travel companion and especially for the last few weeks of the project when we worked hard to get the rendering to work in a dome. I owe you a lot!

Secondly I would like to thank Anders Ynnnerman for proposing the project in the first place and for enabling me to go and meet astronomers both in Vienna and in New York.

I would also like to thank Jacqueline Faherty and Brian Abbot at the American Museum of Natural History (AMNH) for inviting me to New York, for your willingness to use OpenSpace for the Gaia Sprint event at AMNH and for being fantastically enthusiastic and generous hosts. João Alves and Torsten Möller at the University of Vienna for your initial input for the project and for sharing your enthusiasm for Gaia and at the possibility of getting a new visualization tool to work with. Marie Rådbo for teaching me about space in general and for cheering me on. Alexander Bock and the rest of the OpenSpace developer team for helping me prepare for the event at AMNH and for letting me be a part of the team. Patrik Ljung, Karljohan Lundin Palmerius, Kristofer Krus and everybody else at the Visualization Center C in Norrköping for your input and for treating me as your colleague for last five months.

Finally I would like to thank my partner Rebecca and all the rest of my friends and family who have supported me during this time. I love you all!

June 15 2018, Norrköping Adam Alsegård

(7)

Acronyms

AABB Axis-Aligned Bounding Box. 19, 35

AMNH The American Museum of Natural History. 1, 7, 15, 31, 38, 40, 41 AU Astronomical Unit. 9, 13

CCA Center for Computational Astrophysics. 31 CPU Central Processing Unit. 3, 5, 6, 22, 28, 41 CSV Comma-Separated Value. 15, 17, 34, 35 Dec Declination. 11, 12, 17, 19

DPAC Gaia Data Processing and Analysis Consortium. 11 DR1 Gaia Data Release 1. 2, 11, 15, 17, 34

DR2 Gaia Data Release 2. 2, 3, 8, 11, 12, 13, 14, 15, 17, 19, 28, 34, 37, 41 ESA European Space Agency. 1, 2, 9, 16

FBO Framebuffer Object. 25, 30, 35, 36

FITS Flexible Image Transport System. 15, 16, 17, 28, 34, 40 FPS Frames Per Second. 37

GPU Graphics Processing Unit. 3, 5, 6, 15, 17, 18, 19, 20, 22, 23, 24, 28, 34, 41 ICRS International Celestial Reference System. 11, 12

L2 Second Lagrange point. 10, 11 LiU Linköping University. 1

LOD Level-Of-Detail. 6, 18, 20, 21, 23, 29, 30, 36, 37, 42 MPI Message Passing Interface. 6

NASA National Aeronautics and Space Administration. 1 NYU New York University. 1

(10)

Acronyms iv RA Right Ascension. 11, 12, 17, 19

RAM Random-Access Memory. 3, 5, 6, 15, 17, 18, 19, 22, 28, 30, 36, 41, 42 SCI University of Utah Scientific Computing and Imaging Institute. 1

SGCT Simple Graphics Cluster Toolkit. 26

SSBO Shader Storage Buffer Object. 20, 21, 22, 24, 30, 34, 36, 37, 41 TGAS Tycho-Gaia Astrometric Solution. 2, 7, 15, 34

UBO Uniform Buffer Object. 20

(11)

List of Figures

2.1 The TGAS dataset of 2 million stars visualized in 3D with TOPCAT. . . 7

2.2 The TGAS dataset of 2 million stars mapped as the night sky with TOPCAT. . . 7

3.1 Illustration of how a parallax angle is determined. . . 9

3.2 An artist’s rendition of the Gaia spacecraft for the Paris Air Show 2013. . . 10

3.3 The Gaia spacecraft model rendered in OpenSpace. The model has been rotated be-fore this screenshot so that the sun will brighten up the instrument. . . 10

3.4 Trail lines of the Gaia spacecraft rendered in OpenSpace. The shown trajectory is with respect to Earth’s position in space. . . 11

3.5 Illustration of how the space velocity vector can be broken up into a transverse veloc-ity and a radial velocveloc-ity. . . 14

4.1 Illustration of the data pipeline when reading a dataset from multiple files, such as the full DR2. . . 16

4.2 Illustration of the data pipeline when reading a dataset from a single file. . . 16

4.3 Illustration of how the size of the octree depends on the maximum number of stars in each node and the initial extent of the octree. . . 18

4.4 Illustration of which nodes that are eligible for streaming to the GPU as the camera rotates. The red (striped) nodes are already uploaded to the GPU and will not be updated. The blue (clear) nodes are no longer visible and will be removed from the GPU with their indices being returned to the index stack on the next render call. The green (circle) nodes become visible and will be uploaded to the GPU. If the node with given buffer index 88 is smaller in screen space than a set threshold it will return its LOD cache instead of traversing any further. . . 20

4.5 Illustration of how the SSBO buffers are updated in a single draw call. The traversal adds a node with buffer index 2 and removes the node with index 3. First the index buffer is updated linearly. The index buffer keeps track of the accumulated sum of stars in the data buffer. The numbers with new and removed stars are added and prop-agated through the remaining buffer. Thereafter the data buffer is updated with the actual new data. The buffer index of the node is used to determine where in the buffer the data should be written. . . 21

4.6 Illustration of which nodes that will be fetched around the camera initially. The cyan nodes are children to the neighboring nodes on the same level as the inner node (red) that contains the camera. The blue nodes are children to neighboring nodes of the second parent while orange nodes are the same for the third layer of neighboring parent nodes. By default this means that 632 nodes will be fetched around the camera with the possibility for the user to add more layers. . . 23

(12)

LIST OF FIGURES vi 4.7 The TGAS subset rendered as Static. Here all stars are assumed to have the same

luminosity. . . 24 4.8 The TGAS subset rendered as Color. The luminosity is calculated from the stars’

absolute magnitude. . . 24 4.9 Extreme over-exposing effect while rendering billboards. . . 25 4.10 Stars rendered as billboards with an excessive initial size and close-by stars boosted

even further. . . 25 4.11 Stars rendered as points with a filter size of 19 and sigma of 2.0 which gives an

excessive effect. . . 25 4.12 The radial velocity subset with 7.2 million stars rendered in Static mode with all stars

enabled. . . 26 4.13 The radial velocity subset rendered in Static mode with all stars without parallax

filtered away. . . 26 4.14 Darkening effect when rendering in fish-eye mode (i.e. on a curved screen). . . 26 4.15 Illustration of how the scalefactors forσmajor andσminor are calculated. . . 27 5.1 Render statistics with 1.1 million stars visible and a screen resolution of 1280x720. . 31 5.2 Render statistics for the radial velocity dataset with a screen resolution of 1920x1200. 32 5.3 Performance for different filter sizes with and without scaling of the filter enabled. 2

million stars visible on a flat screen with 1920x1200 resolution. . . 33 5.4 Photograph taken at the Gaia Sprint show at AMNH. . . 33 6.1 Block structure that can occur when storing the brightest stars as LOD cache. . . 37 6.2 Highlighting the visual artifact when rendering the full DR2 dataset of 1.7 billion stars. 37 6.3 Gaia Sky running their largest dataset with a background image of the Milky Way,

with maximum star brightness used. . . 38 6.4 Gaia Sky running their largest dataset without a background image of the Milky Way,

with maximum star brightness used. . . 38 6.5 61 million visible stars rendered in OpenSpace with a background image of the Milky

Way. . . 38 6.6 61 million visible stars rendered in OpenSpace without a background image of the

(13)

List of Tables

5.1 Statistics while running ReadFitsTask for the full DR2 with 1.7 billion stars. . . 28 5.2 Statistics while running ReadFitsTask for a random subset from DR2 with 42,9

mil-lion stars. . . 29 5.3 Statistics while running ReadFitsTask for the radial velocity subset from DR2 with

7.2 million stars. . . 29 5.4 Statistics while running ConstructOctreeTask for three different DR2 datasets. . . 29 5.5 Statistics while running ConstructOctreeTask for DR2 and filter both bright and dim

stars with a parallax error less than 0.9. . . 30 5.6 Performance while rendering a subset with 618 million stars on a flat screen with

1920x1200 resolution. . . 31 5.7 Statistics while rendering different large datasets on a flat 1920x1200 screen. . . 32

(14)

Chapter 1 Introduction

Humanity’s drive to explore goes back to the beginning of our civilization. It began with exploring our closest environment but expanded quickly and once most of the Earth had been discovered we turned our gaze to the stars. While there is still a lot we do not understand about our own planet there is even more knowledge hidden in space.

This thesis is part of that drive to explore. The main focus of the project is to work with a dataset of about 1.7 billion stars released by the European Space Agency (ESA) as part of their Gaia mission and to develop tools that astronomers around the world can use to explore this dataset. With help from these tools the astronomers are hopefully able to discover new knowledge for us all to share.

This report will discuss what optimization techniques that can be used to work with and render a dataset of that magnitude in real-time. The techniques were implemented in the open source software OpenSpace1_.

1.1 Background

This thesis project stems from a collaboration between professors at Linköping University and at the University of Vienna. The scientists in Vienna already worked with data from ESA’s Gaia mission but were interested in better visualization tools to explore the dataset. At the same time a group at Linköping University had been working on a software to visualize the cosmos for a few years but had not incorporated data from the Gaia mission as of yet. That is where this thesis come into play.

1.1.1 OpenSpace

The visualization software developed partly at Linköping University is called OpenSpace and is an open source interactive data visualization software designed to visualize the entire known universe and portray humanity’s ongoing efforts to investigate the cosmos [11]. It is designed to support both personal computers as well as domes/planetariums that are using a cluster of computers and projec-tors. The aim is both to serve as a platform for scientists chasing new discoveries and for museums working with public outreach. The software is implemented in C++17 and OpenGL 3.3 and above. OpenSpace supports multiple operating systems, interactive presentations of dynamic data and en-ables simultaneous connections across the globe. The development is a collaboration between

Linköping University (LiU), The American Museum of Natural History (AMNH), National Aero-nautics and Space Administration (NASA), New York University (NYU) and the University of Utah

1_{http://openspaceproject.com/}

(15)

CHAPTER 1. INTRODUCTION 2 Scientific Computing and Imaging Institute (SCI).

1.1.2 The Gaia mission

On December 19, 2013 ESA launched the Gaia instrument into space with the objective to "measure the position, distances, space motions and many physical characteristics of some one billion stars in our Galaxy and beyond" [1]. About two years later ESA released Gaia Data Release 1 (DR1) to the public in September 2016 with 1.1 billion point sources based on observations during the first 14 months of the mission. Out of those sources "only" 2 million counted as part of Tycho-Gaia Astrometric Solution (TGAS) or were referred to as primary sources. The rest were merely seen as placeholders until the next release because they either lacked a number of parameters including parallax or because of high uncertainties in the measurements.

On the 25th of April 2018 the Gaia Data Release 2 (DR2) went public. In this update 1.7 billion point sources have their measured galactic position in space and G-band magnitude. A large portion also have photometry, parallaxes and proper motions as well as radial velocities for about 7.2 million stars2. A third release will happen in 2020 before the final release of the catalogue in the end of 2022. A more in-depth explanation of the Gaia mission and its releases can be found in Chapter 3.

1.2 Objective

The work of this thesis has two main objectives regarding the Gaia mission. The first is to enable interaction with the Gaia DR2 dataset for public outreach purposes and the second is to develop tools that scientists can use in their research while exploring the same dataset. Each objective can be divided into a few sub goals.

1.2.1 Implementation goals

The following are the implementation goals for the public outreach objective: • Visualize the Gaia mission and how the instruments measure the stars.

• Display the stars at their measured position in 3D space and render them physically correct with regard to brightness and size.

• Be able to switch rendering technique during runtime so the full extent of the measurements can be appreciated.

• Put the new dataset into context with previous knowledge, such as the constellations, and inte-grate it with the rest of the OpenSpace software.

The implementation goals for the researched focused objective are as follow:

• Be able to load the full DR2 dataset, or a chosen subset, and render it in 3D with interactive framerates.

• Be able to re-load a different subset during runtime. The subset could have been created in a third party software.

(16)

CHAPTER 1. INTRODUCTION 3 • Be able to filter what stars to render by their spatial information as well as photometric

proper-ties or by the estimated error in the measurements.

• Render stars differently depending on magnitude and photometry to make them easier to clas-sify.

• Read velocity where possible and be able to "turn back time" to be able to see how the stars move in space.

1.2.2 Research questions

Part of the implementation process is to figure out what techniques to use. The main question this thesis poses is: What combination of optimization techniques can enable real-time rendering of one billion stars?

As this is a fairly big question it can be divided into several smaller questions within different areas. For example, as the DR2 dataset will be much too large to store in a computer’s working memory, or Random-Access Memory (RAM), some sort of streaming technique from disk or network will have to be researched and implemented. Then an optimization technique of uploading the data to the Graphics Processing Unit (GPU) will have to be implemented as well, as the memory on the GPU is even smaller than the one on the Central Processing Unit (CPU). Finally there is the issue of rendering the particles to the screen. When taking these insights into consideration the research questions become:

• What optimization technique will enable streaming of data dynamically to the CPU during runtime?

• How should the data be structured to limit the uploading of data to the GPU? • What rendering technique is optimal for rendering as many stars as possible?

• What rendering technique is optimal for improving the readability when exploring the dataset? • What tools can OpenSpace offer to astronomers that are not already available?

1.3 Limitations

The limitations this project faces are mostly connected to OpenSpace and how it is used. As Open-Space is used in planetariums around the world the techniques must work on a cluster of computers, most with only a single GPU per computer and possibly a limited amount of available RAM. Thus solutions for supercomputers are not feasible for this project. However, if the user has better hardware the algorithms should be able to scale accordingly.

OpenSpace also supports both Windows, Linux and Mac OS with multiple users on each operating system. Thus, to the extent that it is possible, the implementation should strive to work for all operating systems. This is most notable for Mac OS as they so far only supports OpenGL versions up to 4.1 [6], while Windows and Linux support up to 4.6, which was released in July 2017 and is still the latest version as of June 2018 [19]. To run OpenSpace users are required to have at least OpenGL 3.3 but several modules require a later version to work properly.

(17)

CHAPTER 1. INTRODUCTION 4

1.4 Delimitations

There are many aspects of the Gaia DR2 dataset that this thesis will not take into consideration. The release contains information about a lot of different celestial objects such as exoplanets, asteroids, quasars, supernovae and variable stars that will not be mentioned further in this report. Only stars will be visualized and only a subset of the information about each star will be presented as well. Up to 95 values were released for each star but only the 24 values used for either rendering or filtering (see Section 4.5 and 4.3.2) will be explained in this report.

(18)

Chapter 2 Related work

As computing hardware capabilities grow so does the amount of scientific data that those hardware are able to produce. To visualize the larger amount of data new techniques have to be developed that can handle the increased requirements in memory and bandwidth. This chapter will briefly go through a few examples of recent large-scale simulations and novel algorithms developed to visualize the simulated data, as well as visualization software that can be used to explore huge datasets in real-time. Even though the following sections will mainly discuss astronomical data the presented techniques should work for other kinds of large particle datasets as well, such as molecular structures.

2.1 Large-scale simulations

Gaia is not the only instrument that generates large quantities of data. Most of the others are however based on simulations instead of real measurements. In astrophysics this is often referred to as N-body simulations, as they try to solve how n bodies interact with each other gravitationally. These simulations have grown rapidly both in size and fidelity in the recent past due to improvements in hardware and algorithms as well as in observation techniques that can validate the results. A few recent examples of N-body simulations are the Millennium Run in 2005 consisting of 10 billion particles [13], a 6 billion year simulation of the Milky Way on a supercomputer with 51 billion particles in 2014 [9], the Millennium XXL project in 2010 that simulated over 300 billion particles for more than 13 billion years [5], the Q Continuum simulation in 2015 that harnessed the powers of GPUs in supercomputers and simulated 550 billion particles [17] and the Dark Sky simulations that had a first data release in 2014 with 1.07 trillion particles [31].

The growth of these simulations is outperforming Moore’s Law which so far has been a good indicator of the growth of hardware capabilities. Beside position these simulations often produce velocity and other properties per star, not unlike the measurements provided by Gaia. This increase of data poses a challenge for all visualization tools and new algorithms for data management and rendering have to be developed to be able to work with such huge datasets.

2.2 Visualization techniques

The goal of most visualization tools is to achieve good enough framerates that the user can interact with the data in real-time. To do that you have to optimize the bottlenecks in the rendering pipeline which currently usually are reading large amount of data from disk, streaming data from the CPU to the GPU and ineffective rendering techniques.

(19)

CHAPTER 2. RELATED WORK 6 Because the memory footprint of the particle data often exceeds the amount of available RAM in a single computer one either has to divide the data into numerous files, require a computer with large enough RAM to fit the whole dataset, compute the renderings offline or make use of a cluster of machines to achieve interactive framerates.

After the particle data have been read it should be reorganized into an ordered data structure as a preprocess to optimize the disk operations during runtime. The most popular structure for particle data in recent research is an octree ([27], [26], [29] etc). The octrees are mainly used to optimize what data to load into memory, or what to stream to the GPU. Another way to reduce how much data to stream is to use Level-Of-Detail (LOD), which is to store several layers of data with more and more complexity. During rendering one of the levels with less complexity can then be fetched if the node is far away and thus reduce the amount of data to stream.

The last area that researchers are focusing on improving is rendering techniques. Some of the most common techniques today are direct rendering of particles, volume rendering, distributed rendering and CPU-based ray-casting. Direct rendering is also known as point splatting [18], which often makes use of geometry instancing in geometry shaders (also sometimes referred to as billboarding or point sprites) to reduce the complexity of the data to upload to the GPU.

One group that made use of a computer cluster was Rizzi et al. [27] who were able to run a dataset of 32 billion particles using a cluster of 128 GPUs. They used a hierarchical octree structure and a distance-based LOD and made use of a Message Passing Interface (MPI) to read the data in parallel before rendering it with a parallel point sprite algorithm.

If you instead have large amounts of RAM you can use the technique presented by Wald et al. [34] to render the dataset on a single CPU. They used a purely CPU-based ray-tracing algorithm to render a dataset of a billion particles with interactive framerates using a 72-core CPU with 3 TB of RAM. The data structure was a balanced k-d tree but no LOD were used in contrast to most other implementations but it was still competitive to many GPU-based techniques.

If you do not have access to a supercomputer or special hardware with terabytes of RAM, as is the case with this thesis project, there are still a couple techniques out there. One makes use of a CUDA-accelerated wavelet compression to further reduce bandwidth requirements when streaming from disk and manages to render the 10 billion particles simulated by the Millennium Run with point splatting [26]. Another combines LOD and billboarding to render a dataset of 10 billion particles of molecular data on a single GPU [20]. Another example of rendering the Millennium Run is Fraedrich et al. [13] who used an adaptive octree structure and continuous LOD and saved subsets of the octree to files. All nodes with the same parent (i.e. a subtree) were packed together in a file and then only files that were visible were loading during a fly-through.

There are several other examples of implementations where parts of the octree were saved to files and streamed dynamically during render, such as Lukac [21] who claims to be able to render up to 10 billion particles in HD-resolution on a single desktop with only 4 GB of RAM. However, almost 90% of the particles were culled during rendering.

Whereas most of the mentioned projects have used point splatting or ray-casting there are also de-velopment in the field of volume rendering. Scherzinger et al. [30] came up with a novel merger-tree approach that combined volume ray-casting on volumetric resampling and direct visualization of halo overlays and their evolution over time. The paper won the IEEE Scientific Visualization Contest in 2015 and was implemented in the framework Voreen.

Finally there is a novel hybrid method presented by Schatz et al. [29] that renders a dataset of a trillion particles from the Dark Sky simulation on a single computer. They make use of a dual-GPU configuration that splits the data depending on type and renders particles with geometry instancing on one GPU for details and uses volume-based raycasting on density volume on the other GPU to

(20)

CHAPTER 2. RELATED WORK 7 provide context. Their approach is to make use of an octree structure which stores all leaf nodes as files and then only load nodes in the closest vicinity of the camera. By limiting the streaming of data they only need about 5 GB of RAM for the particle data during rendering.

2.3 Visualization software

There is a wide range of existing visualization software packages focusing on astronomy. While as-tronomers often are more comfortable with low-level tools like TOPCAT1 _{and GlueViz}2_{, which are} great for selecting subsets and flexible linked views, several higher level tools exist as well. To give some context of what astronomers are used to look at Figure 2.1 shows the TGAS dataset of 2 million stars being visualized in TOPCAT while Figure 2.2 shows the same dataset being mapped as night sky on a sphere. Together they illustrate the difficulty of getting a sense of scale and structure, which is why most astronomers so far are working with subsets of 100,000 stars or less.

Figure 2.1: The TGAS dataset of 2 mil-lion stars visualized in 3D with TOP-CAT.

Figure 2.2: The TGAS dataset of 2 mil-lion stars mapped as the night sky with TOPCAT.

Another type of software are the commercial planetarium focused, such as Uniview3 _{and Digistar}4 (developed by Sciss and Evans & Sutherland respectively). They incorporate a lot of visualizations for the cosmos but also produces other content for planetariums as well as full dome solutions. Then there are the open source software packages, all with different strengths and focuses. ParaView5 can for example visualize star particle data in 3D but are essentially focused on distributed mem-ory computing and scaling up to bigger clusters or supercomputers. It can visualize large datasets whatever scientific area but does therefore also not contain as many features for the cosmos.

Partiview6 and Celestia7 _{on the other hand have similar features as OpenSpace regarding the known} cosmos. Partiview contains much of the same regarding the outer universe while Celestia is more focused on the solar system. Partiview was developed by Brian Abbot at AMNH, who in turn created the Digital Universe catalogue that both Partiview and OpenSpace uses. OpenSpace does have quite

1_{http://www.star.bris.ac.uk/~mbt/topcat/} 2_{http://www.glueviz.org/en/stable/} 3_{http://sciss.se/uniview} 4_{https://www.es.com/digistar/} 5_{https://www.paraview.org/} 6_{http://virdir.ncsa.illinois.edu/partiview/} 7_{https://celestia.space/}

(21)

CHAPTER 2. RELATED WORK 8 a few advantages in terms of globe browsing [10] and visualization of specific missions however. Neither Partiview nor Celestia have made any effort to incorporate data from the Gaia mission as of yet.

The software that this thesis project has most in common with is Gaia Sky8which is promoted by the official Gaia website. Gaia Sky is a real-time 3D visualization tool that focuses on the Gaia mission and its data. The project started in the end of 2014, about the same as OpenSpace, and has since then been in continuous development with version 2.0 released on the same day as DR2 with several subsets of the data already preprocessed and ready for download.

Gaia Sky does have a couple of features that OpenSpace lacks, such as screen-space picking of ob-jects and showing information about the picked obob-jects in the user interface. Several features that were requested from the astronomers in Vienna, such as selection of subsets by regions and measure distances between objects, are still absent from both OpenSpace and Gaia Sky. Much like OpenSpace it feels like Gaia Sky’s main objective is public outreach and not to be used as a research tool. A more in-depth comparison with Gaia Sky will be held in Section 6.4.

(22)

Chapter 3 Visualize the Gaia mission

The first implementation goal of this thesis was to visualize the Gaia mission. The main objective of the (still ongoing) mission is to measure about 1% of the stars in our Milky Way. To reach that goal a spacecraft has been placed in an orbit around the sun where it keeps pace with the Earth. The following sections describe the mission, spacecraft and orbit, as well as explaining what data that have been released and how it should be converted before it can be used in OpenSpace.

3.1 The mission

Earth’s motion around Sun Distant stars

Near star

Apparent parallax motion of near star Parallax angle = 1 arc second p 1 P arsec 1AU

Figure 3.1: Illustration of how a parallax angle is determined. Source:Srain @ Wikipedia by PD [32]

Astrometry, the discipline to accurately measure the position of celestial objects, has a long history. The earliest star catalogue dates back to 190 BC and contained at least 850 stars and their positions. The discipline had an obvious surge after the inven-tion of the telescope in the 17th century, but it took until the 19th century before astronomers were able to figure out how to accurately measure the distance to the stars. The method they came up with was to use the parallax angle.

The method is similar to holding a finger in front of your face and then close one eye at a time and observing how the finger moves in contrast to a static background. When measuring stars the orbit of the Earth is used instead of the distance between your eyes. A photo is taken every six months in the same direc-tion and the observed displacement of close stars is used to de-termine their parallax angle. That angle is in turn used together with the known distance between the Sun and the Earth (1 As-tronomical Unit (AU) or 150 million kilometers) to determine the distance to that star by using simple trigonometry. Figure 3.1 illustrates the method.

However, observing the parallaxes from Earth proved to be quite difficult because of disturbances in the atmosphere and by the mid-1990s only about 8000 stars had accurate parallaxes. That changed in 1997 when the finding of ESA’s Hipparcos satellite (or High Precision Parallax Collecting Satellite) were released which measured parallaxes with high precision for 117.955 ob-jects [3].

(23)

CHAPTER 3. VISUALIZE THE GAIA MISSION 10 Gaia (or Global Astrometric Interferometer for Astrophysics) is meant to be the successor of Hip-parcos and the science project started in 2000. The construction of the instrument was approved in 2006 and it was launched on December 19 2013. The mission started with four weeks of ecliptic-pole scanning and subsequently transferred into full-sky scanning. The mission is to measure different properties regarding positions (astrometry), flux/intensity (photometry) and electromagnetic radiation (spectrometry) of about one percent (or two billion) of all stars in the Milky Way with high precision. Gaia will also be able to discover new asteroids, comets, exoplanets, brown dwarfs, white dwarfs, supernovae and quasars but the main objective is to clarify the origin and history of our galaxy [14].

3.2 The spacecraft

The Gaia spacecraft is comprised of three major functional modules; the payload module, the me-chanical service module and the electrical service module. To simplify one can say that the payload module is constructed to capture and process the data, the mechanical service module controls the navigation system and operates the instruments while the electrical service module controls the power and communications with the Earth.

The payload module carries two identical telescopes pointing in different directions. Three instru-ments are connected to these telescopes. The first is an astrometric instrument that will measure stellar positions on the sky. By combining several measurements of the same star it it also possible to deduce its parallax, distance and velocity across the sky.

The second is a photometric instrument that provides colour information by generating two low-resolution spectra, one red and one blue. These are used to determine properties such as mass, temper-ature and chemical composition. The third instrument is a radial velocity spectrometer that calculates the velocity in depth by measuring Doppler shifts of absorption lines in a high-resolution spectrum in a specific wavelength range. How the instruments work will not be further explained or visualized in this thesis project. Instead more information can be found on Gaia’s official website1_.

However, the Gaia spacecraft itself will be visualized in OpenSpace. An artist’s rendition of the space-craft can be seen in Figure 3.2. An open-source model of the spacespace-craft (produced by the University of Heidelberg) was imported into OpenSpace and is rendered at its correct position in space w.r.t a specific time and date. A render of the model in OpenSpace in shown in Figure 3.3.

Figure 3.2: An artist’s rendition of the Gaia spacecraft for the Paris Air Show 2013. Source: Pline @ Wikipedia by CC BY-SA 3.0 [25]

Figure 3.3: The Gaia spacecraft model ren-dered in OpenSpace. The model has been ro-tated before this screenshot so that the sun will brighten up the instrument.

(24)

CHAPTER 3. VISUALIZE THE GAIA MISSION 11

3.3 The orbit

A few weeks after the launch in the end of 2013 the Gaia instrument arrived at its operation point, the Second Lagrange point (L2) of the Sun-Earth-Moon system which is about 1.5 million km from the Earth. Lagrange points are positions in an orbital configuration of large bodies where the gravitational forces of the larger objects will maintain a smaller object’s position relative to them. Or as Mignard puts it "The region around L2 is a gravitational saddle point, where spacecraft can be maintained at roughly constant distance from the Earth for several years by small and cheap manoeuvres." [22]. At L2 Gaia will keep pace with the Earth’s orbit while enjoying a less obstructed view of the cosmos than an orbit around the Earth would provide. However, in a circular radius around L2 the Sun is always eclipsed by the Earth and thus the solar panels on Gaia would not receive enough sunlight. Gaia was therefore placed in a large Lissajous orbit around L2 to ensure it stays away from the eclipse zone for at least six years [22] to be able to complete its mission.

Figure 3.4: Trail lines of the Gaia spacecraft rendered in OpenSpace. The shown trajectory is with re-spect to Earth’s position in space. The orbit is visualized in OpenSpace by rendering trail lines

of its accurate journey to, and orbit around, the L2 point with respect to Earth’s position. Figure 3.4 shows Gaia’s trajectory from launch up until 22 May 2018. Data of the trajectory was obtained from the HORIZONS Web-Interface2_{hosted by the Jet} Propulsion Laboratory at California Institute of Technology. To keep fidelity the model in OpenSpace has been rotated so its sun-shield is always facing the Sun. A correct rotation around its own axis is not yet in place however as no real data of the rotation could be found.

OpenSpace already had implemented techniques to render trail lines but a new translation interface was implemented for this project to read the text format exported from Horizons, which was used to place the instrument at its correct position as well as showing the trail up until the end of the nominal mission. The

position for past dates is based on measurements and for future dates it is an approximation. If the user uses the menu in OpenSpace to turn back time (or set it to a future date) where no position data exists then the model will simply remain at the last known position.

One drawback of using Horizons is that the data is static, it is only accurate up to the date it was generated. Many other satellites have their data released in so called SPICE kernels. If a kernel file is updated then that change will be read on the next start-up and no manual update to the data has to be made. However, Gaia has not been released as a SPICE kernel as of yet [2] but the implementation should be updated to use it when or if it is released.

3.4 The releases

The measurements from the Gaia spacecraft will be released in four different batches. DR1 was released on September 14 2016, DR2 was released on April 25 2018, the third release will happen in late 2020 and the final release for the nominal mission will be in the end of 2022.

A short summary is that each release will have measurements of more stars and better measurements of previously released stars. DR2, which is the release this project is focused on, contains about 1.7 billion point sources with about 1.3 billion of them having measurements for parallax and proper

(25)

CHAPTER 3. VISUALIZE THE GAIA MISSION 12 motion in addition to their position on the night sky. Proper motion tells us the transverse velocity of the star across the sky. The position and proper motion of non-solar system objects are expressed in the International Celestial Reference System (ICRS) in terms of equatorial angles Right Ascension (RA) and Declination (Dec). However OpenSpace uses a Galactic Coordinate System3 _{with angles} in galactic latitude and longitude. A conversion of the positions had already been done by Gaia Data Processing and Analysis Consortium (DPAC) and both equatorial and galactic angles were released in DR2 [12]. However, the measurements of proper motion had not and thus a conversion had to be done before proper motion was used to calculate the velocity.

3.4.1 Conversions

The conversions used are the same as in the documentation for DR2 which made use of a simple ma-trix multiplication system [4]. A point in ICRS and the Galactic Coordinate System can be expressed as vectors rICRS =   XICRS YICRS ZICRS  =   d cos α cos β d sin α cos β d sin β   (3.1) and rGal =   XGal YGal ZGal  =   d cos l cos b d sin l cos b d sin b   (3.2)

whered is the distance to the star, α and β are the equatorial angles RA and Dec while b and l are galactic latitude and longitude. Distances in space are often expressed in parsec [pc]. The reason behind it is tied to how parallax angles are expressed. When using very small angles, such as the parallax angle to a distant star, 360 degrees are simply not enough to express them. Instead every degree is divided into 1 archour or 60 arcminutes. One arcminute is in turn 60 arcseconds. If the parallax angle of a star is exactly one arcsecond then the distance to that star is one parsec (or3.08 ∗ 1016

meters). The relationship is given byd = 1/p where p is the parallax angle. The angles in DR2 are given in milliarcseconds and as such the distances are expressed in kiloparsecs. The conversion from ICRS to Galactic is then obtained by

rGal = A′G∗ rICRS (3.3) where A′ G = Rz(−lΩ)Rx(90° − δG)Rz(αG+ 90°) (3.4) =   −0.0548755604162154 − 0.8734370902348850 − 0.4838350155487132 +0.4941094278755837 − 0.4448296299600112 + 0.7469822444972189 −0.8676661490190047 − 0.1980763734312015 + 0.4559837761750669   (3.5) 3_{http://astronomy.swin.edu.au/cosmos/G/Galactic+Coordinate+System}

(26)

CHAPTER 3. VISUALIZE THE GAIA MISSION 13 is a fixed orthogonal matrix that represents the rotation around the three axis. The proper motion angles pmra and pmdec can be expressed as components (µ⋆

α, µβ) with the corresponding values in Galactic angles being(µ⋆

l, µb) where µ⋆α = µαcos β) and µ⋆l = µlcos b).

For the conversions of the proper motion angles four auxiliary matrices are required;

pICRS =   − sin α cos α 0  , qICRS =   − cos α sin β − sin α sin β cos β   (3.6) and pGal =   − sin l cos l 0  , qGal =   − cos l sin b − sin l sin b cos b   (3.7)

which represents unit vectors in the direction of increasingα and β (or l and b). The Cartesian com-ponents of the vectors can then be expressed as

µICRS = pICRS ∗ µ⋆α+ qICRS ∗ µβ (3.8) and

µGal = pGal ∗ µ⋆l + qGal∗ µb (3.9) with the conversion being

µGal = A′G∗ µICRS (3.10)

whereA′

G is the same as in Eq 3.5. This implies then even though a conversion of the positions had taken place in DR2 OpenSpace can now read angles for both position and proper motion and convert them to the Galactic Coordinate system if need be.

3.4.2 Calculating velocity

One of the implementation goals of the project was to read velocity where possible and get the stars to move. To get the space velocity you need two vectors, a transverse velocity vector and a radial velocity vector (see Figure 3.5). To calculate the transverse velocity you need proper motion and the distance to the star. Proper motion is as mentioned the traverse motion across the sky and is expressed in milliarcseconds per year in DR2. To convert it tom/s one can use the same relationship as when calculating the distance from the parallax angle. An angle ofµ arcsec at a distance of r pc corresponds to a separation ofrµ AU [23]. In our case the proper motion parameters are the angles and the distance is obtained from the parallax. To convert the separation/transverse velocity fromAU/year to m/s one can use the following equation.

(27)

CHAPTER 3. VISUALIZE THE GAIA MISSION 14

Figure 3.5: Illustration of how the space velocity vector can be broken up into a transverse velocity and a radial velocity. Source: Brews ohare @ Wikipedia by CC BY-SA 3.0 [24] 1AU/year = ∼ 1.5 ∗ 10 11 m ∼ 3 ∗ 107_s = 4.74 ∗ 10 3 m/s (3.11) The last parameter needed to calculate the space velocity is the radial velocity, which is the velocity by which stars move towards or away from the Sun. In DR2 around 7.2 million stars were released with a radial velocity (in km/s). The radial velocity vector is calculated by using Eq 3.2 with the radial velocity as distance. The space velocity vec-tor is then obtained by combining the two velocity vecvec-tors. This vector can later be used to simulate the movement of the stars. This is however only an instantaneous veloc-ity vector. To obtain the true motion one has to incorpo-rate the gravity of nearby stars and the rotation around the galaxy’s center, which has not yet been implemented into OpenSpace.

(28)

Chapter 4 Render 1.7 billion stars

The main part of this project was to import and display the stars released in DR2. The full release contains about 1.2 TB of raw data which is too much for most computers to handle. Therefore an out-of-core rendering technique had to be implemented. The research presented in Section 2.2 concluded that the main bottlenecks when rendering large datasets usually are the I/O operations, transferring the data to the GPU and too many or too inefficient shader calls. This chapter will present how these bottlenecks have been optimized in OpenSpace.

4.1 System overview

As nothing previously had been implemented in the OpenSpace pipeline to handle such large particle datasets most of the data pipeline had to be implemented from scratch. This section will give a short overview of the pipeline and then let the following sections describe each step in more detail.

First off there is a difference in the entire pipeline depending on if the dataset is stored in one file or in several files. If the dataset is stored in one file we assume that it can fit in RAM. This assumption stems from the subsets that astronomers produced for this project. All the subsets were relatively small and stored in a single file. According to the astronomers this was how they were used to work. Therefore, if a single file is read initially then the "single file format" will be kept through the entire pipeline, even if files are produced in intermediate steps, which in turn means that the dataset cannot be streamed from disk during render and thus the entire dataset has to fit in RAM.

Before OpenSpace can start processing the data the file(s) have to be in a format OpenSpace can read, then the steps are basically to read the raw data, sort the stars into an octree structure, upload stars that are visible to the GPU and finally render them to the screen. The steps can be broken up in separate tasks, as shown in Figure 4.1 or the whole or parts of the process can be done during start-up for smaller single file datasets, as illustrated by Figure 4.2.

4.2 Read the data

There are multiple formats that can be used to store star data on disk, for example Flexible Image Transport System (FITS) 1_{, SPECK, VOTable and Comma-Separated Value (CSV). DR1 and the} TGAS subset were both released in CSV, FITS as well as VOTable. The University of Vienna also uses the FITS format while AMNH uses SPECK which implied that OpenSpace had to be able to support at least both those formats.

1_{https://heasarc.gsfc.nasa.gov/docs/heasarc/fits.html}

(29)

CHAPTER 4. RENDER 1.7 BILLION STARS 16

Figure 4.1: Illustration of the data pipeline when reading a dataset from multiple files, such as the full DR2.

Figure 4.2: Illustration of the data pipeline when reading a dataset from a single file.

Because DR2 had not been released by the start of this thesis a single FITS file with the TGAS subset was used during the majority of the implementation period. The full DR2 was later released as 61,234 separate files. Thus both reading of a single file and of multiple files had to be implemented. The following sections will describe the difference in the techniques.

4.2.1 Read a single file

A reading of single SPECK files had already been implemented in OpenSpace but reading FITS tables had not. FITS files can contain either image data or table data. In the case of Gaia it stores tabular data for all the stars. A FITS file reader module was therefore implemented in OpenSpace where the user can define which file to read from along with which columns and rows. For the I/O operations the module uses CCfits2_{which in turn builds upon cfitsio}3_.

The file can either be read on start-up or in another process as a preprocess. This is called a TaskRunner in OpenSpace and can be run independently from the main process. The reason for implementing the reading as a task is that reading an ASCII file such as SPECK or FITS can take quite a long time, especially if the file is big. A ReadSpeckTask and a ReadFitsTask were therefore implemented that read a single text file and output a binary file with only the star data we are interested in as it is much faster to read a binary file during start-up.

2_{https://heasarc.gsfc.nasa.gov/fitsio/CCfits/} 3_{https://heasarc.gsfc.nasa.gov/fitsio/fitsio.html}

(30)

CHAPTER 4. RENDER 1.7 BILLION STARS 17 To be fair, FITS files can store tables in binary as well but the files released by ESA were stored in ASCII. However, even if the tables had been binary it would still be faster to read a preprocessed file because then the values would already be ordered by star. SPECK files are ordered in row-major fashion, which means that one row contains data for one star, so the stars can be read sequentially. CCfits on the other hand reads the table by column, which requires more memory and additional loops to order the data by star and store it in the correct binary order.

The ReadSpeckTask only takes paths to the input and output files as arguments while ReadFitsTask also accepts optional parameters for the first and last row to read as well as which additional columns to read. The columns needed for default rendering and filtering will always be read but the user can define additional filter parameters. Reading additional columns will slow down the process tremen-dously however so it is actually preferably to add new columns directly to the code instead.

4.2.2 Read multiple files

The same ReadFitsTask can be used to read multiple FITS files from a folder. This is required if the dataset is too large for the RAM in the computer. The reason why only FITS is supported is that it was the fastest format to read a single file out of those that DR1 was released in. As it later turned out the DR2 dataset was only released in compressed CSV files initially. To read the DR2 dataset all 61,234 files first have to be downloaded from the Gaia archive, unzipped and then converted to FITS. The conversion was done with a Python script using Astropy4_{. All the files in the specified folder are} thereafter read in multiple threads. However, CCfits prevents the usage of more than one I/O driver at a time so what is actually threaded is the processing of the data and the writing to binary files. To avoid using too much RAM in the next step the data is split into eight initial octants, defined by the main Cartesian axis. The writing is done in batches so when the number of values in an octant exceeds a pre-defined threshold the values in that octant are appended to the corresponding file. If more values are stored per star or if the available RAM is too low to handle eight files it is possible to divide the data into8n_{binary files instead by going down more levels in the octree.}

When reading from a folder the ReadFitsTask also takes an additional optional parameter which de-fines how many threads to use for the reading.

4.2.3 Pre-process the data

The most fundamental thing you need to render anything is the object’s position in 3D space. In our case we are also interested in the velocity of a star and some parameters for how the star should look. This amounts to eight parameters in the end; [x,y,z] position, [x,y,z] velocity, magnitude and color. How these values are used in the rendering is later explained in Section 4.5. Because one implementation goal was to filter the data a couple of parameters for filtering may also be of interest. How the filtering works will be explained in Section 4.3.2 and 4.5.4.

To be able to store the eight basic rendering parameters all measurements needed for calculating them have to be read from the file(s). In some cases the parameters may already be calculated correctly, but if the data for example is read directly from DR2 they have to be calculated from RA, Dec, parallax, proper motion and radial velocity. For these calculations the equations presented in Section 3.4 were used.

If a star is missing a measurement it will be set to a default value. For example, DR2 contains 1.7 billion stars but only 1.3 billion of them had any parallax angle. Thus the distance for those stars were set to a user-defined constant, which could easily be filtered away later.

(31)

CHAPTER 4. RENDER 1.7 BILLION STARS 18

4.3 Construct an Octree

Once the data have been read it needs to be structured in a way that can optimize how much data is streamed from disk and/or streamed to the GPU. Schatz et al. [29] and several others ([27], [26], [33], [13]) suggest that a version of an octree structure is the best way to go. In this work the subdivision of the octree is based on the spatial information of the stars. When the number of stars in a leaf node exceeds a user-defined constant that node is subdivided into eight new leaf nodes of equal size and redefines itself as an inner node (parent). Figure 4.3 illustrates how a 2D representation of the octree might subdivide.

Figure 4.3: Illustration of how the size of the octree depends on the maximum number of stars in each node and the initial extent of the octree.

The depth of the tree depends on the initial ex-tent of the tree and how many stars that are stored in each node. If the initial extent is small it will generate a shallow tree, which is preferable as it will speed up the traversals. However, the stars that fall outside of the initial extent still has to be stored in the octree. If the outermost node is not able to contain all the outliers the memory stack will overflow and the process will crash during the construction. A higher number of stars per node will also generate fewer total nodes which is both faster to construct and to traverse but may require more data to be streamed to the GPU later on as the frustum culling will get coarser. Bigger nodes also implies that fewer nodes can fit in the buffer in the next step. If many of the nodes in the tree are underutilized it can be quite bad for the performance as well due to how the buffers are updated. When constructing the oc-tree the inner nodes will keep a copy of the brightest stars in all their descendants as a LOD cache that will be used later when traversing the octree. Therefore a higher number of stars per

node also means that more data will be duplicated. The user has to find a balance between the two properties to make sure that both the construction time and render performance are at acceptable levels. A little bit of trial-and-error is necessary as there is no general guideline for how to set the properties as it depends heavily on the characteristics of the specific dataset.

The construction of the octree can either be done during start-up or in a TaskRunner. The Construct-OctreeTaskcan process either a single binary file or the eight binary octant files produced by reading from multiple files. The process in the two versions are similar. The stars are read one-by-one from the binary file, checked if they should be filtered away and, if they pass all filters, are inserted into the octree. The difference is that the single file version saves the octree in a single binary file while the multiple file version stores the structure of the octree, without any data, in a binary index file and then stores the data of the nodes in one file per node. The node files are named after their position index in the octree, which is based on Morton order (or Z-order curve) [8]. This will be important later when accessing neighboring nodes (Section 4.4.5) as it preserves data locality.

When constructing the octree from several files only one octant is processed at a time. It would be possible to read several octants at the same time in different threads but that would require more memory, which the computer used for this project did not posses. It would be possible to add addi-tional threading, if the hardware can handle it. As of now one branch of the octree is constructed at

(32)

CHAPTER 4. RENDER 1.7 BILLION STARS 19 a time, after all stars have been inserted the branch is written to multiple files, after which all data is cleared and the memory deallocated. The user can choose to perform the writing in a different thread as writing thousands of files otherwise slows down the process. This makes the construction require more RAM as the reading of the next octant begins before the data has been cleared from the previous branch. When all octants have been read and all nodes have been stored in separate files then the structure of the octree is saved to a binary index file. The structure tells us all we need to know to be able to load the file later, which is the number of stars stored and if the node is a leaf or not.

4.3.1 Octree structure

The octree used in the implementation is a pointer-based octree with every node represented as an Axis-Aligned Bounding Box (AABB) with a Vec3 center and floating point half-dimension. Every node has a pointer to each of its eight children but none to its parent. Every node also keep track of how many stars the node contains, if it is a leaf, if it is loaded into RAM, if it has any descendants that are loaded into RAM, its index in the streaming buffer, its position in the octree and containers with render values for all its stars.

4.3.2 Offline filtering

During the construction of the octree the stars can be filtered by a number of parameters. The default filter parameters from DR2 that are used are position (x, y, z), velocity (x, y, z), magnitude (mean band of photometry G, Bp and Rp), color (the difference between two photometric bands, i.e. Bp-Rp, Bp-Gand G-Rp), RA, Dec, parallax, proper motion for RA and Dec, radial velocity and errors for the last six values. It is possible in theory to add all the released values and filter by everything if you have the hardware for it but to limit the file sizes these were the 24 values chosen. They are also the filter values suggested by the astronomers in Vienna.

The user can set min and max values for each of these parameters as input. If both min and max are set to the same value then all stars with that specific value are filtered away. It is possible to set the minimum to minus infinity and the maximum to positive infinity.

4.4 Data streaming

The next step in the pipeline is to find out which nodes that are eligible for rendering and load them into RAM, if they are not already loaded, and then stream their data to the GPU. This section will first explain the algorithms used if the entire octree can fit in memory and then expand those techniques to when streaming from files.

4.4.1 Update the buffers

If the entire dataset can fit in memory then it will be asynchronously loaded into RAM on start-up without locking the application regardless if it is stored in a single file or in multiple files. When the data is loaded into the working memory we need to find out which nodes that should be streamed to the GPU. During each render call the octree is traversed in pre-order fashion. The node’s AABB is used to determine if a node intersects the view frustum or not. If it does not there is no need to keep going down that branch. Another culling technique had already been implemented into OpenSpace by Bock et al. [7] which in theory removes the near and far planes and enables seamless space travel

(33)

CHAPTER 4. RENDER 1.7 BILLION STARS 20 over 60 orders of magnitude. However, even though that technique culls objects that are not visible it kicks in later in the OpenSpace pipeline and to be able to optimize the streaming we need to determine what nodes to upload in an earlier stage.

If the node is visible and it is a leaf, the data in that node should be uploaded. If the node is visible but is an inner node we keep the traversal going, unless the node in smaller in screen space than a user-defined threshold. In that case the stored LOD cache should be uploaded instead and the traversal stops going down that branch.

However, to limit the streaming to the GPU we only want to update nodes that are not already up-loaded. An index stack therefore keeps track of all the free spots in the buffer. The maximum size of the stack is determined by how much dedicated video memory the GPU has. The user can also limit the maximum usage with a max-percent property.

Each node then has a buffer index with that node’s placement in the buffer, if uploaded. When a node should be uploaded it first checks if its index is the default value, if not then it already exists in the buffer. Otherwise it requires the top index in the stack, if not empty, and then the data of that node will be inserted into that required position in the buffer. Figure 4.4 illustrates how nodes in a similarly structured quadtree might be updated.

Figure 4.4: Illustration of which nodes that are eligible for streaming to the GPU as the camera rotates. The red (striped) nodes are already uploaded to the GPU and will not be updated. The blue (clear) nodes are no longer visible and will be removed from the GPU with their indices being returned to the index stack on the next render call. The green (circle) nodes become visible and will be uploaded to the GPU. If the node with given buffer index 88 is smaller in screen space than a set threshold it will return its LOD cache instead of traversing any further.

4.4.2 VBO and SSBO

Two different techniques for updating the buffers have been implemented. One using the standard Vertex Buffer Object (VBO) and one using the newer Shader Storage Buffer Object (SSBO), which requires OpenGL 4.3. When using VBOs the render values for the stars are sent as fixed sized at-tributes while with SSBOs they are sent as variable sized arrays. The main reason why SSBO is used

(34)

CHAPTER 4. RENDER 1.7 BILLION STARS 21 instead of the similar Uniform Buffer Object (UBO) is that the size of the latter is limited to 64 KB or even 16 KB for some GPUs whereas SSBOs guarantees at least 128 MB but is most of the times only limited on the available video memory on the GPU.

Because the number of stars is different in each leaf node and we cannot calculate the number before a node should be uploaded we assume that all nodes are of equal size. That way we can easily calculate the offset for a certain node depending on its index. The data of one node will hereafter be referred to as a "chunk".

The main difference between VBO and SSBO is that VBOs are unaware of how many stars there are in each chunk. They assume that all chunks are filled and calls the vertex shaderM axStarsP erN ode ∗ N umberOf Chunks times each frame. This also means that we have to fill up the chunks with zeros so that old values are not accidentally rendered.

With SSBOs on the other hand we can keep track of the exact number of stars in each chunk in an index buffer and send that to the shader as an additional variable-sized array. The shader will then use a binary-search-based algorithm to find the correct index of the star to process. This means that the shaders are only called for the exact number of stars that will be rendered and that there is no need to upload any extra zeros to overwrite old values. Figure 4.5 illustrates how the beginning of the SSBO buffers would be updated for the camera movement illustrated in Figure 4.4. The index buffer is updated with a single glBufferData call and will thus not increase the bandwidth significantly.

Figure 4.5: Illustration of how the SSBO buffers are updated in a single draw call. The traversal adds a node with buffer index 2 and removes the node with index 3. First the index buffer is updated linearly. The index buffer keeps track of the accumulated sum of stars in the data buffer. The numbers with new and removed stars are added and propagated through the remaining buffer. Thereafter the data buffer is updated with the actual new data. The buffer index of the node is used to determine where in the buffer the data should be written.

There are two ways to update parts of a buffer in OpenGL, glMapBufferRange and glBufferSub-Data. Both were implemented in OpenSpace with a technique called buffer re-specification or buffer orphaning which prevents synchronization between render calls which in turn improves the perfor-mance. Both methods worked fine but it turned out that glBufferSubData was the faster option and was therefore the only one kept in the long run.

Interactive out-of-core rendering and filtering of one billion stars measured by the ESA Gaia mission

LiU-ITN-TEK-A--18/034--SE

Interactive out-of-core

rendering and filtering of one

billion stars measured by the

ESA Gaia mission

Adam Alsegård

LiU-ITN-TEK-A--18/034--SE

Interactive out-of-core

rendering and filtering of one

billion stars measured by the

ESA Gaia mission

Examensarbete utfört i Medieteknik

vid Tekniska högskolan vid

Linköpings universitet

Adam Alsegård

Handledare Emil Axelsson

Examinator Anders Ynnerman

Upphovsrätt

Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare –

under en längre tid från publiceringsdatum under förutsättning att inga

extra-ordinära omständigheter uppstår.

Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner,

skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för

ickekommersiell forskning och för undervisning. Överföring av upphovsrätten

vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av

dokumentet kräver upphovsmannens medgivande. För att garantera äktheten,

säkerheten och tillgängligheten finns det lösningar av teknisk och administrativ

art.

Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i

den omfattning som god sed kräver vid användning av dokumentet på ovan

beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan

form eller i sådant sammanhang som är kränkande för upphovsmannens litterära

eller konstnärliga anseende eller egenart.

För ytterligare information om Linköping University Electronic Press se

förlagets hemsida

Copyright

The publishers will keep this document online on the Internet - or its possible

replacement - for a considerable time from the date of publication barring

exceptional circumstances.

The online availability of the document implies a permanent permission for

anyone to read, to download, to print out single copies for your own use and to

use it unchanged for any non-commercial research and educational purpose.

Subsequent transfers of copyright cannot revoke this permission. All other uses

of the document are conditional on the consent of the copyright owner. The

publisher has taken technical and administrative measures to assure authenticity,

security and accessibility.

According to intellectual property law the author has the right to be

mentioned when his/her work is accessed as described above and to be protected

against infringement.

For additional information about the Linköping University Electronic Press

and its procedures for publication and for assurance of document integrity,

please refer to its WWW home page:

Interactive out-of-core rendering

and filtering of one billion stars

measured by the ESA Gaia mission

Adam Alsegård

Abstract

Acknowledgments

Contents

Acronyms

List of Figures

List of Tables

Chapter 1

Introduction

1.1

Background

1.1.1

OpenSpace

1.1.2

The Gaia mission

1.2

Objective

1.2.1

Implementation goals

1.2.2

Research questions

1.3

Limitations

1.4