• No results found

Today's Space Weather in the Planetarium : visualization and feature extraction pipeline for astrophysical observation and simulation data

N/A
N/A
Protected

Academic year: 2021

Share "Today's Space Weather in the Planetarium : visualization and feature extraction pipeline for astrophysical observation and simulation data"

Copied!
40
0
0

Loading.... (view fulltext now)

Full text

(1)

Department of Science and Technology

Institutionen för teknik och naturvetenskap

LIU-ITN-TEK-A-19/054--SE

Today s Space Weather in the

Planetarium

Sovanny Huy Nikkilä

Axel Kollberg

(2)

LIU-ITN-TEK-A-19/054--SE

Today s Space Weather in the

Planetarium

Examensarbete utfört i Medieteknik

vid Tekniska högskolan vid

Linköpings universitet

Sovanny Huy Nikkilä

Axel Kollberg

Handledare Emil Axelsson

Examinator Anders Ynnerman

(3)

Upphovsrätt

Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare –

under en längre tid från publiceringsdatum under förutsättning att inga

extra-ordinära omständigheter uppstår.

Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner,

skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för

ickekommersiell forskning och för undervisning. Överföring av upphovsrätten

vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av

dokumentet kräver upphovsmannens medgivande. För att garantera äktheten,

säkerheten och tillgängligheten finns det lösningar av teknisk och administrativ

art.

Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i

den omfattning som god sed kräver vid användning av dokumentet på ovan

beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan

form eller i sådant sammanhang som är kränkande för upphovsmannens litterära

eller konstnärliga anseende eller egenart.

För ytterligare information om Linköping University Electronic Press se

förlagets hemsida

http://www.ep.liu.se/

Copyright

The publishers will keep this document online on the Internet - or its possible

replacement - for a considerable time from the date of publication barring

exceptional circumstances.

The online availability of the document implies a permanent permission for

anyone to read, to download, to print out single copies for your own use and to

use it unchanged for any non-commercial research and educational purpose.

Subsequent transfers of copyright cannot revoke this permission. All other uses

of the document are conditional on the consent of the copyright owner. The

publisher has taken technical and administrative measures to assure authenticity,

security and accessibility.

According to intellectual property law the author has the right to be

mentioned when his/her work is accessed as described above and to be protected

against infringement.

For additional information about the Linköping University Electronic Press

and its procedures for publication and for assurance of document integrity,

(4)

Today’s Space Weather in the Planetarium

A visualization and feature extraction pipeline for astrophysical observation and simulation data

Sovanny Huy Nikkilä, Axel Kollberg

Supervisor

Emil Axelsson

A thesis presented for the degree of

Master of Science in Media Technology and Engineering

Faculty of Science and Engineering Linköping University

Sweden 27 November 2019

(5)

Abstract

This thesis describes the work of two students in collaboration with OpenSpace and the Commu-nity Coordinated Modelling Center (CCMC). The need expressed by both parties is a way to more accessibly visualize space weather data from the CCMC in OpenSpace.

Firstly, space weather data is preprocessed for downloading and visualizing, a process that involves reducing the size of the data whilst keeping important features.

Secondly, a pipeline is created for dynamically fetching the time varying data from the web during runtime of OpenSpace. A sliding window technique is employed to manage the downloading of the data.

The results show a complete and working system for downloading data during runtime. Measure-ments of the performance of running the space weather visualizations by dynamically download-ing versus runndownload-ing them locally, show that the new system impacts the frame time marginally. The results also show a visualization of space weather data with enhanced features, which facilitate the exploration of the data, and creates a more comprehensible representation of the data. Data is originally kept in a tabular FITS file format, and file sizes after data reduction and feature extraction are approximately 3% of the original file sizes.

(6)

Acknowledgements

Thanks to Emil Axelsson for all your continued guidance and support, as well as the rest of the OpenSpace team. Whom despite the time difference, answered questions to problems that were stumbled upon late in the evening of your timezone, this work would not have not have been completed without your help.

Thank you Masha and the entire staff at CCMC, to never hesitate to explain concepts and aid computer engineers in the jungle that is astro and heliophysics. And a special thanks to Leila Mays and Peter MacNeice to being able to digest our perspective and goal in this project, and helping us convey the essence of our work to other collaborators.

Furthermore, thanks to Nick C. Arge and Samantha Wallace for finding potential and use of our work, to bring our work to a level of relevance that would not been achieved without your help and enthusiasm.

Also, a big thanks to all our roommates at Wakana’s house. Our DC experience would not have been the same without Adam, James, Nick and the rest of the gang.

Lastly, thank you Rick for the help with settings up API endpoints on iSWA, and all the fun lunch breaks with board games and ventures outside of Goddard.

(7)

Contents

1 Introduction 4 1.1 Motivation . . . 4 1.2 Aim . . . 4 1.3 Delimitations . . . 5 2 Related work 6 2.1 OpenSpace . . . 6

2.2 Space weather rendering in OpenSpace . . . 6

2.3 AF-GEOSpace software . . . 6

3 Space weather data 8 3.1 Space Weather . . . 8

3.2 Magnetograms and FITS . . . 8

3.3 The WSA model . . . 9

3.4 Properties of the field line tracing data . . . 10

4 Data reduction and feature extraction 13 4.1 Reducing WSA field line data . . . 13

4.2 Performing feature extraction and data reduction of field lines . . . 13

4.3 Using the synoptic magnetogram as a texture . . . 17

5 User driven data requests 19 5.1 OpenSpace and server communication . . . 19

5.2 Sliding windows . . . 19

5.3 Updating active data set based on time . . . 20

5.4 Implementing user-driven data requests . . . 20

6 Results 24 6.1 User-driven data requests . . . 24

6.2 Synoptic magnetogram as a texture . . . 25

6.3 Field lines feature extraction . . . 26

6.4 Data reduction . . . 29

7 Discussion 30 7.1 Performance of client server pipeline . . . 30

7.2 Benchmarking in OpenSpace . . . 30

7.3 Magnetograms as textures . . . 31

7.4 Field lines feature extraction . . . 31

7.5 Polar pinching . . . 32

7.6 Data reduction . . . 32

7.7 Future work . . . 32

(8)

1 Introduction

OpenSpace is an open source software and project, with the goal of portraying and visualizing space data and space missions to educate the general public [8], and to function as a tool for experts to explore their data. OpenSpace has an ongoing collaboration with the National Aero-nautics and Space Administration (NASA), and this thesis was carried out at NASA Goddard Space Flight Center. It was the combined needs of the OpenSpace team and the Community Coordi-nated Modeling Center (CCMC), at NASA, that led to the work of this thesis. The motivation is to further expand the capabilities of OpenSpace and to satisfy the ambition of being able to utilize OpenSpace as a tool for scientific exploration and storytelling in the field of space weather. The CCMC is a division at NASA that collects and stores data from different observatories and space crafts around the world. The data is used to construct models to simulate reality, with the ambition to be able to forecast space weather developments [13].

1.1 Motivation

Previous theses and work on OpenSpace has enabled the possibilities to visualize space weather data in OpenSpace in a number of ways; as field line renderings, as volume density renderings and texture mapping to 3D-objects [4, 12, 10]. Visualizing models in the environment of OpenSpace facilitates the possibility to further explore and to gain a deeper insight into the models as well as the underlying data. The ability to explore space weather models in OpenSpace also provides a new approach to finding ways to tweak and improve the simulation models to more accurately depict reality, which is an ongoing challenge at the CCMC.

The data sets to represent these space weather data, have historically been large and hard to acquire, and need preprocessing to be compatible with the OpenSpace software. The work of previous theses, focusing on bringing space weather models into OpenSpace, have resulted in preprocessing a portion of data surrounding particular event. While these space weather event visualizations are a valuable way of showcasing the capabilities of OpenSpace, as well as explaining important episodes in space history, there is potential for a better user experience. Most of the data developed and processed is still quite large and hard to acquire for a general OpenSpace user. To improve upon this, the idea of providing data directly from the servers at the CCMC into OpenSpace, was proposed.

1.2 Aim

The aim of this thesis is to create and implement a solution to make space weather visualizations in OpenSpace more accessible. This can be achieved by making the data available online and download it during runtime of OpenSpace, as opposed to having to have the data prepared on the machine prior to running the software. The goal with this solution is to have positive impact on the user experience and accessibility to the data, while having little to no impact on perfor-mance of OpenSpace. A decline in perforperfor-mance would affect the user experience negatively, and would therefore force a user to choose between accessibility and user experience exploring space weather visualizations.

To achieve this aim, and at the same time expand the library of supported space weather models in OpenSpace, an additional goal is to bring the Wang-Sheeley-Arge (WSA) model into OpenSpace. The WSA model is a compact, and therefore more suitable model for runtime downloading, than

(9)

the currently supported models in OpenSpace. Furthermore, a tracing algorithm to produce field line data was already in place. While the WSA field line files are already relatively small compared to models previously used in OpenSpace, it is required to reduce the size of the data even further to enable it to be downloaded during the runtime of OpenSpace. As reducing the data involves the removal of data, it is important to initially analyze and extract interesting features from the data, to not unintentionally remove anything of importance. The extraction of features is not only useful to reduce the amount of data, but is also a method of providing a more understandable representation, that would encourage exploration and analysis of the data. The steps towards reaching the aim of this thesis are formulated in the following research questions:

• What is a solution to forming a server-client pipeline for fetching data to a real time rendering software, while keeping the impact on performance to a minimum?

• What is a suitable way to reduce field line data, to enable internet downloading and rendering of the data in a real-time application?

• What features can be extracted from field line data and be combined with magnetogram imagery, to create a comprehensible representation of the WSA space weather model for scientific exploration and analysis?

1.3 Delimitations

While it would have been possible to introduce new ways of rendering space weather data during this thesis, it has been decided to rely on those methods that are already in place in OpenSpace. Instead, the focus is on creating a pipeline to get the data into a state where it is usable by the existing rendering methods.

(10)

2 Related work

Several space weather models are supported in OpenSpace, and several methods of visualiza-tions are available within the software. There is an existing tool for exploring WSA data, which is comparable to the work done in this thesis.

2.1 OpenSpace

The OpenSpace project stems from a collaboration that started in 2014 between Linköping Uni-versity, the American Museum of Natural History, and the CCMC at NASA Goddard Space Flight Center. An earlier academic collaboration between the two former parts had led to the creation of Uniview, a software that visualizes the universe [15]. Later the development of OpenSpace began, which received funding from NASA, which extended the collaboration to other institutions. CCMC has been a close collaborator from the start of the OpenSpace project, with the goal to visualize space weather forecasting.

The OpenSpace software is open-source and freely available. It is a three-dimensional visualization software that can portray the entire known universe, observed and simulated data, as well as space mission data [7, 6]. It can render the data on vastly different scales; one can zoom in on the Earth’s surface as well zoom out to view galaxy clusters [2]. It avoids floating precision limitations by using a method that utilizes a dynamically assigned frame of reference to provide the highest possible numerical precision for all objects in a scene graph [3]. OpenSpace has the ability to run in several different environments; on personal computers, and touch boards and in multi-screen environments like planetariums.

2.2 Space weather rendering in OpenSpace

The possibility to view and interact with time varying space weather data in OpenSpace exists in the form of field line renderings, volume density renderings and texturing on 3D objects [9]. Carlbaum and Novén[10] introduced a way to render time varying field line sequences. This work included de-veloping a new auxiliary format for storing field line data compatible with OpenSpace. The format is named OpenSpace Field Line Sequence (osfls), and would enable any space weather model with relevant field line data to be viewed in OpenSpace, if formatted according to its structure. The models that have been previously converted to osfls and viewed in OpenSpace are ENLIL, PFSS, and BATSRUS. Previous methods for formatting these models into osfls have generally been manual, and has therefore resulted in a limited sequence of data being converted.

Most 3D objects in OpenSpace have the possibility to have a texture applied and the work of Berg and Grangien [5] has utilized this feature to apply magnetogram imagery as texture to the sun. Their work also included adding the Magnetohydrodynamics Around a Sphere (MAS) model as a volumetric density rendering into OpenSpace. Volumetric density renderings are not used in this thesis.

2.3 AF-GEOSpace software

AF-GEOSpace is a tool able to visualise WSA model data in the form of field lines and magnetogram textures [14]. This software is currently used for scientific exploration at the Heliophysics depart-ment at NASA GSFC. Features of the software include three dimensional movedepart-ment, rotating, and

(11)

scaling. It also has a feature of being able to select field lines with the mouse cursor. The software currently only works on Linux OS, and has the preconditions of having all the model data locally in the Flexible Image Transport System (FITS) file-format [19]. When the data is loaded into the software, the user has the ability to specify how many field lines to load, by selecting the number of every nth field line to include in the visualization. The software has trouble reaching a stable frame rate above 60 frames per seconds during real time rendering, which is a problem for the users of the software. The work of this thesis is to attain and improve the capabilities of exploring the WSA model in GEOspace, by implementing new features in OpenSpace.

(12)

3 Space weather data

The type of data that is the subject for reduction and visualization in this thesis is field line and magnetogram data, both part of the WSA model simulation. The WSA model is a model that simulates the magnetic field between the solar surface and an outer bounding sphere.

3.1 Space Weather

Space weather is a part of space physics related to the varying conditions in the solar system that can affect space-borne and ground-based technological systems and can endanger human life or health [17]. In this thesis, space weather refers to the heliophysics part of space weather, which are the conditions related the Sun and its reach with emitted charged particles.

The Sun’s magnetic field is a important part of space weather visualizations. It reaches beyond the edge of the solar system. The heliosphere is the balloon-like region that encapsulates the Sun’s solar winds reach. Solar wind is a stream of electrically charged particles, propagating out of the solar corona. The solar wind may have varying density and velocity, affect the magnetic fields of Earth and all other planets in the solar system.

The photosphere is the innermost layer of the Sun’s atmosphere, it is the part of the atmosphere where the density is so high, that it is not possible to see any further into the Sun’s core. The photosphere, is what may be referred to as the Sun’s "surface". The outermost layer of the Sun’s atmosphere is called the corona and it is described as an aura of hot plasma, which is visible during solar eclipses.

3.2 Magnetograms and FITS

A magnetogram is a representation of the Sun’s magnetic field at the photosphere in the form of an image. It is usually represented in grayscale, where white and black regions indicate strong magnetic fields of different polarities. An example is showed in Figure 1. Merged magnetograms, or synoptic magnetograms, can be created by combining series of images and/or predictions, re-sulting in 360-images of the Sun’s magnetic field at the photosphere. I.e. an equirectangular image projection. A synoptic magnetogram is showed in Figure 2. In this report, the term magnetogram is sometimes used to describe the synoptic magnetogram. Synoptic magnetograms are often rep-resented with the resolution of every, or every other longitude-latitude coordinate of the entire photosphere, which results in a resolution of either 360 x 180 or 180 x 90 pixels.

The file format of magnetogram files are saved in the Flexible Image Transport System (FITS) for-mat. FITS files are commonly used in science and it can either store large arrays, tables of data or images similar to the types used in the synoptic magnetograms [19]. FITS files are useful in scien-tific purposes because of the possibility of storing multiple fields of meta data, which may be used to provide a readable context to the data contained in arrays and the tables. It is not uncommon that FITS files have multiple layers of arrays or tables, to provide more information for a specific data set.

Since the synoptic magnetogram is an equirectangular image projection, it is therefore a possibility to use the image data as a texture on a 3D sphere depicting the Sun. In order to have the raw values of the magnetogram, the actual magnetic field strength values, it is not suitable to save the magnetograms in a lossy format.

(13)

Figure 1: A magnetogram of the Sun, viewed from Earth. Captured with the Helioseismic and Magnetic Imager (HMI) instrument, which is located in front of Earth. Image taken from the Solar

Dynamics Observatory (SDO) data archive.

Figure 2: A synoptic magnetogram: an equirectangular projection of the entire photosphere. Image taken from Global Oscillation Network Group (GONG) data archive.

3.3 The WSA model

WSA is a model developed by Neil R. Sheeley Jr., Yi-Ming Wang and Nick Arge. Its use is to derive the speed of solar winds from magnetogram imagery [16]. The model is based on the idea that the solar wind far away from the Sun is dependent on the origin of the wind from the inner corona. The path from the inner photosphere to the corona is derived from the potential field source surface (PFSS) model, which is an approximation of the inner magnetic field based on the synoptic magnetogram. The PFSS model terminates at 2.5 solar radii (1 solar radius is 695700km) and serves as input into what is called the Schatten Current Sheet (SCS). The Schatten current sheet is a model to describe how the magnetic field propagates throughout the heliosphere in three dimensions, and provides a more realistic magnetic field topology in the upper corona. The Schatten Current Sheet could in theory stretch out to infinity, but the WSA model used in this thesis reaches out to

(14)

21.5 solar radii. The boundaries are illustrated in Figure 3.

3.3.1 Output files

When the WSA simulation model is ran, a number of FITS-files are produced. The primary file is called the WSA output file, containing multiple nested layers of data. It contains the results of the simulation, in the form of arrays of coefficients that describe the magnetic field. It also contains layers with measurements and calculations used in the simulation. Example of such a layer is a map of the coronal magnetic field at 21.5 solar radii, which also has the same resolution as the synoptic magnetogram.

Another output file is the velocity file. It contains the solar wind velocity derived through the model at the outer boundary sphere, 21.5 solar radii. The number of data points in the file corresponds to the number of field lines generated.

Lastly, there are three files that contain the field lines data from four different tracings through the WSA model.

3.3.2 Availability of WSA output data

The simulation can be run from synoptic magnetograms dated from 2006 until now. The magne-togram data is captured every hour in a day, but the cadence of the model run can be set to a sparser cadence if so desired. For the thesis work, it is assumed that there will be data available at least four times per day to fetch from the server. Since magnetograms and field line data have the same cadence, they will have the same timestamps. Ultimately, the magnetograms and processed field line data will be served on the same server and endpoints. This way, the user will be able to fetch a complete set of space weather data for one timestamp.

3.4 Properties of the field line tracing data

The three field line files contain four different sets of tracing data, that has been traced in different ways using the WSA model output data. The tracings are done as a part of the model run at CCMC and existed prior to the work of this thesis. The three field line tracing files are subject to data reduction and feature extraction, since the files are large and contain an abundance of field lines.

3.4.1 Field line tracing

The first tracing is originating from the photospheric surface and out to 2.5 solar radii, this tracing is referred to as in-to-out. The second is traced every other degree from a theoretical boundary sphere at 21.5 solar radii in the Schatten Current Sheet in through the PFSS model to end up at the footprint of the photospheric surface. This tracing is referred to as out-to-in. The out-to-in tracing is executed into two parts, one for the Schatten Current Sheet, and one for the PFSS model. This division of model parts are illustrated in Figure 3.

(15)

Figure 3: WSA sphere boundaries and field line tracing directions.

The last tracing is similar to the out-to-in, since it is also traced from 21.5 solar radii down to the photospheric surface, but rather being points every other degree in an outer sphere, the field lines are traced from a satellite position. This tracing is referred to as sub-satellite points. The position from where the satellite is located is an estimate, and therefore contains three values for every longitude value, to depict the uncertainty of the measures. Along the latitude it is traced every other degree, and what differs this tracing is that it contains historically measured values, creating a historical track of field lines.

In summary, the traced field lines can be divided into four sets of data: • The PFSS in-to-out set

• The PFSS out-to-in set • The SCS out-to-in set • The sub-satellite points set

The sub-satellite set is stored as extra data points within the two out-to-in files. There are three FITS files that together contain the four data sets.

As the tracing of the field lines are stored in a FITS file format, they must be contained in fixed N-dimensional arrays. Both the in-to-out, and the out-to-in data are traced every other degree around the entire globe of the Sun, which results in 16200 field lines. The length of each field line may vary. Since the arrays in the FITS file format need a fixed size, the field lines are allowed to have a maximum length of 300 points, if one line is shorter than 300 points, padding values are put into the files to fill up the fixed 300 point length.

(16)

coordinates in three dimensional space, and the three last values correspond to the magnetic field strength vector in spherical coordinates at that point in space.

For fetching the field lines over the internet, files are not stored in FITS format, but instead the auxiliary OpenSpace format osfls. This is because FITS files can not be used to display field lines in OpenSpace at this point. The supported file formats for field lines are: .osfls, .cdf, and .json. Furthermore, the FITS file size of the traced field lines with 300 data points per line, and 64-bit precision, would result in a size of approximately 230MB per time step (16200·300·6·64 bytes), for one complete data set. If the cadence of the WSA model is every four hours, one day of data would exceed 1 gigabyte of storage. Such an amount of data is unfeasible to fetch over the internet in a real time fashion, therefore a reduction of data is necessary.

(17)

4 Data reduction and feature extraction

In order to prepare the data for downloading and visualization, data reduction and feature extrac-tion is done. Feature extracextrac-tion is a measure to reduce and enhance the percepextrac-tion of data. It is the idea of collecting and analyzing data to find out where points of interest in the data are.

4.1 Reducing WSA field line data

There are two major operations to reduce data contained in the field line FITS-files, firstly reducing from 64-bit precision to 32-bit precision. There is no reason for the use within OpenSpace, during the transport phase, to store a position with double precision, as the measurements do not depict such high precision in the magnetogram. If a higher precision is required during rendering, it can be converted on the fly. Since every field line has a fixed length of 300 points, removing all the padding values will significantly reduce the amount of data. These two reductions result in approximately a fourth of the original file size.

To reduce the data even further, it is necessary to examine what field lines that are interesting to save, and what field lines could be discarded. Keeping all 16200 field lines makes the visualization cluttered and will not be useful for exploration. Deciding what field lines to save may differ de-pending on what a user is interested in to examine. In order to determine what features to keep and to extract, collaboration with end users and experts on the WSA model is required.

The out-to-in data set only contains field lines of the type that is originating from coronal holes or open field regions, which implies that all field lines are open. This is derived from the fact that the tracing is started at 21.5 solar radii, and only open field lines reach out to 21.5. The PFSS in-to-out data set is traced starting from the photospheric surface, and contains field lines originating both from open regions and closed regions [1]. Combining and comparing these two data sets may be used to outline the open field regions, which is a feature that describes a general magnetic field structure. An example of this structure is shown in Figure 4.

The open field regions are those of a weaker magnetic field strength, while field lines of strong magnetic field strength are located in the closed region of the magnetic field. The strongest mag-netic field lines usually originate from sunspots or active regions. Since these are located in the closed field regions, the active regions are extracted from the in-to-out data set. Combining field lines from active regions with field lines outlining coronal holes adds another layer to the general structure of the solar magnetic field and are the two features that will be focused on.

4.2 Performing feature extraction and data reduction of field lines

Feature extraction and data reduction of the field lines is performed as a combined step, as the procedures are closely related to one another. Feature extraction is done partly in order to be able to visualize interesting features and partly in order to reduce the amount of data. The procedures are done between reading the FITS field line tracing data after a WSA model run, and outputting them as the auxiliary binary format osfls.

4.2.1 Determining polarity and open- or closedness

To determine polarity and consequently give each field line point a value that represents a negative or positive field, the sign of the sixth coordinate in each point is extracted. This represents the

(18)

Figure 4: The PFSS field line model. The blue lines are the last closed field lines, green and red lines are open field lines, and the colors depict different polarities. Image retrieved from iSWA.

direction of the magnetic field strength in that point, thus the polarity.

In order to determine which field lines are close to the coronal hole boundaries, it has to be determined whether a field line is open or closed. This is done by comparing the signs of magnetic field strength, i.e. the polarity, in the first and last point of a field line. If the sign changes, the field line is closed. If the sign is the same, the field line is open.

4.2.2 Extracting field lines close to coronal hole boundaries

In the out-to-in set, all the lines are open, since they are traced from 21.5 solar radii and inwards. To determine which field lines that make up boundaries around coronal holes, the information from the open- and closedness extraction is used. In coronal holes, only open field lines propagate out, so the boundaries are where open and closed field lines meet. By comparing the position of the last point of the open field line, with the first and last point of the set of closed field lines, the coronal hole boundaries can be found. The comparisons of the footpoints is illustrated in Figure 5.

The comparisons are made by iterating through all the closed field lines for every open field line. The lines are stored in lists. For every loop, two comparisons are made. One between the seeding point of the closed field line and the footpoint of the open field line, and one between the endpoint of the closed field line and the open field line. Each comparison consist of calculating the distances between the polar and the azimuthal angles of the two points. If any of the distances are below the threshold, the lines are classified as boundary lines.

(19)

Figure 5: The yellow lines represent closed field lines and the blues lines open field lines. If an open line’s footpoint is a distance less than√2 degrees from a closed line’s footpoint, it is

classified as a boundary line.

The threshold of how close the coordinates need to be to be classified as boundary lines is set by the resolution of the seed points. The field lines are traced from the center of the 2x2 degree grid that every other longitude and latitude coordinate make up. The threshold for being near a closed line is set to the length from the center to the corner of the grid, which is the square root of two. If the point of the open line is within the distance√2 degrees in longitude or latitude of either foot point of a closed field line, they are classified as boundary lines.

4.2.3 Field line concentration based on field strength

In order to reduce the number of field lines in the in-to-out data set, field lines are selected based on magnetic field strength. The regions with higher activity are where the magnetic field strength is relatively higher. In order to have a denser concentration in these regions, a threshold is used to filter lines, in combination with a Monte Carlo method.

The threshold is determined by the value that is two standard deviations away from the mean. If that value is lower than 15 gauss, the value 15 is used as threshold instead. This was decided by trying different values with the algorithm on various data sets, from both solar maximum and minimum time periods, where the values differ greatly.

All the field lines are compared to the threshold, where the ones above it are kept. The value used in the comparison is the line’s magnetic field strength at the photospheric surface, which is defined by the magnitude of the magnetic field vector in the first point on the line.

Saving only the field lines above the magnetic field strength threshold makes the visualization sparse, which can give the impression that there is no other activity on the photospheric surface. Therefore, the lines with magnetic field strengths lower than the threshold are used to create a probability distribution array. The array is populated by adding the index of a line as many times as the integer value of the field strength of that line. For instance, if a line has the field strength of 2.4 gauss, its index is added 2 times to the list.

(20)

A Monte Carlo approach is then used, where the elements from the distribution array are selected at random a set number of times [18]. Once a line has been picked it cannot be picked again. The picked lines are more likely to originate from a stronger magnetic field region, and are added to the final data set.

4.2.4 Extracting the current sheet

The current sheet is defined as the sheet that is formed similar to a skirt around the Sun, where the field changes from positive to negative charge. To identify the lines that are in this sheet of two layers, where the negative and positive lines meet, another data set from the WSA output data can be used. One of the layers of the WSA output file is the coronal magnetic field at the outer boundary of the domain, which can be seen in Figure 6a. In the Figure, the x axis show the indices of the longitude coordinate and the y axis the latitude coordinate (there is a value for every other integer coordinate of the sphere).

The layer is an image with a value for each field line and its magnetic field strength at 21.5 solar radii. The field lines represents the field lines in the out-to-in field line set. The value includes the polarity of the field. The lines where the polarity changes can be found by using image operations, using an image kernel. The kernel consists of three elements, arranged like in Figure 6b.

(a) Coronal field at outer boundary (b) Image kernel

Figure 6: Fig. (a) shows the magnetic field strength at 21.5 solar radii. The dark area are negative and the light area are positive magnetic flux strengths, measured in Gauss. The image kernel in (b) is used for finding the current sheet boundary, the transition from negative to positive values.

The kernel is moved across the image, and a comparison is made at every step. If any of the elements in the kernel is of a different sign, the elements are saved. These elements represent the field lines in the current sheet.

By having a kernel with two elements in both horizontal and vertical direction, the irregular case of a completely vertical sheet is taken into consideration.

4.2.5 Data types and format

The final set of reduced field line data are saved in a binary file format, osfls, which has better I/O performance than a structured file format like FITS and json, as shown by Carlbaum and Novén [10]. The coordinates, which were originally spherical, are converted to Cartesian coordinates to

(21)

match the existing field line rendering logic in OpenSpace. As aforementioned, single precision is used instead of double precision floating point format. The describing properties, such as polarity and open- and closed-ness, are saved as an array of extra values for each point in each field line, also with single precision.

The reason why single precision is used for a binary variable, is because the way the module to handle field line rendering is designed in OpenSpace. The module only takes float types to describe field line point properties in order to enable the possibility of filtering based on values, and most importantly, coloring lines based on floating point values. The same reason applies as to why the property value is added to each point, as opposed to each line; in order to be able to color and the lines vertex-wise.

If a field line set has 2000 lines, and those lines have 300000 points in total, it will have 300000·3 = 900000 values that describe its position. If that set has two describing properties it has 300000·2 = 600000 extra values, and 1500000 values in total. In addition, each file has some meta data, such as field line start indices and names of extra properties.

4.3 Using the synoptic magnetogram as a texture

In concert with field lines, magnetograms are used to visualize the space weather data. When using the synoptic magnetogram as a texture, value enhancements are done to create a better visualization.

4.3.1 Offsetting textures values for leading edge longitude

Many types of synoptic magnetograms start (have their leading edge) where the most recent image of the Sun was captured. For the type used in this thesis, the 120 first degrees from the left originate from the most recent image, with the central meridian 60 degrees in, which is the Earth’s view of the Sun’s meridian at that time.

In OpenSpace, the point where a texture starts to map to a renderable sphere object is at longitude zero of the sphere. Furthermore, the rotation of the sphere to correspond to that of the Sun is handled by transformation matrices. In other words, in the case of OpenSpace, the rotation of the Sun should not be handled by rotating the texture of it. Meaning, the longitude of the texture at the left edge should always be zero, to match the longitude on the sphere. Information about which longitude is at the leading edge of the synoptic magnetogram, is available as metadata in the header of the FITS file. This is extracted and used to shift all the image values.

4.3.2 Normalizing and enhancing values

The values in synoptic magnetogram are in the unit gauss, with an average magnitude of 1 gauss, and don not have any set lower or upper limit. The magnitude of the values in a low activity region can be close to zero gauss, up to around 10 gauss, and the values in high activity regions are in the hundreds, or thousands in some cases [11], depending on the current state of the Sun. Furthermore, the values are both negative and positive.

The values are normalize to the range [0, 1]. To make it easier to spot different intensities in the texture, the values are modified.

(22)

The new intensity value is set using a logarithmic function, Equation 1, in order to enhance low values. In the equation, d is a damper value, and m is a multiplier to the old intensity value.

i= d ∗ log10(1 + m ∗ i) (1)

The default value for the damper and multiplier are 1.0 and 20.0. With these values, the colors become saturated around 0.45, as can be seen in Figure 7. This is most probably an active re-gion, since as mentioned earlier; the average is 1 gauss and the max value is multiple order of magnitudes larger.

0 0.2 0.4 0.6 0.8 1 0

0.5 1

Figure 7: Logarithmic function used to calculate the new intensity value. The x-axis shows the original value and the y-axis the new value.

To set the color of the texture using these intensity values, the three color channels are configured. Depending on the desired background color, in other words the color that are around 0 gauss, or low activity regions, the channels are set differently. In Table 1, the different color channel settings are listed, where i is the calculated intensity value.

Color scheme Positive Negative Black background, blue color

for positive active regions and red for negative active regions

R = 0 R = i G = 0 G = 0 B = i B = 0 White background, blue color

for positive active regions and red for negative active regions

R = 1 - i R = 1 G = 1 - i G = 1 - i B = 1 B = 1 - i Gray background, white color

for positive active regions and black for negative active regions

R = 1 - (i*0.5 + 0.5) R = i*0.5 + 0.5 G = 1 - (i*0.5 + 0.5) G = i*0.5 + 0.5 B = 1 - (i*0.5 + 0.5) B = i*0.5 + 0.5 Table 1: Color schemes for magnetograms, using the color intensity value i

(23)

5 User driven data requests

The magnetogram and field line files need to be downloaded during runtime of OpenSpace, but even when the amount of field lines have been reduced, the file sizes are still of a size that will not be downloaded in a matter of milliseconds over high speed internet connection. This introduces a problem in a real time application.

5.1 OpenSpace and server communication

Openspace has a virtual clock, that by default progresses at the pace of one second per second. The pace can be changed by the user. When launching OpenSpace, the virtual clock is set to the current day and time minus one day. The reason for this is mainly that some map services will not have all maps ready in real time. When downloading field lines during runtime, the downloader could send a request to the API, requesting a file corresponding to the current OpenSpace time. However files regularly have the cadence of every two hours, and the precision of the timestamp is down to sub-milliseconds. Therefore, sending a request for a file, with the current OpenSpace time, is likely to not exist on the server. Even if there is a file at approximately corresponding to the OpenSpace time, it likely that it will be missed because of the sub-millisecond precision. The server that provides the files need more functionality than merely serving the files. This func-tionality could be to provide the file that is closest in time to what a user is requesting. Another solution could be providing the client with a list of available files and their corresponding times-tamps, thus it is up to the client to make the decision of what file to download. It is either going to be the client that will manage what to download, or the server. The latter would also imply that every client needs to send information about its current state for the server to make a proper decision. For example, if a client is browsing in OpenSpace, in between two timestamps of files, the server then would need to know in what direction the client is progressing time to be able to send the most suitable file of the two.

Assuming a way of downloading the correct file is in place, the next problem is knowing when to download the correct file. A user wants the correct data to be ready at the time in OpenSpace when that data is recorded. While the goal is obvious, the problem does not have a trivial solution. Factors like limiting the number of requests and not reducing the performance of OpenSpace have to be taken in to consideration.

With inspiration from how video stream functions, a possible solution to this problem is proposed. The streaming of video differs from sending large chunks of data sets, but the general problem may be theorized in a similar way. When streaming video over the internet, a user wants to be able to see the video right away, not wait for the entire video to be downloaded first. To achieve this, the video is divided into smaller pieces and sent to the client. These pieces can then be separately be viewed by the client. This concept would be suitable to adapt into sending field line data and images from server to client.

5.2 Sliding windows

Sliding windows is a concept used in different fields for batch processing. The general mechanism for a sliding window techniques is that a fixed size array is iterating through a bigger array in order to extract a subset of data to be processed. This principle is adapted in the thesis. The larger array in this case, is a list available files to download received from the server. It is suitable to download

(24)

the available files in chunks. Starting with downloading the most relevant chunks for the user, and then proceeding to buffer a few timestamps forward, to have data ready a head of time. The sliding window, or chunk, is in the thesis report denoted as small window and the large array is denoted as a big window.

5.3 Updating active data set based on time

Previous work in OpenSpace introduced a method to dynamically swap the active data set shown in OpenSpace, based on the timestamps connected to the data [10]. This method is based on extracting all timestamps for every data set available on disk, and saving them in a sorted list, and continuously checking if the current active timestamp is the right one compared to OpenSpace time. As soon it is not, it checks which timestamp in the list corresponds to the clients OpenSpace time. By building the sliding windows on top of this already functioning system, and adding addi-tional checks to the already continuous checking structure, a life cycle foundation for the sliding windows is in place.

5.4 Implementing user-driven data requests

In order to apply a sliding window technique, several parts are needed to create a full pipeline for downloading and managing the data. The parts include a test server, and constructing a window-worker system.

5.4.1 Local server for delivery of data

The use of a local server, that may be customized and experimented with to meet a desired re-quirement, is essential to develop a functioning sliding windows system. This is mainly because of that a precise behaviour might be unknown at an early stage of development, being able to progressively fine tune the logic of an API is necessary. At an early stage it is also important to work with dummy data, to reduce the risk of deleting files of importance. During the development of this project, a NodeJS-server was set up with several endpoints serving a RESTful API.

To implement a sliding windows technique, two main endpoints are required. The first one is to provide a list of available files in a time range specified in the clients request to the server. The second endpoint is to send a file specified by the client, if that file exists. The dynamic between these two endpoints is refined to work as such that the list sent back from the first endpoint contains the complete links to the second endpoint. The second endpoint is responsive for the actual delivery of files.

5.4.2 Two windows: one big and one small

In this work, the of concept of sliding windows include two windows: one big window and one smaller that is sliding over the big window. The size of these windows is referring to the amount of data they hold. The big window has two purposes in this system, to receive a list of available files from the server, and to continuously check if the list it holds on to; is still relevant to the OpenSpace time.

The big window saves the response from a server as pairs in a container, containing the trigger time of the data, and the URL to the file of the corresponding timestamp. The data structure of

(25)

this container has to meet some requirements: it has to be able to vary in size, as the server may respond with a lists of different sizes. It is important that the container allows access the both first and last element in constant time complexity. This behavioural structure is important because every render update loop, the big window is going to check if the window is still relevant by comparing OpenSpace time is within the first and last timestamp of the first and last available files. If not, it will request a new list of files from the server.

The list that the big window holds on to has to be sorted, if the list is not sorted from the response from the server, it must be sorted when inserted to the container. Lastly, the big window container has to support iteration through the list, since the smaller window has to be able to extract content from the big window. An example of a big window is illustrated in Figure 8.

Figure 8: The big window holds a list of available file URLs, and their corresponding timestamps

The small window’s purpose, is to slide over the big window and extract a smaller frame of data sets that is currently the most suitable, based on the time in OpenSpace. This smaller frame of data is later what is going to be downloaded by the worker. An example of the small window extracting pairs from the big window is illustrated in Figure 9.

The small window has a fixed maximum size so that the big window can aware of when the small window might try to read values that are out of bounds of the big window. In those cases the big window need to request a new list of files.

http://1001.ile http://1002.ile http://1003.ile http://1004.ile http://1005.ile http://1006.ile

Big Window

Small Window

http://1002.ile http://1003.ile http://1004.ile 1002 1003 1004

Figure 9: A small window sliding over the large window, extracting timestamps and corresponding download URLs to deliver to the worker for download

(26)

The two windows are in place to have a robust way of making sure that the client is always referring to the correct files, and minimizing the amount of request to a server during runtime. The actual execution of downloading, processing, and updating data is carried out by the worker.

5.4.3 The life of a worker

Internet requests in the execution pipeline are asynchronous. This implies that every request that is sent will not get a response the same rendering loop. Therefore, checking every rendering loop if the asynchronous call has received a response is required. When a response is received, the worker joins the thread they were executed on.

Assuming that the worker is given a new small window to download, the worker then expects that the small window is centered around the current OpenSpace time. Meaning that the timestamp and file in the middle of the window is the one currently most relevant for the user to see. With this assumption it is desired to start downloading that file, first checking if that file already exists. The worker keeps track of everything it has downloaded during a session, in order to not download the same files again.

After the worker has downloaded the most relevant file, it proceeds to downloading the files for-ward from the center of the small window. This is based on the assumption that the default behaviour is that of moving forward in time. Once all the files ahead of OpenSpace time has been downloaded, it proceeds with downloading all the files before the center of the small window. An illustration of the order in which the worker downloads files from the small window is shown in Figure 10.

Small Window

http://1002.ile http://1003.ile http://1004.ile 1002 1003 1004 1001 1005 http://1005.ile http://1001.ile Current OpenSpace Time 1st 2nd 3rd 4th 5th

Figure 10: The order in which the worker picks a file to download from the small window

There are two instances when the worker emits notifications that there is new data ready. Once when the first, most relevant data set has finished downloading and once when the entire small window is downloaded. This structure allows the most relevant data to be seen quickly, while less relevant data is downloaded in the background. An illustration of the complete pipeline for the client-server communication using sliding windows is shown in Figure 11.

(27)

Available iles 201 9-11 -15 12 :00 201 9-11 -15 13 :00 201 9-11 -15 14 :00 201 9-11 -15 15 :00 2019 -11 -15 16 :00 2019 -11 -15 17 :00 2019 -11 -15 18 :00 201 9-11 -15 19 :00 201 9-11 -15 20 :00 201 9-11 -15 21 :00 201 9-11 -15 22 :00 Big Window Small Window

Current OpenSpace Time

Server Worker 1 2 3 4

1. The big window requests a list of available iles in a speciic time period to the server. 2. The small window extracts a number of iles

surrounding the current OpenSpace time. 3. The worker picks up a ile from the small

wind-ow, to check if it already exists locally. 4. The worker requests a speciic ile from the

server, and marks it as downloaded

(28)

6 Results

To measure the performance impact of the addition of sliding windows to OpenSpace, a series of benchmarks have been carried out.

6.1 User-driven data requests

The implementation of the method with sliding windows resulted in a client-server system enabling a during runtime downloading of data sets while the user browses around in OpenSpace. To measure the impact of this system on the user experience, a measurement to signify this, is the stability of frame time per seconds during runtime. Benchmarks has been carried out to see how the implemented system performs compared to running field lines from local data sets, and completely without field lines.

Throughout all benchmarks, settings have been kept the same, except for how the field lines are added to the scene. The setting for the benchmarks has been set at a fixed camera position at 25 solar radii from the Sun, the scene is loaded with all the planets in the solar system, and with digital stars in the background. All four field line data sets loaded in to the scene: PFSS in-to-out, PFSS out-to-in, SCS out-to-in, and the sub-satellite connection. The OpenSpace time have been set to 10 September 2017, 04:00 for all benchmarks. The resolution of OpenSpace for these runs has been set to a default of 1280 × 720. Magnetograms are not loaded. Since they are available at the same cadence and are considerably smaller in size, one can expect the similar results. There are two sets of benchmarks to test the performance. Figure 12 shows the results from the first benchmark, a three minute long run in OpenSpace with field lines. Figure 13 shows the results from the second benchmark, a three minute benchmark with an advancement forward in OpenSpace time by one day at the 60 second mark.

0 20 40 60 80 100 120 140 160 180 0 2 4 6 8 10 12 Second Fr a m e ti m e (m s) No Field Lines Local Field Lines Runtime Downloaded Field Lines

Figure 12: Line plot of OpenSpaces frame time per second in millisecond, during a 180-second run. The plot contains data from three separate runs.

(29)

0 20 40 60 80 100 120 140 160 180 0 2 4 6 8 10 12 Second Fr a m e ti m e (m s) No Field Lines Local Field Lines Runtime Downloaded Field Lines

Figure 13: Line plot of OpenSpaces frame time per second in millisecond, during a 180-second run, with an advancement in OpenSpace time. The plot contains data from three separate runs.

The runs without field lines serves as a reference to OpenSpace performance in general on the specific machine. The results of the first run shows a peak in frame time at the 25, 60 and 90 second marks for the runtime downloaded field line files. The second run shows a more consistent frame time for both local and runtime downloaded field lines, but a frame time increase during the advancement in time at the 60 second mark for all three settings. There is a minor difference in frame times between running field lines from local files or downloading them during runtime. The average frame time for all runs are shown in Table 2.

Run Type 180 seconds

180 seconds with time progression No field lines 4.78 ms 4.80 ms Local field lines 5.24 ms 5.25 ms runtime downloaded field lines 5.30 ms 5.32 ms Table 2: Average frame time for each run and setting

6.2 Synoptic magnetogram as a texture

Using the different color schemes presented in Table 1, the following results were achieved. Black color scheme and white color scheme can be seen in Figure 14 and grayscale in Figure 15. Images of the texture with and without enhancing values using the logarithmic function presented in section 4.3.2 is presented in Figure 15. The images are from the same timestep.

(30)

Figure 14: Left shows black-red-blue and right white-red-blue color schemes.

Figure 15: Gray color scheme. Left is with enhanced values and right without.

6.3 Field lines feature extraction

The in-to-out data set, on which data reduction in the form of field strength based selection was made, is seen in Figure 16. In the figure, the lines have a color scheme based on polarity. Since the selection of lines are based on magnetic field strength, there is higher a concentration of lines where there are active regions on the magnetogram.

(31)

Figure 16: PFSS in-to-out field lines. The concentration of field lines is denser where the magnetic field strength is higher.

On the out-to-in data set, data reduction and the selection is based on whether the line is close a coronal hole boundary. Besides those lines, a set of random sparse lines are added to the data set to fill out the non-boundary areas. This data set can be seen in Figure 17a. In the figure, the lines have color based on polarity, which was also extracted. In Figure 17b, the second part of the out-to-in set is displayed, showing in lines that goes out from 2.5 solar radii to 21.5 radii.

(a) PFSS out-to-in field lines. (b) SCS out-to-in field lines.

Figure 17: Coronal hole outline based selection is used. Such an outline is visible in the image (a). The lines in (b) are directly connected to the lines in (a). The PFSS set is not present in (b), creating

(32)

Feature extraction based on membership of the current sheet mentioned in chapter 4.2.4 was also made on the out-to-in data set. The results can be seen in Figure 18. Displayed in Figure 19 is the sub-satellite track, where the satellite is the Earth. The colors are based on polarity in the figures.

Figure 18: The current sheet, a subset of the out-to-in data sets. The current sheet is where negative and positive charge meet.

Figure 19: The sub-satellite track, where the satellite is Earth. The lines represent where Earth is connected to the Sun stepwise from the timestep and one revolution back in time.

(33)

6.4 Data reduction

In Table 3, the file sizes of the different field line data sets before and after feature extraction and data reduction is made. The files are for one timestamp, from September 2017. Because of the number of points being different in every data set, due to the coronal boundary line picking and the field strength based picking resulting in different number of lines every time. The total size have not been recorded to exceed 25MB.

Data set Before reduction After reduction Reduction factor (1/x) PFSS in-to-out 241.1MB 125KB 1 928.8 PFSS out-to-in 241.1MB 7.2MB 33.5 SCS out-to-in 241.1MB 9.5MB 25.4 Sub-satellite (extracted from out-to-in) 4MB

-Total size 723.3MB 21MB 34.4

(34)

7 Discussion

The addition of the sliding windows pipeline have affected the performance of OpenSpace com-pared to running local field lines. The methods used to benchmark the performance may have to be revised to give better confidence to the result, and even discover circumstances when the sliding pipeline might show better performance than local field lines.

7.1 Performance of client server pipeline

The results from the benchmark indicates that downloading and providing field line data during runtime does affect the performance of OpenSpace compared to having local data. A reasonable result, considering the system of sliding windows implements mechanisms on top of running local field lines, that continuously check if new data should be requested from the server. However, the average frame time per seconds during the three minute long runs with local data sets are 5.24 and 5.25 milliseconds, while the average for runtime downloaded data sets are for the same runs 5.30 and 5.32 milliseconds, respectively. The difference is less than three frames per second for both runs, which is about a 1.3% increase in average frame time. The difference should be considered minor, and would likely not be noticed by a user of OpenSpace.

During the first run, there are three major frame time peaks at the 25, 60 and 90 second marks. The frame time peaks might have occured when a window of field lines is completely downloaded, and the worker prompts there are new field lines to display. This causes the field line rendering loop to add those files to a list of field lines ready to be displayed. The same phenomenon should occur during the second run, but it did not. Either it did not occur during the second run because the worker never finished downloading a complete window, or the frame time peaks during the first run might have been caused by something other than the client server pipeline.

7.2 Benchmarking in OpenSpace

The goal of measuring performance in these benchmark is to keep each and every run exactly the same, except the variable of how field lines are provided. This goal presents a challenge in the environment of OpenSpace, where a large amount of auxiliary background objects and processes are active in the scene, and these may affect the performance of the benchmarks. This is the reason for starting the benchmarks at exactly the same time in OpenSpace, and once started, not modifying the view at all during the run.

This problem eliminates the possibility of more advanced benchmarks that might produce a dif-ferent result. An instance where OpenSpace might benefit from having runtime downloaded data sets, is when there is a large amount of local data sets are in place, and all data has to be held by OpenSpace in memory. Whereas downloading them as the user is browsing in OpenSpace would result in a low memory consumption at the start, that would slowly accumulate the data up until the point of having the same amount of as the local data sets run. This type of benchmark would be hard to execute in OpenSpace, where the navigations and actions would have to be all man-ual, and executing manual actions in a longer run impose the risk of not being able to maintain consistency throughout all the runs. Having a consistent benchmarking suite would improve the credibility of the benchmarks.

The second benchmark was a reach for a more advanced test, by progressing time during the run, doing this clearly affects the frame time, during all three settings of runs, at the point where the

(35)

progression in time was made. This exhibit the instability of the frame time doing navigations in OpenSpace, not being caused by the variable desired to measure.

7.3 Magnetograms as textures

With the continuous collaboration with users, it was concluded that the grayscale version is the most useful, as the black one blends into the dark background and the white version causes the visibility of the field lines to deteriorate, due to an additive blending feature.

The texture before modifications is relatively monotone compared to the enhanced version of the same timestep, presented in Figure 15. In the enhanced version, it is easier to spot areas with higher activity. Although, having modified values leads to scientific incorrectness. Hence it is reasonable to leave it as an option for the user to switch on the enhancement.

Another reason to leave the enhancement feature as an option for the user, is that if there is very low activity on the Sun, such that there are no outlying values, the texture can appear noisy and distorted. This was discovered after the product had been tested on more data.

7.4 Field lines feature extraction

As the development of the feature extraction procedure has been refined with the help and feed-back from the potential users of the product at CCMC and NASA, the product at its current state is satisfactory for the use of scientific exploration. Although, it has been expressed by the associates at NASA that the ability to being able to slightly modify the way field lines and features are ex-tracted would result in even higher usability. Alas, the feature extractions done in this work might not be the only features a user is interested to see and examine.

One of the goals of this work has been to achieve a state where OpenSpace, combined with the client-server delivery of the WSA data, is a useful tool for scientific exploration. To achieve such goal, the opinions and feedback from peers at NASA has been heavily influencing the progression of the work. While this has resulted in a more desirable product, it has not been possible to fulfill all requests, like having the user being able to tweak the process of the feature extraction. To satisfy such requests would require building an interface for the server at NASA, where the user may customize the input to the preprocessing script for converting FITS to osfls. It would also imply storing the data in a non-reduced raw FITS format, since it is on the raw files the feature extraction and data reduction is operating on, and that amount of data would not be feasible to store the non-reduced files in the CCMC servers.

It is a fundamental problem when trying to satisfy two groups of product users, one being the general public, that has little to no knowledge about heliophysics and space weather. For those, the method of an automated script to simplify and emphasize prominent features is an excellent way to get a quick perception of how solar activity might look like. While the other group being experts and scientists, that are trying to explore and understand every part of the data, the ability to customize and tailor it to their need is more prioritized than the general perception. Imple-menting a customizable system, that is tailored to the scientists needs would result in a product barely usable for the general public user, since they have no knowledge about the data, it would be difficult to set parameters for them to get any output at all.

(36)

7.5 Polar pinching

In the feature extraction process, when finding the field lines close to the coronal hole boundaries, the method used for comparing distances can be improved. When comparing the polar and azimuthal angles, a side effect called polar pinching can appear, due to the angles being smaller near the poles. The field lines near the poles tend to be open and no artifacts of polar pinching has been discovered yet. The comparison algorithm can be improved by using the great-circle distance, or orthodromic distance instead.

7.6 Data reduction

Data reduction has been closely related to the work done for feature extraction, since the idea of feature extraction is to only save what is important to save from a data set. The goal for the data reduction has been to reduce data to a size that is manageable to download during runtime, which is a vague specification. It has been difficult to define a particular threshold for a maximum acceptable file size, since transfer speeds are heavily dependent on internet connection.

The focus has rather been how much can be discarded from the data, while still keeping everything worth keeping. The data sets holds different amounts of data worth keeping. With the in-to-out data set being by far the smallest, since the most interesting to feature to view in this data set are the strong field lines. During low activity solar periods, the amount of strong field lines are very low, even in periods with more solar activity, strong field lines are not common. In contrast to the in-to-out data set, the out-to-in data set for the Schatten Current Sheet, holds information of the coronal boundaries and the general structure of the outer parts of the field lines. This data set being the largest, partly by having the longest field lines.

The primary measurement for the reduction of data, is the total size of all the combined data sets. As presented in the results, the raw FITS files for one time step of field lines are about 720MB of data, the result of the work done for reducing and extracting field lines has resulted in a size in the range 20-25MB for the combined data sets. Which is only about 3.5% of the original file size. As there was no strict maximum file size set at the start of the project, one restriction that came to be during development, was the storage capacity on the CCMC servers. As this result was presented, with the reduction to about 3-4% of the original file size, it was an acceptable reduction of data and no further reduction was made.

It was early on theorized if another representation of the field lines could be made to further reduce the file sizes, as the current file format stores raw float values as 3D points in space in the OpenGL format of vec3. Considering that the change along one field line is minor, the lines could theoretically be approximated by a function rather than storing raw values. This would allow the OpenSpace software to extract points where desired, and could provide a dynamic resolution to the data, with high resolution at points of interest and low resolution in areas of less interest. Although discussed, this idea was discontinued due to the lack of time and it would also imply implementing an additional feature for reading field line data in OpenSpace.

7.7 Future work

The result of the feature extraction and the data delivery pipeline shows that OpenSpace has reached a level of usability comparable to the existing software of GEOSpace. With the main differences being OpenSpace’s feature extraction, OpenSpace’s dynamic fetching of data, and

(37)

GEOSpace’s possibility of selecting field lines for viewing of data. Although the latter being a fea-ture that OpenSpace is missing, the data delivery pipeline and feafea-ture extraction of the data has facilitaded a more productive experience in exploring the WSA model than GEOSpace.

The missing feature of being able to select field lines to extract data about the selected line, is a deficiency, and is the only piece that is missing for the users at the Heliophysiscs department to completely abandon the GEOSpace software. When this missing feature came to attention, it was explored whether such feature would be feasible to implement within the time frame of the project. This was disregarded, since no similar system existed in the OpenSpace software, and would be therefore require building up a completely new addition to OpenSpace.

An improvement that would result in a better user experience, is to implement visual feedback in the GUI to indicate that files are being downloaded. In the state of the software now, there is no indicator that a download is in progress, which can result in confusion if there is a data set showing from a previous time step. Consider the case where the user jumps to another point in OpenSpace time, and the old field line set is showing, but the magnetogram has updated. There will be a dissonance between the data sets, and no indicator that the visible field lines set is incorrect. The same applies if there is missing data between two distant points in time, and those two are already downloaded. A data set will be visible, but not corresponding to that particular timepoint. A solution would be to have visual feedback in the GUI, with the information of the timestamp of the currently showing data sets.

References

Related documents

The increasing availability of data and attention to services has increased the understanding of the contribution of services to innovation and productivity in

Av tabellen framgår att det behövs utförlig information om de projekt som genomförs vid instituten. Då Tillväxtanalys ska föreslå en metod som kan visa hur institutens verksamhet

Närmare 90 procent av de statliga medlen (intäkter och utgifter) för näringslivets klimatomställning går till generella styrmedel, det vill säga styrmedel som påverkar

Den förbättrade tillgängligheten berör framför allt boende i områden med en mycket hög eller hög tillgänglighet till tätorter, men även antalet personer med längre än

På många små orter i gles- och landsbygder, där varken några nya apotek eller försälj- ningsställen för receptfria läkemedel har tillkommit, är nätet av

DIN representerar Tyskland i ISO och CEN, och har en permanent plats i ISO:s råd. Det ger dem en bra position för att påverka strategiska frågor inom den internationella

While firms that receive Almi loans often are extremely small, they have borrowed money with the intent to grow the firm, which should ensure that these firm have growth ambitions even

Effekter av statliga lån: en kunskapslucka Målet med studien som presenteras i Tillväxtanalys WP 2018:02 Take it to the (Public) Bank: The Efficiency of Public Bank Loans to