Snow depth measurements and predictions: Reducing environmental impact for artificial grass pitches at snowfall

(1)

Authors:

Lars Petter Ulvatne Findlay Forsblom Supervisors:

Fredrik Ahlgren Martin Tiinus

Examiner: Jonas Lundberg Semester: VT 2020

Engineering Degree Project

Snow depth measurements and predictions

- Reducing environmental impact for

artificial grass pitches at snowfall

(2)

Abstract

Rubber granulates, used at artificial grass pitches, pose a threat to the environment when leaking into the nature. As the granulates leak to the environment through rain water and snow clearances, they can be transported by rivers and later on end up in the marine life. Therefore, reducing the snow clearances to its minimum is of impor- tance. If the snow clearance problem is minimized or even eliminated, this will have a positive impact on the surrounding nature. The object of this project is to propose a method for deciding when to remove snow and automate the information dispersing upon clearing or closing a pitch. This includes finding low powered sensors to measure snow depth, find a machine learning model to predict upcoming snow levels and create an application with a clear and easy-to-use interface to present weather information and disperse information to the responsible persons. Controlled experiments is used to find the models and sensors that are suitable to solve this problem. The sensors are tested on a single snow quality, where ultrasonic and infrared sensors are found suitable. However, fabricated tests for newly fallen snow questioned the possibility of measuring snow depth using the ultrasonic sensor in the general case.

Random Forest is presented as the machine learning model that predicts future snow levels with the highest accuracy. From a survey, indications is found that the web application fulfills the intended functionalities, with some improvements suggested.

Keywords: artificial grass, rubber granulate pollution, snow depth measurement, snow level prediction, temperature index, energy index, energy balance model, machine learning, python, random forest, ultrasonic sensor, infrared sensor, lorawan, pycom lopy4, micro python, arduino uno, web application, javascript, node.js

(3)

Preface

We would like to thank Fredrik Ahlgren at Linnaeus University, for his continuous effort in evolving the project by his feedback in methods and report writing. We would also like to thank Martin Tiinus and Fredrik Alserin at dizparc and the Växjö county staff for assigning us this project. Lastly, we would like to thank the Swedish Meteorological and Hydrological Institute (SMHI) for providing us with the datasets needed and for their fast and informative responses.

(4)

1 Introduction

Microplastics in the environment pose a threat to both surrounding nature and the marine life around the world [1]. Rubber granulates are categorized as microplastics, which are used for football pitches made of artificial turf. As the granulates do not decompose naturally and can contain toxic chemicals, they should be kept at the pitch site to reduce the environmental impact [2].

Since plastics leak to the environment through rain water, they can be transported by rivers and later on end up in the marine life. To reduce the risk of leakage from artificial turf pitches in Växjö (Sweden), the county has included a goal of reducing granulate leakage in their chemical plan from 2017. By the year of 2024 the goal is to keep the refilling volume of new granulate close to zero at the pitches [3]. To reach this goal, so called granulate traps has been installed to collect and stop the microplastics from leaking into the nature. Växjö county have also discovered that keeping snow clearance to its minimum will reduce the leakage of granulates [4]. This project addresses this problem and propose a solution to restrict the number of snow clearances, by automated snow depth measurements, snow depth predictions and present decision proposals to clear or close the pitch at snowfall.

The project was appointed as a thesis project through dizparc. Dizparc is responsible for the digital lab for Växjö county’s European project, DIACCESS, where one of the tasks is to contribute to the public digital- and innovation climate in the county [5]. Im- plementations of Internet of Things are included, by connecting the sensors to the internet using Växjö’s existing LoRaWAN. The project also touches other areas as web programming, machine learning, meteorology, physics and mathematics, when implementing an algorithm and data presentations.

1.1 Background

Växjö has numerous artificial grass pitches using rubber granulates around the county.

The rubber granulates pose a threat to the surrounding environment when leaking into the nature, which causes pollution. As granulates leakage partly was found upon snow clearance, clearing the pitch from snow should be kept to its minimum. Therefore, pitches should only be cleared upon snow depth levels above a specific threshold of 2 cm and when one additional criterion was fulfilled: A lack of snowfall in the near future or if there was no possibility the snow will melt in the next couple of days. Otherwise, the pitch should be closed. Växjö county physically has to visit the pitches to make a decision of snow clearance and the staff manually make phone calls or send emails to notify the concerned person/team upon closing the pitch. Therefore, the object of this project was to propose a more effective method for deciding when to remove snow and automate the information dispersing. This method should include automated measurements and investigations of current and upcoming snow depths and weather conditions.

The traditional method to measure snow depths by the Swedish Meteorological and Hydrological Institute (SMHI) are manual measurements, using a ruler or measurement stick. However, there are also automated techniques, where ultrasonic sensors and laser emitting sensors are the most commonly used methods [6]. This knowledge, along with findings for infrared (IR) sensors, later explained in Section 1.2, laid ground for the choice

(7)

When calculating the snowmelting process, different physical models are often ap- plied. According to the findings of P. Larsson at Uppsala University, the temperature index model and energy balance model are the two most commonly used [7]. Machine learning models on the other hand, are used in numerous prediction tasks, where the models are trained on datasets. There are many different machine learning algorithms, along with methods to test the predictive performance for unseen data. Predicting snow levels in the near future by implementing machine learning algorithms, based on datasets from the factors in the physical models, could therefore be considered as a potential approach.

Data are often made available for parsing and analysis in web applications. A web application can be a powerful tool when customized programs needs to be available online.

There are numerous frameworks and modules that ease the creation of a web application and help fulfill the desired functionality. In this project, a web application with Express.js as framework along with different modules will be used to handle sensor data, run machine learning algorithms, render views to present data and disperse information through different communication channels.

1.2 Related work

The following related works were obtained through literature research. The most useful findings were listed and formed the base of the methods used in this project.

1.2.1 Devices for measuring snow

Andrés Varhola et al. presented and described the development of a prototype to measure snow depth with a low-cost ultrasonic sensor (LOCUS-2), using the LV-MaxSonar EZ1 sensor which is a low power (2 mA typical current draw) sensor. The resulting prototype was explained as, "low-cost units built from inexpensive sensors and hardware available in the market offer a remarkable solution to compete with costly brand-name instruments"

[8]. The LV-MaxSonar EZ1 is in a similar power consumption and price rate as the HC- SR04 used in this project.

In the winter of 2006-2007, 17 sites around the United States tested automated snow measurement systems, using the SR-50 ultrasonic sensor (peak power consumption of 250 mA). Wendy A. Ryan et al. examined and presented the results in their article, where the accuracy was presented to be within 2 cm. However, this was mentioned with some uncertainties of manual measurement data [9].

K. Yoshihisa et al. described legends in winter life influenced by the snow as “quiet- ness” and “difficulty of conversation”. Therefore, measurements on the acoustical proper- ties of snow were carried out. Results in these tests concluded that glass wool has similar sound absorbing characteristics as newly fallen snow [10]. Due to the low probability of snow fall in the time frame of this project, Yoshihisa’s findings were used in the project to test the ultrasonic sensor potency for measuring snow depth of newly fallen snow.

Jeffrey S. Deems et al. aimed to map snow depths in snow hydrology and avalanche applications using LIDAR sensors. Tests were made with both airborne and ground based LIDAR sensors, to measure snow depth at terrestrial sites. As they mention in their abstract that the ground based sensors can provide millimeter-scale accuracy [11]. This laid ground of testing a micro LIDAR sensor in this project.

Harold W. O’Brien and Richard S. Munis studied the spectral reflectance of snow in range of 600 nm to 1400 nm. They reported a large increase of the snow surface reflection of light waves in the 600 nm to 1400 nm spectrum, compared to waves >1400 nm. They

(8)

also show that the natural aging of snow decreases reflectance of the surface, as an effect of the natural density and hardness increase that comes with the aging process [12].

Chris Nafis measured snow depths with Sharp GP2D12 infrared light emitting sensor in his tests. His words on the background of the project were the "inconsistency of how people/agencies measure snowfall". This was not retained from any scientific resouce database, but was useful to this project, since the Sharp GP2D12 uses triangulation to determine distance to a target, as the GP2Y0A21YK0F IR sensor in this project [13].

Both H. W. O’Brien et al. and C. Nafis findings increased the interest of testing an IR sensor.

1.2.2 Snow level prediction with machine learning

Previous work related to machine learning, for the prediction of the change in snow depth were not found in the literature research. Meteorological and hydrological models were often the preferred approach. However, machine learning has been used for similar weather predictions.

M. Zickus et. al. compared four different machine learning methods of different complexity to predict PM₁₀ levels and also get a better understanding of which factors influenced the PM10levels. The machine learning methods/algorithms tested were logistic regression, decision tree, multivariate adaptive regression splines and neural network. In their test all except the Decision Tree performed well and had similar results [14].

In their article, O. R. Dolling and E. A. Varas used Artificial Neural Network (ANN) to predict monthly streamflow in mountain watersheds, subject to rainfall and snowmelt in conditions of scarce hydrologic information. Their findings showed that calculated flows represented by neural network models have a better performance than alternative procedures [15].

1.3 Problem formulation

To automate snow depth measurements, the problem was to find low power and low cost sensors that could measure snow depth, within the requested accuracy of ±0.6 cm. An appropriate microcontroller should be used to process and send data over Växjö’s existing LoRaWAN.

Another problem was to find which factors, along with the measured data, can be used to build an effective algorithm that predicts the change in snow depth. The algorithm should contain a model for snow level prediction for the next two days. Since machine learning has proven to be powerful for these kind of predictions, machine learning algorithms will be tested to predict the change in snow depth.

With the measured data and algorithm in place, an application is needed to present the measured data along with the algorithm output in a clear way and automate the information spreading. The idea was to build an application that presents the measured data and snow depth predictions to the appropriate user. Based on the information presented, the user should decide to either clear the pitch or close the pitch. Therefore, the data should easily be parsed and the application, built to use for anyone without particular computer knowledge. Upon closing the pitch the application will disperse the information to the booked teams.

(9)

Problems to solve:

• Find low power and low cost sensors that can measure snow depths with ±0.6 cm accuracy.

• Investigate what physical factors that affect the snowmelt process and can be used to predict snow depth in the next coming days.

• Investigate which method, along machine learning algorithms and physical models, to use when predicting the snow level changes, based on factors found above.

• Create an application with a clear and easy-to-use interface to present the data (sensor and algorithm output), as base for decisions to clear or close pitches, along with automated information spreading upon confirmed decision.

1.4 Motivation

The environment impact of rubber leaking into the nature, has a scientific and societal significance. If the snow clearance problem were to be minimized or even eliminated, this would have a positive impact on the surrounding nature. As the weather conditions in the off-season months in Sweden are not beneficial for grass to grow, the availability of artificial grass pitches are important for football teams in their off-season training. This could also be viewed as an asset for the society, as the pitches serves as a health and social resource for those who uses them. If the environmental impact of the surrounding nature would be close to none, the beneficial of having artificial grass pitches would most likely overcome the risk of pollution. Not only are the pitches an asset for the society, but also since it serves as a second market for the rubber material as an already used resource.

The tasks of freeing up and maximizing resources are important matters in an industrial aspect, which also applies for the counties with strict budget conditions. Therefore, the implementation of automated snow depth measurements, along with predictive algorithms, are of industrial interest as the monitoring and decision making will be made online. If the technique works well, it could also be implemented for snow clearing decisions in other parts of the county.

The sensor technique could also be used in many different distance or level measurement tasks, such as water level monitoring, farming food dispensers, etc. The application for presenting data could also connect to other sensor devices, to provide for monitoring systems in different areas.

1.5 Objectives

The aim of this project was to build a prototype that measures snow depth, using distance measurement sensors along with long-range communication. A server-side algorithm should be formed to predict snow levels in the near future. The data should be presented to the county in a web application along with decision proposals of closing or clearing a pitch. The decision should be presented to the responsible persons through email. Objec- tives to obtain the resulting prototype can be viewed in Table 1.1.

(10)

Table 1.1: The different objectives of this project.

O1 Research/testing of laser, IR and ultrasonic sensors measuring snow depth. Testing with snow if possible, otherwise fabricated tests will be made.

O2 Research for algorithm, for both temperature index method and machine learning. Gather data sets needed for testing.

O3 Create algorithm with initial tests of fabricated data for both test methods.

O4 Test algorithm with real sensor data (Fabricated if bad conditions).

O5 Server setup

O6 Connect sensor device with server.

O7 Coding for back-end data and information dispersing.

O8 Create user interface for decision confirmation and front-end data presentation.

O9 Bug/stability testing of implementation.

1.6 Scope/Limitation

The project testing was limited to the period of March-April, which reduced the availability of snow. Tests were made at a local ski slope, where snow production was possible at night for a short period of days. The tests were only allowed when the ski slope was closed and tests were limited to a single day. Therefore, the scope of the tests and results on real snow was limited to 1.5 days old snow. The sensors used in this project were limited to low power sensors. This to investigate if sensing snow depth was possible, in relation to the more power consuming sensors used at professional automated weather stations.

Testing the temperature index model and the energy balance models was not achievable because the temperature was warmer in comparison to an average winter day and no snow was obtainable. The weather limited the testing of machine learning models on real world usage.

Neural network and deep learning are considered as powerful approaches in these kind of prediction tasks, however this was not implemented in this project due to the time limits and lack of knowledge in that area.

The main objective for the application was to present measured data and algorithm output, along with information dispersing. Due to time limitations, important security and other implementations, not related to the problem itself, was left out for future work. The application as whole was created for the specific problem in this project, where integrating the application with the IBGO database, which is a platform used by Växjö county to book sport facilities, was requested by the assignee’s. However, many libraries and/or parts in the code could be reused for similar purposes in other scopes than for this project.

1.7 Target group

(11)

designed specifically to handle the task at hand, but this could still be of interest to data analytics or data scientists and might give them more ideas where machine learning can be used. The application serves the whole chain from sensor measurement collection, machine learning prediction to presenting data and information dispersing, this could serve as an inspiration in many platforms for software system developers.

1.8 Outline

The chapters that follows are Method, Implementation, Results, Analysis, Discussion and Conclusion.

The next chapter, Method, provides information on how the progress of different parts of the project were formed, i.e. the methods used. This information will include the algorithm and sensor methods, along with the reliability and validity of the methods used. Also, ethical considerations will be presented. In the chapter Implementation, the process of putting the prototype together is presented. This will include sending sensor data through LoRaWAN to the server, perform predictions and the implementation of the web application and its different parts. Results will contain the result presentation of the sensor measurements, machine learning model testing and web application poll.

The following chapter Analysis presents an analysis of the results. In the Discussion chapter the results and findings in the project will be discussed, along with other problems discovered during the project. Comparison of the results to the research questions found in the problem formulations will also be made. The final part of this report is the Conclusion chapter, where conclusions will be drawn from the result of the findings. This chapter will also present the relevancy of the project along with future work.

(12)

2 Theory

In the project different measuring techniques, hardware, software, communication platforms, machine learning- and physical models are used. This section will present information for these subjects on a deeper explanatory level.

2.1 Snow measurement techniques

The traditional method to measure snow depths are manual measurements with a ruler or measurement stick. The reports of snow depths from observation sites made by SMHI are solely from manual testings. However, there are also automated techniques used at other locations around the world, where ultrasonic sensors and laser emitting sensors are the most commonly used methods. Ultrasonic sensors has been used for a longer time than laser sensors, but the laser sensors are increasingly being integrated as they often are more precise compared to the ultrasonic devices. The main problem with the automated techniques is that they only measure on a small area. One device would be required for each test point, where manual measurements only requires one visit for multiple test points at each site. However, using automated methods would make higher frequency testings sufficient [6].

2.2 The growth of Internet of Things

As the availability on the internet for different devices has grown, such as sensors, cam- eras, smart home devices, etc., Internet of Things (IoT) has emerged. IoT has been defined as a global infrastructure for the society to connect devices (physical and virtual things) based on the evolving information and communication technologies [16]. IoT has been integrated into several areas of the society, such as homes, industry, infrastructure, etc.

Installations of different techniques in those areas has contributed to the growth of IoT, where common techniques are embedded systems, wireless sensor networks, control systems and automation [17]. IoT-devices has shown a year-over-year growth of 31 % and the estimated number of devices in the year of 2020 devices are 20.4 billion [18, 19].

2.3 Technical information for distance measurement sensors

The ultrasonic (HC-SR04), laser (VL53L0X) and IR sensor (GP2Y0A21YK0F) were the chosen sensors to measure snow depth in this project and the RHT03 sensor was used to measure relative humidity and temperatures. The technical information of the sensors can be viewed in Table 2.2, along with an image of the sensors in Fig. 2.1.

(13)

Figure 2.1: Overview of the sensors used in this project. From the left: HC-SR04 ultrasonic sensor, VL53L0X laser sensor, GP2Y0A21YK0F IR sensor and RHT03 relative humidity and temperature sensor.

Table 2.2: Technical information from the sensors used in this project when measuring snow depths.

Sensor Measures Technique frequency/

wavelength

Working current

HC-SR04 Distance Ultrasound >20 kHz 15 mA

VL53L0X Distance Laser 940 nm 10 mA

GP2Y0A21YK0F Distance Infrared light 870 nm 30 mA

RHT-03 Humidity/temperature Humidity capacitor,

thermistor - 1.5 mA

The ultrasonic sensor transmits an eight cycle sonic burst, where the time is measured between transmitted burst and receiving the reflected sound waves after they hit an object.

The distance from the sensor to the object can be derived from the time measured. The sensor uses a measurement angle of 15° and can measure distances of 2 cm to 400 cm with ±3 mm accuracy [20].

The laser sensor measures distance from the time the emitted light travels from the sensor to the reflection off an object, which is called Time of Flight (ToF). Depending on the reflectance of the target surface, the accuracy ranges from 3 % to 12 % [21].

The infrared light sensor, uses triangulation for distance measurement. The sensors transmitter emits infrared light and upon reflection the light hits the receiver of the sensor at a specific angle, based on the distance from the target. The sensor measures the angle of the incoming reflected light and then calculates the distance based on triangulation [22].

The RHT sensor measures relative humidity with an accuracy of ±2 % and temperature with ±0.5^◦C. The relative humidity and temperature are read from the sensor by measuring the pulse width of each high pulse. From the pulse widths, bytes are derived and the humidity and temperature can be read from those bytes [23].

(14)

2.4 Overview and technical information of single-board microcontrollers

Single-board microcontrollers has been available since 1979, and consists of a microcontroller built into a single circuit board with the circuitry needed for programming and connecting external devices, such as microprocessor, internal memory (RAM), I/O circuits, etc. Different components can be wired to the microcontroller input pins and shields can be mounted onto most microcontrollers for additional features [24]. There are many different single-board microcontrollers available on the market, such as Arduino, Raspberry Pi, Dwengo, LoPy4 etc. The single-board microcontrollers used in this project was Arduino Uno and Pycom LoPy4 mounted onto the Expansion Shield 3.0, which can be viewed in Fig. 2.2.

Figure 2.2: Overview of the microcontrollers and antenna used in this project. From the left: LoRa antenna, Pycom LoPy4 and Arduino Uno.

Arduino Uno is a microcontroller with ATmega928P, 14 digital and 6 analog I/O pins.

The clock speed is at 16 MHz and a 2 kB SRAM built into the ATmega928P chip [25].

Different wireless protocols can be used by the Arduino Uno by connecting different modules or shields.

Pycom LoPy4 is a micro-python programmable microcontroller. It uses a ESP32 chipset and has 8 analog I/O and 24 GPIO pins. The internal memory consists of 4 MB RAM and has 8 MB of external flash memory. Since the LoPy4 has a low power consumption and can handle multiple wireless protocols, it is considered suitable for IoT implementations [26].

2.5 Transmitting wireless sensor data

Both microcontrollers in this project had the ability to use different wireless protocols to connect to networks using Wi-Fi, LoRa, etc. A wireless network is a computer network with nodes which communicates wireless using radio waves [27]. There are different types of wireless network protocols, such as Wireless Local Area Network (WLAN),

(15)

standards. Linkrates between 1 Mbit s⁻¹ to 9608 Mbit s⁻¹ are obtainable, depending on which Wi-Fi generation used and can operate on either 2.4 GHz or 5 GHz frequency bands [29].

To send data over longer distances, the WAN protocols are used, which contains techniques as 3G, 4G, 5G, LoRa etc. LoRaWAN (Long Range Wide Area Network) is a low- power wide-area network protocol, which was developed by Semtech. In Europe LoRa uses the license-free bandwidth of 868 MHz and are used for connecting IoT-devices sending smaller chunks of data [30]. The data travels from a node to a gateway, where the gateway forwards the data to the appropriate destination, which in this project was The Things Network.

The Things Network is a platform to connect IoT devices to the internet using Lo- RaWAN and is an open network to build IoT applications at a low cost, using 128 bit AES encryption for maximum security [31]. When a LoRa node sends data over Lo- RaWAN a gateway receives the data, where the gateway could handle the data itself or act as packet forwarder. In this project The Things Network (TTN) was used to ease the process of making the data available for the server on the internet by letting connected gateways forward data packets to TTN. LoRa node devices, added to a TTN-application, can communicate with gateways connected to TTN, where the gateways will forward the data to the TTN-application. The code provided at the payload formatting page will parse the incoming data and return it as programmed. Communication can be made through two different protocols. ABP (Activation By Personalization) or OTAA (Over The Air Activation). OTAA is considered the most secure way to connect with TTN as the devices negotiate and assigns the security keys, along with a dynamic device address in the connection establishment process. ABP hard codes the security keys to the device.

Therefore, no join-procedure is needed, as keys are pre-defined on the device [32].

2.6 Web application environments and persistent data storage

Numerous web applications written in JavaScript are built with Node.js. Node.js is an asynchronous event-driven JavaScript runtime environment, which is based on Chrome’s V8 JavaScript Engine, written in C++ [33]. Node.js is designed to build scalable web applications, as several connections can be handled concurrently [34]. Express.js is a web application framework, which is designed to ease the development of web applications by its set of features and provides a standard way to build web applications [35, p. 2].

A view engine, or template engine, provides the possibility to use static template files.

The view engine injects real values into the template at runtime and is then transformed into HTML. The view engine used in this project was Express-hbs (handlebars), which is based on the Mustache engine [35, p. 71][36].

To save persistent data, such as user login details, pitch details, session storage, etc., databases are used. The two database platforms used in this project were Redis and Mon- goDB. Redis is an open source in-memory data structure storage platform, which can be used as database cache and interface engine [37]. In this web application Redis was used for session variable storage. MongoDB is a general purpose, document-based distributed database. The data is stored as JSON-like documents along with ad hoc queries and in- dexing [38]. In the application MongoDB was used for storing user information, pitch information for the available artificial grass pitches and to store information from data collected upon snow depth threshold detections.

(16)

2.7 Traditional snowmelt modelling

When modeling snowmelt runoff, the traditional approach is to use hydrological methods.

The temperature index model, also known as the degree day approach, is a hydrological snowmelt modeling method that simulates snowmelt using air temperature. There are some variants of the temperature index model, but in its simplest form the model can be described with the equation below.

M = P_{CF M AX}∗ (T − P_{T T})

R = P_{CF R}∗ P_{CF M AX}∗ (P_{T T} − T ) (1) where

M = The rate of snowmelt (mm day⁻¹) P_{CF M AX} = Degree-day factor (mm^◦C day⁻¹) T = Temperature (^◦C)

P_{T T} = Threshold Temperature (^◦C)

R = The rate of snow refreezing (mm day⁻¹)

Both the degree-day factor (P_{CF M AX}) and threshold temperature (P_{T T}) in Equation 1 are user defined variables. The threshold variable is usually set to 0°C [39] and the degree-day factor varies between 1.8 to 3.7 mm/day, when no rain is present [40]. In Sweden this factor is generally set to 2.0 mm/day for forest locations and 3.5 mm/day for open ground [40].

Another physical approach of modelling snowmelt is the energy balance equation.

Although being more accurate it is used less, due to the requirement of more parameters, which in some cases are not obtainable [41]. The energy balance is described in Equation 2.

∂U

∂t = Q_sn+ Q_li+ Q_p+ Q_g− Q_le+ Q_h+ Q_e− Q_m (2) where

Q_sn = net shortwave radiation Q_li = incoming longwave radiation Q_p = advected heat from precipitation Qg = ground heat flux

Q_le = outgoing longwave radiation Q_h = sensible heat flux

Qe = latent heat flux due sublimation/condensation Q_m = advected heat removed by meltwater

(17)

The temperature index model can also be elaborated by combining it with the energy index. This was proven in a research done by F. Cazorzi and D. Fontana [42]. The combination of the two models can be described by Equation 3.

M = P_{CM F} ∗ EI ∗ T (3)

where

M = The rate of snowmelt (mm h⁻¹)

P_{CM F} = combined melt factor (mm^◦C EI⁻¹day⁻¹) EI = The energy index of the element (MJ m⁻²day⁻¹) T = Temperature (^◦C)

2.8 Machine learning algorithms

Machine learning is the art and science of programming computers to learn from data and is viewed as a subset of artificial intelligence [43, 44]. Machine learning algorithms build mathematical models based on sample data, known as training data, in order to make predictions or decisions without being explicitly programmed to do so. Machine learning algorithms are used in a wide variety of applications, such as email filtering and computer vision, where it is difficult or infeasible to develop conventional algorithms to perform the needed tasks [44]. In this project machine learning was used to predict snowmelt by analyzing data containing previous weather observations. The section below gives a short description of each algorithm, followed by a description of some common terms and techniques used in machine learning.

Linear regression is a machine learning algorithm used for finding linear relationship between a target variable and one or more predictors [45]. With regression in general the letter Y is used to depict the target variable, while X is used to depict the predictors. There are different variants of linear regression, but in its simplest form there is only one predic- tor and a y-intercept, known as β0. In this project the tested variants of linear regression were the multiple linear regression and polynomial linear regression. In polynomial regression the features or variables are of a polynomial degree, while in linear regression no changes are performed on the features [46]. The objective in both models is to to find the combinations of β0, ..., βnthat minimizes the cost function, which is the squared difference between the predicted Y and the actual Y. The equation for the multiple linear regression is depicted in Equation 4.

Y = β₀+ β₁X₁+ ... + β_nX_n (4)

A polynomial regression with degree 2 can be depicted as in Equation 5.

Y = β0 + β1X + β2X² (5)

K-nearest neighbours (KNN) is a machine learning algorithm used in both regression and classification problems. KNN uses feature similarity to predict the values of new data points [47]. When KNN is used in regression, the algorithm finds the k nearest points that has similar features to the new unseen data point and calculates the average of their dependent variables, which is then assigned to the new point.

Support Vector Machine (SVM) is a machine learning algorithm used in both regression and classification tasks. When used for classification, each feature is mapped into an n-dimensional feature space, where n is the number of features. The algorithm computes

(18)

a hyperplane that maximizes the margin between the two classes, while at the same time minimizing the amount of classification errors [48]. In other words it tries to fit the largest possible street between two classes, while limiting margin violations, which is illustrated in Fig. 2.3.

Class A Class B

Figure 2.3: Intuition of the SVM classifier.

When SVM is used for regression, the objective is reverse. The algorithm tries to fit as many instances on the street while limiting margin violations. For regression tasks the margin violations are the total distances to the instances outside the street borders [43].

SVM is versatile as it can handle both regression and classification tasks. Not only that, it can also deal with both linear and nonlinear regression and classification tasks. In non-linear tasks the algorithm uses something known as a kernel trick. When the data is not linearly separable in the current dimension or input space, the kernel trick transforms the data to a higher feature space, which can be separated by a hyperplane. This concept is illustrated in Fig. 2.4. There are several different kernels available and each kernel does the computation in different ways. The most common are the linear, poly, rbf and sigmoid kernel, which are tested in this project.

(19)

Figure 2.4: Illustrative explanation of kernel trick. Image is under Creative Commons Attribution-Share Alike 4.0 International license [49].

Just like SVM, Decision Trees are versatile in that they can handle both regression and classification tasks. A Decision Tree is like a flowchart tree structure. Each internal node denotes a test on an attribute and each branch represents an outcome of the test, where each leaf node holds a class label (classification) or value (regression) [50].

Random Forest is an ensemble method that consists of many Decision Trees [43].

A common problem with Decision Trees is that if allowed to grow deep, they tend to overfit the data [48]. Another common problem with Decision Trees are their sensitivity to training data, which makes them error prone to the test dataset. These problems are often reduced when ensemble methods such as Random Forest are used.

2.9 Training and evaluation of machine learning models

In machine learning, K-Fold Cross Validation (K-Fold CV) is a method of evaluating the performance of a machine learning model [51]. This is done by splitting the data (usually the training data) into K number of folds (sections) and for each iteration one of the folds is used for testing, while the rest is used for training [51]. At the end, a combined average score of each fold is computed. Using this approach the final score is not biased, since every part of the data was used for both training and testing.

Grid search is a method of performing hyperparameter tuning. A hyperparameter is a parameter whose value is used to control the learning process of a machine learning algorithm. Grid search is used to find the optimal hyperparameters of a model that gives the most accurate predictions [52].

In machine learning a common problem that plagues each model is overfitting. Over- fitting is when a model learns the details and noise of the training data too closely, making its future predictions on unseen data less reliable. This can be avoided by using regularization. Regularization is a technique used for tuning the function by adding an extra penalty term [53].

(20)

3 Method

To solve the tasks mentioned in the problem formulation, different research, tests and data gathering was made. To measure snow depth, three different sensors were tested along with a sensor for humidity and temperature measurements. Data sets were gathered to test different machine learning algorithms, to find the most suitable model for predicting upcoming snow levels. To gain knowledge in those fields and select which methods to use, different literature researches were made. With data from the sensors and predictions from the machine learning algorithm, a web application was built to present data and build a platform for decision making and information dispersing.

3.1 Literature review

To gain knowledge in the matters of the problem formulation, some research was made through Google Scholar and the university library’s all-in-one search service, OneSearch.

Each search tag found in Table 3.3 and 3.4 were made on both engines, which mostly gave similar results. However, OneSearch was mainly used to gain access to charged resources, using the university license.

A part of the problem was to find which sensors to use for measuring snow depth. At first, more general tag searches, see the first three tags in Table 3.3, were made to get an overview of which sensors might be suitable for this project. From the research in both scientific and non-scientific resources the ultrasonic, laser and IR sensor techniques were chosen for deeper investigation. To validate the found sensor techniques, further searches were made with tags directed towards each technique. Those searches were made solely towards scientific databases to validate the possibility of interaction with snow. As some searches gave larger amount of search results, the most relevant resources were read (i.e.

top 2-3 page search results). Researches for snow depth measurement and environmental impact of rubber granulates can be viewed in Table 3.3 and 3.4 respectively.

Table 3.3: Information from sensor literature research. Tags shown in this Table are searches where information was extracted from the resulting resources.

Tags Results Language Search Engine

"measure" "snow" "depth" "sensor" 15 800 000 English Google

"Mäta snödjup" 39 900 Swedish Google

"measure" "snow" "depth" "sensor" 145 000 English Google Scholar

"snow" "depth" "low" "powered" "sensors" 34 000 English Google Scholar

"ultrasonic" "waves" "snow" 13 600 English Google Scholar

"snow surface" "ultrasonic" "wave" "reflection" 307 English Google Scholar

"snow depth" "ultrasonic" "wave" "reflection" 332 English Google Scholar

"sound reflection" "snow" 581 English Google Scholar

"snö" "ultraljud" 157 Swedish Google Scholar

"infrared" "light" "snow" 185 000 English Google Scholar

"time of flight" "sensor" "snow" "depth" 3 400 English Google Scholar

(21)

Table 3.4: Information from literature research about artificial grass impact on the environment. Tags shown in this table are searches where information was extracted from the resulting resources.

Tags Results Language Search Engine

"artificial" "turf" "fields" "crumb rubber material" 99 English Google Scholar

"konstgräs" "miljöfara" 6 Swedish Google Scholar

3.2 Creating the snowmelt prediction algorithm

As mentioned earlier, there are two main methods of modelling snowmelt, the temperature index model and the energy balance model [4, 42, 54]. Although being more accurate than the temperature index model, the energy balance model is rarely used since it contains variables that are harder to obtain [42].

Since the time frame of the project was in the spring, the weather conditions were not ideal for the physical models. The temperature was warmer in comparison to an average winter day, where testing the equations derived from the temperature index model was not quite achievable. To tackle that problem machine learning was used instead. With machine learning, one can build a model and also evaluate the models performance. Sec- tion 3.2.5 describes in further details how this was done. Most of the input variables or features of the machine learning models were based on the variables from the temperature index model. A supervised learning approach was used when creating the different models. Supervised learning are algorithms that are designed to learn by example, meaning the training data consists of inputs paired with the correct output [55]. All models were trained using a single target variable. Section 3.2.2 gives an in depth explanation of the variables used.

3.2.1 Important libraries and algorithms

Python was the main programming language, used with Scikit-learn (Sklearn) which is a machine learning library containing various classification, regression and clustering algorithms [56, 57]. The Python version used was 3.8.0 and version 0.22.1 for Sklearn. Six different machine learning algorithms were tested using Sklearn, which were Linear Re- gression, Polynomial Regression, SVM, KNN Regression, Decision Tree Regression and Random Forest Regression. Some of these were also tested with regularization techniques to prevent overfitting.

3.2.2 Dataset

The datasets used in this experiment were obtained from the Swedish Meteorological and Hydrological Institute (SMHI). SMHI provided different weather observations gathered from different stations across the country. The weather observations used in this project were obtained from stations located in the Swedish city Växjö. Each observation, such as snow depth, was its own dataset, where some observations contained information dated back to the 1890s. All datasets are available on the project GitHub repository 2DT00E- DegreeProject [58] and the variables from the final dataset can be found in Table 3.5. The following observations were downloaded from SMHI:

(22)

• Lufttemperatur - dygn, (Air temperature - Daily values)

• Nederbördsmängd - dygn, (Precipitation amount - Daily values)

• Nederbördstyp - dygn, (Precipitation - Daily values)

• Snödjup och markytans tillstånd - dygn, (Snow depth - Daily values)

• Relativ luftfuktighet - h, (Relative humidity - Hour values)

Table 3.5: List of all features from the dataset used in the project.

No. Variable Unit Dataset Observations

1 Snow depth m Snödjup (Snow depth) 22 544

2 Snow depth +day1 m Snödjup (Snow depth) 22 544

3 Temperature ^◦C Lufttemperatur (Air temperature) 18 438 4 Temperature +day1 ^◦C Lufttemperatur (Air temperature) 10 168 5 Humidity % Relativ luftfuktighet (Rel. humidity) 10 168 6 Precipitation amount m Nederbördsmängd (Precip. amount) 56 295 7 Light snow mixed rain binary Nederbördstyp (Precipitation) 11 228

8 Drizzle binary Nederbördstyp (Precipitation) 11 228

9 Hail binary Nederbördstyp (Precipitation) 11 228

10 Ice pellets binary Nederbördstyp (Precipitation) 11 228 11 Ice-needles binary Nederbördstyp (Precipitation) 11 228 12 Corn snow binary Nederbördstyp (Precipitation) 11 228 13 Rainfall binary Nederbördstyp (Precipitation) 11 228 14 Rain showers binary Nederbördstyp (Precipitation) 11 228 15 Small hail binary Nederbördstyp (Precipitation) 11 228 16 Snow fall binary Nederbördstyp (Precipitation) 11 228 17 Small mixed rain binary Nederbördstyp (Precipitation) 11 228 18 Cloudy light snow binary Nederbördstyp (Precipitation) 11 228 19 Snow Grains binary Nederbördstyp (Precipitation) 11 228 20 Undercooled rainfall binary Nederbördstyp (Precipitation) 11 228 The precipitation types were the variables containing binary units, which means the values zero or one. Zero for off (that particular precipitation did not occur) and one for on.

The dataset column shows which dataset the variable values were obtained from and the No of observationscolumn shows the number of rows the dataset contained. The number of observations varied between datasets, mainly because the date when the measurement began were different for each observation type. For example, some started in the 1890’s, like snow depth, while others started much later. After the merging the datasets, discussed in section 3.2.3, the final dataset contained 8809 observations (or rows) of which 5872 were used as training and 2937 for testing.

The variable Snow depth +day1 in the dataset was the target variable and the remain- ing were used as predictors. In production, variables 1, 3 and 5 were obtained from the

(23)

3.2.3 Data pre-processing

Most of the observations had both hour values and daily values, but some had only hour values, such as the relative humidity observations. The relative humidity observations had to be recalculated to daily values by taking the average hourly value for each day, which was done using the Python library, Pandas.

Since each observation of snow depth, temperature, precipitation, precipitation amount and humidity was a data set of its own, they had to be merged together to form a single dataset, to be ready for machine learning. The merge was done using the datetime index column. After merging the data to a single dataset, some missing data occured, which had to be handled before being passed to the different machine learning algorithms. Since temperature was an important factor, only the rows containing temperature values were kept. The precipitation amount dataset and snow depth dataset did not have any missing data. The precipitation dataset had only values for days when precipitation occurred, so when merged with the columns from the other dataset this led to missing data in some rows. Since the values were only binary, see table 3.5, the rows that had missing data were set to 0. For the humidity dataset the rows that were missing were replaced with the average humidity.

Using the train_test_split method from the sklearn.model_selection, the data was di- vided into a training set and a test set, with a test a size of 1/3. The train_test_split method by default makes the split random, but in order to make the comparison as fair as possible between the different models, a random_state variable was passed to the train_test_split method. The random_state variable serves as a random seed generator and makes it gen- erate the same test and training set for all models.

3.2.4 Model selection

When selecting an appropriate machine learning model, it was important to find an opti- mum balance between robustness and flexibility [14]. If a model is too flexible it tends to capture all the data points and errors in the dataset. This is often known as overfitting, which often leads to high variance. On the other hand if a model is too simple, its predictive performance on complex problems will also be low, which is then said that the model is underfitting the data.

In order to estimate the performance of the different models, a technique known as cross validationwas used. There are different types of this technique, but the one used in all tests was k-fold cross validation. K-fold cross validation was first used together with gridSearchCVto obtain the best hyperparameters for each model and then used separately to compute the validation error and validation score using the chosen parameter output from the gridsearch. K-fold cross validation was solely used on the training set.

3.2.5 Model evaluation

After evaluating the models and choosing a final model, its prediction power, also known as the generalization error, had to be computed. This was done by setting aside a test set and to compute the Mean Squared Error (MSE) of the set. The generalization error shows how well the dataset generalises and gives an idea on how well it performs on unseen data, which also indicates how well it will perform in deployment.

(24)

3.3 Preparing the IoT device

To make the IoT device ready for implementation as a prototype, libraries for each sensor was needed for the microcontrollers, followed by sensor tests. Also, a connection to The Things Network through LoRaWAN was needed. The libraries and codes were uploaded to the thesis project GitHub repository, 2DT00E-DegreeProject [58].

3.3.1 Sensor assembly and library codes

Initially, coding had to be made for the Pycom LoPy4 device for each sensor. The libraries were coded with help from sensor datasheets, suggestions from forums and by studying other language libraries for each sensor. The sensors were then tested by putting objects in front of the sensor at different distances next to a ruler, to compare the measured distance to the actual distance.

The ultrasonic sensor micro-python code for the LoPy4, was written with the sensor datasheet as inspiration, using the timing diagram provided. However, the datasheet did not provide a solution to include the temperature impact on the speed of sound, which was implemented with information from educational resources [59]. The Arduino library code was retrieved from GitHub, provided by the user, enjoyengineering79, at the repository HCSR04 [60]. Some minor changes was made to the library to fit the need of the tests made in this project. The laser sensor code for the LoPy4 were inspired by the user, IoTMaker, from the forums at pycom.io [61]. The forum thread was intended for the Pycom WiPy2 device, but proved successful for LoPy4, with minor changes in the code to fit the need of this project. The Arduino code used contained a library provided by Adafruit Industries, written by Limor Fried and Ladyada [62]. The IR sensor was only tested with the Arduino Uno. The library was found at Makerguides.com, where the user Benne de Bakker provided a step-by-step tutorial with library code included [63]. Minor changes were made to fit the needs of this project. A code was initially written for the LoPy4 based on this tutorial, as no libraries was found for this particular IR sensor. However, this proved unsuccessful to measure distances. Due to limits in time, this code was not further developed and the sensor capabilities was only tested with the Arduino library. The RHT03 sensor code for the LoPy4 was developed by parsing the output data along with the information from the sensor datasheet. The sensor output was compared with measurements from another device at site to confirm the data accuracy. The Arduino code was provided by Sparkfun Electronics at the GitHub repository, SparkFun_RHT03_Arduino_Library [64]. The pin connections for each sensor to the microcontrollers can be viewed in Fig. 3.5-3.8 and Tables 3.6-3.9.

(25)

Figure 3.5: Connection overview of the HC-R04 ultrasonic sensor with both Arduino Uno and Pycom LoPy4. Fritzing images are under Creative Commons (CC-BY-SA) license.

Table 3.6: Pin connections for the HC-SR04 ultrasonic sensor with Pycom LoPy4 and Arduino Uno.

Micro-controller (MC) Sensor pin MC pin MC information

Pycom LoPy4 VCC Vin 5V

trig Pin 19 Digital out echo Pin 20 Digital in

GND GND

Arduino Uno VCC 5V

trig Pin 9 OUTPUT

echo Pin 8 INPUT

GND GND

Figure 3.6: Connection overview of the VL53L0X LIDAR sensor (ToF) with both Ar- duino Uno and Pycom LoPy4. Fritzing images are under Creative Commons (CC-BY-SA) license.

(26)

Table 3.7: Pin connections for the VL53L0X LIDAR sensor (ToF) with Pycom LoPy4 and Arduino Uno.

Pycom LoPy4 Vin Vin 5V

GND GND

SCL Pin 10 SCL

SDA Pin 9 SDA

Arduino Uno Vin 5V

GND GND

SCL SCL

SDA SDA

Figure 3.7: Connection overview of the GP2Y0A21YK0F IR-sensor with Arduino Uno.

A 10 µF capacitor was connected in parallel between 5V and GND in order to stabilize the power supply line. Fritzing images are under Creative Commons (CC-BY-SA) license.

Table 3.8: Pin connections for the GP2Y0A21YK0F IR-sensor with Arduino Uno.

Arduino Uno V0 A0 Analog input

Vin 5V

GND GND

(27)

Figure 3.8: Connection overview of the RHT03 relative humidity and temperature sensor.

Fritzing images are under Creative Commons (CC-BY-SA) license.

Table 3.9: Pin connections for the RHT03 relative humidity and temperature sensor.

Pycom LoPy4 VDD Vin 5V

GND GND

DATA Pin 21 Open drain

Arduino Uno VDD 5V

GND GND

DATA Pin 4

3.3.2 Connecting the IoT device to the internet

To make data available for the application server, a connection setup was made to The Things Network (TTN). The Pycom LoPy4 was setup as a LoRa node, where boilerplate code was fetched from the Pycom website for LoRa nodes and both OTAA and ABP protocols were tested. The security aspect of OTAA laid ground for the choice of using the protocol. OTAA also benefited from less package loss for both uplink and downlink mes- sages. When connection was made with a nearby gateway, the data was processed at TTN application website by writing a code for decoding the bytes received into a JavaScript object, presented in Appendix A.1. The data sent from the IoT device was in string format and then decoded to bytes, before sending a message over LoRaWAN to the server. An acknowledgement method were written on both ends to ensure data delivery to and from the server.

3.3.3 Snow depth measurements

Before the snow depth tests were made, the test tower shown in Fig. 3.11 was built to make measurements from four different heights: 20 mm, 40 mm, 60 mm and 80 mm.

Also, a box with the dimensions of 50 × 50 × 5 cm was built, to put the materials which were to be tested. The dimension of the box was based on the maximum angle of detection from 80 cm height for the ultrasonic sensor, as it had the widest detection area. The heights in every test were validated by measuring the real height with a ruler from the sensor to the target.

(28)

The first sensor tests were made on glass wool which, as K. Yoshihisa et al. implied in their article, had similar sound propagation characteristics as newly fallen snow [10].

The glass wool contained smaller bits, put into the box to fill all gaps in the measurement area, as viewed in Fig. 3.9.

Figure 3.9: Testing ultrasonic sensor on glass wool with the Arduino Uno.

The next test was made on ice from the local ice hockey rink, since snow was unavail- able at most part in the time frame of this project (March-June). The ice was grainy, as seen in Fig. 3.10, and nearly impossible to flatten out.

The last tests were made on 1.5 days old artificial snow. The local ski slope was able to manufacture snow during one night of temperatures below 0^◦C. As the ski slope was open, the tests could not be made before the staff closed the facility. This meant that only one day of testing was possible. The snow obtained had been exposed to sunlight for one day, which resulted in packed snow quality with minor grains, see Fig. 3.10. The air temperature measured at site was 6.1^◦C to 7.4^◦C.

(29)

Figure 3.10: Snow quality show from the local ice hockey rink (left) and the local ski slope (right).

At first the distance was calibrated for each sensor to the masonite board, where the box was placed upon measuring snow depth, see Fig. 3.11. The snow was easy to form and make a flat surface on the box. The real height was measured with a ruler, from the sensor to the surface of the snow. Tests were both made on flattened out snow in the box, Fig. 3.11, and on untouched snow at the ski slope, Fig. 3.12. For tests on untouched snow only real height to the snow was measured and were made in both direct sunlight and shadowed areas.

Figure 3.11: Tests made on packed snow in the box.

(30)

Figure 3.12: Tests made on untouched snow at the local ski slope. Only distance to target was measured in these tests.

3.3.4 Sensor data processing and presentation with Matlab

The data collected from the measurements were used in Matlab to present the results.

Boxcharts were chosen as data presentation model, as it shows the dispersion of the data. Matlab boxcharts has different entities, shown in Fig. 3.13, which are outliers, upper/lower adjacent, upper/lower quartile and the median.

The median is the middle value of the sorted array (low to high) of values. Up- per/lower quartiles are the 75th/25th percentile (0.75/0.25 quantile) of the sorted array (low to high). The percentiles are the indexed values, v[x], of which the indices are calculated as x = 0.75 ∗ l and x = 0.25 ∗ l, where l is the length of the sorted array. The values between the quartiles contain the midmost 50 % of sorted values which are presented as the blue box in Fig. 3.13. Upper/lower adjacents are the extremes which are not outliers. These extremes are calculated by the interquartile range (IRQ), see equation 6.

As the data input to the boxchart was a vector, the interquartile range was calculated as the difference between the 75th and the 25th percentiles [65], as described in the equation below.

The upper adjacent is the closest value within 1.5 ∗ IRQ from the upper quartile and the lower adjacent is the closest value within −1.5 ∗ IRQ from the lower quartile. All values outside the adjacents are labeled as outliers.

IRQ = Q_upper− Q_lower (6)

where

Q_upper= upper quartile

(31)

Figure 3.13: A Matlab boxchart where outlier, adjacent, quartile and median is high- lighted. This boxchart was used in the results, see Section 5.

3.3.5 Web application evaluation poll

The web application was demonstrated to representatives from dizparc, Wexnet and Växjö county. To evaluate the web application in regards of the problem formulation, a poll was formed through Google Docs and requests sent to all participants in the demonstration of the application. The questions, which in its original form can be viewed in Appendix A.2, were based on the problem formulation regarding the web application. Out of nine sent requests, five participated in the poll. The poll questions and answers were written in Swedish and upon presenting the results then translated into English.

3.4 Reliability and validity

As the sensor tests were made in the southern parts of Sweden in March-April, there was little access to snow. The snow from the ski slope was very different to the ice from the ice hockey rink, where the ice was grainy and not possible to fully flatten out, which affected the results significantly. Therefore, these results are not included in the result section.

The tests made on glass wool with the ultrasonic sensor could not detect any reflected waves. This indicates that the snow quality and/or the characteristics of the surface had a big impact on the results. Therefore, the reliability of the sensor testing can be questioned with other snow qualities than the 1.5 days old snow. In Fig. 3.11 the tests were made on packed snow, where snow was moved from the ground to the box and flattened out.

As mentioned in the last section, the quality of the snow had a big impact on the results.

This could also raise questions of how differently packed snow would impact the results in contrast of the untouched snow tests made in Fig. 3.12.

When it comes to snow level prediction, even if the dataset used were not retained from the same stations, as long as the same pre-procesing steps are followed and the datasets are obatained from SMHI, then one should be able to get similar results on the algorithm and machine learning section.

(32)

3.5 Ethical considerations

The application built in this project does not prevent attacks such as XSS (Cross Site Scripting)and CSRF (Cross Site Request Forgery), where hostile code is injected into the web site code or unauthorized commands are transmitted from a trusted user to the web application, etc. Therefore, it is not suitable for production at this state. The application was built with focus on requested functionality and not security. Further security implementations are mandatory before putting the application into production. The application does not follow the GDPR regulations, due to only being in development phase. On the other hand private information, such as passwords, are hashed to maintain some security regarding privacy.

The application poll was made anonymous, therefore no personal data was collected from the participants, thus preserving their privacy. No further documented interviews or surveys involving sensitive or personal data were conducted.

(33)

4 Implementation

With the IoT-device and machine learning algorithm in place, the prototype could be implemented. The flowchart presented in Fig. 4.14 gives an overview of the process.

After the sensor measurements arrived at the server from The Things Network, the web application made its entrance. As snow was not obtainable when testing the prototype, fabricated data had to be used when sending data over LoRaWAN.

Figure 4.14: Project flowchart of the process for the resulting prototype.

When the sensor device detected snow depth levels above the threshold of 2 cm, the device sent data including the measured temperature, relative humidity, snow depth, the pitch ID (IBGO facility ID) and a message ID (used for acknowledgements). The data was sent via LoRaWAN to The Things Network, where the application installed would translate the bytes of data to JSON-format and push the data to the application server.

The measured temperature, humidity and snow depth were sent to the machine learning algorithm along with weather forecast API data, to predict the snow depth for the next two days. When the prediction was made, an email was sent to the pitch authority to inform that a decision of action might be needed. The authority would then enter the page, viewed in Fig. 4.15, and perform the necessary task.

The machine learning models were saved to a file on the server, using the Python library, Joblib. Since the server-side code of the application was written in JavaScript and the machine learning models in Python, a pipeline was created between the two languages.

This was done by spawning a Python script from JavaScript, with the help of a Node.js library called child_process. When spawning the Python script, a JSON array was passed as an argument containing the values to be predicted. The results were then printed and returned as JavaScript.

As human interaction was requested by the project assignee’s in the decision making for the artificial snow pitches at snow fall, clear presentation of data, weather forecasts and snow level predictions were found necessary. This aided the need of an application, with a lucid and easy-to-use interface, to present the data and build a base for the decision if a pitch should be cleared or closed. A confirmed decision should then be automatically dispersed to the responsible persons.

The application code was written in JavaScript using Node.js, with Express.js as web framework. The engine for rendering views was Express-hbs (handlebars). MongoDB

(34)

and Redis was used as databases to save user-, data measurement resources and session data. Fig. 4.15 shows the web application page, where sensor data and algorithm predictions were presented in the left column of the page under the sections "Mätdata" and

"Förväntat snödjup" respectively. The bookings and contact information were fetched from the IBGO API, which manages all bookings for Växjö’s sport facilities, and were presented below the data presentation row. The weather forecast presented in the right column were from the current date (when entering the page) and two days forward, containing forecast values of temperature, humidity and weather fetched from SMHI’s API.

Decisions were made by pressing one of the buttons. Pressing the yellow button informs the responsible for clearing the pitch, by fetching contact information from the MongoDB database and the red button informs the teams that the pitch will be closed, using the contact information found in the bookings section. All emails were sent by using nodemailer, which is a Node.js module to send emails from a web application, with high security in focus. The source code for the application was uploaded to the thesis project repository, 2DT00E-DegreeProject, at GitHub [58].

Figure 4.15: Application page presenting data and decision options.

(35)

5 Results

5.1 Snow depth measurement

A part of the problem formulation was to investigate which low power sensors could be used to measure snow depth. The results of testing snow depth on 1.5 days old snow at the heights of 40 cm and 60 cm can be viewed in Fig. 5.16 and 5.17. The charts show the deviance from the real snow depth height for tests on packed snow, as shown in Fig. 3.11.

Test samples for each sensor at each test was 60.

Ultrasonic (Pycom) Ultrasonic (Arduino) Laser (Pycom) Laser (Arduino) IR (Arduino) -5

-4 -3 -2 -1 0 1 2 3

Deviance from depth (cm)

Snow depth centered around 0 cm (40 cm test height)

Figure 5.16: Snow depth tests made on 40 cm height, showing deviance from real depth for packed snow.

Table 5.10: Extreme values from Fig. 5.16 at 40 cm test height. Median, higher- and lower adjacent are presented in centimeters. Numbers presented are deviance from real snow depth. The number of test samples for this test was 60.

Sensor (MC) Median Upper adjacent Lower adjacent Number of outliers

Ultrasonic (Pycom) -1.09 -0.96 -1.65 0

Ultrasonic (Arduino) -0.24 -0.16 -0.35 11

Laser (Pycom) -2.85 -1.9 -4.1 5

Laser (Arduino) -2.4 -1.8 -2.9 0

IR (Arduino) 0.25 1.02 -0.61 3

Snow depth measurements and predictions: Reducing environmental impact for artificial grass pitches at snowfall

Authors:

Lars Petter Ulvatne Findlay Forsblom Supervisors:

Fredrik Ahlgren Martin Tiinus

Examiner: Jonas Lundberg Semester: VT 2020

Engineering Degree Project

Snow depth measurements and predictions

- Reducing environmental impact for

artificial grass pitches at snowfall

Preface

Contents

1 Introduction

2 Theory

Class A Class B

3 Method

4 Implementation

5 Results