• No results found

Data Quality Study of AMR Systems

N/A
N/A
Protected

Academic year: 2021

Share "Data Quality Study of AMR Systems"

Copied!
92
0
0

Loading.... (view fulltext now)

Full text

(1)

UPPTEC IT 15 009

Examensarbete 30 hp

August 2015

Data Quality Study of AMR Systems

Adam Viklund

(2)
(3)

Teknisk- naturvetenskaplig fakultet UTH-enheten Besöksadress: Ångströmlaboratoriet Lägerhyddsvägen 1 Hus 4, Plan 0 Postadress: Box 536 751 21 Uppsala Telefon: 018 – 471 30 03 Telefax: 018 – 471 30 00 Hemsida: http://www.teknat.uu.se/student

Abstract

Data Quality Study of AMR Systems

Adam Viklund

Energy metering is a constantly changing field with increasing demands to get more measurement data. The implications are systems that are evolving and improving. It is important for data to be of high quality in these systems.

This thesis set out to investigate data quality in advanced meter reading (AMR) systems that are used by energy companies in Sweden today. In order to investigate data quality, a definition was suggested. The definition was used as a basis for interviewing users of AMR systems to figure out the user experience of data quality and to understand what features improve data quality. The interviews were conducted with six different users working on companies that distributes electricity and/or district heating to companies and consumers.

The features improving data quality were used to assess data quality in the open source AMR system called Gurux. A redesign was proposed to improve data quality in Gurux. The data quality parameter that needed to be improved the most was data accessibility. The conclusion of this master’s thesis includes that there are many systems where data quality can be improved according to the perspectives given by the interviewees. Gurux is a system that can help improve data quality by making changes suggested in this thesis.

(4)
(5)

Populärvetenskaplig sammanfattning

Under de senaste 20 åren har energibranschen i Sverige genomgått stora förän-dringar. Avregleringen år 1996 gjorde att elkunder själva kunde välja elhandels-företag. Sedan dess har reformer införts med syftet att göra företag och privat-personer mer medvetna om sin energiförbrukning och att bli mer aktiva på elmarknaden. Reformerna innefattar krav på elnätsbolagen att fakturan ska baseras på faktisk förbrukning. Om en elkund idag vill ha ett elprisavtal baserat på timförbrukning så måste elnätsbolaget mäta denna åt kunden och vidareförmedla mätvärdena till elhandelsbolaget. Liknande tendenser finns i fjärrvärmebranschen. Den första januari 2015 infördes krav på fjärrvärmebolag att fakturera på faktisk förbrukning, vilket innebär att fjärrvärmebolagen be-höver läsa av mätarna minst en gång per månad.

(6)

Contents

1 Introduction 9 1.1 Problem Description . . . 10 1.2 Limits . . . 11 1.3 Thesis Outline . . . 11 2 Background 13 2.1 Purpose of Measuring Energy . . . 14

3 Method 17 3.1 Work Structure . . . 17

3.2 Literature Study . . . 18

3.3 Interviews . . . 18

3.4 Data Quality Assessment of Gurux . . . 19

4 Literature Study 21 4.1 Data . . . 21

4.2 Data Quality . . . 22

4.2.1 Defining Data Quality . . . 23

4.2.2 Formal Data Quality Definition . . . 25

(7)

5 Remote energy meter reading infrastructure 29

5.1 Common terms . . . 29

5.1.1 Automatic Meter Reading . . . 29

5.1.2 Advanced Metering Management . . . 29

5.1.3 Advanced Metering Infrastructure . . . 30

5.2 Communication . . . 30 5.2.1 Communication Structure . . . 30 5.2.2 Communication types . . . 31 5.2.3 Topologies . . . 32 5.3 Communication Protocols . . . 33 5.4 Integrations . . . 33

6 Data Quality of AMR Systems of EDC’s 35 6.1 Introductory Remarks . . . 35

6.2 The Interview Subjects’ Thoughts on Data Quality . . . 36

6.3 Division of labor . . . 37

6.4 Data Quality Parameters . . . 37

6.4.1 Accuracy . . . 37 6.4.2 Timeliness . . . 39 6.4.3 Completeness . . . 40 6.4.4 Consistency . . . 41 6.4.5 Accessibility . . . 42 6.4.6 Interpretability . . . 46 6.4.7 Reliability . . . 48

(8)

7 Gurux: Assessment of Data Quality 50

7.1 Description of Gurux . . . 50

7.1.1 Open Source . . . 52

7.2 Implementation of Gurux . . . 52

7.3 Data Quality Features in Gurux . . . 54

7.3.1 Validation . . . 55

7.3.2 Templates . . . 56

7.3.3 Regular Collection of Data . . . 57

7.3.4 Reports . . . 57

7.3.5 Time Stamps . . . 57

7.3.6 Meter Status After Creation . . . 57

7.3.7 Deviation Code . . . 58

7.3.8 Requesting missing measurement values . . . 58

7.3.9 Interpolation of consumption . . . 58

7.3.10 Integrations . . . 58

7.3.11 Communication protocol implementations . . . 59

7.3.12 Search and filter . . . 59

7.3.13 Error messages and meter alerts . . . 60

7.3.14 A view for meters not communicating . . . 60

(9)

7.4.1 System Infrastructure . . . 61

7.4.2 Value Handling . . . 62

7.4.3 Presentation . . . 64

7.4.4 Commands . . . 66

7.4.5 Service database interface . . . 66

7.4.6 Web Service . . . 67

7.4.7 Background service . . . 67

7.4.8 Integrations . . . 67

7.4.9 Web user interface . . . 67

8 Discussion and Conclusion 68 8.1 Data Quality Definition . . . 68

8.2 Data Quality Results from Interviews . . . 69

8.2.1 Accuracy . . . 69 8.2.2 Timeliness . . . 69 8.2.3 Completeness . . . 70 8.2.4 Consistency . . . 70 8.2.5 Accessibility . . . 71 8.2.6 Interpretability . . . 72 8.2.7 Reliability . . . 72 8.2.8 Summary . . . 72

8.3 Data Quality of Gurux . . . 73

8.4 Open Source . . . 73

8.5 Conclusion . . . 73

(10)
(11)

List of Tables

4.1 Dimensions cited from [1]. Numbers denote the importance from the data consumer viewpoint, where 1 being the most important

of the 20 dimensions and 20 being the least important. . . 23

4.2 Flow of dimensions for the definition. . . 24

6.1 Data quality features identified in the interviews. . . 49

7.1 Export fields . . . 54

7.2 Summary of changes . . . 55

(12)

List of Figures

2.1 Timeline of Energy metering . . . 15 7.1 New system communication infrastructure. Dashed rectangles

represent new parts to create. . . 62 7.2 The new dependency graph of libraries. Dashed rectangles

(13)

Chapter 1

Introduction

The term data quality has an intuitive appeal. The first things that come to mind are probably correctness and completeness of data. Although very impor-tant dimensions of data quality, they do not constitute a sufficiently complete definition of data quality in most cases. Whether or not data can be easily accessed with sufficient enough speed is another example of what should be entailed in data quality definitions.

With the amount of data and data sources increasing more than ever, it is important to measure the quality of data. A strong case for that is made by [2], which states that poor data quality has been costing U.S. businesses more than $600 billion annually.

The purpose of this thesis is to analyze the data quality of AMR systems1in use

by companies and other organizations in Sweden. These systems communicate with meters to collect data about the energy usage of the consumer. This data is either stored and/or sent to other systems that require it. An important role of the operator of an AMR system is to figure out what measurement data is not being sent to the system. In this thesis, data quality will be analyzed from the perspective of the operators of AMR systems. This implies the need to assess, not only the quality of measurement values, but also the quality of other data that is important to the operators. This data includes information about, for example, malfunctioning meters and coupling meters with correct properties. The energy metering domain has for some time undergone big changes. Between 2003 and 2009, almost all electricity meters in Sweden were changed [3] to introduce monthly meter readings. This has changed the environment of meter reading quickly and the trend is currently towards consumption by hour. These changes come with challenges. Data volumes are increasing, amplifying the challenge of communication with the meters. There is a need to keep track of

1Automatic Meter Reading (AMR) Systems are used for collecting data from meters, either

(14)

meter failures with shorter intervals.

Developers of AMR systems have focused mostly on the technical aspects of the AMR systems, including communication and protocols. The result is that the user aspect of the system has been somewhat neglected. The systems have mostly been stable and communication has worked, but reporting failure in a satisfying way has been lacking. Some of this is changing with more suppliers exclusively developing AMR systems without manufacturing their own meters. The number of actors developing AMR systems able to communicate with mul-tiple brands of meters is increasing. Most are proprietary and they solve the problem of using multiple AMR systems to communicate with different brands of meters. There are also open source applications and libraries. Gurux is an example of an open source system containing a wide array of applications and libraries designed to communicate with energy meters. Being open source, it has a community of contributors and is actively developed.

Data quality is a domain specific problem [4] and studies usually define the term when assessing data quality in a specific field. These definitions can differ widely depending on the domain and scope of the study. This thesis will start from theories about data quality and how data quality can be defined. Based on this definition that suits AMR systems, interviews can be made to assess the data quality of AMR systems used in Sweden in general. This results in tools for assessing what data quality Gurux can deliver.

1.1 Problem Description

As explained above, the energy metering climate has changed drastically the last ten to fifteen years, focusing on increasing the resolution of energy usage information. There is a need to focus more on what the data users, i.e. the operators of the AMR systems, want and need in order to increase the quality of data and ultimately the quality of these systems. This is one goal of this thesis.

The AMR system is the most important of all the systems in the chain from meter to invoice or visualization. The system has to implement several protocols of communication in many cases and the identification of measurement values, for the systems further down the chain to get the correct values for the correct energy property, needs to be correct. The system is also important for providing the user with information about meter communication.

Another problem is how to design a system for data quality. Seeing as there already are open source systems available, a good first step is to assess such a system for data quality. This is another goal of this thesis.

(15)

• How should data quality be defined for AMR systems?

• What conclusions can be drawn about data quality on AMR systems that Swedish energy distribution companies2 use today?

• How can open source in general and Gurux in particular help improve data quality?

1.2 Limits

It is common to make data quality studies for a whole organization, looking at all the data chains. For this thesis, however, data quality is only measured on the AMR system.

The question of how data quality parameters are satisfied today is a very broad question that cannot be answered in a complete manner in the scope of this thesis. It can, however, be partially answered when it comes to certain situa-tions. For this thesis it will be answered based on the interviews done with the individuals involved with the AMR system. Furthermore, some data quality parameters that would suit the domain is not chosen, because other methods and more in-depth work are needed to answer them.

Gurux has a few different programs and services available. This work will focus on only one if these programs and based on the source code committed by Gurux at a certain date.

1.3 Thesis Outline

In this chapter, the introduction and goals of the thesis was presented. Presented in the next chapter is the background to energy metering, with emphasis on Sweden.

The method used in this thesis is presented in Chapter 3.

In Chapter 4, literature from the data quality field is presented. This part results in a data quality definition used for the rest of the thesis.

Chapter 5 presents an overview of the infrastructure of remote meter reading systems.

In Chapter 6, the results of the interviews are presented.

2An energy distribution company (EDC) is a company or department that has the

(16)
(17)

Chapter 2

Background

In order to measure electricity consumption, it has been common to make a prognosis of how much energy will be consumed during the year. The invoice was based on the prognosis and the meter was checked regularly to make corrections of the payments. The interval with which meters were read ranges from a few months to a year.

Reading meters manually can be more costly compared to alternatives, such as an AMR system. Using an AMR system also reduces the risk of a surprise in the case where the prognosis is wrong.

The manual work can be reduced with so-called walk-by or drive-by [5] AMR (automatic meter reading) systems. This improves on the model, but the manual work is still required.

The first technology for automatic meter reading was Power Line Communica-tion (PLC) invented in the 1920’s [6]. It is a technology using the power line for communication, taking advantage of already existing infrastructure. Today, there are many PLC meters used by energy distribution companies (EDC’s) in Sweden. There are both advantages and disadvantages to this type of com-munication. The advantages include long distance stability and not not being disrupted by radio signals compared to technologies using radio. However, de-pending on the set up, the communication can be disrupted by home appliances that create frequencies on the power line. Using lower frequencies diminishes this problem at the cost of longer transfer times, e.g. a system where messages are successfully sent after 27 hours.

Nowadays, more sophisticated communication methods are used for the trans-mission of measured energy consumption. Examples include GPRS and mesh radio technology.

(18)

2006/32/EC of the European Parliament and of the Council of 5 April 2006 on energy end-use efficiency and energy services and repealing Council Directive 93/76/EEC emphasizes a few recommendations. In summary the recommen-dations are about improving energy efficiency by using a few different means. These means include the following [7]:

• The exchange of information, experience and best practices at all levels. • Increasing the availability of and demand for energy services, or other

energy efficiency improvement measures.

• To have a widespread use of cost-effective technological innovations, such as electronic metering.

These EU directives were made in 2006. In Sweden this development can be said to have started with the deregulation in 1996.1 In the early stages of

deregulation, consumers had to buy the equipment for metering which implied a low participation in the deregulated market. The government added a price cap of 2500 SEK for the metering equipment, but the participation was still low. In 1999 the rules were changed where an estimated consumption profile could be used in order to allow customers (in specific geographical areas) to participate in the deregulated energy market [7].

This did not incentivize energy consumers to reduce electricity consumption in on-peak or high-peak periods. In 2002, the Swedish energy agency recommended the government to introduce mandatory monthly meter readings [7]. In 2003 the electricity meter reading reform had begun and a large part of all meters in Sweden were replaced by 2009 [3].

Mandates are now active forcing EDC’s to report consumption by hour if it is required. There are also talks about mandating 15 minute values.

2.1 Purpose of Measuring Energy

There are a few different uses for measured energy. The most obvious use is for the invoice. This is where it is the most important that data are correct and of high quality. If the customer pays according to the spot prices on Nord Pool2

it is important to have detailed measurements per hour.

1The deregulation meant that competition was introduced in the electricity trading market.

Privately owned companies are allowed to buy and sell energy. The market still has quite a few regulations.

2Nord Pool is the electricity market place that is operating in Sweden, Norway, Denmark,

(19)
(20)

A second use is to help the customer make better decisions by visualizing the en-ergy usage. How effective this is depends, among other things, on the resolution of energy usage data. To make smarter decisions about the energy consumption it is better with a high resolution, e.g. hourly. The effectiveness of visualization also depends on the timing of the visualization, i.e. how soon after consumption the consumer gets sees the data.

(21)

Chapter 3

Method

This chapter describes the methods that have been used in this data quality study. First, an overview of the work structure is presented then the different methods will be described in more detail.

There is a study in the data quality area that uses similar methods as this thesis. That paper creates a data quality definition and uses interviews to investigate data quality of a system [8].

3.1 Work Structure

The work in this thesis was structured in five stages. The first stage was to gather background information about energy metering, with a focus on Sweden. This sets the context of this thesis and provides background knowledge. This was done by searching for studies and documents that describe changes in the field of energy metering and in legislation.

The second stage was the literature study, giving the theoretical framework for the thesis. The results of this literature study was supposed to answer the first question. The purpose was to investigate the concept of data quality and to find parameters as the basis for evaluating data quality in AMR systems. The third stage involved the familiarizing of AMR systems. Common terms were explained and communication infrastructures were researched. The purpose was to make the complexity of the problem clear and to get an idea of what processes provides good or bad data quality in AMR systems.

(22)

the purpose was to investigate data quality of AMR systems used in Sweden today.

In the fifth stage, data quality was analyzed in the AMR system, Gurux. This was done by implementing Gurux for an EDC and comparing what features can be found in Gurux with what data quality features were the result from the interviews. This stage answered the third and last question of the thesis. The purpose was to concretize what can be done to improve data quality in this field.

3.2 Literature Study

The literature study has been an ongoing process throughout the work of this thesis. There are three areas where knowledge had to be acquired through literature. These are the background to energy metering, the infrastructure of energy metering and data quality.

The background part is built up mostly of the development of the energy meter-ing business in general and in Sweden in particular. This was written by lookmeter-ing at resources describing the development and the change in the legislation, such as thesis papers and web pages.

In the literature study of data quality the first steps was to compile several arti-cles, papers and books to get an understanding for the subject. The theoretical framework of this thesis is based on this literature study. After having studied a lot of sources on data quality definitions and parameters, a definition for data quality could be created. This definition has been used throughout the rest of the thesis project, and allowed interview questions to be structured.

3.3 Interviews

Conducting semi-structured interviews can provide reliable, comparable qual-itative data. The interviewer follows an interview guide but allows for the conversation to stray away from the guide in order for the interview subject to freely express ideas that might be of interest [9]. A downside to this type of research method is that the presence of an interviewer may result in biased data.

Six semi-structured interviews were conducted to collect qualitative data about data quality of current AMR systems. The focus point was the users’ personal experience of data quality in the systems.

(23)

goal was to interview some with a comparatively small amount of meters and some with a big amount of meters. Three of the interview subjects’ organizations controlled more than 100,000 meters. Two interview subjects control around 25,000 meters and another one around 10,000. With this selection, it is possible to get a broader picture of data quality in AMR systems. There was also a difference in how many AMR systems the interview subjects were using, ranging from one to four. Two of the interview subjects worked with only electricity metering, one worked with only district heating metering and three worked with both.

Four of the interviews were conducted face-to-face and two were conducted via telephone. The interview guide that was used for the interviews can be found in Appendix B (Appendix A for Swedish). A list of the interview subjects can be found in appendix C. In the beginning of each interview, the purpose of the thesis was explained. All the interviews were recorded, transcribed and summarized. From the summaries, all the important points were extracted and the results are in chapter 6.

3.4 Data Quality Assessment of Gurux

In order to answer the third question of the thesis, data from the interviews will be used in order to assess data quality in Gurux. In the results from the interviews, results included concrete features that have an impact on data quality. Gurux was analyzed on the basis of these features.

The first step to analyze data quality in Gurux was to do an implementation in a real environment, at an EDC which will use Gurux for collecting measurement values. From this implementation, knowledge was attained about what features Gurux has and how to work with it for a production use. This step gave an idea of what is needed to make sure that Gurux is deployable, and what features the EDC will be able to use in the scope of this project. It also reassured that Gurux could be used for the purpose of collecting measurement values from energy meters for an EDC in Sweden.

In order to do the implementation, information about how Gurux works was gathered. This was done by visiting their home page, testing the software, in-cluding the example meters, building the Gurux programs from the source code, and communicating via the Gurux community forums. Then, requirements were gathered from the EDC to know what minimum features were necessary in or-der for them to use the system. Some alterations had to be made to the Gurux source code before it could be implemented and used for communicating with meters in production.

(24)
(25)

Chapter 4

Literature Study

This chapter provides the results from the literature study about data quality. Commonly cited literature is presented and discussed. Towards the end of the chapter, a definition of data quality will be given. Lastly, the definitions are given for the data quality parameters.

4.1 Data

This thesis investigates the concept of data quality. For clarity, it is worth going deeper into the concept of data.

There are many definitions for data in the research literature. One definition is “data is the raw material for information” [10]. However, [10] uses a definition of data that consists of two interrelated concepts, “data models” and “data values”. The “data models” describes what the data means so that context is given to the “data values”.

Data can be said to be a representation of real world objects that can be stored and retrieved. It can be applied to a large number of phenomena, such as measurements, events, characteristics of things and so on [11].

The terms data and information are often used in synonym. Managers, however, use the term information as data that has been processed in some manner [12]. It is also possible to say that data is one kind of information, where other types of information constitutes what you write on paper or what is spoken through the voice [11].

(26)

to create information products [12]. Information manufacturing is a processing system acting on raw data to produce information products[12]. The activity of considering the usage of data means that it is important to consider how it is presented and what activities the data is involved in. That is, the data models, data values and the presentation are important for the definition of data.

4.2 Data Quality

A widespread general definition of data quality is “fitness for use” [13][14]. This definition implies that data quality is a domain specific problem, due to the fact that data that is fit for one use might not be fit for another use. It also implies that, in order to analyze data quality, you have to understand and anticipate what the data will be used for and how it is going to be used [14].

It is common to think of data quality simply as a problem of accuracy of data. Common errors that occur in information systems are typos or other processes that corrupt the data. However, the problem of data quality is much bigger than data accuracy [11, p. 4–5]. Other dimensions have to be considered, e.g. data has to be understood by the user.

There are three approaches to data quality used in the literature, an intuitive, a theoretical and an empirical approach. The intuitive approach is when data quality attributes are selected by the researchers based on experience or under-standing about what attributes are “important”. This approach is the most com-monly used in data quality studies. In the theoretical approach, focus is on how data may become deficient during the data manufacturing process. Although often recommended, it is seldom used in research. The empirical approach to data quality collects data from data consumers to determine how they assess whether data are fit for use in their specific tasks [1]. This study will use an intuitive approach.

(27)

Table 4.1: Dimensions cited from [1]. Numbers denote the importance from the data consumer viewpoint, where 1 being the most important of the 20 dimensions and 20 being the least important.

Dimension Name of dimensions 1 Believability 2 Value-added 3 Relevancy 4 Accuracy 5 Interpretability 6 Ease of understanding 7 Accessibility 8 Objectivity 9 Timeliness 10 Completeness 11 Traceability 12 Reputation 13 Representational consistency 14 Cost-effectiveness 15 Ease of operation

16 Variety of data and data sources 17 Concise

18 Access security

19 Appropriate amount of data 20 Flexibility

second category of parameters that are subjective and user centered. Examples of these desired features are interpretability and relevancy [10, p. 74].

That data quality can be divided into two categories is shared by [15]. It states that “companies must deal with both the subjective perceptions of the individ-uals involved with the data, and the objective measurements based on the data set in question” [15, p. 211]. It also confirms that data quality is a multidimen-sional concept [15, p. 211].

In [11], Batini and Scannapieco describe four common data quality dimensions. These dimensions are accuracy, completeness, currency (and other time-related dimensions) and consistency [11, p. 20]. These attributes are what [10, p. 74] would call free of defects.

4.2.1 Defining Data Quality

(28)

others. Consider integer values that are supposed to be summed together. If one or more values are missing, it is a problem with completeness. But the sum will also be wrong, i.e., there is also a problem with accuracy.

In choosing data quality parameters, we use the four common parameters in [11] as a starting point. They are often cited in the literature and they are intuitively important. The parameters are:

• Accuracy • Completeness

• Timeliness (related to currency) • Consistency

In choosing the rest of the parameters, the reasoning is based on the parameters in table 4.1. The results of the reasoning are illustrated in table 4.2.

Table 4.2: Flow of dimensions for the definition. Base Parameters Resulting Parameters Access security Accessibility Flexibility Relevancy Accessibility Ease of operation Interpretability Interpretability Concise Ease of understanding Accuracy Accuracy Believability Reliability Objectivity Reputation Timeliness Timeliness Completeness Completeness Appropriate amount of data

Traceability

Representational consistency Consistency Cost-effectiveness Excluded Value-added Excluded Variety of data and data sources Excluded

(29)

It is important that data be trustworthy, dependable and able to be relied upon. This is commonly called reliability [14]. In this parameter we also include a) believability, because if data is reliable it is also believable, b) objectivity, which is the degree to which data is objectively true, and c) reputation, that tells us whether the data consumers feel that data is reliable.

Accessibility is another important parameter. Even more parameters could be merged into that parameter, such as relevancy, ease of operation, access secu-rity and flexibility. Relevancy is included in accessibility because it is about accessibility to relevant data. Ease of operation is about what you can do with accessible data and that it is easily accessible, so for this thesis it is included in the accessibility parameter. Access security makes sure that the data is acces-sible only to the correct data users, and that accessibility is not compromised. Flexibility can be seen as a part of accessibility because it is the degree to which accessible data can be manipulated.

It is also important that data is easily understood by data consumers. Ease of understanding means that data is easily interpretable. Further, for data to be interpretable, it needs to be concise enough and not overwhelming.

Traceability is a special case of completeness, because whether data is traceable depending on the scope of the information. Appropriate amount of data is also merged with completeness because both are about the scope of the data. Hence, the last parameters become:

• Accessibility • Reliability • Interpretability

In conclusion, the parameters are chosen so that there is one quite widely in-clusive parameter for each area. E.g., everything having to do with time in data quality is considered under timeliness. This simplifies the definition for the thesis. Furthermore, value-added and cost-effectiveness needs other ways of measuring and is not directly related to the user of the system in this case.

4.2.2 Formal Data Quality Definition

Data quality is the degree to which data is fit for its intended use, free of defects and satisfactory for the data consumer, where the following parameters satisfy those constraints:

(30)

• Timeliness (objective) • Consistency (objective) • Accessibility (subjective) • Reliability (subjective) • Interpretability (subjective)

In this definition, data that is of high or good quality contains the properties necessary to have a high degree of data quality. In contrast, low or bad data quality means that the data has a low degree of data quality. High and good data quality will be used interchangeably throughout the thesis. The same goes for low and bad data quality.

4.2.3 Data Quality Parameters

In this section, the data quality parameters in Section 4.2.2 are defined and described. It is noteworthy that there is generally no commonly accepted def-inition of the different parameters. A few defdef-initions have been found and the one best suited for this thesis is picked out.

4.2.3.1 Accuracy

Accuracy is the degree to which data corresponds with the real world. It is defined as the closeness between a value and its real-world value that it is meant to represent [11]. As an example, if the real consumption of energy is 2.65 kWh and the measurement value shows 2.67 kWh, while not completely correct there is some degree of closeness and depending on the application, it may or may not be accurate enough.

Similarly, a database contains a large set of values. These values are meant to represent the real world in some manner. But this database might be known to contain fields that are erroneous and it is only 85% accurate in aggregate. This, in analogy with the example above, might be accurate enough for one application but not for another [16].

There are many examples where data accuracy can be lacking for this thesis. E.g., it is common that connections to meters fail. If it turns out that no data is showing that this happened, then data about the status of the meter is inaccurate.

(31)

4.2.3.2 Timeliness

Timeliness can be expressed both as how current data is for the task at hand and as a relation between the degree of currency and volatility. Currency describes how promptly data are updated and volatility is the frequency with which data vary in time [11, p. 28–30].

AMR systems need to run often in order to fetch measurement values and to get knowledge about malfunctioning meters. If measurement values are not collected in time for invoicing, this implies bad timeliness of data.

4.2.3.3 Completeness

Completeness can be defined as the extent to which data is of sufficient breadth, depth and scope for the task at hand [1].

Three types of completeness can be identified. Schema completeness is the degree to which concepts and their properties are not missing from the schema. The measure of the missing values for a specific property or column in a table is called column completeness. There is also population completeness which evaluates missing values with respect to a reference population [11, p. 23–24]. An example from AMR systems is when looking at the list of malfunctioning meters, it is crucial that all malfunctioning meters that lost communication is in it. Another example is measurement value series which does not contain all values, i.e. an hour value is missing.

4.2.3.4 Consistency

Consistency is the degree to which data is represented in the same way in dif-ferent views or in different systems. Also, if the data is incorrect in the other view or system, there is a problem with consistency.

Consistency can also be called logical consistency.

There might be inconsistencies to the system that the gathering system sends its data to. The result can be that consumption values are switched between two customers of a company and ultimately the wrong measurements are the basis of the invoice.

4.2.3.5 Accessibility

(32)

has access to relevant data for the task. The ease of retrieval also extends to the way in which data is represented and how suitable is the medium through which the data is accessed [17].

It is important that accessible data is relevant for the task at hand. When data is scattered or hard to filter, accessibility is decreased.

4.2.3.6 Interpretability

Interpretability is the “extent to which data is in appropriate languages, symbols, and units, and the definitions are clear” [15].

Interpretability simply means that data is possible to understand for the data user in the context. Data should be defined clearly and represented appropri-ately [18].

4.2.3.7 Reliability

Reliability is the extent to which data is dependable, trustworthy and can be relied upon [18]. It is linked with believability and the reputation that the data has. If data is reliable, data users also believe the data and it would also follow that data gets a good reputation.

(33)

Chapter 5

Remote energy meter reading

infrastructure

In order to get a clear picture of what data is needed and where things can be wrong, it is a good idea to look at what the communication infrastructure looks like in the energy data metering. Common terms will be explained and the different types of possible communications will be walked through.

5.1 Common terms

There are a few common terms that appear when it comes to reading energy meters remotely.

5.1.1 Automatic Meter Reading

Automatic meter reading (AMR) is the technology of automatically collecting consumption, diagnostic and status data from an energy meter. That data is transferred to a central system where the EDC can use it for billing and analyzing [7].

5.1.2 Advanced Metering Management

(34)

remote changes in contracted power or price schemes. This can enable more advanced price models and actions related to demand-response measurements [7].

5.1.3 Advanced Metering Infrastructure

Advanced metering infrastructure (AMI) includes additional features to AMM. It includes advanced metering devices, communication, advanced metering man-agement and customer oriented systems. It refers to both hardware and soft-ware.

5.2 Communication

5.2.1 Communication Structure

There are four interesting components to look at when it comes to communi-cating consumption into the central AMR system.

• Meter

• Data Concentrator • Communication medium • Central AMR system

The meter is the device responsible for measuring energy consumption for the energy property and transmitting it using the communication medium. It is usually structured by having a meter module and a communication module. The meter module measures consumption and keeps all data about the meter and the communication module is responsible for data transmission. Some meters only support one-way communication but there is an increase in meters supporting two-way communication.

Before sending to and receiving from the central AMR system, it is common to have a data concentrator, sometimes referred to as communication master. Each concentrator communicates with multiple meter in a given area, depending on the communication structure. The reason for using a concentrator is that the communication infrastructure gets cheaper, as it is expensive to have direct communication with all meters. It also gets easier to manage. On properties with high consumption it is common to use direct communication instead of having a data concentrator.

(35)

system. Examples of communication media include power line, radio, phone line and GPRS.

The main role of the central AMR system is to collect consumption from meters and sending that data to the systems that require it for their task. The system connects to data concentrators and in some cases directly to the meters. The AMR system has certain control over meters and concentrators, like setting the time, turning off the meter, among other things. Many AMR systems include reporting of meter failures and other things that can go wrong in the overall system. It also sometimes includes validation of the energy consumption, in cases where meters send obviously wrong consumption data. Data is regularly collected from the meters by the central AMR system.

5.2.2 Communication types

This section describes communication types used in meter communication and focuses on the physical layer of the OSI-model.

5.2.2.1 Power-Line Communication

Power-line communication (PLC) means that transfer of data is made on exist-ing infrastructure, namely the power line supplyexist-ing electricity. It can be divided into two main categories; Narrowband Power Line (NPL) and Broadband Power Line (BPL). These are referring to different frequency bands to send data over. For NPL the frequency range is between 3kHz and 148.5 kHz in Europe. The range for BPL is between 2 MHz and 30MHz. Other differences between the two include information capability, required power and attenuation (affecting propagation of waves and signals) [7].

A data concentrator is always used when using power-line communication. The concentrator, in turn, uses another communication type to communicate with the central AMR system.

Power-line communication has both advantages and disadvantages. One advan-tage is that it can transfer data over a long distance, making it common to have PLC to meters on the countryside. Another advantage is that it is not disrupted by radio signals, making it easier to communicate in the center of a big city, for example. The technology is also cheap, as there are small additions of infrastructure for the communication to work. It is possible for utilities other than electricity to use power-line communication as well, district heating is an example.

(36)

27 hours in some cases.

In Europe, a wide range of projects are active rolling out systems based on PLC, like DLMS-PRIME [5, p. 45] and Meters And More[19][20]. In spite of this, for EDC’s in Sweden the communication problems mostly occur where PLC is used and the future plans are mostly to phase out communication based on PLC [EDC1][EDC2].

5.2.2.2 Wireless / Radio

The license-free radio ISM bands are used for communication with wireless technology. These radio bands are typically 433 MHz, 868 MHz, 915 MHz and 2.4 GHz[5].

Radio communication provides a good infrastructure to use the mesh topology (discussed below). It makes it easier for nodes to communicate directly with each other.

In Sweden there is generally a movement towards radio technology [EDC1][EDC2]. It is considered to be quite a stable technology and it produces very few errors. 5.2.2.3 GPRS

The mobile telecommunications technology, GPRS, allows for direct communi-cation to the central AMR system, either via a concentrator or directly to the meter. It is a common technology to use when communicating directly with the energy meters.

Communication over mobile phone technology is quite rarely used. It is mostly used where the consumption amount is worth it, that is only if the consumption is high enough [EDC1]. GPRS is the most common technology when it comes to energy data over mobile telecommunications networks. It is considered reliable and the biggest reason against it is the cost.

5.2.3 Topologies

There are different demands on the AMR system depending on the topologies of the communication system. In this section four different topologies are de-scribed.

(37)

In a bus network, nodes are connected in a daisy chain by a linear sequence of buses. This is a common topology when using PLC.

A tree topology is essentially a combination of the bus topology and the star topology. The nodes of the bus topology are replace by star networks.

A mesh networking topology is where all nodes in the network are interconnected with each other. Each node in the network relays data. It is common to use this topology when using radio transmission, and the communication to the AMR system goes via a communication master (or data concentrator.) This system allows for communication master failures, because meters find other lines of communication.

5.3 Communication Protocols

This section describes some common communication protocols used in AMR systems.

The DLMS/COSEM (IEC 62056) specification contains a set of standards for energy metering data exchange. It was developed as a way for smart metering to have a communication standard [21]. DLMS is an abbreviation of Device Language Message Specification [21]. COSEM is an abbreviation of Companion Specification for Energy Metering. It sets the rules for data exchange with energy meters [21].

Another common protocol for meter communication is M-Bus (Meter-Bus). It is mainly used for the remote reading of gas and electricity meters, but is also usable for other types of consumption meters. There is a wireless standard based on M-Bus, called Wireless M-Bus.

ZigBee is a protocol providing the network infrastructure required for wire-less sensor network applications. It is an interoperable protocol meaning that different application can work together. ZigBee provides low-cost, low-power, wireless mesh network technology. The specification is a suite of high level pro-tocols, based on wireless personal area networks (WPANs). The devices have to conform to the IEEE 802.15.4 standard, which contains specifications on the physical layer and the data link layer [5].

5.4 Integrations

(38)

There are at least two other uses for consumption data. The operators of the energy grid may gain some value from data in the AMR system. An example is to detect when a phase in the electric power goes down in the distribution of electricity to a property. Another use is to export measurement values for presentation to the energy customer.

(39)

Chapter 6

Data Quality of AMR

Systems of EDC’s

The following chapter presents the results of the interviews. Problems, examples of good data quality and suggestions for improvements are identified, all relating to data quality. This chapter is the first part of answering the second question of the thesis: “What conclusions can be drawn about data quality on systems Swedish energy distribution companies use today?”

The answers are categorized under the parameters of the data quality definition in section 4.2.2. There are, however, many issues discussed that could be cate-gorized under multiple parameters, but they will not be repeated. The answers are categorized under the parameter that is best suited.

All the interviews were conducted in Swedish. The quotes are translated from Swedish to English with the intent that the meaning of the quote is correct. The interviews are referred to in the format [EDCx], where x is a number. The size of the company for each interview subject is stated in appendix C.

6.1 Introductory Remarks

All interview subjects have explained that they work with what does not work in the system, i.e. when measurement values are not entering the system. If the meters, the communication and the AMR system are functioning correctly, the system collects data and makes exports of measurement values. Operators1

need to be able to track malfunctioning meters. This implies an importance to identify e.g. how reports work, how communication failures are presented and how missing measurement values are presented.

1An operator is a user of an AMR system. Their responsibility is to make sure that

(40)

“[The AMR systems] are doing their task. But you should be able to see what is missing in a simple and comprehensive manner, when you get to work.” [EDC2]

Many interview subjects chose meters that were supposed to manage the re-quirements implied by the legislation between 2006 and 2009. In many cases, the meters had a minimum of features to obey the law, but not much more. As the years went by, they figured out more about what is important to think about and can see the mistakes that were made [EDC3].

During the period between 2006 and 2009, many decisions were made in panic, as one interview subject noted [EDC4]. For this reason, there have not been much focus on user interfaces and quality of data (other than measurement values.) Another issue that EDC’s are becoming increasingly aware of is vendor lock-in. EDC’s work to increase their options, not being dependent on a single entity for energy metering.

A few interview subjects’ EDC’s are setting higher requirements on AMR sys-tems. The requirements increase data quality in the AMR systems, especially when it comes to availability of meters [EDC5][EDC6].

6.2 The Interview Subjects’ Thoughts on Data

Quality

When asked what data quality means to the interview subjects, they thought about quality of measurement data. The following are some examples of how they defined it.

• Measurement values has the correct status [EDC1].

• That correct measurements comes in with correct time stamp [EDC4]. • That measurement values are collected [EDC2].

• That measurement values are reliable [EDC5].

• That the correct measurement from the correct meter is inserted at the right “place”2 [EDC6].

While measurement values were their focus at the beginning of the conversation, after more discussion and consideration, other data was talked about.

(41)

6.3 Division of labor

In smaller EDC’s, it is common that one person works with the AMR system. All tasks from changing meter to checking measurement values are done by this person [EDC2][EDC1].

Larger EDC’s usually involve more operators in their team. How tasks are divided depends on the situation with the system. One interview subject’s team divided their meters among different operators. Two operators worked with hourly measured meters, and six to seven operators worked with monthly measured meters.3 Within these groups, the tasks are divided further,

hav-ing operators responsible for certain tasks. These tasks include makhav-ing tariff changes and collecting a specific meter reading (e.g. when a customer changes electricity trading company) [EDC5].

If you have multiple AMR systems, the operator team can be divided among the systems, having one or more operator responsible for each system [EDC4].

6.4 Data Quality Parameters

This section presents the results of the interviews, categorized under each pa-rameter defined in 4.2.2. Examples of both high data quality and low data quality are identified and presented.

6.4.1 Accuracy

As defined in section 4.2.3.1, accuracy measures the degree to which data cor-responds to the real world.

6.4.1.1 Ensuring measurement value accuracy

Most interview subjects talked about correctness of measurement values when asked about accuracy. Measurement values in the AMR system corresponding to values in the meter are not necessarily correct. Wrong measurement values may enter, either the meter (through a measuring error) or the AMR system (due to e.g. communication failure.)

According to one interview subject, when their validation process4detects

erro-3When properties consume a high amount of energy, it is usually measured per hour. This

requires more workload as the requirements are higher for those meters.

4 The process of detecting errors in measurement value series is called “validation” and is

(42)

neous measurements, they are blocked from transferring to other systems and the operators are notified. This improves accuracy for measurement values [EDC5].

Validation has advantages beyond detecting meter and communication failures. Meters can be tied to the wrong property in the AMR system, for a variety of reasons. This alters the consumption of the property and is identified through validation [EDC5].

“We have seen meters sending massive errors in measurement values. Some meters that should have sent around 10 MWh sent 1 GWh instead.” [EDC1]

Errors are corrected where they occur, according to one interview subject. An error with the value in the meter is corrected in the AMR system. If the error occurred in a system that the AMR system is integrated with, it is corrected in that system [EDC4].

One interview subject said that they can fine tune the consumption patterns5

in their validation process in order to improve accuracy [EDC6]. 6.4.1.2 Manual error

A manual error among interview subjects is setting up the wrong consumption constant6 for a meter.

“Maybe if you would enter the wrong transformer constant or some-thing like that. But it is very rare, we are very strict when we do these things.” [EDC3]

“We have been working a lot with documentation of our electricity properties for a very very long time and seen the significance of how important it is that data is accurate.” [EDC3]

rules include a tolerance interval (too high consumption) and analyses of normal consumption behaviour. The rules vary depending on the energy type. District heating is more predictable when comparing the consumption to the outside temperature.

5A consumption pattern is used as a basis point for validation to determine whether the

measurement values are correct. An example of a consumption pattern is a yearly profile, month by month, used in order to check if the measurement values deviate too much.

6Transformers are used when the voltage is too high for the meter to measure the

(43)

6.4.1.3 Templates

In some systems, interview subjects are using meter profiles (or templates) when adding or changing meters in the AMR system. These profiles include meter manufacturer and other parameters that is suitable for the meter type. This procedure helps ensuring that the information about each individual meter is correct [EDC3].

6.4.2 Timeliness

Timeliness is the extent to which data is current for the task at hand. It can be expressed as a relation between the degree of currency and volatility.

6.4.2.1 Late measurement values

The AMR systems, discussed in the interviews, collect data from the meters ev-ery night. If the communication medium and AMR system are up and running, everything is presented to the operator in the morning.

“We are collecting measurement data daily so everything gets de-livered every day. There might, of course, be some meters where there is a day they did not deliver measurement values, but that will always happen for some meters.” [EDC2]

Working with large sets of meters increases the regularity of meter failure. It is common for some meters not communicating during a night. Some EDC’s have a limit in days that the meter did not communicate before the problem is analyzed [EDC6]. This limit depends on the property, e.g. the size of the main fuse and the energy contract the customer has on the property. If only monthly readings are interesting, there can be a longer time before the problem is corrected. It is more urgent if the customer has a payment model based on hourly consumption [EDC1].

“In some industries, it is sometimes hard to set a good consumption so that it varies between specific measurement values.” [EDC1]

How long before measurement values enter the AMR system depends on the communication type in some cases [EDC1].

(44)

In some systems it is possible to see when the measurement value entered. “You can click on a measurement value and see when it came in.” [EDC1]

Turtle7 meters are always one day late with the measurement data. This is

expected as Turtle meters communicate over the power line on low frequency to increase stability [EDC4].

In no system could operators see the time of the last connection attempt, as it is assumed that connection attempts are made during the nightly data collection. There is, however, a time stamp on the latest measurement value, allowing for operators to conclude when the latest successful data collection occurred. Sometimes nothing is updated during a nightly data collection, or that nothing is collected from one of the data concentrators [EDC4].

6.4.2.2 Meter status

One interview subject mentions the desire to be notified whether or not meter communication works after setting up a new meter. If there was a failure, however, it is seen the day after in the report of missing measurement values [EDC2].

“It would be good if there was a check immediately without an extra step required by the user.” [EDC2]

One interview subject mentions setting deviation codes when there are meter failures. This deviation code is removed when the AMR system notices that the communication to the meter is working again, keeping data up to date [EDC5].

6.4.3 Completeness

Completeness is the extent to which data is of sufficient breadth, depth and scope for the task at hand.

6.4.3.1 Incomplete measurement value series

Sometimes there are missing values in the measurement value series. The AMR system of one interview subject requests these missing values, in addition to allowing the user to request values manually [EDC5].

7Turtle is a meter brand used by many EDC’s in Sweden. They communicate over the

(45)

When hourly values are missing and the AMR system is unable to retrieve them, it is possible to interpolate the consumption from meter readings. When there are meter readings per day and multiple hourly values are missing, the values are calculated with the meter reading and the res of the hourly values as a basis [EDC3][EDC5].

The AMR system, in use by one interview subject, creates a performance report at 4 am each morning, called “P4”. This report contains a percentage indicating how many of the meters are working. That report is used as a basis for deciding what to start troubleshooting [EDC5].

6.4.4 Consistency

Consistency is the degree to which data is represented in the same way in dif-ferent views or in different systems.

In order for the system receiving measurement values to identify what meter or property the data is tied to, there is one or more identification numbers transferred with the data. The interview subjects said that the AMR systems are doing this by exporting text files.

6.4.4.1 Troubleshooting

The team that one interview subject is involved with, works with the meter data management system most of the time. This system is where errors were first noticed, leading to continue troubleshooting in the AMR system [EDC6]. 6.4.4.2 Duplicate manual work

When doing meter changes, one interview subject needs to work with both the meter data management system and the AMR system manually, making the same change twice. This increases the possibility of breaking logical consistency thanks to increased chance of manual error [EDC6].

(46)

by an operator and a message is sent to the customer information system. The AMR system notices that the meter no longer communicates and indicates that new information is needed in the AMR system [EDC4].

6.4.4.3 Grid performance

Operators of the energy grid have uses for the data collected by the AMR system, for example controlling what temperature to use for the water in a district heating grid [EDC6].

The company of one interview subject is conducting a project for smart grid applications. The project idea is to measure electricity quality (e.g. the volt-age) which is collected by the AMR system and transferred to the system that operators of the electricity grid use [EDC4].

One interview subject mentions that there is increased interest in the data from their AMR system, especially for operating the energy grid [EDC3].

6.4.4.4 System differences

It is common to add transformer constants in the AMR system. In cases where it does not, it is added in the meter data management system [EDC2].

6.4.4.5 Double security

If there are validation in many systems, an error is less likely to pass through. The different systems use different algorithms and it is more likely that errors will be found [EDC4].

6.4.5 Accessibility

Accessibility is the extent to which data is available, or easily and quickly re-trievable.

The question of Accessibility is divided into three parts. What data is available, how is it presented and what can be done with available data?

(47)

6.4.5.1 Search and filter information

According to one interview subject, meters can be searched for with filter func-tions, including filters based on errors. When searching for meters, there are a lot of data not needed for the specific task, but it is easy to filter as the data is structured with columns and rows, just like a database [EDC5].

For the AMR system of one interview subject, the system administrators change what data the reports contain [EDC3].

“For example missing measurement values, we want to know what hourly measured meters are malfunctioning every day. But if we get the monthly measured meters in the same alert system, then we would get alerts about a lot of meters that did were malfunctioning.” [EDC3]

One interview subject have to create an export from the customer information system in order to get a list of meters with missing measurement values [EDC1].

“I wish that there would be a page in the AMR system, for example missing meters, that you immediately get a list of those values that did not arrive. Then you would not have to create an export in the customer information system.” [EDC1]

The same interview subject states that alerts can be better structured [EDC1]. “It would have been good with an alerts page to configure what you can see. Then I could configure what I want these categories to show when it comes to alerts. That would improve the overview in the system.” [EDC1]

“We don’t care about all the alerts you can send if you can’t present them to me.” [EDC4]

One interview subject talked about a system they were looking at that solves the problem with alerts in a better way. Before the meters send out alerts, they communicate with each other to find out if other meters have the same error. This decreases the amount of alerts sent to the AMR system [EDC4].

(48)

Some AMR systems include errand handling [EDC1]. Other AMR systems integrate with errand handling systems [EDC5].

One interview subject can sort data, build groups and create specific criteria for presentation [EDC3].

6.4.5.2 Missing information or features

In most cases, the interview subjects could not think of that much information that was missing in the systems. But some improvements could be identified when the interview subjects thought about how to better present malfunctioning meters and other problems that could occur.

A missing feature that is mentioned is a map of the meters in the AMR system. If many meters in the same area stop working at the same time, it is quicker to locate the problem if the meters are seen on a map. It is probable that meters malfunctioning in the same area stopped working for the same reason [EDC5][EDC4].

Another desired feature is a dashboard. This would immediately give an overview of how well the overall system is working. The dashboard would make it easier to identify how much work is needed during the day [EDC4].

According to another interview subject, important data include alerts. The systems that this interview subject is using do not have built-in alert functions in their AMR systems [EDC2].

Another interview subject says that seeing malfunctioning meters as early as possible is most important [EDC3].

6.4.5.3 Reading from multiple meters

Some interview subjects’ EDC’s are only using one AMR system. Either by limiting themselves to only one or two meter brands, or by having systems that can communicate with two or more brands. Most of the interview subjects that did not already have only one system to work with, is striving to have only one system.

One interview subject that combined an AMR system with a specific meter brand said that it works really well when everything works, but as soon as there is a failure, it is hard to troubleshoot. The system did not recognize how to present the failure in a way that was aligned with the meter type [EDC4].

(49)

A few interview subjects mentioned that openness is a concept that gets intro-duced more and more into the field. According to one interview subject, what is meant by openness is not very clear [EDC4].

6.4.5.4 Application usage

An interview subject talked about an AMR system accessed with a client that needed to be accessed by logging in to a server [EDC4].

There is an advantage using a web interface in that you do not need licenses for the installed clients [EDC3].

“You do not need to install a bunch of clients and buy licenses for them. It is free to create users and add access rights.” [EDC3] 6.4.5.5 Accessibility of measurement values

To increase the availability of measurement values, the interview subjects and their organizations are working with replacing meters and changing the way that they communicate.

When it comes to measuring district heating, there are more parameters of the measurements that need to be sent. These include two temperatures, meter reading of flow and meter reading of energy [EDC6].

6.4.5.6 System stability

The interview subjects stated that their systems are stable, have good uptime and are accessible most of the time.

6.4.5.7 Provider support and upgrades

(50)

6.4.5.8 Accessibility from competence

One EDC is creating reports and visualizations with custom tools, interfacing with the database [EDC4].

When there are voltage interruptions on phases, some systems displays this data. It is shown as an event and is read from a specific register in the meter [EDC1].

6.4.6 Interpretability

Interpretability is the extent to which data is in appropriate language, symbols, and units, and the definitions are clear.

6.4.6.1 Experience

Experience makes data in the AMR system more easily interpretable. “We have had these systems for a few years so we know most things, when there are errors and services that go down that needs restart-ing.” [EDC2]

In general, the interview subjects understand the data presented to them. They have worked with these types of systems for a long time which gives them the expertise that they need.

An example of knowledge an operator needs is the difference between time tariff and simple tariff and how consumption is measured in relation to the tariffs. Simple tariff means that the energy customer pays the same price for consumption regardless of time of day. With a time tariff pricing, customers pay more for consumption on certain hours during the day. When using time tariff, measurement data are stored in two different registers, depending on if the consumption is high price or low price. With simple tariff, measurement values are registered only on register [EDC5].

6.4.6.2 Presentation of data

(51)

Only one interview subject had been using a system that presented meters on a map [EDC4]. This makes the data easier to understand in a quick way, instead of having to search using external tools for specific addresses [EDC5].

6.4.6.3 Alerts and error messages

Alerts (that meters send to the AMR system) and error messages are sometimes hard to understand, according to some of the interview subjects. When alerts and error messages are not understood, colleagues are asked or the support is contacted [EDC4][EDC5].

“You can get alerts, sometimes, if you are trying to do some things that you don’t really understand. But then you ask the system provider and you get support.” [EDC3]

6.4.6.4 Coherence

When you have one system for different meter brands it is easier to use and understand because the fact that everything looks the same regardless of meter brand increases familiarity with the system [EDC3].

One interview subject thinks that having one system for multiple meter brands increases interpretability. The reason is that familiarity is increased with the system [EDC3].

The same interview subject thinks that more insight into how the system works is gained with the new systems [EDC3].

“Thanks to the new systems, I feel like we have more insight. It isn’t as much black box, so to speak, before you get a measurement value. You understand the route in a better way.” [EDC3]

6.4.6.5 Simplicity

What one interview subjects’ EDC is constantly working with is to simplify [EDC3].

(52)

6.4.7 Reliability

Reliability is the extent to which data is dependable, trustworthy and can be relied upon.

6.4.7.1 Measurement values and validation

The interview subjects felt that data in the AMR systems are reliable. For measurement values this is assured by the validation process for some interview subjects. It is considered that it catches errors and that notifications are good.

“We trust our measurement values. There is also a status parameter conveying if the value is calculated or if there was a power outage that hour. When this happens, it is mandatory to investigate be-cause there are non-approved measurement values.” [EDC1] “I think that these validation rules are correct and good. If a value is not approved for some reason, I can find it and correct it. I have never experienced a case where there has been unreliable or non-validated measurement values that have been transferred to the customer.” [EDC5]

In one AMR system, measurements that are “too high” or “too low” get a status, to detect erroneously measured consumption. Setting limits for this status is harder on properties where the consumption fluctuates too much [EDC1]. 6.4.7.2 System change

According to [EDC3], when switching systems (which happen for a lot of the interview subjects currently) it can be hard to trust the new system immediately. Reliability is reassured by running two different systems at the same time and comparing the results [EDC3].

6.5 Summary of data quality features

(53)

Table 6.1: Data quality features identified in the interviews. Feature Data Quality Parameter(s) Validation Accuracy, Reliability

Templates Accuracy

Regular Collection of Data Timeliness

Reports Completeness, Accessibility Time Stamps Timeliness

Meter Status After Creation Timeliness

Deviation Code Timeliness, Accessibility Requesting missing measurement

values Completeness

Interpolation of consumption Completeness

Integrations Consistency, Accuracy Communication protocol

implementations Consistency, Accessibility,Interpretability Search and filter Accessibility

Error messages and meter alerts Accessibility, Interpretability A view for meters not communicating Accessibility

Errand handling Accessibility

Map Accessibility, Interpretability Dashboard Accessibility

(54)

Chapter 7

Gurux: Assessment of Data

Quality

Presented in this chapter is an assessment of data quality provided by Gurux. This is the results that is supposed to answer the third and final question of this thesis. The results are presented in section 7.3. First Gurux will be described shortly and what programs are included. Secondly, already provided features and necessary changes to start the project are described. Thirdly, features of Gurux are compared with features increasing data quality, identified from chapter 6.

7.1 Description of Gurux

There are a few software projects of Gurux that are found on Github (https://github.com/Gurux). Gurux Device Suite is a high-end product that basically includes three major

interfaces described as:

Gurux Device Editor is software for creating meter templates that corre-spond to a specific version of a meter. Templates are created in one of two ways, by adding correct objects, tables and registers in the template, or by automatically fetching all the registers from the meter by communicat-ing with it. This template is later used to add meters uscommunicat-ing other Gurux software.

References

Related documents

The opportunity exists to prove processing of massive trace event data from radio base station sites and apply identified algorithms to process and to compute calculations as

drop if dthdate < $startTime | dthdate > $endTime drop if round< $startRound | round> $endRound sort individid. save

This thesis has two purposes; emphasizing the importance of data quality of Big Data, and identifying and evaluating potential error sources in JavaScript tracking (a client

Detta leder till utmaningar för vårdpersonal, och kan vara svårt att känna igen när patient och anhöriga önskar en mer delaktig roll, eller när de vill avstå

Medarbetarna upplevde inte att cheferna gav beröm så ofta, mer än hälften upplevde att de fick beröm en gång i månaden till mindre än en gång i månaden.. En tydlig

The reliability input data for first order failure modes are mainly the failure rate and repair time and the output data are failure rate and its associated duration.. The second

This research will be made in a hypothetically challenging way, using the existing knowledge of the production area and connect it to theory in order to see if the hypotheses

Förändringen i vår syn på verkligheten är ett faktum, och det handlar inte längre bara om att ”spela en roll”, utan att människor utvecklar förhållanden med andra