Centralized log management for complex computer networks

(1)

Centralized log management for complex computer

networks

Marcus Hanikat

K T H R OY AL I N S T I T UT E O F T E C H NO L O G Y

E L E K T R O T E K N I K O C H D A T A V E T E N S K A P

(2)

Abstract

In modern computer networks log messages produced on different devices throughout the network is collected and analyzed. The data from these log messages gives the network administrators an overview of the networks operation, allows them to detect problems with the network and block security breaches. In this thesis several different centralized log management systems are analyzed and evaluated to see if they match the requirements for security, performance and cost which was established. These requirements are designed to meet the stakeholder’s requirements of log management and allow for scaling along with the growth of their network. To prove that the selected system meets the requirements, a small-scale implementation of the system will be created as a “proof of concept”. The conclusion reached was that the best solution for the centralized log management system was the ELK Stack system which is based upon the three open source software Elasticsearch, Logstash and Kibana. In the small-scale implementation of the ELK Stack system it was shown that it meets all the requirements placed on the system. The goal of this thesis is to help develop a greater understanding of some well-known centralized log management systems and why the usage of them is important for computer networks. This will be done by describing, comparing and evaluating some of the functionalities of the selected centralized log management systems. This thesis will also be able to provide people and entities with guidance and recommendations for the choice and implementation of a centralized log management system.

Keywords

Logging ; Log management ; Computer networks ; Centralization ; Security

(3)

Abstrakt

I moderna datornätverk så produceras loggar på olika enheter i nätverket för att sedan samlas in och analyseras. Den data som finns i dessa loggar hjälper nätverksadministratörerna att få en överblick av hur nätverket fungerar, tillåter dem att upptäcka problem i nätverket samt blockera säkerhetshål. I detta projekt så analyseras flertalet relevanta system för centraliserad loggning utifrån de krav för säkerhet, prestanda och kostnad som är uppsatta. Dessa krav är uppsatta för att möta intressentens krav på loghantering och även tillåta för skalning jämsides med tillväxten av deras nätverk. För att bevisa att det valda systemet även fyller de uppsatta kraven så upprättades även en småskalig implementation av det valda systemet som ett ”proof of concept”. Slutsatsen som drogs var att det bästa centraliserade loggningssystemet utifrån de krav som ställs var ELK Stack som är baserat på tre olika mjukvarusystem med öppen källkod som heter Elasticsearch, Logstash och Kibana. I den småskaliga implementationen av detta system så påvisades även att det valda loggningssystemet uppnår samtliga krav som ställdes på systemet. Målet med detta projekt är att hjälpa till att utveckla kunskapen kring några välkända system för centraliserad loggning och varför användning av dessa är av stor betydelse för datornätverk. Detta kommer att göras genom att beskriva, jämföra och utvärdera de utvalda systemen för centraliserad loggning.

Projektet kan även att hjälpa personer och organisationer med vägledning och rekommendationer inför val och implementation av ett centraliserat loggningssystem.

Nyckelord

Loggning ; Log hantering; Datornätverk ; Centralisering ; Säkerhet

(4)

i

List of figures

FIGURE 1:EMPIRICAL RESEARCH CYCLE ... 6

FIGURE 2:A TWO LAYERED CENTRALIZED LOG MANAGEMENT SYSTEM ...13

FIGURE 3:A FOUR LAYERED CENTRALIZED LOG MANAGEMENT SYSTEM ...15

FIGURE 4:ARCHITECTURE OF THE ELKSTACK SYSTEM ...33

FIGURE 5:A IMPLEMENTATION OF THE ELKSTACK IN A COMPLEX NETWORK ...36

FIGURE 6:A BASIC VIEW OF A MULTI-NODE SETUP OF A GRAYLOG SYSTEM. ...38

FIGURE 7:NETWORK TOPOLOGY WHICH THE ELKSTACK SYSTEM WAS DEPLOYED IN ...48

FIGURE 8:THE LOGSTASH PIPELINE. ...49

FIGURE 9:PARSING OF LOG DATA USING GROK FILTER ...50

FIGURE 10:THE AUTHENTICATION PAGE IN THE KIBANA WEB INTERFACE ...53

FIGURE 11:A PARSED SYSLOG MESSAGE USING A GROK PATTERN ...62

FIGURE 12:A PACKET CAPTURE OF HTTP COMMUNICATION ...64

FIGURE 13:A STANDARD LOG MESSAGE DISPLAY IN KIBANA ...65

FIGURE 14:A LINE CHART VISUALIZATION CREATED WITHIN KIBANA ...66

FIGURE 15:A PIE CHART VISUALIZATION CREATED WITHIN KIBANA ...67

FIGURE 16:THE X-PACK MONITORING FEATURE FOR ELASTICSEARCH NODE ...68

FIGURE 17: X-PACK MONITORING FEATURE FOR LOGSTASH NODES ...68

(7)

iv

List of tables

TABLE 1: A BASIC DESCRIPTION OF THE BASIC LOG SOURCE GROUPS ...10

TABLE 2:THE SEVERITY LEVELS FROM THE SYSLOG STANDARD ...11

TABLE 3:PIPELINE STAGES WHICH LOG MESSAGES ARE PUT THROUGH...12

TABLE 4:CRITERIA RATINGS RECEIVED AND THE SUMMARIZATION RECIEVED ...45

(8)

v

List of acronyms and abbreviations

SOHO Small Office Home Office JSON JavaScript Object Notation XML Extensible Markup Language

SIEM Security information and Event Management PuL Personuppgiftslagen

TCP Transmission Control Protocol TLS Transport Layer Security

RELP Reliable Event Logging Protocol IPS Intrusion Prevention System VLAN Virtual Local Area Network UDP User Datagram Protocol HTTP Hypertext Transfer Protocol

LDAP Lightweight Directory Access Protocol APM Application Performance Monitoring AI Artificial Intelligence

(9)

1

1 Introduction

The number of devices connected to the internet and computer networks around the world are constantly growing [1]. As the number of devices connected to a network increase so does the amount of traffic and complexity of the network. Most devices in a network run some sort of application or service which produce log messages. These log messages can contain vital information about the system or software execution and information about security threats to a device or the network.

A log file is a collection of messages created when an event occurs during a programs execution. These events can be anything from a failed authentication of a user to an error condition within the program. The log message is created to describe the event that occurred and then stored within the programs log file.

These log files can then be parsed for information which can provide more information about how the program is executing or if there have been any security incidents.

Nowadays, a computer network commonly exists of many different devices.

Since there are often several programs executing on each device, each device will in most cases be a producer of multiple log files. As the number of devices increase within a network it can become difficult to manage all the log files produced by the devices. For the person or entity hosting these devices, the log messages can provide crucial information to improve a device or networks operation and strengthen its security. Because of this, a good centralized log management system is becoming a crucial part of any larger network today. The centralized log managements system is used to collect and store the logs from different devices and allows easier access to the log data.

Log collection and management is nowadays a must in almost all larger networks. It can be used in Small office/home office (SOHO) with just a couple of devices to larger enterprise networks with possible thousands of devices. All these networks can benefit from log management to different extent. As more devices are added to a network the implementation of a centralized log management system become more and more attractive and at some point, even necessary. A centralized log management system can help to reduce the time spent organizing and analyzing log messages [2]. One of the disadvantages is that it can also have a steep learning curve and the setup might be very time consuming. Also, depending on the chosen system for centralized log management, the time required for the setup process as well as the cost of the system ranges from low to very high [3].

(10)

2 1.1 Background

The number of log messages produced in computer networks are constantly growing [4]. Alongside with this growth, the effort required to handle, store and analyze these log messages increase. Log messages are important since they tell the reader a lot about how an application or device currently is operating. For example, the log messages can contain information about severe problems or security issues. Collecting and analyzing these log messages are very important but can also be a time-consuming task. In a relatively large computer network with over 1 000 endpoints, the data produced by log messages could reach about 190GB per day [5]. When the amount of data produced reaches these levels the administrating entity must rely on help from computers to parse the log data efficiently. A computer can sort out which information that is relevant and might be interesting or information that require further investigation.

It is often the network and system administrators who are responsible for analyzing log messages from devices within an entity’s network. But in today’s computer networks it is unfeasible to employ enough administrators to be able to analyze all the log messages. Instead, the administrators often deploy and use systems where all log messages will have to pass through a filtering and analyzation process before they are reviewed or stored. This process can use data mining to extract the valuable information from the log messages [6].

Since all the endpoints and devices within the network is of a wide variety of operating systems and often produce differently formatted log messages, it would be very impracticable and time consuming to analyze the log messages locally on every device. A more common solution is to have each endpoint and device within the network send its log messages to one, or several, designated log parsers and analyzers [7]. These designated log parsers and analyzers can parse the data from the log message and then analyze it to see if the information is relevant or redundant. The data can then with benefit be stored in a new format such as JavaScript Object Notation (JSON) [8] or eXtensible Markup Language (XML) [9] to have all log messages stored in the same format. JSON and XML are two different ways of formatting data for storage or transport into a standardized structure. Within JSON and XML filesdata such as timestamps and IP-addresses can be standardized fields and searchable between different log formats.

When the log messages have been parsed and analyzed they are stored for a period of time or until the messages have been reviewed. Today many entities use a Security Information and Event Management (SIEM) systems to help with the review process [10]. These tools allow for searching and correlation of log messages and can be used to monitor the log information and alert administrators when a problem occurs. They also allow presentation of the information in a more human-readable format, such as graphs, charts or tables.

This can save a large amount of time spent on analyzing log messages and allows for new possibilities for correlation and finding relationships between different events throughout the network.

(11)

3 1.2 Problem

There are several different reasons for storing and analyzing the log messages produced within a computer network. For example, in Sweden it is required for internet service providers to keep the dynamically assigned IP-addresses of their clients for a period of time due to the IPRED-law [11]. With the help of this law the authorities can request information about that specific IP-address to see who or what device it belonged to at a given time. If there has been a crime committed using a leased IP-address the authorities can trace the IP-address to a specific person. These log messages are specified by law to be kept for a specific period before they are removed. Other log messages such as informational and debug log messages do in most cases not require storing for longer periods of time. Instead they can be deleted within a couple of days or as soon as the problem described within has been resolved.

An evolving problem related to computer log management is how the valuable information is distinguished from the non-valuable information. It might be feasible to extract information from a single or a couple of log messages. But when trying to extract information from all the log messages produced within a large network it becomes a tough task. To have network administrators handle all the information produced by devices in larger networks, companies would have to hire a lot of personnel just to parse through all these log messages. This would introduce a huge cost for entities maintaining larger networks. Because of this, it is in their interest to introduce automatic parsing systems for log messages. These parsing systems should display only the valuable information to the person reviewing the log messages. A common term within the industry is SIEM [10] which is a software that can provide analysis of log messages in real-time. This software often supplies the network administrators with tools to better visualize the log data and help with correlation, alerting and analysis.

The focus of this thesis is towards exploring and finding a viable and scalable centralized log management system for larger and complex networks.

Centralized log management is an important part to implement in larger networks and comes with many benefits. There is currently a large amount of different systems available for centralized log management with a wide variety of different features and functionalities. It can be hard to tell which one is suitable for a given network and its requirements. So, from all these different systems, which ones are suitable to use for centralized log management and visualization in a large and complex network? A thorough investigation into the most well-known systems on the market will be deployed in this thesis to answer this question.

(12)

4 1.3 Purpose

This thesis aims at presenting the most appropriate solution for a centralized log management system for the stakeholder’s, EPM Data AB’s, network and explore what benefits can be reaped from this system. The reason as to why this problem needs to be solved is that without a proper log management system the management of their network and its security will become increasingly difficult.

With a functional and easy to manage log management system, the time spent on network management will be greatly reduced. Security issues such as intrusions or malware detection will also become a lot easier to detect and manage. This will save the stakeholder both time and effort while maintaining full control of their network.

In this thesis several different log management systems will be presented and evaluated. This will help the reader of this thesis to develop a greater understanding of these centralized log management systems and their benefits.

Centralized log management is a quickly rising concern within network management. This thesis aim is to shed some light on possible issues and solutions within this field.

As of before this thesis was started, the stakeholder was using Splunk [12] as their centralized log management system. Hopefully, the presented system in this thesis will be able to offer an improved implementation or alternative system.

1.4 Goals

This thesis goal is to propose a solution to a centralized log management system which can be beneficially used in the stakeholder’s and other complex networks.

This system selected must meet the requirements which will be established for the project. To achieve this goal the following sub-tasks must be fulfilled:

1. Construction of a basic logging policy based on the stakeholder’s requirements.

2. Use the logging policy and information from background research to set the system requirements for the log management system.

3. Selection of centralized log management system which meet the requirements established.

4. Implementation and configuration of a small-scale solution which meets the project requirements and follows the logging policy.

The results presented by this project will be a comparison of several well-known centralized log management systems as well as recommendations for implementing the chosen system for this project. This will help the stakeholder by giving them a proposed and proven small-scale system for further deployment within their network. Since there are several different systems discussed within this thesis, it can also be used as a guideline for selection of log management system within a wide variety of networks. Because of this,

(13)

5 other companies and entities should also have an interest in the results presented by this thesis.

At the end of this thesis the goal is to present a small-scale system which is designed to comply with the crafted logging policy. The configuration of this system will be produced to serve as a guideline for a full-scale implementation within the stakeholder’s network and other networks in the future.

1.4.1 Benefits for society, ethics and sustainability

As the security threats to networks and devices increases, so does the demands for a system to help sort out and present the information produced by networks or devices. This thesis aims towards helping the reader to better understand the requirements for collection, storage and analysis of log data. The implementation of a centralized log management system can help to increase the visibility and correlations of attacks and other threats to networks and devices. This will help the society by allowing for better understanding and more thorough investigations when these attacks occur.

Some ethical problems arise when data collection is involved. It is important to make sure that no personal data is stored within log messages since this can violate the person’s privacy and integrity. It is the entity collecting the log data responsibility to make sure that no personal data is captured and stored within or that the personal data is treated according to applicable laws. Some countries have laws to protect peoples’ personal data. For example, in Sweden there is the PuL (Personuppgiftslagen 1998:204) [13] law which states how personal data should be handled. It is very important for the entities collecting personal data to make sure that laws such as this is upheld.

This projects also help to facilitate some of the ethical aspects of computer security and the morality of hacking and other computer crimes. Computer security is very important if an entity is storing personal or other sensitive data within their network. A centralized log management system can help with the investigations towards potential crimes, also called computer forensics. It can also help to detect ongoing attacks using visualization and correlation methods and it is also possible to strengthen security by the help of data from the log messages collected.

1.5 Research Methodology

There are two distinct different groups of research methods, quantitative and qualitative research. The quantitative research is concerned with measurements of data, as the name suggest it tries to quantify things [14]. The research methods which belongs to the quantitative research group, such as experimental and deductive research [15], are concerned with gathering and generalizing data from, for example, surveys. The qualitative research methods on the other hand are focusing on the quality of information and perception. It also aims at providing a detailed description of what is observed or experienced

(14)

6 [14]. Applied research and conceptual research are examples of qualitative research methods [15].

This thesis relies on empirical and applied research methods. Both the empirical and applied research methods fall under the qualitative group of research methods. The empirical research method aim is to develop a greater understanding of practices and technologies using collection and analyzation of data and experiences [15]. Using experiments, observations and experience as proof the empirical research method draws conclusion of the researched topic.

The applied research method is often based upon existing research and uses real work to solve problems [15].

In this thesis the empirical research method will be used during the investigation of the possible solutions for the centralized log management system. Empirical research uses observation of for example experiments to gain knowledge of the area of investigation [15]. When the system to implement has been chosen, empirical research will also be used to set up a small-scale implementation of the chosen system. Finally, with help of experiments and observations the aim is to prove that the chosen system is sufficient and meets the requirements.

Figure 1: This figure gives a visual representation of A.D. de Groot’s empirical research cycle. The cycle describes the empirical research process and its different phases. From Wikimedia Commons. Licensed under CC for free

sharing [16].

(15)

7 In figure 1 above a visual representation of A.D. de Groot’s empirical research cycle can be seen. This cycle describes the work flow of empirical research [17].

The research starts out with the observation phase where the problem is identified and information collected. It continues with the induction phase where the hypothesis is derived from the observations made. The deduction phase is then used to setup how the hypothesis should be tested. In the following testing phase the hypothesis is put to the test. Finally, in the evaluation phase the results are interpreted and evaluated.

The applied research method will then be used during the implementation of the proposed system to investigate if it meets the set requirements to solve the problems attacked in this thesis. The applied research method is concerned with answering specific questions regarding a set of circumstances [15]. Applied research often builds on former research and uses data from real work to solve a problem. The results from applied research is often related to a certain situation which in the case of this thesis is the stakeholder’s network. Although the investigation is done towards this particular situation, the results and conclusions found might be applicable to other situations.

1.6 Stakeholder

This thesis was produced in cooperation with the company EPM Data AB. EPM Data is a company which is focusing on providing management of modern IT- services. The services provided by EPM Data ranges from providing high availability hardware and servers to cloud desktops with pre-installed applications which can be reached from anywhere. Among the services offered by EPM Data is hosting, virtualization and cloud desktops. The research done in this thesis is supposed to help EPM Data find alternatives to their current implemented log management system. Since EPM Data focuses on providing a worry free and secure services for their customers, they need a good and reliable log management system to achieve this.

EPM Data has several important requirements which the chosen system and implementation must fulfill. The requirements placed upon the system are designed to allow for expansion of EPM Data’s network in the future. Because of this, the requirements are set to meet a larger network than which is currently deployed by the stakeholder, EPM Data. The requirements placed upon the system will be further discussed in section 4.

1.7 Delimitations

There are many different systems available which can provide centralized log management and visualization to choose from. To be able to get a deeper understanding of some of them, only the most well-known systems will be compared. Only one of the compared systems will be implemented and tested.

It would have been a great idea to do a performance and in-depth comparison between different systems, instead this is left as a possible thesis proposal or future work.

(16)

8 This thesis will only analyze centralized log management systems which also incorporate visualization and analyzation tools. There are several possibilities for collecting and visualizing log data. In this thesis, only the systems which have the functionality to collect, store, analyze and visualize log data will be examined.

In this thesis a basic logging policy with foundations built on the paper “Guide to computer and security log management” presented by Kent and Souppaya [18] will be established. This logging policy will serve as a foundation during the selection and implementation of the log management system. However, this thesis will not go into depths about how a good logging policy should be formed since this is greatly dependent on the network or entity it is created for.

The solution presented in this thesis is only aimed at fulfilling the stakeholder’s requirements placed upon the system. This means that the solution might, or might not, be applicable to other networks. If such an attempt should be made it is strongly recommend that a new logging policy should be created for the network and then a comparison made to see if the solution still is viable.

The primary focus of this thesis is towards centralized log management systems. This means that the discussion and implementation of visualization and correlation tools that come along with these systems will be limited. There will be a small coverage of these visualization and correlation tools, but the primary focus is towards finding a system for centralized log management purposes.

1.8 Disposition

Section 2 of this thesis will discuss and present background information which is relevant to the work performed within this thesis. In section 2 a description of the requirements set upon the searched system, the introduction of logging policies and other relevant work will be presented.

In section 3 the methodologies used to solve the stated problems will be presented. Also, section 3 will present some modeling of the network, analysis of the estimated logging related traffic within the network and the system development methods used.

Section 4 discusses and produces a logging policy. Drawn from the logging policy and the results of the background and literature study the requirements placed upon the centralized log management system will be stated.

Section 5 discusses the different viable systems which meets the requirements for this project. Also, an analytical in-depth comparison of the proposed systems will be done, and the most suitable system will be selected for implementation.

(17)

9 In section 6 the configuration and implementation process of the proof of concept system will be described and some of the functionalities further discussed.

Section 7 will present the results from the work done within the thesis. This includes the results from the logging policy, system selection, system implementation.

In section 8 a discussion of the results presented by this thesis will be conducted.

Section 9 will give a quick recap of the work done in the thesis and some of the results presented. Further, section 9 will present some of the conclusions that can be drawn from the results of this thesis.

Finally, section 10 will give recommendations on how the work presented within this thesis can be used for further research into the area.

(18)

10

2 Log management

In this section some of the most distinct characteristics of log messages and log management systems are presented. To begin with, section 2.1 presents common terminology for different log source groups. In section 2.2 an explanation of log severity levels will be given. Section 2.3 presents terminology regarding the log processing pipeline. In section 2.4 differences in centralized log management structures will be discussed. Section 2.5 will introduce logging policies and explain why they might be useful. Finally, in section 2.6 the related work to this thesis will be presented.

2.1 Log source groups

A log is a collection of events that occurs during the operation of a device, application or service. Log messages can come from many different sources and have many different distinct characteristics. The log messages produced within a network can come from a wide variety of sources with different operating systems, applications and services installed throughout the network. Some log messages might be more interesting than others to an administrator when investigating events in the network. This can be dependent on which sources the log messages origins from. There are three primary groups of log messages which are commonly seen in computer networks, these are [18]:

Computer security logs These log messages contain information about possible attacks, intrusions, viruses or authentication actions against a device.

Operating system logs These log messages contain information about operating system related events, such as system events, authentication attempts or file accesses on a system.

Application logs These log messages contain information from applications installed on devices throughout the network, for example web or database servers.

Table 1: This table gives a basic description of the basic log source groups. Most log sources can be divided into one of these groups [18].

These log source groups can be used to filter and divide the log data during searches and analysis to remove irrelevant information. This can help network administrators to save time since the number of log messages that are required to be searched and analyzed is decreased. These groups can also be helpful when establishing logging policies, which will be discussed later in the thesis.

(19)

11 2.2 Log severity levels

Log severity levels are used to show the severity of the occurred event described within the data of a log message. By using log severity levels uninteresting events can be filtered out. The severity level of a log message can also tell a lot about the device operation and if it is in acute need of oversight.

In most cases an event is more important the lower its severity level is. If the log has a low severity level, there might be a severe incident in the network, or on a device, and the network administrators might need to be alerted to this problem. On the other hand, if the log message has a high severity level the data within the log message might be deemed unnecessary by the entity’s logging policy and is instead discarded rather than stored.

There are several different standards for log severity levels which might depend on what operating system, application or devices that produced the log message. For network and Linux operating systems logs, one of the most common standards of log severity levels is the one used by the syslog protocol [19]. Syslog uses 8 different log severity levels with different meaning and numerical code. The numerical code represents the severity level of the log message and ranges from 7, as the lowest severity, to 0 which is the highest severity.

Numerical

code Severity

0 Emergency: System is unusable

1 Alert: Action must be taken immediately

2 Critical: Critical conditions

3 Error: Error conditions

4 Warning: Warning conditions

5 Notice: Normal but significant condition 6 Informational: Informational messages

7 Debug: Debug-level messages

Table 2: In this table the severity levels documented in the syslog standard can be seen. These severity levels are used to represent how important the log

message is [19].

As can be seen in table 2 above it provides an overview for each log level that syslog uses. These numerical log severity levels shown in table 2 are nowadays used by many different software and other producers of log messages. The log severity levels presented in table 2 are not only used for applications and devices capable of using the syslog protocol but by many other applications as well.

(20)

12 2.3 Log processing pipeline

When the log messages are processed and analyzed they are passed through what could be described as a pipeline. Within this pipeline all the scenarios which the log are put through during its lifetime is included. There are four phases which the log passes through which ends with the log being disposed of.

These four phases can be described as follow [18]:

Processing Processes the log by parsing, filtering or performing event aggregation on it. This phase often changes the log messages appearance to get match a more uniform log template.

Storage Handles the storing of the log

messages and are responsible for actions such as log rotation, compression, archiving and in some cases integrity checking.

Analysis The information from the log

messages are reviewed, sometimes with the aid of tools for correlation and relationship finding between log messages.

Disposal When the log messages are no longer

needed they reach the end of the pipeline and are disposed of.

Table 3: This table displays the pipeline stages which log messages are put through during their lifetime in a centralized log management system [18].

Together, these four stages represent the lifetime of the log messages from the moment they enter the log management system until they are deleted. Different log management systems will perform different operations within each stage of the pipeline. For example, one system might only sort out the events with high severity in the processing stage. Meanwhile, another system might parse the data from the log in the processing stage and then store it in another format.

The way that the log messages are handled and processed in each stage of the pipeline are therefore dependent on the log management system they pass through.

(21)

13 2.4 Centralized log management structures

Almost every device that produces log messages and that has not been configured for any other solution stores the log messages produced in its own storage pool. In a centralized log management system there are different kinds of structures used to achieve centralized log management and each one come with its own benefit. For example, maybe the most used is a single device running one instance of the chosen centralized log management system. All log messages are sent to this device and they are processed, stored and analyzed on the same device. An example is rsyslog protocol enabled devices which can be configured to send its log messages to a single rsyslog server [20]. This will produce a log management structure with two distinct layers, one layer with clients and one layer with the syslog server. These management structures with a low number of layers have several benefits such as being easy to setup and manage. However, they do not scale very well when compared to structures implemented across several layers.

Figure 2: This figure displays a two layered centralized log management system where all clients send their log messages to a central rsyslog server. Figure

drawn by the author using the tools at https://www.draw.io.

Nowadays it is common to find systems where the log messages are not stored as the original message, but instead processed into a unified template before they are stored. These systems are much more resource intensive [21] and therefore it can be a good idea to split them into several layers to allow for better scaling. In some cases, a three-layered solution is a good and viable solution. In a three-layered solution an extra layer has been added between the production and storage of the log data compared to the structure seen in figure 2. This extra layer processes the data within the log messages and in most cases changes it to a different format, for example JSON (JavaScript Object Notation). This enables the centralized log management system to read log messages with different formatting and from different sources and then transform them into a uniform log format. An example of a solution that can implement this strategy is the use of Logstash [22] and Elasticsearch [23].

(22)

14 Multiple servers of the same type can be used for redundancy and load balancing in case of hardware failure or at peak load times. This increases scalability since nodes can easily be added to increase the storage or processing capacity of the structure. On the other hand, it is more difficult to manage and setup when compared to the single log server since the number of devices in the structure increases.

Another solution is where a visualization tool is used to help with correlation and analyzation of data. These visualization tools often come incorporated in SIEM solutions. Visualization tools retrieve the log data from storage at the log storage servers and then analyses it to present valuable visualizations of the events contained in the log data. With the help of these tools correlation of logged events becomes much easier. It is also common that these tools serve as an administration point for the remainder of the implemented system. An example of one system which incorporates such a visualization tool is Splunk [12]. In figure 3 below a four-layered centralized log management system with visualization tools can be seen.

(23)

15

Figure 3: This figure displays a four layered centralized log management system and visualization system. The clients send their log messages to the log processing servers. The log processing servers then process the messages and then forward the data to the log storage servers. The visualization tool is used to

extract the data from the storage servers. Figure drawn by the author using the tools at https://www.draw.io.

One of the benefits of having a multilayered log management structure is that they often have the ability for horizontal scaling. Horizontal scaling is when more devices are added to share the load whereas vertical scaling is when the hardware is upgraded to meet the load [24]. In most cases horizontal scaling is the most cost-effective way of scaling systems. This is because of that the cost to performance ratio in most cases is lower for the cheaper hardware.

Therefore, scaling the system horizontally with cheaper hardware is often more cost efficient than scaling it vertically with more costly hardware.

In this thesis the focus is towards finding a four-layered centralized log management system with a visualization tool. This is because of the benefits this structure provides for scalability and visualization. The proof of concept presented within this thesis will have a similar structure to the system

implemented in figure 3 above.

(24)

16 2.5 Logging policy

Before a log management system is implemented in any network, it is a good idea to create a logging policy which governs the choice of the system and how it is implemented [25]. A logging policy is created by the owners or administrators of a network and is used to describe the required behavior of the log management system within the network. The logging policy states which log messages that contain interesting information, from which sources these log messages originate, for how long these log messages should be stored, who is responsible for analyzing the log messages etc.

The logging policy governs all actions performed on the log messages. It starts out by stating which type of devices that are required to save its log messages and which log messages that should be kept from each device. The logging policy also touch upon how log messages should be treated during transmission. This includes if the log messages should be encrypted and hashed to protect confidentiality and integrity of the log messages. Further, a logging policy should state how log messages are stored and if they are required to be encrypted and hashed during storage. Another important topic logging policies should state is for how long each log type should be stored and how they should be disposed of. Finally, a logging policy should also state who should be able to access the log messages, how they are analyzed and how often the log messages should be analyzed.

The stakeholder does not have an established logging policy so for the foundation of this thesis a basic logging policy will be set up a using the guidelines presented in Kent and Souppaya’s paper [18] under section 4.2.

Worth noting is that this policy is not by any means supposed to be used in full scale deployments. The policy set up within this thesis is purely for demonstration and help with staking out the requirements of the centralized log management system which will be implemented.

2.6 Related work

Within this section relevant work to this thesis will be discussed. The previous work and findings will be presented here as a foundation for this thesis to build upon.

Many devices today have support for the syslog protocol for transferring their log messages within networks. The original syslog protocol has several flaws such as it does not support encryption during transport and does not allow for reliable delivery methods. The rsyslog protocol was produced to solve some of these flaws and is now commonly used in networks for centralized storage of log data. In September 2009 Peter Matulis published a technical white paper named “Centralized Logging With rsyslog” [20] which focused on the functionality and implementation of the rsyslog software. This software is an open-source software which is built on syslog but extends it with some key functionalities. These functionalities include Reliable transport with TCP (Transmission Control Protocol), encryption availability with TLS (Transport

(25)

17 Layer Security), support for Reliable Event Logging Protocol (RELP) and disk buffering of event messages during peak load times. This paper presents a good introduction and explanation of the key features of the rsyslog software. It also shows how some of them are implemented.

Creating and maintaining a logging policy is a good idea when dealing with larger and complex networks. This allows for the entity managing the network to setup a framework for how log messages should be handled and processed within the network. In the end, the goal of a logging policy is to make sure that log messages are treated in the same way on different devices throughout the network. The paper “Guide to computer security log management” [18] that was presented by Karen Kent and Murugiah Souppaya in September 2006 gives a good questionnaire which can be used for creating logging policies. The paper aims at spreading knowledge and understanding about the need for computer security log management. It also provides a guidance on how entities can develop, implement and maintain log management systems within their network. The paper presented by Kent and Souppaya provides a good start for entities looking to implement or improve their implementation of log management system. It provides a good introduction to logging policies, why they are required and how a basic logging policy can be constructed. This logging policy can then, if necessary, be extended and used to dictate the choice of log management system.

The largest benefits of centralized log management do not seem to come from the centralized storage of the log messages. In 2005, Robert Rinnan concluded within his master thesis “Benefits of centralized log file correlation” [2] that centralized log management by itself does not achieve much more than the convenience of storing all the information in one place. Instead, the true benefits of centralized log management come from the ability for visualization and correlation between log data. To be able to offer improvements in security and reduce time spent on analysis of log messages, some kind of visualization tool is required. Because of this, finding a system which provides a strong visualization tool together with a centralized log management system can help with correlation between logged events. Thereby, the time spent in this process and the response time when a serious event occurs can be reduced. In the end this will lead to increased security throughout the network since the data provided can be easier accessed and assessed. Therefore, a good log management system should include a visualization and analyzation tool.

Comparing log management systems is a difficult task since there are many different features to investigate and requirements to meet. In the master thesis

“Application Log Analysis” presented by Júlia Murínová [26] she discusses the implementation of log management system for web-based applications.

Various systems were compared and analyzed to find the optimal system. In the end she reached the conclusion that the ELK Stack system was the best suited system for the task. This was because of the custom log-format processing capabilities of Logstash together with the rich filtering and search capabilities of Elasticsearch. Although the results reached in Júlia Murínová’s thesis are not

(26)

18 directly applicable to the area concerned by this thesis, it brings valuable information, recommendations and opinions. This information can help with the setup of the system selection process within this thesis.

(27)

19

3 Research methodologies and methods

In this section a brief explanation to the methods and methodologies used within this thesis will be given. The work and research process will also be described within this section, together with the techniques used to evaluate and select the solution for a centralized log management system. Section 3.1 will explain the system development methodologies used for implementation of the proof of concept system. In section 3.2 an explanation of how the research within this thesis was done will be given. In section 3.3 the process of setting up the system requirements is explained. Section 3.4 explains how the system selection process was done. Finally, in section 3.5 the implementation process of the chosen system will be described.

3.1 System development methodologies

Since the time is limited to evaluate, develop and present the chosen system, this thesis uses a modified version of SCRUM [27]. SCRUM is an iterative and incremental framework for developing, delivering and sustaining complex products. This modified version of SCRUM was adapted to easier fit a shorter sprint and development time by removing some of the overhead. This will allow for more time to be spent on developing the system rather than producing documents which would have many similarities to the information presented in this thesis.

This thesis will not apply any distinct roles from the “Scrum Team”, since it would in this case introduce a lot of extra and unnecessary complexity. Each requirement placed upon the system will be broken down into the basic parts and used as the product backlog items. Some of these items will then be selected to be implemented by the start of each sprint. Each sprint in this thesis will be one week, which in the end amounts to a total of eight sprints for the project.

By the end of the last sprint, the proof of concept implementation should be done, and conclusions can be drawn.

The proof of concept implementation will be done iteratively using the methods described above. This means that in each sprint, one of the requirements will be targeted. The required functionality will be experimentally implemented to prove that the chosen centralized log management system meets the requirement. This means that for each sprint that passes the proof of concept system should meet at least one more of the requirements that are placed upon it. After the eighth sprint the proof of concept system should meet all the requirements placed upon the system.

(28)

20 3.2 Research phases

The foundations of research are based upon the gathering, understanding and correlation of information. The gathered information is examined and processed to reach a conclusion. This conclusion can then be used as a foundation and further developed by other researchers. This thesis will go through a series of different research phases which together will build up to the results and conclusion of the thesis. These phases can be divided into the following:

1. Thesis problem formulation 2. Set up thesis goals

3. Selection of research methods 4. Collection of information 5. Analyze gathered information

6. Apply gathered information to select a system for implementation 7. Implementation of chosen system

8. Draw conclusions from research

Together, these phases will allow the work to flow through the empirical research cycle that was presented in section 1.5 and can be seen in figure 1. The phases 1, 2 and 4 are equivalent to the observation and induction phases. Phase 5 and 6 represent the phase of deduction. Phase 7 is used as the testing phase and finally phase 8 is the evaluation phase of the empirical research cycle. The results presented in the end of the thesis should be repeatable and conclusions well founded. In the end, this should allow future research to use and build upon the conclusion presented within this thesis.

3.3 Data collection

Data collection is the process where the information that the research is based upon is collected. This information must come from relevant and trustworthy sources to be used as a valid support for claims and conclusions presented within the thesis. The most trustworthy information a research can use is self- collected data using valid research methods. This data can be collected during the research using for example interviews or surveys. For all information collected during the research an in-depth analysis of the source must be done to verify that the information is valid. A research built upon unverified or false information will not lead to any new trustworthy results or conclusions.

This thesis does not collect any new data using for example interviews or surveys. Instead, this thesis uses existing information to find a solution to the presented problem. The data collected and used within this thesis was retrieved from several different sources. To find the different sources of information presented within this thesis the “Academic and Scholar Search Engines and Sources” [28] written by Marcus P. Zillman was used. Within this document a large amount of search engines for scholar and academic sources. Several of the

(29)

21 presented search engines by Marcus P. Zillman was used to find the sources presented within this paper. Information about features and functionalities of the different centralized log management systems discussed within this thesis were taken directly from their respective website. This will help to ensure that the information retrieved about these systems will be correct and not opinionated by reviewers or other people.

3.4 Setting up system requirements

To be able to select the most appropriate software system to use for centralized log management it is important to clearly define the requirements placed upon the system. In this thesis the requirements are created with the help of the stakeholder to cope with the requirements of large data networks. The requirements placed on the system will be slightly higher than necessary. The reason for setting the requirements slightly higher than necessary is to allow for good scaling of the system when future expansions of the stakeholder’s network is considered.

To setup the requirements of the system the first task within this thesis is to create a logging policy. The logging policy will provide a better understanding of the requirements of the system. This logging policy was created in collaboration with the stakeholder to make sure it complied with their data treatment policies. The logging policy was created together with the stakeholder by answering the questions that can be found in the questionnaire Appendix A.

The logging policy presented in this thesis does not touch upon some of the questions of log management. Some of the log policy questions such as who should analyze the log messages and how often should the log messages be analyzed will not be answered. These questions were deemed unnecessary for this thesis but should otherwise be included when creating a logging policy.

From the created logging policy, the requirements for the system was derived.

These requirements will make sure that the selected and implemented system will deliver the desired functionality.

3.5 System selection process

The selection of the optimal centralized log management system for the stakeholder’s network is a tough task. The selection process will be based upon the requirements and logging policy laid out within section 4 of this thesis.

Since there is a high possibility of having several centralized log management systems matching the requirements, a sorting process is used to decide which system is the most suitable for the stakeholder’s network. In case of multiple centralized log management systems meeting the requirements, these software systems must be ranked using reasonable criterion.

(30)

22 In this thesis the criterion used for ranking the suitable centralized log management systems will be:

• Number of threads on Stack Overflow [29].

• Cost of implementation

• Open source

• Scalability

Each centralized log management system will be given a rating of one to ten for each presented criterion depending on how good they follow the criteria. The centralized log management system with the highest average rating will be the system chosen for implementation as the proof of concept model.

The number of threads on Stack Overflow criteria is one of the criterion that was used for ranking of the suitable systems. It was used because it provides an overview of how good the knowledge of the system is within the community.

The community knowledge can help with support for troubleshooting and setup of the system. A larger number of threads on websites such as Stack Overflow [29] clearly shows the width of the community knowledge for the specific system. This will make the setup of the system and its features easier since there is a great deal of frequently asked questions, solved problems and answers available. Therefore, if a problem should be encountered during the setup process of a system it is more likely to be an available solution to this problem if the system is well-known within the community.

The cost of implementation is a very important criteria to consider during the selection process. The cost can vary greatly between different systems. This is mainly because there are a lot of open source systems available as well as systems that are developed by companies and sold for profit. It is not as easy as to assume that the open source systems would always have a lower cost since they often are free to use. Many open source systems have a very steep learning curve and can be difficult to implement and configure to the meet the requirements placed upon it. This might introduce a large cost to the entity which is implementing the system. In some cases, it might be cheaper for an entity to implement a paid system instead of using an open source system and implementing it themselves. This is because the systems which are paid for might offer a simpler configuration and setup process. Therefore, the cost of implementation criteria will only focus on the estimated time it takes to implement the system. To handle the differences in cost between open source and paid systems, the open source criteria will be introduced.

Open source is most of the times a good thing to look out for. Open source systems will most of the time have better and wider support. Open source systems can also help to ensure development is continued for a longer period of time. Other benefits are that the goal of an open source system is not a financial gain but rather providing functionality to the developers and community surrounding it. Also, open source systems do not require a license to be installed and used.

(31)

23 Scalability might be the single most important factor when considering larger network deployment. If the system does not scale well several problems may arise. For example, if the system does not scale well an entity might have to implement several separate systems across their network. This would cause the network to not have a truly centralized log management system, but instead a less decentralized log management system. Another problem might be that the system works well when it is first deployed, but as the entity grows and an expansion of the system is necessary it could lead to a great amount of complications.

(32)

24

4 System requirements

In this section the requirements that are placed upon the system will be derived.

This will be done by taking into consideration the recommendations from the stakeholder on what requirements should be placed upon the system by creating a logging policy. The requirements will also be deducted from the background and literature study of this thesis. Within the section 4.1 the logging policy is created together with the stakeholder using the questionnaire that can be seen in appendix A. Section 4.2 stakes out the requirements placed upon the centralized log management system.

4.1 Logging policy

A logging policy is a vital part for the planning and maintenance phases of an entity's network. It is recommended that every entity that deploys and maintains a larger network should create a logging policy [25]. The logging policy may be useful to declare which information that is important for the security and maintenance of the network. This can for example be to declare how the log data should be transported and stored to comply with the security requirements placed upon the system. The logging policy also describes things like by who and how the log data should be analyzed.

There are several different ways to create a logging policy and there is not a set of rules as to what it should contain. This thesis will use the questionnaire presented in appendix A which is derived from Kent and Souppaya’s paper

“Guide to Security Log Management” [18]. The logging policy will provide a clear view of the functionality that the system chosen for implementation is required to have. These requirements will be the foundation of the selection process for the system to be implemented in the stakeholder’s network.

To create the logging policy there are several questions that needs to be answered [18]. These questions can be seen in detail within appendix A of this thesis. Below follows a summary of the answers to these questions. The questions have been answered to comply with the stakeholder’s network. The statements presented in the subsections below the logging policy created for this thesis.

4.1.1 Log generation

The questions answered from appendix A under the log generation subject is used to see which devices that are required to send their log messages to the centralized log management server. Using the statements presented below it is possible to tell which devices that are required to log data within the network.

These statements also tell which events that are required to be logged on each device. The following devices and events should generate log data within the network:

Centralized log management for complex computer networks