• No results found

The Use of Big Data in Process Management : A Literature Study and Survey Investigation

N/A
N/A
Protected

Academic year: 2021

Share "The Use of Big Data in Process Management : A Literature Study and Survey Investigation"

Copied!
79
0
0

Loading.... (view fulltext now)

Full text

(1)

Linköping University

IEI - Department of Management and Engineering Master’s in Industrial Engineering and Management

Master’s thesis, 30 credits Spring 2021

The Use of Big Data in Process

Management: A Literature Study and

Survey Investigation

Ekow Esson Ephraim (essep111)

Sanel Sehic (sanse523)

Supervisor: Peter Cronemyr Examiner: Mattias Elg

LINKÖPING 2021

Linköpings Universitet 581 83 Linköping

(2)
(3)

II

ABSTRACT

In recent years there has been an increasing interest in understanding how organizations can utilize big data in their process management to create value and improve their processes. This is due to new challenges for process management which have arisen from increasing competition and the complexity of large data sets due to technological advancements. These large data sets have been described by scholars as big data which involves data that are so complex traditional data analysis software are not sufficient in managing or analyzing them.

Because of the complexity of handling such great volumes of data there is a big gap in practical examples where organizations have incorporated big data in their process management. Therefore, in order to fill relevant gaps and contribute to advancements in this field, this thesis will explore how big data can contribute to improved process management. Hence, the aim of this thesis entailed investigating how, why and to what extent big data is used in process management. As well as to outline the purpose and challenges of using big data in process management.

This was accomplished through a literature review and a survey, respectively, in order to understand how big data had previously been used to create value and improve processes in organizations.

From the extensive literature review, an analysis matrix of how big data is used in process management is provided through the intersections between big data and process management dimensions. The analysis matrix showed that most of the instances in which big data was used in process management were in process analysis & improvement and process control & agility. Simply put, organizations used big data in specific activities involved in process management but not in a holistic manner.

Furthermore, the limited findings from the survey indicate that the main challenges and purposes of big data use in Swedish organizations are the complexity of handling data and making statistically better decisions, respectively.

Keywords: Big data, process management, process mapping and development, process analysis

and improvement, process control and agility, data strategy, data quality and governance, database management, data generation and integration.

(4)

III

Acknowledgment

We would like to give our thanks to our supervisor, Associate Professor Peter Cronemyr at the department of management and engineering (IEI) at LiU. Peter provided guidance and support from the beginning of the thesis right till the end, steering us in the right direction when necessary. His constant supervision and extremely helpful feedback were all the support we could have asked for and we are extremely grateful and indebted to him for that.

We also give thanks to our examiner, Professor Mattias Elg of the IEI department at LiU for his valuable comments on this thesis.

To our opponents Eduardo García Suárez and Jeeveet Ramesh, we thank them for their valuable feedback during the thesis seminar sessions which contributed immensely to improving the quality of the report.

We would also like to thank Peter Cronemyr, Mattias Elg, and Hendry Raharjo who served as our interview participants and helped redefine our first research question by sharing with us their expertise and knowledge on the studied topic. We also give thanks to all those who helped us directly or indirectly during the entire span of the master thesis.

Finally, we give our family and loved one’s thanks for being behind us throughout our studies in LiU up until the very end. We could not have done it without you, Thank You!

Linköping Universitet SE-581 83 Linköping, Sverige

(5)

IV

List of abbreviations

AC/TC Asset and Capability creation/Transformation

and Competition

AI Artificial Intelligence

AWS Amazon Web Services

BASU Business Analytics Service Unit

BDA Big Data Analytics

BI Business intelligence

BPM Business process management

BPMS Business process management systems

CPM Customer process management

DBMS Database management system

DIKW Data-Information-Knowledge-Wisdom

DMAIC Define, Measure, Analyze, Improve and

Control

GBAS Global business analytics service

HDFS Hadoop Distributed Filesystem

IoT Internet of Things

MySQL My structured query language

NoSQL Non-structured query language

O&G Oil and gas

R&D Research & Development

RDBMS Relational database management systems

SaaS Service as a Software

SPC Statistical Process Control

TQM Total Quality Management

VRMDB Vehicle Relationship Management Database

(6)

V

TABLE OF CONTENTS

ABSTRACT ... II Acknowledgement ... III List of abbreviations ... IV 1. Introduction ... 1 1.1 Problem Background ... 2 1.2 Purpose ... 3 1.3 Research questions ... 3 1.4 Delimitation ... 3 2. Methodology ... 4 2.1 Research Strategy ... 4 2.2 Research Approach ... 5 2.3 Research Design ... 6 2.4 Data Collection ... 7 2.4.1 Interviews ... 7 2.4.2 Literature review ... 8 2.4.3 Survey ... 9 2.5 Data Analysis ... 10 2.5.1 Interview ... 10 2.5.2 Literature review ... 11 2.5.3 Survey ... 12

2.6 Research Validity and Reliability ... 13

2.7 Ethical Considerations... 14

3. Theoretical framework ... 15

3.1 Characteristics of Big Data ... 15

3.2 Small Data versus Big Data ... 16

3.3 Big data Analytics and Management ... 17

3.4 Process Management ... 20

4. Literature review ... 23

4.1 Value creation through big data ... 23

4.2 Customer Process Management ... 25

(7)

VI

4.4 Ambidextrous organization and agility in big data era – The role of business process

management systems (BPMS) ... 28

4.5 A recipe for big data value creation ... 30

4.6 Big data analytics: transforming data to action. ... 32

4.7 Literature review summary ... 33

5. Empirical findings ... 35

5.1 Demographics... 35

5.2 Using Big Data ... 38

5.3 Not using big data ... 42

5.4 Using Big Data in process management ... 44

5.5 Survey comments ... 46

6. Analysis... 47

6.1 Analysis of interviews ... 47

6.2 Literature review analysis ... 47

6.2.1 Big data in process mapping & development ... 48

6.2.2 Big data in process analysis & improvement ... 48

6.2.3 Big data in process control & agility ... 51

6.3 Survey analysis ... 52 7. Conclusions ... 55 7.1 Conclusions ... 55 7.2 Limitations ... 58 7.3 Future scope ... 59 References ... 60 Appendices ... 63

(8)

VII

TABLE OF FIGURES

Figure 1 Data Management process loosely adapted from "Introduction to Big Data". (Li H. , 2015)

... 19

Figure 2. Process Management 1-2-3. (Cronemyr & Danielsson, 2013)... 21

Figure 3. Customer Process Management Framework ... 26

Figure 4. BDA-capable BPMS (Rialti, Marzi, Silic, & Ciappei, 2018) ... 29

Figure 5. AC/TC process model for big data value creation (Ylijoki & Porras, 2019) ... 31

Figure 6. Gender distribution of survey respondents ... 35

Figure 7. Age distribution of survey respondents ... 36

Figure 8. Organization size of survey respondents ... 37

Figure 9. Big data experience of survey respondents ... 37

Figure 10. Organizational use of big data ... 38

Figure 11. Purpose of using big data. ... 39

Figure 12. Challenges when using big data. ... 39

Figure 13. Organizational expertise with big data. ... 40

Figure 14. Top management commitment to big data activities ... 40

Figure 15. Usage of big data in process management. ... 41

Figure 16. Reasons why organizations don´t use big data. ... 42

Figure 17. Possible reason for using big data in the future ... 43

Figure 18. Usage of big data in operational processes ... 44

Figure 19. Usage of big data to analyze and improve processes. ... 45

Figure 20. Usage of big data to control processes ... 45

LIST OF TABLES

Table 1. The five V´s of big data loosely adapted from Wamba et. al. ... 16

Table 2. Comparison between small and big data (Kitchin & Lauriault , 2014) ... 17

Table 3 Various projects in which the CPM framework was applied. ... 25

Table 4. Literature review summary ... 34

Table 5. Job roles of survey respondents and their organization ... 36

(9)

1

1. Introduction

This chapter deals with the basic description of this thesis report and a brief introduction of the selected research topic is provided. Furthermore, the purpose of the report is described along with the research questions investigated. Finally, the delimitations of the thesis are outlined.

Big Data is revolutionizing entire industries and changing human behavior in terms of how we work, learn, and view society (Jin, Wah, Cheng, & Wang, 2015). Big data can be defined as large data sets with so high complexity which are difficult or sometimes even impossible to process using traditional methods (Chen, et al., 2013). Although data has always been used as input for various activities in process management such as development, mapping, and control of processes, there has been a significant amount of technological growth in the 1990s. Due to this technological growth, data processing software is becoming more powerful and more flexible and has resulted in new ways of using data in process management (Vera-Baquero, Colomo-Palacios, & Molloy, 2013).

In various literature concerning process management, there have been many different views on how to properly define, utilize and implement process management. Cronemyr & Danielsson (2013) however, propose a three-step maturity model and diagnostics tool based on previous research on process management. These include process development, process improvement, and process control where various data concerning customer needs are utilized in achieving a high level of customer satisfaction and organizational growth (Cronemyr & Danielsson, 2013).

To achieve the desired outcomes gained from effective process management, big data can be used to help managers in addressing a wide range of business activities from product development to customer experience by providing an opportunity to better performance on objectives measures of financial and operational results (McAfee, Brynjolfsson, & Davenport, 2012).

(10)

2

1.1 Problem Background

There is a great deal of knowledge about big data and process management individually to answer the purpose of the report, however, there are not enough studies that bind the two themes together and investigates how big data can be properly implemented in process management. The lack of research about the relationship between process management and big data poses a problem for researchers interested in studying this topic, and practitioners, managers, and organizations when trying to incorporate it into their organizations' processes. By recognizing the significance of how big data can be beneficial when implemented in process management in organizations, and by providing a conceptual framework explaining how big data has previously been used, we gain academic and managerial advancements in the exploration of this topic.

Even though big data is considered essential to unlock effective process management and develop organizations, it does not come without its challenges (Chen, et al., 2013). Because big data volumes are increasing at a fast pace, organizations will find it challenging to find ways to effectively store it. However, the challenge is not only to store the data, but also the curation process, in other words organizing the data in a way that enables meaningful analysis and use (Wamba, Akter, Edwards, Chopin, & Gnanzou, 2015).

Linked to the technical challenges big data comes with a lot of expenses that organizations would have to cover to properly integrate it into their business activities. Costs of new hardware, new employees, and paying for software development and cloud services needed for Big Data analysis. In the long-term, companies also have to account for the fact that there should be an avenue for scalable future expansions to ensure that the growth of Big Data does not get out of hand and cost them a fortune (Jin, Wah, Cheng, & Wang, 2015).

Aside from the technical difficulties associated with Big Data, there is also the existence of managerial challenges concerning how it can be properly implemented in process management. McAfee & Brynjolfsson (2012) highlight five management challenges as a key factor in the successful implementation of Big Data in an organization, these include leadership, talent management, technology, decision-making, and company culture. These management challenges show the significance of the role of leaders in using Big Data in making data-driven decisions that can lead to better results. For this to happen, companies have to figure out a way of combining domain expertise with Big Data to achieve effective process management (McAfee, Brynjolfsson, & Davenport, 2012).

(11)

3

1.2 Purpose

The purpose of this report is to explore and theoretically investigate how, why and to what extent big data is used in various activities in process management, with regards to the topic being part of a bigger societal development issue. Furthermore, the purpose and challenges of using big data in process management will be at the core of the analysis. As such, the recipients of this report are the researchers in the field of process management and quality engineering that would seek to validate our findings in this report.

Hence, the study aims to draw generalized conclusions about how, why, and to what extent big data is used in process management. As well as to outline the purpose and challenges of using big data in process management.

1.3 Research questions

To fulfill the purpose of this thesis, the following research questions will be explored:

RQ1: How is Big Data used in Process Management according to literature?

RQ2: What are the purposes and challenges of using Big Data in Process Management according

to a survey of Swedish organizations?

1.4 Delimitation

The delimitations of the project are outlined below:

• This thesis was limited to only investigate the use of big data regarding the following three activities in process management: development, mapping, and control of processes. • The survey was limited to only investigate the purposes and challenges of using big data

in Swedish organizations.

(12)

4

2. Methodology

In this chapter, the methodology applied to this thesis is presented and explained in greater detail. Initially, the research strategy applied in the report is explained, afterwards, the research approach is outlined followed by a thorough explanation of the adopted research design. Furthermore, in-depth descriptions of the data collection methods are explained, and the data analysis method is described. Lastly, the reliability and validity of the research are discussed.

2.1 Research Strategy

The research strategy describes how data is to be collected. In our thesis, a literature review and survey strategy were chosen. Rocco & Plakhotnik (2009) discuss the functions of the literature review in explorative research, to build a foundation, demonstrate how the study advances knowledge, conceptualize the study, assess research design, and finally provide a reference point for the interpretation of findings.

Research question 1 outlined the descriptive and exploratory nature of the research problem and as such dealt with focusing on understanding, defining, and refining the theory concerning the phenomenon studied. As such the literature review was the strategy adopted for answering this question. The theoretical framework for this literature review was centered on relevant big data and process management theory and bridged a gap between both fields of study. In addition to the literature review conducted, informal semi-structured interviews were used by talking to researchers in the Quality Engineering department of Linköping University with the purpose of further refining the research question.

For the second research question, a survey was the chosen method. Malhotra & Grover (1998) describe survey research as having three characteristics, i.e., collection of information in a structured format, standardized information to describe variables, and gathering of information from a sample population. According to Malhotra & Grover (1998), there are two different types of survey research: exploratory and explanatory. Exploratory aims to investigate and increase understanding within a topic, while explanatory aims to finding causal relationships among variables. Since the core of our thesis revolves around describing and investigating the research topic based on mainly literature study, our report can be classified as exploratory. Exploratory survey research leads to the acquisition of results that can be refined to identify new possibilities and areas of interest (Malhotra & Grover, 1998). As such, the aim of using exploratory surveys is to search for patterns among variables thus building theory.

(13)

5

2.2 Research Approach

In conducting research, two different approaches can be employed, they are the inductive theory approach and the deductive theory approach. The deductive approach involves deducing a hypothesis (or hypotheses) based on what is known about a particular domain which is then subjected to empirical scrutiny (Bryman & Bell, 2018). In other words, the deductive approach starts from the theory and ends at testing or revising the theory. On the other hand, the inductive approach moves from the opposite direction from deduction, where researchers infer the implication of their research from the theory that initiated the research (Bryman & Bell, 2018). For this thesis, an inductive approach was deemed suitable for the research since the nature of the project was explorative and requires further investigation into the subject area. As such, we used this approach by starting from our findings and heading towards the exploration of the theory. The inductive approach is more suitable for this because it enables our research findings to be fed back into the stock of theory and to be associated with a certain domain of inquiry (Bryman & Bell, 2018). Data analysis was also inductive since it enabled us to build patterns, categories, and themes by organizing data into useful units of information (Creswell, 2009). This also showed that the deductive approach where researchers begin testing existing hypotheses was not suited for this explorative study since there was no hypothesis to be tested in the first place.

(14)

6

2.3 Research Design

This report was a research and literature review carried out mainly in a qualitative manner by studying existing literature and conducting surveys of Swedish organizations and interviews with researchers in the quality management and logistics department at Linköping university.

The chosen research design for answering the first research question was a literature review, and there are a few compelling reasons as to why. Firstly, descriptive, and exploratory research designs which use these investigative methods of research enabled us to gain a large amount of data from the representative literature for a detailed analysis (Rocco & Plakhotnik, 2009). Secondly, the theoretical review enabled us to provide an overview and synthesis of applicable sources explored for the study. Its purpose was to concretely review, critique, and synthesis existing theories on how big data was used in process management to generate a new framework(s) and/or perspective(s) on the topic (Rocco & Plakhotnik, 2009). In doing this we aimed to establish what practices already existed with regards to big data in all types of process management used globally in organizations. Lastly, this method served as a pre-cursor necessary for answering the second research question since it provided a reference point for the interpretation of the findings (Rocco & Plakhotnik, 2009).

The research design applied to investigate the second research question was to conduct cross-sectional survey research based on Swedish organizations. Cross-cross-sectional survey research is characterized as a kind of observational research that analyzes data of variables gathered at a particular point in time across a sample population (Malhotra & Grover, 1998). Even though this type of research does not include experimentation, it is often used as a method to understand outcomes in the physical and social sciences and many business industries (Malhotra & Grover, 1998). A cross-sectional survey research design was suitable for answering the second question since it enabled us to analyze differences between Swedish organizations regarding their purpose of using big data and their specific challenges with its usage. The goal with this research question was to generate results that the researcher could further investigate to validate or refute the findings, meaning that we mainly focused on defining the researchable problem and theorizing it based on theory. Hence, our findings from this question will provide researchers with an opportunity for testing and evaluating solutions in reality as well as designing a concept model.

(15)

7

2.4 Data Collection

The process of data collection was mapped out after identifying the aim of the study and what was intended to be achieved from it. The problem statement outlined in this study followed a theoretical issue of how big data is used in process management, and as such required mostly qualitative data expressed in words and analyzed through interpretations and categorizations as opposed to quantitative data measured in the form of numerical values. Even though our analysis was to be carried out using a qualitative approach, we used a mixed-method approach of collecting data based on our research questions. The first research question aimed to assess how big data was used in process management across industries according to literature while the second question aimed to gather meaningful feedback on the purposes and challenges of big data use in process management. Therefore, a mixed-method approach was adopted by using both survey and literature review to collect quantitative and qualitative data, respectively.

For this purpose, it was necessary to define what kind of data our primary and secondary data would be. Primary data can be defined as data that is obtained directly by the researchers during their research while secondary data is data that has been recorded and documented by other researchers. Primary data for this study was gathered through the use of a survey and interviews while the secondary data for this study was collected from past articles, books, etc.… gathered through the literature review.

The order in which we collected our data for this thesis was to firstly conduct interviews with researchers to redefine the first research question, followed by a literature review to answer the first research question. Finally, we surveyed to answer the second research question and to also compare the survey results to the findings from the literature review.

2.4.1 Interviews

In research methodology, interviews are conducted by researchers to gain insights, understand the opinions, attitudes, experiences, processes, behaviors, or predictions of participants (Rowley, 2012). Rowley (2012) proposes important questions to consider when designing and planning an interview for research purposes, these include,

• Why should I choose interviews for my research? • Which type of interview is best?

• How do I decide the questions to ask?

• How long should the interviews be? And how many interviews do I need to conduct? • How do I select and enlist potential interviewees?

Interviews were chosen to redefine the research topic and gain an in-depth understanding of important perceptions and opinions from experienced practitioners regarding the studied area we

(16)

8

conducted three semi-structured interviews with researchers experienced with using either limited or big data. Semi-structured interviews were suitable since they mainly followed a pre-set questionnaire, see Appendix A, but also allowed for slightly modified follow-up questions depending on the interviewee and their experience. The questions were formulated to enable the interviewee to properly express their opinions and knowledge about big data. Since we only had eight questions the interviews only lasted approximately half an hour each and were conducted individually for each researcher to freely express their opinions and experience without being influenced by others. Lastly, since the purpose of the interviews was to redefine the research questions, we did not need more than a few interviewees. The researchers we approached to interview were suggested to us by our supervisor Peter Cronemyr. Many of the researchers we contacted felt unqualified to participate because of a lack of knowledge about big data. However, we still managed to find three participants to interview. The first interview was conducted with Peter Cronemyr, who is an associate professor at Linkoping University currently invested in research concerning big data in process management. Coincidentally, he was the supervisor for this thesis, and as such has an interest to further research the topic depending on the findings from this thesis. The second interview was conducted with Hendry Raharjo who was an associate professor in quality management at Chalmers University. Prof. Raharjo had no experience with working with big data but had experience working with limited data in quality management with the Swedish institute for quality, developing processes for the healthcare sector. The third and final interviewee was Mattias Elg who was the examiner of this thesis but also a professor at Linkoping University. Prof. Elg had experience working with data in large volumes in industrial processes, such as in healthcare.

2.4.2 Literature review

To answer the first research question, we had to collect literature to develop a conceptual framework from historical research and practices. The purpose was aside from comparing previous findings and identifying potential areas for research to also identify knowledge gaps and create a need for further research.

Snyder (2019) describes three broad types of review strategies to answer a particular research question. These review strategies include the systematic review, the semi-systematic review, and the integrative review. All three strategies are appropriate for different research depending on the methodology needed to achieve the purpose of the review. It can be quite difficult to choose what type of approach is best suited for the review and careful deliberation must be made based on the research question and specific purpose of conducting the review. For this research it is appropriate to use an integrative review since the aim is to contribute to the advancement of knowledge and theoretical frameworks which will potentially lead to the generation of new conceptual frameworks or theories (Snyder, 2019), concerning how Big Data has been used in Process Management. Systematic reviews are intended to identify all empirical evidence based on

(17)

pre-9

specified criteria for the research and as such do not fit this research since the research question covers a far too broad scope concerning the use of big data in process management. Semi-systematic reviews are designed for topics that have been conceptualized differently by different researchers from diverse backgrounds and as such require a meta-narrative approach to synthesize findings (Wong, Greenhalgh, Westhorp, Buckingham, & Pawson, 2013).

The process of conducting a literature review involves different approaches, Snyder (2019), however, developed a synthesis of various standards and guidelines suggested for literature reviews. This process involves four phases – designing the review, conducting the review, analysis, and writing the review (Snyder, 2019).

After the review approach has been decided, researchers must consider a search strategy for identifying relevant literature which involves selecting search terms, appropriate databases, and deciding on inclusion and exclusion criteria (Snyder, 2019). Appropriate search terms for literature were mainly the phrases “Process Management” and “Big Data” since these two phrases are based on concepts directly related to the research question. It is important to note that due to the broad nature of the research question, some limitations were put in place in order not to select articles and journals which do not cover any of these two key concepts. Closely related areas such as quality management for instance were also considered since they are connected to the concepts that were studied in this research. All credible databases were chosen from the LiU collection of databases, mainly Emerald, and its search engine UniSearch, with additional articles and journals chosen from Google Scholar as long as they were gathered from a credible academic journal publication and were subject to peer review. Inclusion criteria for the review included all conceptual articles found when searching for the two key phrases mentioned above, written in English, and published from the year 2010 and beyond. Exclusion criteria for the review included articles in which the key concept of the research was not closely related to the studied phenomenon. Conducting the review required two reviewers to select articles considered suitable to ensure the quality and reliability of the search process. Research theory has multiple ways of conducting the review including reading each article or journal in full, focusing on the research method and findings, or by reading abstracts of journals first to decide whether such literature is selected for the review (Snyder, 2019). For this literature review, the article selection was based on reading the abstract and research findings for the respective article before proceeding to read entire articles.

2.4.3 Survey

In order to understand the general opinion and experiences of people who work with big data in process management but also other fields as well as people who lack experience of working with big data in the real world, we designed a survey aimed to answer the second research question, see Appendix C. According to (Creswell, 2009) survey methodology entails distributing a list of questions to a sample population in hopes of gaining an understanding of similarities and

(18)

10

differences regarding challenges and purposes from their past experiences. Furthermore, (Creswell, 2009) explains that surveys are classified as quantitative or numeric descriptions of trends, attitudes, or opinions of a population.

In this thesis a cross-sectional survey was used, meaning that the data was collected from many different individuals at a single point in time and the studied variables were observed without influencing them (Creswell, 2009). Finding one specific company or group of people qualified to answer the survey proved to be a challenge, therefore, the survey had to be made open to the general populace in hopes of getting an adequate number of responses. The survey was distributed by Peter Cronemyr using his personal and professional channels such as LinkedIn, Facebook, and Liu contacts. In the end, only nine respondents were recorded in total, but it still proved to be beneficial since we were able to get responses from managers, researchers, and Ph.D. students to answer the second research question. This means that the findings from the survey proved to be limited and as such, generalizable conclusions are not able to be made from the survey but perhaps only point to areas where more research should be conducted.

The survey was split into three different parts depending on the experience of the respondents, see Appendix C. Since the respondents were anonymous the survey began with a demographic part which all respondents answered for us to get stratification variables on how our sample population was divided. Then, depending on if the respondent had experience working with big data, they got the questions under “using big data”, and if they had no experience, they got the questions under “not using big data”. If the respondent had experience not only with big data but had also worked with it in process management, they got the questions under “big data in process management”.

2.5 Data Analysis

Data analysis entails analyzing the findings to make sense of it. In our project, the findings consisted of the data collected during the interviews, literature study, and survey.

2.5.1 Interview

In research methodology, analyzing interviews are done in an iterative manner described by Rowley (2012) as four key components. These components include,

• Organizing the data set,

• Getting acquainted with the data,

• Classifying, coding, and interpreting the data, and, • Presenting and writing up the data.

In organizing the data set, transcripts of the interview were recorded in MS Word so that all the text relating to the answers to specific questions could easily be found. To get acquainted with the interviews, we read through the transcripts thoroughly for easy identification of key themes or

(19)

11

anything of interest Additionally, interviews which we got permission to record were recorded and watched again to make sure nothing was misinterpreted. Thirdly, while we did not classify and code the data, we did interpret the answers through discussion and by looking for common patterns or themes found from the different interviews. Since this was an indication that it was not just a need of one specific researcher but rather a collective issue that needed to be further explored. Lastly, according to Rowley (2012), takeaways from the interviews should be presented in the report in a manner that reflects how the analysis was conducted. As such the understandings from the interviews were presented in the analysis chapter.

2.5.2 Literature review

Critical analysis of the literature review was conducted by following a research methodology approach described by Snyder (2019) and Torraco (2016) for integrative reviews. After the review has been conducted, researchers proceed into the third phase of the review process by considering how the articles should be used to perform appropriate analysis. The analysis should involve a standardized approach of abstracting appropriate information from the final sample selected articles. This abstracted data can come in the form of descriptive information, including the authors, year of publication, topic, type of study, or in the form of the research findings and effects (Snyder, 2019). An additional form of abstracted data can also be conceptualizations of a specific idea or theoretical perspective. This phase essential involves four guidelines on how to conduct the data abstraction and analysis (Snyder, 2019). These guidelines include:

• Ensuring the data abstracted from the article follows the overall purpose of the review, • Describing accurately the process for abstracting data,

• Taking proper measures to ensure the quality of data abstraction and,

• Choosing an appropriate data analysis technique appropriate for the research question and abstracted data.

Since the research question involves investigating how Big Data is used in Process Management, abstracted data should consist of research findings covering topics connected to the concepts of big data and process management. Relevant data also included conceptualizations and frameworks of different methods related to the use of big data in business processes. The process of abstracting the data involved using a table to track which keywords and databases did or did not lead to relevant literature (Torraco, 2016). It was also required to discuss how the main ideas and themes from the literature are identified and analyzed to ensure the research methods for data abstraction and analysis are good quality and transparent to the reader.

For an integrative review, appropriate techniques for the critical analysis should involve ways of describing the strengths and weaknesses, identifying contradictions and inconsistences, and areas of the studied phenomenon that may be poorly represented in literature (Torraco, 2016). These were done using an analysis matrix to visually represent the main ideas and conceptual relationships between big data and process management. In theoretical work, it is important to use

(20)

12

logical and conceptual reasoning to conduct a synthesis that is derived from the analysis where researchers integrate existing concepts and ideas which lead to the creation of new ways of thinking about the topic. This according to research methodology should be in the form of a classification scheme of constructs, propositions for further research, a reconceptualization of the topic, a meta-analysis, or metatheory (Torraco, 2005).

2.5.3 Survey

The steps involved in analyzing the survey follow the six-step approach proposed by Creswell (2009) which involves complete data analysis procedures for surveys. These steps include,

• Reporting information about the number of respondents who did or did not return the survey.

• Ensuring that response bias is determined in the survey. • Providing a descriptive analysis for all variables in the study.

• Deciding on if the survey contains an instrument with scales and identifying the statistical procedure i.e., factor analysis.

• Identifying the statistics and the statistical computer program for testing the major inferential research questions or hypotheses in the study.

• Presenting the results in tables or figures and interpreting the results from the statistical test.

For the first step, since our survey was open to the public and did not have any selected sample group, therefore, it is impossible to report the number of respondents who did or did not return the survey as well as to determine whether our response rate is good. Creswell (2009) suggests that response bias can be determined using wave analysis i.e., examining returns on select items weekly to determine if the average responses change. However, this was also not possible since our survey was open and did not have a sample population. The third step was accomplished by using bar charts and tables to represent the descriptive statistics of all variables involved in the study. Since the questions were multiple choice with options for more than one answer, the only viable method was to use bar charts and not including means, standard deviations, and range scores for the variables. For the fourth step, there was no inclusion of any instrumentation with scales since it did not apply to the type of questions included in the survey. In the fifth step, the main statistical program used for testing the research questions was Microsoft Excel. Finally, the interpretations of the results were presented in the analysis chapter as seen in 6. Analysis.

(21)

13

2.6 Research Validity and Reliability

In determining how credible research can be, its quality is to be measured by the degree of validity and reliability (Merriam, 1994). Validity is further divided into internal, external, and construct validity. Internal validity is considered as a measurement to ensure that the research measures what it is intended to measure i.e., the procedures of the study and how well they are performed, and external validity refers to the extent to which the results of the research can be generalized in various contexts. The third type of validity is known as construct validity, which is the extent to which results obtained from a study are generalizable in relation to the theoretical framework of the report. Lastly, reliability is a measurement of the consistency of the research, in other words, if it is possible to reproduce the same results under the same conditions.

Internal validity was achieved by posing various questions on the same topic to ensure that there were no contradictions in the answers gained from the surveys and interviews. Conducting a thorough literature study, enabled us to understand the underlying phenomena of the research topic and ensured that we measured what was intended to be measured. To ensure high external validity, we studied various organizations in different fields and their challenges with using big data. This was done to gain the possibility of obtaining results with the possibility of generalizing the findings. Questions included in the questionnaires were developed carefully based on relevant existing knowledge concerning big data and process management. By effectively framing questions that lead to applicable and useful answers for our research, we aimed to reduce the chances of not achieving a high level of construct validity. Finally, reliability was ensured by using research methods that were consistent to obtain findings that were also consistent across data sources. Also, since the thesis was carried out in pairs, all views and agreement on coding, interpretation, and findings were consistent with both parties. Lastly, peer reviews during different stages of the thesis ensured that reliability was at an optimal level due to the feedback gained from the peer review sessions.

Every researcher brings a certain degree of bias into their study, however according to Creswell (2009), performing a self-reflection is necessary for ensuring objectivity. This is because these biases have the possibility of influencing the way researchers to analyze and interpret their data and experiences. In this study, the research was conducted by two researchers therefore the biases, values, and personal background decreased since the findings were obtained, viewed, and analyzed objectively in pairs. Furthermore, a supervisor and peer group were involved in the work process which made sure that bias was limited in the study.

(22)

14

2.7 Ethical Considerations

It is important to consider the ethical aspect of research conducted especially with projects which involve interaction with businesses or members of society who act as participants/respondents in the research conducted. Even if it might not be the intention of researchers to harm participants in any way or form, there is the possibility that interactions with them might cause psychological, financial, or social harm (Polonsky, 2005).

According to Creswell (2014), researchers need to consider the impacts of the research problem with regards to who stands to benefit from it. A meaningful research problem should be beneficial to others besides the researcher conducting the study for it to be ethical. In our research, we aim to investigate how big data is used in process management to the benefit of researchers, organizations, practitioners, and managers. Researchers interested in the advancement of knowledge concerning the studied phenomenon will find it beneficial when conducting future research. Furthermore, organizations, practitioners, and managers will be able to understand how big data can be used in process management and apply it in their daily operational processes. To avoid deception, it is also important to be clear about the purpose and intended outcome of the research to the participants (Creswell, 2009). For instance, when distributing the surveys, a descriptive text explaining who we are, the purpose of the study, and how the answers would be used was included in the survey. Additionally, when conducting the interviews, permission was asked from the participants if we could record the interviews for our research. Allowing participants to keep their anonymity is important, therefore the names of participants of the survey are not included in this research. In the interviews, however, permission was asked to include the identities of the participants when writing the report.

Writing the report also requires that the language used is not biased against anyone regardless of their gender, sexual orientation, racial or ethnic group, age, etc. (Creswell, 2009). Additionally, findings are not to be falsified to meet the researcher’s needs. This was fulfilled in this research by including all data collected from the interviews, survey, and review in a transparent manner throughout the report in the empirical findings, and appendices.

(23)

15

3. Theoretical framework

This chapter presents the theoretical basis on which the report is based.

Initially, a definition of the term big data is provided, and characteristics of big data are presented. In addition, a comparison between small and big data is provided. Afterward, an insight is provided on how big data analytics are performed together with its underlying technologies. Finally, process management is presented as a whole by describing its key activities in detail.

3.1 Characteristics of Big Data

When defining big data, it is hard to find one common definition, therefore, scholars and practitioners have developed the notion of ´V´ to define big data. According to Wamba et. al. (2015), one might define big data with either 3Vs, 4Vs, or 5Vs. The five Vs are Volume, Variety and Velocity, Veracity, and Value. However, all definitions include the following 3V which describes the core characteristics of Big Data: Volume, Velocity, and Variety.

Volume refers to the size that either takes up a large amount of space or is made up of a large number of records. Variety means that the data is generated from different sources and formats and accommodates multidimensional data fields. Velocity consists of the speed/frequency at which data is generated and delivered. The veracity of big data highlights the inherent unpredictability of the data leading to a requirement of thorough analysis to reduce data replication problems. Finally, the value of big data can be measured by how much the data generates economically, and its benefits through extraction and transformation. Based on these 5 V’s, Wamba et al. (2015) propose a concrete definition of big data as a “holistic approach to manage, process and analyze 5 Vs (i.e., volume, variety, velocity, veracity, and value) to create actionable insights for sustained value delivery, measuring performance and establishing competitive advantages.”

(24)

16

Table 1. The five V´s of big data loosely adapted from Wamba et. al. Characteristics

of big data Definition Example

Volume

The size that either take up a large amount of space or are made up of a large number of

records

For example, Facebook’s Hive data warehouse held in total 300 petabytes (PB) with a daily rate of 600 terabytes (TB) in April 2014 (Vagata & Wilfong,

2014).

Variety

The data is generated from different sources and formats and

accommodates multidimensional data fields

Big data sources and formats include text, audio, video, pdf, social media,

etc.…

Velocity The speed/frequency at which data is generated and delivered

Businesses like Amazon are now able to track and update customer behavior in near-real-time (Davenport, 2006).

Veracity

The unpredictability of the data leading to a high requirement of

ensuring data is not replicated.

Companies like eBay Inc. are faced with serious data replication problems

with the need of filtering out similar versions of the same data daily.

Value

How much the data generates economically, and its benefits

through extraction and transformation

Match.com increased its revenue by 50% from 2009-2011 mostly through

big data analytics

3.2 Small Data versus Big Data

According to (Kitchin & Lauriault , 2014) data were hardly considered in terms of “small” or “big” before 2008, instead, all data were considered what we today refer to as small data despite their size. Small data is normally characterized by its limited volume, bounded variety, non-continuous collection, and are tailored to answer specific questions. However, some small data sets can still be very large, and therefore for data to be considered big it has to have multiple characteristics from the five V’s. Big data has emerged due to the advancement of technologies, infrastructure, software, processes, and techniques that enable data to be used in the day-to-day activities of businesses and society as a whole. These enabling factors include the internet of Things (IoT), the creation of new databases such as non-structured query language (NoSQL), social media, and new kinds of data analytics designed to handle large data sets (Kitchin & Lauriault , 2014). In contrast, traditional small data analytics only requires extraction, gathering, analysis of data to form a decision model in the future. For a more detailed comparison between small and big data, see Table 2.

(25)

17

Table 2. Comparison between small and big data (Kitchin & Lauriault , 2014)

Characteristics Small data Big data

Volume Limited to large Very large

Exhaustivity Samples Entire populations

Resolution and indexicality Coarse and weak to tight and

strong Tight and strong

Relationality Weak to strong Strong

Velocity Slow Fast

Variety Limited to wide Wide

Flexible and scalable Low to middling High

3.3 Big data Analytics and Management

Organizations have always relied on some form of data analytics to run and improve their operations through newly acquired knowledge and insights. Big data analytics (BDA) ultimately has the same business goal as traditional data analytics. The real difference is the approach between modern-day BDA and traditional data analytics. BDA has been described as, “proactively learning and understanding the customers, their needs, behaviors, experience, and trends in near real-time and 24x7 (Li H. , 2015)”. It becomes apparent that something as powerful and useful as BDA is applied across many different areas and industries. Some of the areas in which data analytics on such a huge scale is used are in marketing with companies like Salesforce with service as a software (SaaS) companies, like Salesforce using cloud technology to improve customer relations and in aviation by using sophisticated data analysis, machine learning algorithms and pattern matching to improve Airline ETAs.

Before these vast amounts of data can be analyzed for businesses to gain a competitive advantage, there must be adequate measures taken place to ensure that businesses use supporting technologies and management techniques to fully optimize modern-day BDA. These key factors include Data Management, Cloud Computing, IoT, Machine Learning and Artificial Intelligence (AI), and Predictive Analysis.

To make full use of big data as a whole, it must be managed properly to bring out its full potential. It not only involves the management of the database, but it is rather a “systematic process of capturing, delivering, operating, protecting, enhancing, and disposing of the data cost-effectively (Li H. , 2015)”. That is why it is important to manage data effectively since data management is a vital part of making use of modern-day computer technology for business managers, corporate executives, etc.…, to bring about the best of data analysis leading to operational decision-making and strategic planning (Stedman & Vaughn, 2019). Data management practices include a

(26)

18

combination of different functions collectively working to ensure corporate systems comprise of high accuracy, high availability, and high accessibility in data.

The key parts of the data management process i.e., an organization’s data strategy begin with ensuring that data architecture is designed and deployed with database systems, data warehouses, etc…, which house all data belonging to an organization. Primarily, databases are managed by a database management system (DBMS) which is software that acts as an interface between the databases and the database administrators, end-users, and interconnected applications which have access to them (Stedman & Vaughn, 2019). DBMS technology consist of relational database management systems which work with more structured data and NoSQL databases are used for storing unstructured and semi-structured data [ (Li H. , 2015) (Stedman & Vaughn, 2019)]. Secondly, for data stored to be organized, stored, and retrieved, there have to be various techniques for handling databases or schemas. The three leading database or schema systems today are MySQL (My Structured Query Language), SQL (Structured Query Language), and Oracle Database (Database Guide, 2016). All of these databases require methods for database management that are based on the type of data model it runs on namely: hierarchical model, network model, entity-relationship model, and relational model (Gupta, 2020). It should be noted that there are various other categories of data models depending on the case scenario. These mathematical data models are created to help with the analysis and visualization of data so that databases can be easily organized to meet business needs. There are a wide variety of mathematical models which make use of different technologies such as machine learning algorithms which make use of a priori for models (Li H. , 2015).

The third key aspect is to generate, process, and store data in the database systems, cloud storage services, and other data repositories (Stedman & Vaughn, 2019). Concurrently, as data is being generated it must be integrated from different sources for operational and analytical sources into data warehouses and data lakes. Data warehouses are considered the more traditional method of handling structured data while the use of data lakes is a more advanced option that stores pools of big data for predictive analysis & modeling, machine learning, and other advanced analytics (Stedman & Vaughn, 2019). Data Lakes are mostly built on a Hadoop Cluster which consists of an Ecosystem of a wide variety of platforms combined in the data lake environment. For instance, Apache Hadoop is designed to be scaled up from single servers to thousands of machines, each offering local computation and storage (The Apache Software Foundation, 2021). This level of connectivity also highlights the importance of the Internet of Things (IoT) and Cloud Computing in bringing the most out of Big Data applications since the IoT, Big Data, and Cloud Computing are considered as the source of data, the analytic platform of data and the location of storage, scale, and speed of access of data respectively (McKenna, 2021).

Lastly, it is important to have data quality and data governance practices put in place to ensure the quality of an organization’s data (Li H. , 2015). Data quality has been described by scholars as describing the properties of information such as accuracy, timeliness, completeness, consistency,

(27)

19

relevance, and fitness for use (Janssen, Voort, & Wahyudi, 2017). Techniques involved in ensuring data quality include data profiling to identify outlier values in data, data cleansing to fix errors by modifying and deleting data, and data validation which checks gathered data against quality rules which have been pre-set (Stedman & Vaughn, 2019). To achieve a desirable level of data quality and governance Janssen et. al (2017) suggest that effective contractual and relational governance mechanisms for managing the big data chain must be put in place. For instance, a big data analytics department in close collaboration with the data quality department can achieve this.

Figure 1 Data Management process loosely adapted from "Introduction to Big Data". (Li H. , 2015)

An example of an architecture provided by Vera-Baquero et.al (2013) is designed to integrate big data analytics with business process management (BPM) in a distributed environment to effectively analyze the execution outcomes of business processes. This big data-driven approach to process management is based on their framework which involves a:

• Data strategy involving cloud-based infrastructure that allows each organizational unit of a company to handle its local business analytics service unit (BASU). The cloud-based infrastructure is based on Apache Hadoop (The Apache Software Foundation, 2021) , which is designed to provide high performance and scalability for organizations in need of big data analytical services.

Data

Management

Data Strategy Data Quality and Governance Data Generation and Integration Database Management

(28)

20

• Database management is run through their local event repository, HBase (The Apache Software Foundation, 2021) which is an open-source distributed database system developed as part of Apache Hadoop. It is a NoSQL, versioned, column-oriented data storage system that provides random real-time read/write access to big data tables and runs on top of HDFS (Hadoop Distributed Filesystem).

• Data Generation and Integration are provided by an additional component of the Apache Hadoop project known as Hive (The Apache Software Foundation, 2014), which is an open-source data warehouse system. This warehouse system provides data summarization, queries, and analysis of large data sets.

• Lastly, data quality and governance in this framework are assured through the global business analytics service (GBAS) which allows organizations to integrate a diverse set of heterogenous systems in identifying the sequence flow of the processes that run through their systems (Vera-Baquero, Colomo-Palacios, & Molloy, 2013).

Overall, such an architecture provides a business the opportunity to implement a big-data-driven process management system with opportunities for business analytics, real-time business intelligence (BI), business activity/process monitoring, collaborative analytics, and simulation engines (Vera-Baquero, Colomo-Palacios, & Molloy, 2013).

3.4 Process Management

Process management is a key aspect of quality management, but to speak about process management, one must first understand and define what a process is. In the relevant literature, a process has been defined by different scholars in various ways. As such, the chosen definition for this research is that given by Bergman and Klefsjö (2010, p. 456) – “a process is a network of activities that are repeated in time, whose objective is to create value to external or internal customers.” From the definition, it can be noticed that the key aspect is to create value that leads to customer satisfaction which falls under the cornerstone model of total quality management. However, processes are not an isolated cornerstone but are linked to all cornerstones with the sole aim of satisfying customers. To improve processes and create better results for customers and the whole organization, a process-oriented management method such as process management is essential. According to Palmberg (2009), process management can be defined as “a structured systematic approach to analyze and continually improve the process”. Palmberg (2009) states multiple purposes of adopting process management such as:

• To remove barriers between functional groups and strengthen organizational bonds. • To control and improve the processes of an organization.

• To improve the quality of products and services.

• To identify opportunities for outsourcing and the use of technology for organizational support.

(29)

21

• To align the business process with strategic objectives and customer needs. • To improve organizational effectiveness and improve business performance.

Process management has a wide scope of activities involved in its implementation, however, Cronemyr & Danielsson (2013), propose a three-step model for process management which involves three main activities, see Figure 2. These include process mapping & development, process analysis & improvement, and process control & agility.

Figure 2. Process Management 1-2-3. (Cronemyr & Danielsson, 2013)

Process mapping & development is considered as the first step in process management where the process owners are established within top management, as well as process teams appointed with their respective process team leaders. In this early stage of process management, the process is mapped out and developed for the implementation of the process. The process owner is responsible for making strategic decisions concerning the process and overall process improvement while the process leader is in charge of daily operations of the process by working with managers, team members, and other employees to ensure that the process is mapped out and developed properly. This stage requires everyone involved in the process to have a similar understanding of the process, thus making process mapping a good way of achieving that goal. When processes are mapped out properly, the process team can identify value-adding activities as well as non-value activities that can be removed from the process. This brings about an increase in the process orientation of the organization by getting everyone involved and identifying what is optimal for a process.

For the second stage of process management i.e., process analysis & improvement, it is necessary to understand the three categorizations of activities within a process. These activities include the

value-adding activities which bring about value for the customer and non-value-adding activities

(30)

22

work. Lastly, work that does not add value to nor aid in value-adding work is considered waste. In quality management, there are various tools and methods used in analyzing processes to improve upon them. Six Sigma methodology for instance incorporates most of these methods in a data-driven, simple, and logical problem-solving approach (Brook, 2020). The structured approach of Six Sigma has five phases; Define, Measure, Analyze, Improve, and Control (DMAIC). This iterative approach makes use of various techniques during a Six Sigma project to solve a problem within a process by analyzing and improving upon the specific process. For example, brainstorming within the analysis phase is used to come up with ideas as to where the potential root causes for a problem in a process could stem from.

Process control and agility is the third and final step in process management where the aim is to measure the outcome and control the variables which affect the process. In this stage statistical tools are used to control and steer the process to obtain a process that is in control and stable. These tools are collectively known as Statistical Process Control (SPC) and are based on mathematical statistics which are closely related to the quality principle of “basing decisions on facts” i.e., one of the cornerstones in Total Quality Management. When working with SPC it is important to be aware that variations occur in all processes but not all variations have a big effect on the process. According to Deming, there are two types of variations: common cause variation and special cause variation. Common cause variations are all the numerous small variations that can be predicted to fall within the μ +/- 3𝜎. Whilst special cause variations are the few larger variations of some variables considered to be unpredictable surprises and abnormal variations, meaning that the process output will fall outside of μ +/- 3𝜎. SPC is used in organizations with the main purpose of achieving process stability and improving process capability by reducing variation. Adopting a statistical-thinking strategy is to improve the performance of a process by ensuring the process is in control (stable) and, the performance per customer specifications (capability), is improved. SPC analysis normally includes the following steps: checking for normality and distribution, plotting time series and control charts, conducting a capability analysis, and finally interpreting and evaluating the results of the analysis.

(31)

23

4. Literature review

In this chapter relevant findings from academic journals and literature is presented to come up with new understandings and insights to answer the first research question regarding the usage of big data in process management.

This literature review delves deep within suggested topics and areas included in the theoretical framework to find relevant academic journals, literature, books, etc., needed for answering the first research questions on how big data is used in process management.

For each article in this chapter, a summary of the most relevant findings is presented in detail under each subchapter heading. At the end of this chapter, a literature review summary is presented in the form of an analysis matrix that shows how the contents of each article connect both process management and big data.

4.1 Value creation through big data

In the article, “Value creation through big data application process management: the case of the oil and gas industry” written by Sumbal et al. (2019), the authors describe how big data was used in oil and gas (O&G) sector to create value. In their study, they interviewed top and middle-level managers and experts from different companies around the world with relevant knowledge of the use of Big Data in the O&G sector (Sumbal, et al., 2019). Their findings revealed how big data is used primarily to enhance internal processes such as improving the performance of different operations, increasing the efficiency of different mechanisms, and optimizing costs of new technologies (Sumbal, et al., 2019).

One participant revealed how O&G companies use big data to develop a catalyst to be used in the gas liquefaction process i.e., condensation of gas. By comparing large volumes of data, the participant’s company was able to determine the most efficient combination of parameters for catalyst development. This proved to help reduce the development time of the catalyst from many years to just 13 months.

Another example of how big data was used, was in the maintenance processes of companies in the O&G sector. They were able to conduct predictive maintenance of machines by generating a variety of data through sensors placed on equipment such as turbines, pumps, and compressors. This enabled predictions of parts of machines that required maintenance through the analysis of data from the sensors and comparing various parameters of the machines. This was beneficial since they were able to save time and reduce the cost of repairs.

One of the interviewed participants working as a Senior IT Manager stated that 15 big data projects had been initiated within his organization. They had been using satellite images to make decisions related to pollution. The idea behind this approach was to monitor the leakage of oil pumps to

(32)

24

check for pollution of their installations under water. This participant also stated that by using satellite images, it was possible to automatically detect leakage.

Aside from the practical applications of big data in process management to create value, Sumbal et al. (2019) highlight the main issue and challenges concerning its implementation. These included top management’s orientation towards big data, cyber-attacks & security issues, quality & integration of data sources, IT infrastructure, and the lack of data scientists.

The interviews showcased that in some cases executives were not aware and concerned about big data but only budgets, costs, and other high-level decisions. Other companies had executives who were more knowledgeable about big data and were generally supportive of big data but not for large-scale deployment of the technology behind it. Another participant revealed that top management viewed data management as a function of IT when that was not the case.

Cyber-attacks and security issues were a concern for one participant’s company where technology was the main issue. This company did not trust cloud-based systems to be safe for storing data since they did not own them. Another company recognized that a cyber link in between big data by having instrumentation necessary for establishing this link. However, they feared giving access to big data because of the risk of cyber-attacks.

Concerning quality and integration of data sources, the O&G companies recognized the challenge it posed since their exploration operations were very data intensive. They required an investment of time, money, and expertise to integrate huge volumes of data generated through a variety of equipment, instruments, and sensors. Participants from two different companies revealed that integration of data from different databases to get knowledge out of it was their main issue and needed data warehouse solutions for this. Another challenge was ensuring data was looked into, sampled, and searched for problems and that master data management systems and programs were set up to improve data quality. Specifically, they needed recorded data to be confirmative data (explicit knowledge) that had trended over a long period to make decisions (tacit knowledge). Their research revealed that the IT infrastructure of O&G companies was not a challenge for companies looking to perform BDA. IT infrastructure was considered to be very robust, with multiple alternatives of solution providers. Most companies were investing in infrastructure to store their data for faster real-time connections between rigs and centers. The major concern for these companies was that there existed a lack of data analysis and scientists who clearly understood data mining, data analytics, programming, statistics, and business.

References

Related documents

In discourse analysis practise, there are no set models or processes to be found (Bergstrom et al., 2005, p. The researcher creates a model fit for the research area. Hence,

In particular, the purpose of the research was to seek how case companies define data- drivenness, the main elements characterizing it, opportunities and challenges, their

Regarding actual integration of the crowdsourced data to the official data, the results from the alpha evaluation of the External Alert to Alert Matching method indicate

http://juncker.epp.eu/sites/default/files/attachments/nodes/en_01_main.pdf (accessed on 03 May, 2018) as cited in DREXL, J. Designing Competitive Markets for Industrial Data – Between

While social media, however, dominate current discussions about the potential of big data to provide companies with a competitive advantage, it is likely that really

Advertising Strategy for Products or Services Aligned with Customer AND are Time-Sensitive (High Precision, High Velocity in Data) 150 Novel Data Creation in Advertisement on

Different tools such as life cycle assessment (LCA), input-output analysis, material intensity per service unit (MIPS) etc. are described in Moberg et al. However, the aim of

We achieve scalability by using distributed file systems, having an archiving file system interface that can be implemented by new file systems in the future and we have one