• No results found

Improving the Quality Attributes of a Monolithic Messaging Gateway

N/A
N/A
Protected

Academic year: 2021

Share "Improving the Quality Attributes of a Monolithic Messaging Gateway"

Copied!
53
0
0

Loading.... (view fulltext now)

Full text

(1)

Sender

Gateway 1

Recipient

NoSQL

MySQL

Gateway 2

NoSQL

Paper B Paper C Paper A Paper D P R O V IN G T H E Q U A LIT Y A TT R IB U TE S O F A M O N O LIT H IC M ES SA G IN G G A TE W A Y 2020 ISBN 978-91-7485-465-7 ISSN 1651-9256

Address: P.O. Box 883, SE-721 23 Västerås. Sweden Address: P.O. Box 325, SE-631 05 Eskilstuna. Sweden E-mail: info@mdh.se Web: www.mdh.se

that their messages will reach the customers.

One of the software products that handle this kind of data traffic, which has some unusual features and quality requirements, is the Enterprise Messaging Gateway (EMG) from Infoflex Connect (ICAB). In this thesis Daniel Brahneborg, in a collaboration between Mälardalen University and ICAB, has built on current research to find better ways to meet the sometimes conflicting requirements of both good performance and high reliability. This has resulted in a new algorithm for finding deviations in response times, which can vary from a few mil-liseconds to several seconds and still be considered normal. It has also provided a more efficient technique to keep data safe when using geographically dispersed computers. A thorough analysis of EMG’s architecture finally showed how its balance management could be changed to handle the steadily increasing traffic volumes of both larger and smaller SMS brokers.

Daniel Brahneborg has worked at Infoflex Connect AB in Stockholm,

Sweden for close to 20 years. Most of this time has been spent on the company's flagship product, the SMS gateway EMG. He received a M.Sc degree in computer science from Umeå University in 2015, after com-pleting his master's thesis on the evaluation of an alternative software architecture for EMG. In 2017 he joined the ITS ESS-H research school as an industrial doctoral student, with a research focus on distributed systems in general and messaging systems in particular.

(2)

Mälardalen University Press Licentiate Theses No. 290

IMPROVING THE QUALITY ATTRIBUTES

OF A MONOLITHIC MESSAGING GATEWAY

Daniel Brahneborg 2020

School of Innovation, Design and Engineering

Mälardalen University Press Licentiate Theses No. 290

IMPROVING THE QUALITY ATTRIBUTES

OF A MONOLITHIC MESSAGING GATEWAY

Daniel Brahneborg 2020

School of Innovation, Design and Engineering

(3)

Copyright © Daniel Brahneborg, 2020 ISBN 978-91-7485-465-7

ISSN 1651-9256

Printed by E-Print, Stockholm, Sweden

Copyright © Daniel Brahneborg, 2020 ISBN 978-91-7485-465-7

ISSN 1651-9256

(4)

Ambiguous dedication. Ambiguous dedication.

(5)
(6)

Abstract

All software communicates, either with the operating system, with other soft-ware running on the same machine, or over a network. Typically some middle-ware is used to facilitate this communication, providing protocol conversion between otherwise incompatible senders and recipients, as well as to handle buffering for optimized bandwidth usage. This sender middleware recip-ient model holds for all abstraction levels, from the physical link layer up to the application layer, with differences only in the details.

In this thesis we focus on the application layer, in particular on the group of middleware called “messaging gateways”. For the validation of our contribu-tions, we use an existing, real-world messaging gateway deployed world-wide for forwarding mobile text messages. As these messages have a per message cost, this gateway also has a module for credit management, used when billing the senders for their traffic. Our overall goal is to find ways to improve the quality attributes of such a gateway, particularly concerning message delivery throughput and reliability.

First, we wanted a better understanding of how the round-trip times for outgoing requests varied, in order to correctly detect abnormal delays. This resulted in a new variant of exponential smoothing which we used in a novel algorithm to detect anomalies. This algorithm was then validated in a case study, resulting in a log file analysis tool now used in production.

Second, the requirements for high throughput and strong reliability in some ways contradict each other. Strong reliability requires messages to be repli-cated to one or more other nodes, resulting in extra processing and network traffic, which lowers the throughput. We addressed this conundrum by first writing a problem formulation on how the quality attributes of a messaging gateway would be affected by a multi-node configuration, resulting in a review of state of the art and state of practice for multi-node systems. Next, we devel-oped a data replication algorithm, and validated it in a controlled experiment. Its proof-of-concept implementation showed that even in a geo-distributed con-figuration, replication throughput can scale with the number of nodes.

Abstract

All software communicates, either with the operating system, with other soft-ware running on the same machine, or over a network. Typically some middle-ware is used to facilitate this communication, providing protocol conversion between otherwise incompatible senders and recipients, as well as to handle buffering for optimized bandwidth usage. This sender middleware recip-ient model holds for all abstraction levels, from the physical link layer up to the application layer, with differences only in the details.

In this thesis we focus on the application layer, in particular on the group of middleware called “messaging gateways”. For the validation of our contribu-tions, we use an existing, real-world messaging gateway deployed world-wide for forwarding mobile text messages. As these messages have a per message cost, this gateway also has a module for credit management, used when billing the senders for their traffic. Our overall goal is to find ways to improve the quality attributes of such a gateway, particularly concerning message delivery throughput and reliability.

First, we wanted a better understanding of how the round-trip times for outgoing requests varied, in order to correctly detect abnormal delays. This resulted in a new variant of exponential smoothing which we used in a novel algorithm to detect anomalies. This algorithm was then validated in a case study, resulting in a log file analysis tool now used in production.

Second, the requirements for high throughput and strong reliability in some ways contradict each other. Strong reliability requires messages to be repli-cated to one or more other nodes, resulting in extra processing and network traffic, which lowers the throughput. We addressed this conundrum by first writing a problem formulation on how the quality attributes of a messaging gateway would be affected by a multi-node configuration, resulting in a review of state of the art and state of practice for multi-node systems. Next, we devel-oped a data replication algorithm, and validated it in a controlled experiment. Its proof-of-concept implementation showed that even in a geo-distributed con-figuration, replication throughput can scale with the number of nodes.

(7)

Finally, in order to ensure we were solving the right problems going for-ward, we also performed an architecture analysis of the messaging gateway based on its quality requirements. From this exploratory case study, we de-duced a somewhat unexpected plan for migrating the balance management module to a set of microservices, thereby providing higher throughput for most users of our messaging gateway.

vi

Finally, in order to ensure we were solving the right problems going for-ward, we also performed an architecture analysis of the messaging gateway based on its quality requirements. From this exploratory case study, we de-duced a somewhat unexpected plan for migrating the balance management module to a set of microservices, thereby providing higher throughput for most users of our messaging gateway.

(8)

Sammanfattning

All programvara kommunicerar på ett eller annat sätt, antingen med opera-tivsystemet, med annan programvara som körs på samma dator, eller över ett nätverk. Vanligtvis används någon form av mellanmjukvara för att underlätta kommunikationen, vid behov tillhandahålla protokollkonvertering och medelst buffring optimera bandbreddsanvändningen. Modellen med avsändare mel-lanmjukvara mottagare är användbar på alla nivåer, från det fysiska länkla-gret till applikationslalänkla-gret, där skillnaderna huvudsakligen avser detaljer.

Den här avhandlingen har fokus på applikationslagret, framför allt på grup-pen mellanmjukvara kallad “messaging gateways”. För att validera våra re-sultat använde vi en befintlig programvara skapad specifikt för att vidarebe-fordra SMS över hela världen. Eftersom SMS debiteras per meddelande har denna programvara också en modul för kredithantering, som ger underlag för fakturering av trafiken. Vårt övergripande mål är att identifiera olika sätt att förbättra kvalitetsattributen för sådan programvara, i synnerhet avseende pre-standa och tillförlitlighet.

Till att börja med ville vi få bättre förståelse för variationen i svarstiderna från mobiloperatörerna, med målet att kunna identifiera onormala beteenden. Arbetet med att nå denna förståelse resulterade i en ny variant av exponentiell utjämning och en algoritm för avvikelsedetektering. Algoritmen validerades därefter i en fallstudie.

Vidare står kraven på hög prestanda och tillförlitlighet i konflikt med varan-dra, eftersom hög tillförlitlighet kräver att meddelanden replikeras till en eller flera andra noder, vilket resulterar i ökad bearbetning och nätverkstrafik, och därmed påverkar prestandan negativt. Vi adresserade detta problem genom att först skriva en problemformulering för hur kvalitetsattributen skulle påverkas i en konfiguration med flera noder, vilket resulterade i en översyn av modern forskning och praxis. Därefter utvecklade vi en ny datareplikeringsalgoritm, som validerades i ett kontrollerat experiment. Resultaten från experimentet visade att även i en geografiskt utspridd konfiguration, kan prestandan öka i takt med antalet noder.

Sammanfattning

All programvara kommunicerar på ett eller annat sätt, antingen med opera-tivsystemet, med annan programvara som körs på samma dator, eller över ett nätverk. Vanligtvis används någon form av mellanmjukvara för att underlätta kommunikationen, vid behov tillhandahålla protokollkonvertering och medelst buffring optimera bandbreddsanvändningen. Modellen med avsändare mel-lanmjukvara mottagare är användbar på alla nivåer, från det fysiska länkla-gret till applikationslalänkla-gret, där skillnaderna huvudsakligen avser detaljer.

Den här avhandlingen har fokus på applikationslagret, framför allt på grup-pen mellanmjukvara kallad “messaging gateways”. För att validera våra re-sultat använde vi en befintlig programvara skapad specifikt för att vidarebe-fordra SMS över hela världen. Eftersom SMS debiteras per meddelande har denna programvara också en modul för kredithantering, som ger underlag för fakturering av trafiken. Vårt övergripande mål är att identifiera olika sätt att förbättra kvalitetsattributen för sådan programvara, i synnerhet avseende pre-standa och tillförlitlighet.

Till att börja med ville vi få bättre förståelse för variationen i svarstiderna från mobiloperatörerna, med målet att kunna identifiera onormala beteenden. Arbetet med att nå denna förståelse resulterade i en ny variant av exponentiell utjämning och en algoritm för avvikelsedetektering. Algoritmen validerades därefter i en fallstudie.

Vidare står kraven på hög prestanda och tillförlitlighet i konflikt med varan-dra, eftersom hög tillförlitlighet kräver att meddelanden replikeras till en eller flera andra noder, vilket resulterar i ökad bearbetning och nätverkstrafik, och därmed påverkar prestandan negativt. Vi adresserade detta problem genom att först skriva en problemformulering för hur kvalitetsattributen skulle påverkas i en konfiguration med flera noder, vilket resulterade i en översyn av modern forskning och praxis. Därefter utvecklade vi en ny datareplikeringsalgoritm, som validerades i ett kontrollerat experiment. Resultaten från experimentet visade att även i en geografiskt utspridd konfiguration, kan prestandan öka i takt med antalet noder.

(9)

För att slutligen säkerställa att vi framöver skapar lösningar som ger signifikanta förbättringar, utförde vi också en arkitekturanalys av SMS-programvaran. Denna fallstudie utmynnade i en något oväntad insikt om att en migrering av kredithanteringen till en uppsättning mikrotjänster skulle resultera i förbättrad prestanda för de flesta av systemets olika användarkategorier.

viii

För att slutligen säkerställa att vi framöver skapar lösningar som ger signifikanta förbättringar, utförde vi också en arkitekturanalys av SMS-programvaran. Denna fallstudie utmynnade i en något oväntad insikt om att en migrering av kredithanteringen till en uppsättning mikrotjänster skulle resultera i förbättrad prestanda för de flesta av systemets olika användarkategorier.

(10)

Popular summary

As individuals, we can choose between a plethora of systems for sending short messages. For messages sent from companies to their customers, such as meet-ing reminders, tickets, and authentication codes, traditional text messages are still commonly used as this is proven technology which works on all mobile phones. The companies usually send these messages via SMS brokers, who in turn forward them to each recipient’s mobile operator. Because the brokers charge the senders per message, they want to be able to handle a large num-ber of messages for this traffic to be profitable. They also want to be sure the senders are charged the correct amount. Senders, on their part, want to be able to trust that their messages will reach the customers.

One of the software products that handle this kind of data traffic, which has some unusual features and quality requirements, is the Enterprise Mes-saging Gateway (EMG) from Infoflex Connect (ICAB). In this thesis Daniel Brahneborg, in a collaboration between Mälardalen University and ICAB, has built on current research to find better ways to meet the sometimes conflicting requirements of both good performance and high reliability. This has resulted in a new algorithm for finding deviations in response times, which can vary from a few milliseconds to several seconds and still be considered normal. It has also provided a more efficient technique to keep data safe when using geo-graphically dispersed computers. A thorough analysis of EMG’s architecture finally showed how its balance management could be changed to handle the steadily increasing traffic volumes of both larger and smaller SMS brokers.

Popular summary

As individuals, we can choose between a plethora of systems for sending short messages. For messages sent from companies to their customers, such as meet-ing reminders, tickets, and authentication codes, traditional text messages are still commonly used as this is proven technology which works on all mobile phones. The companies usually send these messages via SMS brokers, who in turn forward them to each recipient’s mobile operator. Because the brokers charge the senders per message, they want to be able to handle a large num-ber of messages for this traffic to be profitable. They also want to be sure the senders are charged the correct amount. Senders, on their part, want to be able to trust that their messages will reach the customers.

One of the software products that handle this kind of data traffic, which has some unusual features and quality requirements, is the Enterprise Mes-saging Gateway (EMG) from Infoflex Connect (ICAB). In this thesis Daniel Brahneborg, in a collaboration between Mälardalen University and ICAB, has built on current research to find better ways to meet the sometimes conflicting requirements of both good performance and high reliability. This has resulted in a new algorithm for finding deviations in response times, which can vary from a few milliseconds to several seconds and still be considered normal. It has also provided a more efficient technique to keep data safe when using geo-graphically dispersed computers. A thorough analysis of EMG’s architecture finally showed how its balance management could be changed to handle the steadily increasing traffic volumes of both larger and smaller SMS brokers.

(11)
(12)

Populärvetenskaplig

sammanfattning

Privatpersoner kan idag välja bland många olika system för att skicka korta meddelanden mellan sig. För meddelanden som skickas från företag till de-ras kunder, exempelvis mötespåminnelser, biljetter, och inloggningskoder, är dock traditionella SMS fortfarande väldigt vanliga eftersom tekniken är väl-beprövad och fungerar på alla mobiltelefoner. Företagen skickar oftast dessa via SMS-mäklare, som i sin tur skickar dem vidare till mottagarnas respektive mobiloperatör. Eftersom mäklarna tar ut en avgift från avsändarna per medde-lande, vill de kunna hantera stora trafikmängder för att det ska vara lönsamt. De vill också vara säkra på att avsändarna debiteras rätt belopp. Avsändarna å sin sida vill kunna lita på att deras meddelanden kommer fram till kunderna.

En av de mjukvaruprodukter som finns för att hantera den här sortens data-trafik, som har både speciella egenskaper och kvalitetskrav, är Enterprise Mes-saging Gateway (EMG) från Infoflex Connect (ICAB). I ett samarbete mellan Mälardalens Högskola och ICAB har Daniel Brahneborg i den här avhand-lingen byggt vidare på aktuell forskning för att hitta bättre sätt att uppfylla de emellanåt svårförenliga kraven på både god prestanda och hög tillförlitlighet. Arbetet har resulterat i en ny algoritm för att hitta avvikelser i svarstider, trots att de kan variera från enstaka millisekunder till flera sekunder och ändå anses normala. Det har också gett en effektivare metod för att säkerhetskopiera data mellan geografiskt åtskilda datorer. En djupgående analys av EMGs arkitektur visade slutligen hur dess saldohantering skulle kunna ändras för att hantera de stadigt ökande trafikvolymerna hos både större och mindre SMS-mäklare.

Populärvetenskaplig

sammanfattning

Privatpersoner kan idag välja bland många olika system för att skicka korta meddelanden mellan sig. För meddelanden som skickas från företag till de-ras kunder, exempelvis mötespåminnelser, biljetter, och inloggningskoder, är dock traditionella SMS fortfarande väldigt vanliga eftersom tekniken är väl-beprövad och fungerar på alla mobiltelefoner. Företagen skickar oftast dessa via SMS-mäklare, som i sin tur skickar dem vidare till mottagarnas respektive mobiloperatör. Eftersom mäklarna tar ut en avgift från avsändarna per medde-lande, vill de kunna hantera stora trafikmängder för att det ska vara lönsamt. De vill också vara säkra på att avsändarna debiteras rätt belopp. Avsändarna å sin sida vill kunna lita på att deras meddelanden kommer fram till kunderna.

En av de mjukvaruprodukter som finns för att hantera den här sortens data-trafik, som har både speciella egenskaper och kvalitetskrav, är Enterprise Mes-saging Gateway (EMG) från Infoflex Connect (ICAB). I ett samarbete mellan Mälardalens Högskola och ICAB har Daniel Brahneborg i den här avhand-lingen byggt vidare på aktuell forskning för att hitta bättre sätt att uppfylla de emellanåt svårförenliga kraven på både god prestanda och hög tillförlitlighet. Arbetet har resulterat i en ny algoritm för att hitta avvikelser i svarstider, trots att de kan variera från enstaka millisekunder till flera sekunder och ändå anses normala. Det har också gett en effektivare metod för att säkerhetskopiera data mellan geografiskt åtskilda datorer. En djupgående analys av EMGs arkitektur visade slutligen hur dess saldohantering skulle kunna ändras för att hantera de stadigt ökande trafikvolymerna hos både större och mindre SMS-mäklare.

(13)
(14)

Acknowledgments

This thesis would not exist had it not been for the teachers at MDH. After putting my academic studies on hold for 20 years, in 2015 I finally completed my master’s thesis. Studying turned out to be more fun than I remembered, probably because after having been programming for a living for some time, I now had concrete use cases where the new knowledge could be applied. So, I kept taking courses here and there. One of these courses was on software testing, at MDH. It turned out that testing was actually a science which could be approached methodically to get significantly better results, and not a com-petition in who can find the best pills with funny colors? Who knew? Thank you Daniel, Wasif, Adnan, and Eddie for turning my world upside-down.

Until this point, doctoral studies had never even crossed my mind. Then somebody at MDH flips my world (again) by mentioning the concept “indus-trial PhD”. By doing publishable research and taking courses which would be beneficial for the company, everybody wins. The company gets better prod-ucts, and if everything goes well the student eventually gets a PhD. Cool! Thank you Stefan for agreeing that this would be a good idea for Infoflex Connect. There has been some changes in the set of advisors over the years, which can be seen in the lists of co-authors. Daniel, Wasif, Adnan, Mats and now most recently Saad, have all helped in making this thesis better in various ways. Thank you all.

A special thanks also goes to the other PhD students at MDH, in particular the rest of you in the ITS ESS-H industrial graduate school, for inspiration. We mostly all do our own stuff, but the number of “oh, that was a neat way of doing $thing” moments in both papers and presentations are still quite many.

Though most of you are anonymous, I must also thank everybody who has reviewed and commented on our submitted papers over the years. Getting a “-2: Reject” is never fun, but eventually the associated comments helped making the papers included in this thesis better. Both “+3: Strong Accept” (on a scale from -3 to +3) and “Originality: 6” (on a scale from 1 to 6) are however obviously more appreciated.

Acknowledgments

This thesis would not exist had it not been for the teachers at MDH. After putting my academic studies on hold for 20 years, in 2015 I finally completed my master’s thesis. Studying turned out to be more fun than I remembered, probably because after having been programming for a living for some time, I now had concrete use cases where the new knowledge could be applied. So, I kept taking courses here and there. One of these courses was on software testing, at MDH. It turned out that testing was actually a science which could be approached methodically to get significantly better results, and not a com-petition in who can find the best pills with funny colors? Who knew? Thank you Daniel, Wasif, Adnan, and Eddie for turning my world upside-down.

Until this point, doctoral studies had never even crossed my mind. Then somebody at MDH flips my world (again) by mentioning the concept “indus-trial PhD”. By doing publishable research and taking courses which would be beneficial for the company, everybody wins. The company gets better prod-ucts, and if everything goes well the student eventually gets a PhD. Cool! Thank you Stefan for agreeing that this would be a good idea for Infoflex Connect. There has been some changes in the set of advisors over the years, which can be seen in the lists of co-authors. Daniel, Wasif, Adnan, Mats and now most recently Saad, have all helped in making this thesis better in various ways. Thank you all.

A special thanks also goes to the other PhD students at MDH, in particular the rest of you in the ITS ESS-H industrial graduate school, for inspiration. We mostly all do our own stuff, but the number of “oh, that was a neat way of doing $thing” moments in both papers and presentations are still quite many.

Though most of you are anonymous, I must also thank everybody who has reviewed and commented on our submitted papers over the years. Getting a “-2: Reject” is never fun, but eventually the associated comments helped making the papers included in this thesis better. Both “+3: Strong Accept” (on a scale from -3 to +3) and “Originality: 6” (on a scale from 1 to 6) are however obviously more appreciated.

(15)

Above all, thanks to Mia for support, numerous discussions and tons of links to useful articles and videos helping me understand what I was trying to achieve, and spending countless hours suggesting cleaner, clearer, and more concise English expressions. You are the bestest.

Daniel Brahneborg Västerås and Stockholm, 2020

xiv

Above all, thanks to Mia for support, numerous discussions and tons of links to useful articles and videos helping me understand what I was trying to achieve, and spending countless hours suggesting cleaner, clearer, and more concise English expressions. You are the bestest.

Daniel Brahneborg Västerås and Stockholm, 2020

(16)

List of Publications

I am the main author of all papers listed below. I also made the implementa-tions for Paper A and Paper D. The co-authors all contributed with valuable discussions before and during each project, and various additions, adjustments and clarifications of the texts. The included papers have been reformatted to comply with the thesis layout.

Papers included in the thesis

Paper A: Daniel Brahneborg, Wasif Afzal, Adnan ˇCauševi´c, Daniel Sund-mark, and Mats Björkman. Round-Trip Time Anomaly Detection. In the ACM/SPEC International Conference on Performance Engineer-ing (ICPE). ACM, 2018.

Paper B: Daniel Brahneborg, Wasif Afzal, Adnan ˇCauševi´c, Mats Björkman. Towards a More Reliable Store-and-forward Protocol for SMS.

In the Workshop on Advanced tools, programming languages, and PLat-forms for Implementing and Evaluating algorithms for Distributed sys-tems (ApPLIED), part of the ACM Symposium on Principles of Dis-tributed Computing (PODC). ACM, 2018.

Paper C: Daniel Brahneborg, Wasif Afzal. A Lightweight Architecture Analysis of a Monolithic Messaging Gateway.

In the IEEE International Conference on Software Architecture (ICSA). IEEE, 2020.

Paper D: Daniel Brahneborg, Wasif Afzal, Adnan ˇCauševi´c, Mats Björkman. Superlinear and Bandwidth Friendly Geo-replication for Store-And-Forward Systems.

In the International Conference on Software Technologies (ICSOFT). INSTICC, 2020.

List of Publications

I am the main author of all papers listed below. I also made the implementa-tions for Paper A and Paper D. The co-authors all contributed with valuable discussions before and during each project, and various additions, adjustments and clarifications of the texts. The included papers have been reformatted to comply with the thesis layout.

Papers included in the thesis

Paper A: Daniel Brahneborg, Wasif Afzal, Adnan ˇCauševi´c, Daniel Sund-mark, and Mats Björkman. Round-Trip Time Anomaly Detection. In the ACM/SPEC International Conference on Performance Engineer-ing (ICPE). ACM, 2018.

Paper B: Daniel Brahneborg, Wasif Afzal, Adnan ˇCauševi´c, Mats Björkman. Towards a More Reliable Store-and-forward Protocol for SMS.

In the Workshop on Advanced tools, programming languages, and PLat-forms for Implementing and Evaluating algorithms for Distributed sys-tems (ApPLIED), part of the ACM Symposium on Principles of Dis-tributed Computing (PODC). ACM, 2018.

Paper C: Daniel Brahneborg, Wasif Afzal. A Lightweight Architecture Analysis of a Monolithic Messaging Gateway.

In the IEEE International Conference on Software Architecture (ICSA). IEEE, 2020.

Paper D: Daniel Brahneborg, Wasif Afzal, Adnan ˇCauševi´c, Mats Björkman. Superlinear and Bandwidth Friendly Geo-replication for Store-And-Forward Systems.

In the International Conference on Software Technologies (ICSOFT). INSTICC, 2020.

(17)

Papers not included in thesis

Paper X: Daniel Brahneborg, Wasif Afzal, Adnan ˇCauševi´c. A Pragmatic Perspective on Regression Testing Challenges.

In the International Conference on Software Quality, Reliability & Se-curity (QRS), IEEE, 2017.

Paper Y: Daniel Brahneborg, Wasif Afzal, Adnan ˇCauševi´c. A Black-Box Approach to Latency and Throughput Analysis.

In the International Conference on Software Quality, Reliability & Se-curity (QRS). IEEE, 2017.

This is an early version of Paper A.

Paper Z: Daniel Brahneborg. Doctoral Symposium: Leaderless Replication and Balance Management of Unordered SMS Messages.

In the International Conference on Distributed and Event-based Systems (DEBS), ACM, 2019.

xvi

Papers not included in thesis

Paper X: Daniel Brahneborg, Wasif Afzal, Adnan ˇCauševi´c. A Pragmatic Perspective on Regression Testing Challenges.

In the International Conference on Software Quality, Reliability & Se-curity (QRS), IEEE, 2017.

Paper Y: Daniel Brahneborg, Wasif Afzal, Adnan ˇCauševi´c. A Black-Box Approach to Latency and Throughput Analysis.

In the International Conference on Software Quality, Reliability & Se-curity (QRS). IEEE, 2017.

This is an early version of Paper A.

Paper Z: Daniel Brahneborg. Doctoral Symposium: Leaderless Replication and Balance Management of Unordered SMS Messages.

In the International Conference on Distributed and Event-based Systems (DEBS), ACM, 2019.

(18)

Acronyms

The acronyms used in this thesis are provided here for easy reference.

ATAM Architectural Trade-off Analysis Method, a way to analyze a software architecture, which is focused on quality attributes.

EMG Enterprise Messaging Gateway, our demonstration system.

GSM Global System for Mobile communications, a digital system for mobile telephony.

IA5 International Reference Alphabet number 5, a 7 bit character encoding scheme often used for mobile text messages.

ICAB Infoflex Connect AB, the company that owns EMG.

HTTP HyperText Transfer Protocol, the primary communication protocol used for web traffic.

MPS Messages Per Second, the unit we use for measuring throughput of a messaging system.

NoSQL A common name for databases which do not use the relational paradigm of SQL databases.

PDU Protocol Data Unit, a single data packet used for SMS traffic. Each PDU contains a login request with a username and password, a single SMS of up to 160 characters, or the acknowledgement of a received message. QUIC A general-purpose networking protocol used instead of TCP for

HTTP/3.

RTT Round-Trip Time, the time between sending a request to a remote sys-tem and getting an acknowledgement back.

SMPP Short Message Peer-to-Peer, a communication protocol used for SMS.

Acronyms

The acronyms used in this thesis are provided here for easy reference.

ATAM Architectural Trade-off Analysis Method, a way to analyze a software architecture, which is focused on quality attributes.

EMG Enterprise Messaging Gateway, our demonstration system.

GSM Global System for Mobile communications, a digital system for mobile telephony.

IA5 International Reference Alphabet number 5, a 7 bit character encoding scheme often used for mobile text messages.

ICAB Infoflex Connect AB, the company that owns EMG.

HTTP HyperText Transfer Protocol, the primary communication protocol used for web traffic.

MPS Messages Per Second, the unit we use for measuring throughput of a messaging system.

NoSQL A common name for databases which do not use the relational paradigm of SQL databases.

PDU Protocol Data Unit, a single data packet used for SMS traffic. Each PDU contains a login request with a username and password, a single SMS of up to 160 characters, or the acknowledgement of a received message. QUIC A general-purpose networking protocol used instead of TCP for

HTTP/3.

RTT Round-Trip Time, the time between sending a request to a remote sys-tem and getting an acknowledgement back.

SMPP Short Message Peer-to-Peer, a communication protocol used for SMS.

(19)

SMS Short Message Service, the traditional text messages in the GSM, 3G, and 4G networks.

SQL Structured Query Language, the standard language for accessing rela-tional databases.

TCP Transmission Control Protocol, providing reliable and ordered delivery of network packets over IP networks.

UCP Universal Computer Protocol, a communication protocol used for SMS. UCS Universal Character Set, a way to encode character codes. Common variants are UCS-2 which uses 2 bytes per character and UCS-4 which uses 4 bytes per character.

xviii

SMS Short Message Service, the traditional text messages in the GSM, 3G, and 4G networks.

SQL Structured Query Language, the standard language for accessing rela-tional databases.

TCP Transmission Control Protocol, providing reliable and ordered delivery of network packets over IP networks.

UCP Universal Computer Protocol, a communication protocol used for SMS. UCS Universal Character Set, a way to encode character codes. Common variants are UCS-2 which uses 2 bytes per character and UCS-4 which uses 4 bytes per character.

(20)

Contents

I Thesis 1 1 Introduction 3 1.1 System Model . . . 4 1.2 Proof-of-Concept System . . . 5 1.3 Motivation . . . 6 1.4 Thesis Goal . . . 7 1.5 Thesis Outline . . . 8

2 Background & Related Work 9 2.1 SMS Protocols . . . 9

2.2 SMS Gateway Log Files . . . 10

2.3 Round-Trip Time Distribution . . . 11

2.4 Architectural Approaches . . . 13 2.5 Related work . . . 15 3 Research Summary 17 3.1 Challenges . . . 17 3.2 Papers . . . 18 3.3 Contributions . . . 18 3.4 Process . . . 24 4 Conclusions 27 4.1 Future Work . . . 28

Contents

I Thesis 1 1 Introduction 3 1.1 System Model . . . 4 1.2 Proof-of-Concept System . . . 5 1.3 Motivation . . . 6 1.4 Thesis Goal . . . 7 1.5 Thesis Outline . . . 8

2 Background & Related Work 9 2.1 SMS Protocols . . . 9

2.2 SMS Gateway Log Files . . . 10

2.3 Round-Trip Time Distribution . . . 11

2.4 Architectural Approaches . . . 13 2.5 Related work . . . 15 3 Research Summary 17 3.1 Challenges . . . 17 3.2 Papers . . . 18 3.3 Contributions . . . 18 3.4 Process . . . 24 4 Conclusions 27 4.1 Future Work . . . 28

19

(21)

II Included Papers 33

Paper A: Round-Trip Time Anomaly Detection 35

5.1 Introduction . . . 37

5.2 Background and Terminology . . . 39

5.3 Related work . . . 41

5.4 Approach . . . 43

5.5 Case Study Design . . . 46

5.6 Case Study Results . . . 48

5.7 Validity Threats . . . 51

5.8 Conclusions and Future Work . . . 54

Paper B: Towards a More Reliable Store-and-forward Protocol for SMS 59 6.1 Introduction . . . 61 6.2 System Model . . . 63 6.3 Requirements . . . 63 6.4 Solution Space . . . 69 6.5 Related Work . . . 70 6.6 Summary . . . 73

Paper C: A Lightweight Architecture Analysis of a Monolithic Messaging Gateway 79 7.1 Introduction . . . 81 7.2 Method . . . 83 7.3 Results . . . 85 7.4 Discussion . . . 93 7.5 Threats to Validity . . . 95

7.6 Conclusions and Future Work . . . 96

Paper D: Superlinear and Bandwidth Friendly Geo-replication for Store-And-Forward Systems 101 8.1 Introduction . . . 103 8.2 Proposed solution . . . 106 8.3 Experiment . . . 113 8.4 Results . . . 115 8.5 Discussion . . . 116 8.6 Related Work . . . 118

8.7 Conclusions and Future Work . . . 120

xx II Included Papers 33 Paper A: Round-Trip Time Anomaly Detection 35 5.1 Introduction . . . 37

5.2 Background and Terminology . . . 39

5.3 Related work . . . 41

5.4 Approach . . . 43

5.5 Case Study Design . . . 46

5.6 Case Study Results . . . 48

5.7 Validity Threats . . . 51

5.8 Conclusions and Future Work . . . 54

Paper B: Towards a More Reliable Store-and-forward Protocol for SMS 59 6.1 Introduction . . . 61 6.2 System Model . . . 63 6.3 Requirements . . . 63 6.4 Solution Space . . . 69 6.5 Related Work . . . 70 6.6 Summary . . . 73

Paper C: A Lightweight Architecture Analysis of a Monolithic Messaging Gateway 79 7.1 Introduction . . . 81 7.2 Method . . . 83 7.3 Results . . . 85 7.4 Discussion . . . 93 7.5 Threats to Validity . . . 95

7.6 Conclusions and Future Work . . . 96

Paper D: Superlinear and Bandwidth Friendly Geo-replication for Store-And-Forward Systems 101 8.1 Introduction . . . 103 8.2 Proposed solution . . . 106 8.3 Experiment . . . 113 8.4 Results . . . 115 8.5 Discussion . . . 116 8.6 Related Work . . . 118

8.7 Conclusions and Future Work . . . 120

(22)

Part I

Thesis

Part I

Thesis

(23)
(24)

Chapter 1

Introduction

Communication has been an important concept in computer science for a long time, even though it did not always involve networking. In earlier days the communication was mainly between the applications and what is now called the operating system kernel [27]. Over time, applications started communi-cating with each other, both within the same computer and across a network. This communication would sometimes require a separate software component sitting between the communication endpoints, providing protocol conversion when one or both applications could not be changed [7]. In other cases, such a component could provide a bridge between new applications and legacy databases when there was a mismatch in the data [37], e.g. whether prices are specified with or without tax, or whether locations use town names or postal codes. The names used for the software in the middle has varied, but “middle-ware” [27], “gateways” [7] and “mediators” [37] seem to be the most common. In this thesis, we will use “gateways”.

Many gateways use a variant of the store-and-forward architecture [11] in order to isolate producers and consumers of data from each other, leading to a more resilient system than if all operations were done in lockstep. Further-more, by storing data in the gateway for a little while before forwarding it, the data could be split and/or joined to utilize the outgoing bandwidth more effec-tively. The store-and-forward architecture is also useful when there is a human on at least one end, e.g. for email and instant messaging [5], as this allows the human’s computer or mobile phone to temporarily be switched off. We will use the term “messaging gateway” for such systems, even though this term is also used for application-to-application gateways.

Gateways specialized in mobile text messages, often referred to as SMS (Short Message Service), are of particular interest in this thesis. SMS is still popular despite being relatively old technology, as an SMS can be both sent and

Chapter 1

Introduction

Communication has been an important concept in computer science for a long time, even though it did not always involve networking. In earlier days the communication was mainly between the applications and what is now called the operating system kernel [27]. Over time, applications started communi-cating with each other, both within the same computer and across a network. This communication would sometimes require a separate software component sitting between the communication endpoints, providing protocol conversion when one or both applications could not be changed [7]. In other cases, such a component could provide a bridge between new applications and legacy databases when there was a mismatch in the data [37], e.g. whether prices are specified with or without tax, or whether locations use town names or postal codes. The names used for the software in the middle has varied, but “middle-ware” [27], “gateways” [7] and “mediators” [37] seem to be the most common. In this thesis, we will use “gateways”.

Many gateways use a variant of the store-and-forward architecture [11] in order to isolate producers and consumers of data from each other, leading to a more resilient system than if all operations were done in lockstep. Further-more, by storing data in the gateway for a little while before forwarding it, the data could be split and/or joined to utilize the outgoing bandwidth more effec-tively. The store-and-forward architecture is also useful when there is a human on at least one end, e.g. for email and instant messaging [5], as this allows the human’s computer or mobile phone to temporarily be switched off. We will use the term “messaging gateway” for such systems, even though this term is also used for application-to-application gateways.

Gateways specialized in mobile text messages, often referred to as SMS (Short Message Service), are of particular interest in this thesis. SMS is still popular despite being relatively old technology, as an SMS can be both sent and

(25)

received by all mobile phones without any additional software installed. SMS is therefore frequently used world-wide by companies for sending meeting reminders, tickets, authentication codes, and more, to their customers. It is also useful for communication in the other direction, e.g., when the audience can send votes during a TV show. In 2019, on average about 300 000 text messages were sent every second1. This traffic volume keeps increasing.

Sending an SMS directly over the internet to the network operators is sur-prisingly non-trivial. First, the right operator must be selected for each mes-sage. This could previously be done by just checking the first few digits of the phone number, but due to number portability this is now more complex. Next, the operators use systems with different communication protocols and can have very specific requirements on the traffic.

In the spirit of “encapsulating the concept that varies” [12], the complex-ity of communicating with the operators is normally contained within a gate-way designed to handle SMS traffic. Such a gategate-way, simply referred to as an SMS gateway, is often run by a group of companies known as SMS bro-kers. The gateways and the brokers both offer a simplification for the senders, the gateway on a technical level and the brokers on a business level. This provides added value which senders are willing to pay for, and creates many business opportunities to provide additional services. This thesis is centered around messaging gateways in general and SMS gateways in particular, explor-ing ways to make these gateways faster and more reliable, i.e. more profitable.

1.1 System Model

We define our system model as comprising one or more entities sending mes-sages to a messaging gateway. This gateway stores the mesmes-sages and sends back acknowledgements for each one. The messages get picked up from the message storage, sent to a selected recipient, and deleted from the storage when the acknowledgement from the recipient comes back. There are no end-to-end acknowledgements, and the senders can not resend lost messages. All message flows are independent and asynchronous, and all communication to and from the gateway is carried out using standard communication protocols which can not be modified. All remote systems are authenticated and well behaved, so there are no denial-of-service attacks or byzantine failures [24].

As all messages are stored, fetched and deleted individually in this model, the performance of these operations on the message storage significantly

af-1

https://www.visualcapitalist.com/what-happens-in-an-internet-minute-in-2019

4

received by all mobile phones without any additional software installed. SMS is therefore frequently used world-wide by companies for sending meeting reminders, tickets, authentication codes, and more, to their customers. It is also useful for communication in the other direction, e.g., when the audience can send votes during a TV show. In 2019, on average about 300 000 text messages were sent every second1. This traffic volume keeps increasing.

Sending an SMS directly over the internet to the network operators is sur-prisingly non-trivial. First, the right operator must be selected for each mes-sage. This could previously be done by just checking the first few digits of the phone number, but due to number portability this is now more complex. Next, the operators use systems with different communication protocols and can have very specific requirements on the traffic.

In the spirit of “encapsulating the concept that varies” [12], the complex-ity of communicating with the operators is normally contained within a gate-way designed to handle SMS traffic. Such a gategate-way, simply referred to as an SMS gateway, is often run by a group of companies known as SMS bro-kers. The gateways and the brokers both offer a simplification for the senders, the gateway on a technical level and the brokers on a business level. This provides added value which senders are willing to pay for, and creates many business opportunities to provide additional services. This thesis is centered around messaging gateways in general and SMS gateways in particular, explor-ing ways to make these gateways faster and more reliable, i.e. more profitable.

1.1 System Model

We define our system model as comprising one or more entities sending mes-sages to a messaging gateway. This gateway stores the mesmes-sages and sends back acknowledgements for each one. The messages get picked up from the message storage, sent to a selected recipient, and deleted from the storage when the acknowledgement from the recipient comes back. There are no end-to-end acknowledgements, and the senders can not resend lost messages. All message flows are independent and asynchronous, and all communication to and from the gateway is carried out using standard communication protocols which can not be modified. All remote systems are authenticated and well behaved, so there are no denial-of-service attacks or byzantine failures [24].

As all messages are stored, fetched and deleted individually in this model, the performance of these operations on the message storage significantly

af-1

https://www.visualcapitalist.com/what-happens-in-an-internet-minute-in-2019

(26)

fects the number of messages the gateway can process per some time unit. Using contemporary hardware, there are three main classes of storage options. First, we have RAM based storage. RAM is typically fast, leading to through-put in the order of tens of thousands of messages per second (MPS). It is how-ever volatile2, which means messages are lost when the application is restarted. Second, we have disk based storage. When messages are stored on disk they are no longer lost during normal operations, even after application restarts. As disks are slower than RAM, the throughput for the gateway is reduced to typically a few thousand MPS. Third, we have cluster based storage. A server cluster requires additional network communication as well as coordination and is therefore typically slower than local disks, further reducing potential mes-sage throughput to a few hundred MPS. However, mesmes-sages would not be lost even during critical failures of a limited number of servers.

When this model is implemented in the SMS context as an SMS gateway, the message senders are typically other companies, and the recipients are mo-bile network operators, as shown in Figure 1.1. The senders pay the SMS brokers for forwarding the traffic, so an SMS gateway requires a credit man-agement module which can keep track of the messages sent by each company and reject traffic when prepaid balances are depleted. The reliability of the message storage is business critical for the SMS brokers due to the per mes-sage cost.

Company 1 Broker Operator 1

Operator 2 Company 2

Figure 1.1: Companies sending text messages via an SMS broker to Mobile Network Operators.

1.2 Proof-of-Concept System

Some of the SMS brokers develop their own software, while others prefer to use existing third party solutions. One of these third party products is the En-terprise Messaging Gateway (EMG) from Infoflex Connect AB (ICAB). EMG

2At the time of this writing, non-volatile RAM is starting to become a reality, but is not yet

publicly available.

fects the number of messages the gateway can process per some time unit. Using contemporary hardware, there are three main classes of storage options. First, we have RAM based storage. RAM is typically fast, leading to through-put in the order of tens of thousands of messages per second (MPS). It is how-ever volatile2, which means messages are lost when the application is restarted. Second, we have disk based storage. When messages are stored on disk they are no longer lost during normal operations, even after application restarts. As disks are slower than RAM, the throughput for the gateway is reduced to typically a few thousand MPS. Third, we have cluster based storage. A server cluster requires additional network communication as well as coordination and is therefore typically slower than local disks, further reducing potential mes-sage throughput to a few hundred MPS. However, mesmes-sages would not be lost even during critical failures of a limited number of servers.

When this model is implemented in the SMS context as an SMS gateway, the message senders are typically other companies, and the recipients are mo-bile network operators, as shown in Figure 1.1. The senders pay the SMS brokers for forwarding the traffic, so an SMS gateway requires a credit man-agement module which can keep track of the messages sent by each company and reject traffic when prepaid balances are depleted. The reliability of the message storage is business critical for the SMS brokers due to the per mes-sage cost.

Company 1 Broker Operator 1

Operator 2 Company 2

Figure 1.1: Companies sending text messages via an SMS broker to Mobile Network Operators.

1.2 Proof-of-Concept System

Some of the SMS brokers develop their own software, while others prefer to use existing third party solutions. One of these third party products is the En-terprise Messaging Gateway (EMG) from Infoflex Connect AB (ICAB). EMG

2At the time of this writing, non-volatile RAM is starting to become a reality, but is not yet

publicly available.

(27)

is an SMS gateway matching our system model, and is used as a proof-of-concept messaging gateway and an industrial use case in this thesis. EMG handles the “soft mismatch” [7] case, as not all attributes exist or have the same values in all SMS protocols. The protocols are however similar enough for an SMS gateway to be able to provide meaningful conversions in most practical cases. We will use the terms “messaging gateway”, “SMS gateway” or “EMG” throughout the thesis, depending on which abstraction level is the most appropriate.

The strengths of EMG particularly appreciated by the SMS brokers con-cern a) high throughput, b) good compatibility, and c) flexible extensibility. The higher throughput the SMS gateway supports, the more traffic from the senders can be processed, thereby giving more revenue for the broker. The compatibility is mainly with other systems, both those sending messages to EMG and those to which EMG sends messages. The flexible extensibility means the SMS brokers can add their own business logic and communication protocols when needed. When EMG is deployed on multiple servers, there is some room for improvement regarding the total system throughput.

1.3 Motivation

As the number of messages companies want to send keeps increasing, the SMS brokers need their SMS gateways to support a higher message throughput. With a higher throughput there is typically more messages in flight between the senders and the recipients, emphasizing the importance of reliability in the store part of our store-and-forward architecture. In our system model, in-creased reliability using disk or cluster storage for the messages not yet sent typically leads to lower throughput, so we identified a need to better understand the trade-offs between these conflicting requirements.

Using throughput and reliability as the starting point, we widened our domain to cover quality requirements (QR) [13] of messaging gateways in general. Quality requirements are sometimes called “non-functional” require-ments, though that may not be the best name as they often border on function-ality [10]. In some situations, there is even no separation at all between func-tional and quality requirements, for example regarding the maximum time a message is allowed to stay in the SMS gateway before it is forwarded. An SMS containing an authentication code must be delivered well within one minute, or the user will have already requested a new code, rendering the first code invalid and the first message thereby meaningless. In 2019, there was a more severe incident, where 168 149 text messages were delayed from February 14th to November 7th due to a failed server. By the time their messages were

deliv-6

is an SMS gateway matching our system model, and is used as a proof-of-concept messaging gateway and an industrial use case in this thesis. EMG handles the “soft mismatch” [7] case, as not all attributes exist or have the same values in all SMS protocols. The protocols are however similar enough for an SMS gateway to be able to provide meaningful conversions in most practical cases. We will use the terms “messaging gateway”, “SMS gateway” or “EMG” throughout the thesis, depending on which abstraction level is the most appropriate.

The strengths of EMG particularly appreciated by the SMS brokers con-cern a) high throughput, b) good compatibility, and c) flexible extensibility. The higher throughput the SMS gateway supports, the more traffic from the senders can be processed, thereby giving more revenue for the broker. The compatibility is mainly with other systems, both those sending messages to EMG and those to which EMG sends messages. The flexible extensibility means the SMS brokers can add their own business logic and communication protocols when needed. When EMG is deployed on multiple servers, there is some room for improvement regarding the total system throughput.

1.3 Motivation

As the number of messages companies want to send keeps increasing, the SMS brokers need their SMS gateways to support a higher message throughput. With a higher throughput there is typically more messages in flight between the senders and the recipients, emphasizing the importance of reliability in the store part of our store-and-forward architecture. In our system model, in-creased reliability using disk or cluster storage for the messages not yet sent typically leads to lower throughput, so we identified a need to better understand the trade-offs between these conflicting requirements.

Using throughput and reliability as the starting point, we widened our domain to cover quality requirements (QR) [13] of messaging gateways in general. Quality requirements are sometimes called “non-functional” require-ments, though that may not be the best name as they often border on function-ality [10]. In some situations, there is even no separation at all between func-tional and quality requirements, for example regarding the maximum time a message is allowed to stay in the SMS gateway before it is forwarded. An SMS containing an authentication code must be delivered well within one minute, or the user will have already requested a new code, rendering the first code invalid and the first message thereby meaningless. In 2019, there was a more severe incident, where 168 149 text messages were delayed from February 14th to November 7th due to a failed server. By the time their messages were

(28)

ered, some senders had already died, which understandably caused confusion and distress for the recipients3. It would clearly have been better if these mes-sages had been replicated to another server which then could have delivered them as soon as it noticed that the original server was no longer active. Lack-ing such a failover mechanism, the messages should just have been dropped when the server was reactivated.

1.4 Thesis Goal

The overall goal in this thesis is to improve the measurable attributes related to the quality requirements of a messaging gateway consistent with our sys-tem model. Primarily, this means increasing the throughput, measured as the number of processed messages per second, and the reliability, represented by the number of messages which would be lost in case of a server failure. To minimize risk and development costs, the improvements should be achieved while keeping adjustments of the existing system architecture to a minimum. The list of addressed quality requirements is shown in Table 1.1, grouped by the paper in which they are particularly significant.

Table 1.1: The most important quality requirements (QR) in each paper.

Paper QR Description

A Latency Clients should get acknowledgements forsent messages without unnecessary delays. B Scalability It should be possible to run the messaginggateway in parallel on multiple machines

for a higher total system throughput. Maintainability It must be possible to track each processedmessage and troubleshoot connections to

clients and operators.

C Availability Clients should always be able to connect tothe messaging gateway and send messages. Efficiency Throughput should be high, even onmoderately powerful hardware. D Reliability Received and acknowledged messagesshould not be lost.

3

https://www.theverge.com/2019/11/7/20953422/text-messages-delayed-received-overnight-valentines-day-delay

ered, some senders had already died, which understandably caused confusion and distress for the recipients3. It would clearly have been better if these mes-sages had been replicated to another server which then could have delivered them as soon as it noticed that the original server was no longer active. Lack-ing such a failover mechanism, the messages should just have been dropped when the server was reactivated.

1.4 Thesis Goal

The overall goal in this thesis is to improve the measurable attributes related to the quality requirements of a messaging gateway consistent with our sys-tem model. Primarily, this means increasing the throughput, measured as the number of processed messages per second, and the reliability, represented by the number of messages which would be lost in case of a server failure. To minimize risk and development costs, the improvements should be achieved while keeping adjustments of the existing system architecture to a minimum. The list of addressed quality requirements is shown in Table 1.1, grouped by the paper in which they are particularly significant.

Table 1.1: The most important quality requirements (QR) in each paper.

Paper QR Description

A Latency Clients should get acknowledgements forsent messages without unnecessary delays. B Scalability It should be possible to run the messaginggateway in parallel on multiple machines

for a higher total system throughput. Maintainability It must be possible to track each processedmessage and troubleshoot connections to

clients and operators.

C Availability Clients should always be able to connect tothe messaging gateway and send messages. Efficiency Throughput should be high, even onmoderately powerful hardware. D Reliability Received and acknowledged messagesshould not be lost.

3

https://www.theverge.com/2019/11/7/20953422/text-messages-delayed-received-overnight-valentines-day-delay

(29)

1.5 Thesis Outline

The rest of the thesis is structured as follows. Chapter 2 contains further back-ground information, and elaborates on the motivation behind this thesis. Chap-ter 3 contains a summary of the research. ChapChap-ter 4 concludes the thesis, dis-cussing future work. Part II contains the included papers.

8

1.5 Thesis Outline

The rest of the thesis is structured as follows. Chapter 2 contains further back-ground information, and elaborates on the motivation behind this thesis. Chap-ter 3 contains a summary of the research. ChapChap-ter 4 concludes the thesis, dis-cussing future work. Part II contains the included papers.

(30)

Chapter 2

Background & Related Work

In this chapter, we describe important concepts relevant for this thesis and summarize relevant related work. All papers in this thesis discuss SMS traffic to some degree, so in Section 2.1 we describe the communication protocols used for such traffic. Log files can be very helpful in many applications in the analysis of which actions the application has taken, and the log files used in Paper A are described in Section 2.2. Our initial observations regarding the round-trip times for outgoing requests and processing times for incoming requests, which were presented in Paper Y and analyzed in Paper A, are de-scribed in Section 2.3. Next, Section 2.4 contains a general discussion about software architecture, which is important in both Paper B, Paper C, and Pa-per D. Finally, Section 2.5 contains related work on the anomaly detection discussed in Paper A and the data replication discussed in Paper D.

2.1 SMS Protocols

There are a handful of protocols used for SMS messaging, most of them orig-inally designed for direct communication between message senders and net-work operators. The protocols are similar to each other as they all support almost the same set of attributes, e.g., sender phone number, recipient phone number, message body, character set, and whether a delivery receipt should be returned. The data packet containing such a set of attributes for a single request or response is called a PDU, a Protocol Data Unit.

The main differences between the protocols concern the encoding of the values in a PDU. For example, UCP (Universal Computer Protocol1) sends all data as text with each field separated by a “/” character, while SMPP (Short

1https://en.wikipedia.org/wiki/EMI_(protocol)

Chapter 2

Background & Related Work

In this chapter, we describe important concepts relevant for this thesis and summarize relevant related work. All papers in this thesis discuss SMS traffic to some degree, so in Section 2.1 we describe the communication protocols used for such traffic. Log files can be very helpful in many applications in the analysis of which actions the application has taken, and the log files used in Paper A are described in Section 2.2. Our initial observations regarding the round-trip times for outgoing requests and processing times for incoming requests, which were presented in Paper Y and analyzed in Paper A, are de-scribed in Section 2.3. Next, Section 2.4 contains a general discussion about software architecture, which is important in both Paper B, Paper C, and Pa-per D. Finally, Section 2.5 contains related work on the anomaly detection discussed in Paper A and the data replication discussed in Paper D.

2.1 SMS Protocols

There are a handful of protocols used for SMS messaging, most of them orig-inally designed for direct communication between message senders and net-work operators. The protocols are similar to each other as they all support almost the same set of attributes, e.g., sender phone number, recipient phone number, message body, character set, and whether a delivery receipt should be returned. The data packet containing such a set of attributes for a single request or response is called a PDU, a Protocol Data Unit.

The main differences between the protocols concern the encoding of the values in a PDU. For example, UCP (Universal Computer Protocol1) sends all data as text with each field separated by a “/” character, while SMPP (Short

1https://en.wikipedia.org/wiki/EMI_(protocol)

(31)

Message Peer-to-Peer2) sends all data as binary encoded tuples containing a field number, the data length, and the data. Over time, many SMS gateways have started to also support more general purpose protocols such as HTTP (Hypertext Transfer Protocol3) albeit with different PDU encodings.

The similarities between the protocols served as the basis for the creation of EMG. Thanks to SMS gateways such as EMG, SMS software products from different vendors can easily communicate with each other, in spite of them often using very different protocols and/or encodings.

The SMS protocols are all stateful, requiring an initial “login” operation before any messages can be sent. This means there is no need for sending au-thentication information with each request, and enables traffic going upstream from the operators back to the companies.

Sliding windows are used to achieve a higher throughput than would be possible if the system waited for a response after each request. Each outgoing request contains a unique transaction number, and this number must then be included in the corresponding response.

The sender may want confirmation that the message was successfully de-livered to the phone, and would in that case request a delivery report by setting a flag in the message PDU. A delivery report is structured and handled in much the same way as a regular message.

2.2 SMS Gateway Log Files

Most SMS gateways can be configured to produce PDU log files, contain-ing information about each data packet sent or received from both clients and operators. These files might then be used to view the exact network traffic, separated into the different data fields used by each of the supported protocols. They are somewhat similar to having Wireshark4running continously.

A typical entry in a PDU log for a connector using SMPP, as it is gener-ated by EMG, is shown below. The part “operator1,0” means the connection to “operator1”, instance number 0. There can be many parallel connections to the same operator, thus the need for an instance number to distinguish them. The “trn” field is the transaction number used by the sliding window mecha-nism. When a response comes back with the same transaction number on the same connection, it is possible to calculate the round-trip time for that request. The “SHORTMESSAGE” field contains the Latin-1 or UCS-2 encoding of the

2https://en.wikipedia.org/wiki/Short_Message_Peer-to-Peer 3https://tools.ietf.org/html/rfc7231

4https://www.wireshark.org

10

Message Peer-to-Peer2) sends all data as binary encoded tuples containing a field number, the data length, and the data. Over time, many SMS gateways have started to also support more general purpose protocols such as HTTP (Hypertext Transfer Protocol3) albeit with different PDU encodings.

The similarities between the protocols served as the basis for the creation of EMG. Thanks to SMS gateways such as EMG, SMS software products from different vendors can easily communicate with each other, in spite of them often using very different protocols and/or encodings.

The SMS protocols are all stateful, requiring an initial “login” operation before any messages can be sent. This means there is no need for sending au-thentication information with each request, and enables traffic going upstream from the operators back to the companies.

Sliding windows are used to achieve a higher throughput than would be possible if the system waited for a response after each request. Each outgoing request contains a unique transaction number, and this number must then be included in the corresponding response.

The sender may want confirmation that the message was successfully de-livered to the phone, and would in that case request a delivery report by setting a flag in the message PDU. A delivery report is structured and handled in much the same way as a regular message.

2.2 SMS Gateway Log Files

Most SMS gateways can be configured to produce PDU log files, contain-ing information about each data packet sent or received from both clients and operators. These files might then be used to view the exact network traffic, separated into the different data fields used by each of the supported protocols. They are somewhat similar to having Wireshark4running continously.

A typical entry in a PDU log for a connector using SMPP, as it is gener-ated by EMG, is shown below. The part “operator1,0” means the connection to “operator1”, instance number 0. There can be many parallel connections to the same operator, thus the need for an instance number to distinguish them. The “trn” field is the transaction number used by the sliding window mecha-nism. When a response comes back with the same transaction number on the same connection, it is possible to calculate the round-trip time for that request. The “SHORTMESSAGE” field contains the Latin-1 or UCS-2 encoding of the

2https://en.wikipedia.org/wiki/Short_Message_Peer-to-Peer 3https://tools.ietf.org/html/rfc7231

4https://www.wireshark.org

Figure

Figure 1.1: Companies sending text messages via an SMS broker to Mobile Network Operators.
Table 1.1: The most important quality requirements (QR) in each paper.
Figure 2.1: The distributions of the processing times for incoming requests to EMG, and the round-trip times for outgoing traffic to an operator.
Figure 3.1: The gateway architecture and the included papers. There may be multiple senders and recipients
+6

References

Related documents

For example, the following interesting practices were identified: the complementary knowledge on hazardous substances in Norway, the traceability system developed for tracking

Table 2: Regressing Ecosystem Water Quality on Government Effectiveness, Level of Democracy, and GDP/Capita Among All Countries and Among Countries With Real Measured Values For

This section presents the theoretical background of a conceptual model called QUality PERformance (QUPER) for cost–benefit analysis of qual- ity requirements, which incorporates

One key feature of the method is the elicitation of personal constructs in the form of verbal descriptions of sound, subsequently used for development of assessment

It is not the intention of this section to give an explanation what metrics are, as they were introduced in section 2.5.3.6, but to briefly describe the metrics used by the ISO 9126

The multicast data flow between two IP sub-networks can be controlled by Internet Group Management Protocol (IGMP) for allowing receivers subscribing to a

Knowledge Engineering (SEKE 2011) (accepted paper).. Several facets of QRs such as elicitation, dependencies, metrics, cost estimation and prioritization have been addressed

http://urn.kb.se/resolve?urn=urn:nbn:se:bth-21705.. [Context and Motivation] Software requirements are affected by the knowledge and confidence of software engineers. Analyzing