IT 20 032
Examensarbete 30 hp
Juni 2020
Deployment failure analysis
using machine learning
Joosep Franz Moorits Alviste
Institutionen för informationsteknologi
Teknisk- naturvetenskaplig fakultet UTH-enheten Besöksadress: Ångströmlaboratoriet Lägerhyddsvägen 1 Hus 4, Plan 0 Postadress: Box 536 751 21 Uppsala Telefon: 018 – 471 30 03 Telefax: 018 – 471 30 00 Hemsida: http://www.teknat.uu.se/student
Abstract
Deployment failure analysis using machine learning
Joosep Franz Moorits Alviste
Manually diagnosing recurrent faults in software systems can be an inefficient use of time for engineers. Manual diagnosis of faults is commonly performed by inspecting system logs during the failure time. The DevOps engineers in Pipedrive, a SaaS business offering a sales CRM platform, have developed a simple
regular-expression-based service for automatically classifying failed deployments. However, such a solution is not scalable, and a more sophisticated solution is required.
In this thesis, log mining was used to automatically diagnose Pipedrive's failed
deployments based on the deployment logs. Multiple log parsing and machine learning algorithms were compared based on the resulting log mining pipeline's F1 score. A proof of concept log mining pipeline was created that consisted of log parsing with the Drain algorithm, transforming the log files into event count vectors and finally training a random forest machine learning model to classify the deployment logs. The pipeline gave an F1 score of 0.75 when classifying testing data and a lower score of 0.65 when classifying the evaluation dataset.
Tryckt av: Reprocentralen ITC IT 20 032
Examinator: Mats Daniels Ämnesgranskare: Justin Pearson Handledare: Jevgeni Demidov