IN
DEGREE PROJECT ENGINEERING PHYSICS, SECOND CYCLE, 30 CREDITS
STOCKHOLM SWEDEN 2020,
Anomaly Detection and Revenue Loss Estimation in Accounting Data
GUSTAV EDHOLM
KTH ROYAL INSTITUTE OF TECHNOLOGY
SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE
d R Rd
x
Anomaly Normal
!!
!"
[0, 1] [1, 0]
(x, y) y
x x
i x
y xi yi
j xj
R
x ∈ Rd y ∈ Rk
y y x
ˆ
y y
y x y
t |y − ˆy| |y − ˆy| > t y
ˆ y > y
LR(x) =
Ax + b A k × d b ∈ Rk
mse(y, ˆy) =
!k
i=1(yi− ˆyi)2
k .
0 2 4 6 8 10 x
0 1 2 3 4 5 6 7 8 9 10
y
x ∈ R y ∈ R
(x, y) y = bxˆ
x yˆ f (x) = ˆy
x ˆ
y x
x yˆ
N {(xj, yj)}Nj=1
xi
y σy {(xj, yj)}kj=1
!!
!" !#
!"
$!"
%!"
&!"
'!
!! < c"
!! ≥ %"
!# < %$
!# ≥ %$
!% ≥ %& !% < %&
x ci
ˆ yi
{(xj, yj)}Nj=k+1
ˆ yl
l {yj}nj=m
ˆ yl=
!n j=myj
n− m + 1 = µly.
σly/µly t
K
{fk}Kk=1 = φ yˆ
{ˆyj}Kj=1
L
ˆ
y y
(x, f (x))
∂f
∂x η
xt+1= xt− η∂f
∂x(xt)
"K k=1
Ω(fk)
Ω(fk) = γTk+ 1
2λ||wk||2,
L(φ) = l(y, ˆy) +
"K k=1
Ω(fk)
l y yˆ Tk
k wk γ
λ
t ft
L(t) =
"N i=1
l(yi, ˆyi,(t−1)+ ft(xi)) + Ω(ft)
ˆ
yi,(t−1) yi
t− 1
L(t) $
"N i=1
[l(yi, ˆyi,(t−1)) + gift(xi) + 1
2hift2(xi)] + Ω(ft)
gi hi l
gi = ∂yˆi,(t−1)l(yi, ˆyi,(t−1)),
hi = ∂y2ˆi,(t−1)l(yi, ˆyi,(t−1)).
t
L(t) $
"N i=1
[gift(xi) + 1
2hift2(xi)] + Ω(ft).
η < 1
x
F (x, θ) = ˆy hi
ˆ y
Hi θi
Mi Ai
Hi hi x
h1 = H1(x),
h2 = H2(h1), ...
ˆ
y = Hl+1(hl).
l
hi = Hi(hi−1) = Ai(Mihi−1)
Mi Ai
x L(ˆy)
ˆ y
y x
y yˆ
Hidden layers Input layer
Output layer
Weights
!
" #$
{xi}ni=1 n
{F (xi, θ)}ni=1 {ˆyi}ni=1
{L(ˆyi)}ni=1
θ
∂L
∂θi
η
θi ← θi − η∂θ∂Li
argminθ[L(F (x, θ))]
x y
h
ReLU (hi) = max(0, hi)
hl
sigmoid(hl) = ehl ehl+ 1. k
sof tmax(hl) = ehl
!k j=1ehlj,
CE(ˆy, y) =−
"k i=1
yilog(ˆyi).
ˆ y = hl
θ
x
k k
k
{mi}ki x
mi mi
mi
Si ={xp;||xp− mi||2 ≤ ||xp− mj||2,∀j, 1 ≤ j ≤ k},
mi
mi = 1
|Si|
"
xj∈Si
xj.
k = 2
k
P (x) =
z d
k
k
Z = XW
Z W k X!X
X x
P#(z) = ˆx
x ˆ
x = P#(P (x))
1 d
"d i=1
(xi− ˆxi)2.
z k
Encoder Decoder
#"
"
%
z = E(x) x = D(z)ˆ
x
x z x xˆ
T
T P R = T P T P + F N.
F P R = F P F P + T N.
P recision = T P T P + F P.
N
F P R = 0 N F P R = 1 N
T P R > F P R
T P R ≈ F P R
0.5
> 0.5
k
z
k
α
k
k
k
y ˆ
y
k
k
Data
Acquisition Data
Pre-processing Model
Selection Model
Training Model Testing
t
t
pc t
pt
T pt− pc
pt > T
T
pt
Rc Rt
Rc =
"N i=1
picti
Rt=
"N i=1
pitti.
Rt− Rc = 8.5
pc > pt
d = 1473 N = 9813
F (x) = ˆy
[0, 1]
pc pt
n
ˆ
y pc
ˆ y > pc
|pc− ˆy| > T T
ˆ y < pc
max(F P R)× max(T P R)
Rp Rp−Rc
ˆ y t
Rp =
"N i=1
ˆ yiti.
ˆ
y t
µyˆ =
!N i=1yˆ N
σyˆ =
!N
i=1(ˆyi− µyˆ)2
N .
t yˆ t
p = ˆyt
µp = µyˆµt
ˆ y
t σx2 = V ar(x) µx = E(x).
σp2 = V ar(ˆyt)
= E(ˆy2t2)− (E(ˆyt))2
= V ar(ˆy)V ar(t) + V ar(ˆy)(E(t))2+ V ar(t)(E(ˆy))2
= σy2ˆσt2+ σy2ˆµ2t + σt2µ2ˆy
= (σy2ˆ+ µ2yˆ)(σt2+ µ2t)− µ2yˆµ2t p
p
µRp = N µp =
"N i=1
pi,
σRp =»
N σp2 = σp
√N .
n
pt
ˆ
y∈ [0, 1]
pt
pc
pc
λ = 0.1 α = 5000
x d
d d = 1
d = 1
ˆ
ySGBDT yˆN N β γ
ˆ
yens = β ˆySGBDT + γ ˆyN N.
β γ β = 0.3
γ = 0.7
n n =
0.5 0.6 0.7 0.8 0.9 1 AUC score
RLT Ensemble RLT SGBDT regression RLT Random Forest regression RLT NN regression ELT NN regression RLT PCA ELT Binary NN classifier
0 5 10 15 20 25 30 35 40 0
0.05 0.1 0.15 0.2 0.25 0.3 0.35
RLT NN Regression predicted revenue distribution RLT RF and SGBDT predicted revenue
Charged Revenue Target Revenue
t
t
t t
www.kth.se
TRITA-EECS-EX-2020:929