Curl and Twist in Paperboard Manufa turing
GIANANTONIOBORTOLIN
ISRNKTH/OPTSYST/DA-05/02-SE
ISBN91-7178-202-8
SE-10044Sto kholm
SWEDEN
AkademiskavhandlingsommedtillståndavKunglTekniskahögskolanframlägges
tilloentliggranskningför avläggandeavteknologiedoktorsexamen fredagenden
2:a de ember 2005 klo kan 10.00 i Sal F3, Lindstedtsvägen 26, Kungl Tekniska
Högskolan,Sto kholm.
astheyare ertain, theydo notrefer toreality.
Abstra t
The ontents ofthis thesis anbe dividedinto two mainparts. Therst oneisthe
development of an identi ation methodology for the modelling of omplex industrial
pro esses. These ond oneis the appli ationof this methodology to the url and twist
problem.
Themainpurposebehindtheproposedmethodologyisto provideas hemati
plan-ning,togetherwithsomesuggestedtools,when onfrontedwiththe hallengeofbuilding
a omplexmodelofanindustrialpro ess. Parti ularattentionhasbeenpla edtooutlier
dete tionanddataanalysiswhenbuildingamodelfromold,orhistori al,pro essdata.
Anotheraspe t arefully handled inthe proposed methodology is the identiability
analysis. Infa t,itisrather ommoninpro essmodellingthatthemodelstru tureturns
outtobe weakly identiable. Consequently, theproblemof variable sele tion istreated
atlengthinthisthesis,andanewalgorithmforvariablesele tionbasedonregularization
hasbeenproposedand ompared withsomeofthe lassi almethods,yieldingpromising
results.
These ondpartofthethesisisaboutthedevelopmentofa urlpredi tor. Curlisthe
tenden yof paperofassuming a urved shapeand is observedmainly duringhumidity
hanges. Curl inpaper and in paperboard is a long-standing problem be ause it may
seriously ae t the pro essing of the paper. Unfortunately, url annot be measured
online,butonlyinthelaboratoryafterthatanentiretambourhasbeenprodu ed. The
maingoalofthisproje tisthentodevelopamodelfor urlandtwist,andeventuallyto
implementitasanon-linepredi tortobeusedbytheoperatorsandpro essengineersas
atoolforde ision/ ontrol.
Theapproa hwe usedto ta kle this problemis basedon grey-box modelling. The
reasonsfor su hanapproa histhatthephysi al pro essisvery omplexandnonlinear.
The inuen e of some inputs is not entirely understood, and besides it depends on a
numberofunknownparametersandunmodelled/unmesurabledisturban es.
Simulationsonrealdatashowagoodagreementwiththemeasurement,parti ularly
for MD and CD url, and hen e webelieve that the modelhas anusable a ura yfor
beingimplementedasanon-linepredi tor.
Keywords: Grey-box modelling,system identi ation, outlierdete tion, variable
Afterthis longand hallengingexperien e asaPhD student, thelist ofpeopleto
whomI'm indebted hasgrownimmensely. Inthefollowingfew lines I'llmakean
attempttosummarizesu h alist.
First,Iwishtoexpressmysin eregratitudetomysupervisorAndersLindquist
for giving me the opportunity of joining the Opt&Sys group. His dedi ation to
resear hand hiswidely spreadresear hnetwork greatly ontributedin produ ing
andmaintaininga reativeandstimulatingworkenvironment.
Se ondly, but not less importantly, I would like to thank my supervisor
Per-OlofGutmanforhisguidan e,helpfulnessanden ouragements. I'm ertainlymu h
indebted to himfor guiding me throughthe jungle ofthe modelling pro ess. His
omments and (in)famous jokes have taught me to take a look at things from
dierent, and less obvious perspe tives. His passion for ontrol and resear h has
strongly motivated me to arry on my work, espe ially during the most di ult
moments. Thanksalot.
Ialsowishtoa knowledgeAssiDomänFrövifornan ingtheproje t. In
parti -ular,IamgratefultoBengtNilssonandallthePro essControlgroupforproviding
me with a friendly and helpful environment to work. The paper manufa turing
pro essisanextremely omplexworld,andIwouldliketothankStefanEri sson,
MartinHanimannand JanSjögrenfortheirfriendship and help in timesofneed.
AbighuggoestoLarsJonhedandhisfamilyforhelpingmetoorientateinmany
di ultsituations.
The Optimization and System Theory groupat KTH has also beenan
inter-esting andpleasantenvironmentfor work and dis ussion. I wish to thankall the
olleagues and friends at the division. In parti ular, I annot thank enough my
former o emate Ryozo, and my next-door olleagues Christelle and Vanna for
theirsupport, andsin erefriendship.
Then, I would also like to mention all the Swedish-Italian friends that have
shared with me a lot of pleasant and enjoyable moments during these years in
Sweden. Mar o,theAr ovitos, thevolleyballmates, and alltheothershavebeen
extraordinary friends. My warmest thank you is for Barbara (and Nerone, of
ourse)whohasbe omemoreandmoreanessentialpartofmylife.
TherearealsomanypeopleinItaly,friendsandrelatives,whodon't aremu h
theyhavebeenonmysideduringallthistime. Thanksalottoallofyou. Among
them,aspe ialmentionisforPaoloLutman.
Clearly,therearealotofpersons(butmanyarenotmentionedhereforsakeof
brevity)that have ontributed in manydierentwaysto mywork. Thisthesis is
dedi atedtoallofthem.
Sto kholm,De ember2005
A knowledgment vii
Tableof Contents ix
Glossary xi
1 Introdu tion 1
1.1 Thepaperboardmodellingproje t . . . 1
1.2 Curlandtwistmodelling. . . 2
1.3 Overviewofthethesis . . . 3
2 DimensionalStability in Paperboard 7 2.1 Introdu tion. . . 7
2.2 Pro essdes ription . . . 8
2.3 Dimensional stability: someba kground . . . 10
2.4 Modelling . . . 12
2.5 Summaryand on lusions . . . 16
3 OutlierDete tion and ModelDiagnosti 19 3.1 Introdu tion. . . 19
3.2 Outliersandleveragepoints . . . 21
3.3 Robustmodeldiagnosti . . . 23
3.4 ForwardSear h . . . 25
3.5 Proposedmethod . . . 26
3.6 TheHampellter. . . 34
3.7 Summaryand on lusions . . . 39
4 The VariableSele tion Problem 41 4.1 Introdu tion. . . 41
4.2 Subset sele tion. . . 43
4.3 Evaluation riteria . . . 47
4.4 Regularizationmethods . . . 56
4.6 Estimatingtheoptimal
λ
. . . 774.7 Examples . . . 78
4.8 Simulationresults . . . 82
4.9 Extensiontogeneralmodels . . . 88
4.10 Summaryand on lusions . . . 91
5 Identi ationMethodology 93 5.1 Introdu tion. . . 93
5.2 Thestepsofthemethodology . . . 94
5.3 CurlModel . . . 98
5.4 Summaryand on lusions . . . 110
6 Grey-box Modellingof Disturban es 111 6.1 Introdu tion. . . 111
6.2 NonlinearKalmanltering. . . 112
6.3 Modellingofsto hasti disturban es . . . 113
6.4 Simulations . . . 115
6.5 Dis ussion . . . 117
6.6 Summaryand on lusions . . . 122
7 On-line Predi tion 123 7.1 Introdu tion. . . 123
7.2 Predi toralgorithm. . . 123
7.3 A virtualpredi tor . . . 125
7.4 Summaryanddis ussion . . . 126
8 Con lusions 129 8.1 Summary . . . 129
8.2 Con lusions . . . 131
8.3 Dire tions forfuture work . . . 131
Somenotationsmayhaveadierentmeaninglo ally. Notations Latin letters
e
Noise ve torH
Moisture ontent [-℄K
Curvature( url) [m−1
℄I
IdentitymatrixN (·, ·)
NormaldistributionN
Totalnumberofobservationsn
TotalnumberofinputvariablesR
2
Squared oe ientofmultiple orrelation
s
0
Internalstress [N/m
2
℄
U(·, ·)
UniformdistributionX
N
× n
regressormatrixX
k
k
th inputve tor(N
-dimensional olumnve tor)x
k
k
th observation(n
-dimensionalrowve tor)X
T
Transpose
Y
Outputve torˆ
Y
Predi ted outputve torZ
Dataset[Y, X]
Z
I
, Z
V
Identi ation andvalidationdatasetsGreek letters
β
Hygroexpansivity oe ient [-℄ǫ
Strain [-℄ε
Residualve torθ
0
Trueparameterve tor
ˆ
θ
Ordinaryleastsquareestimateofθ
0
ˆ
θ
R
Ridgeregressionestimateof
θ
0
ˆ
θ
I
ISRRestimateof
θ
Greek letters
λ
Regularizationparameterσ
Noise standarddeviationφ
Fiberorientationangle [rad℄Subs ripts
i
Layerindexj
Planedire tionOperators
arg min
θ
f (θ)
Minimizing argumentoff (θ)
diag(
L
) Diagonal matrix, with the ve torL
in the main diagonalE[x]
Expe tedvalueofarandomvariablex
std(x
) Standarddeviationoftherandomvariablex
Var(x
) Varian eoftherandomvariablex
Abbreviations
AIC Akaikeinformation riterion
ARMAX Autoregressive moving average model
stru turewithexogenousinputs
ARX Autoregressivemodelstru turewith
exoge-nousinputs
BE Ba kwardeliminationmethod
BIC Bayesianinformation riterion
CD Crossma hinedire tion
CDC Closestdistan e to enter
EKF ExtendedKalmanlter
FPE Finalpredi tionerror
GCV GeneralizedCrossValidation riterion
GRR Generalizedridgeregressionmethod
KM5 Paperma hine atFrövi
ISRR Iterativelys aledridgeregression
LTS Leasttrimmedsquares
MAD Medianabsolutedeviation
MCD Minimum ovarian edeterminant
MD Ma hine dire tion
ME Modelerror
OLS Ordinaryleastsquaresmethod
RR Ridgeregression
RSS Residualsumofsquares
Introdu tion
1.1 The paperboard modelling proje t
Thereisanin reasinguseof omputersimulationsinindustrytooptimizeprodu ts,
toredu eprodu tdevelopments ostsandtimebydesignoptimization,andtotrain
operators. Thepaperindustry,traditionallyquite onservative,hasalsostartedin
thelast 10-15years to use omputersimulation in asigni antlyin reasing way.
Therst omputermodelling and ontrol attempts were initiated already in the
early 1960s, see [108℄ for example. However, only in the last 10 years, with the
newavailablete hnologies,more omplexmodelsand ontrolstrategieshavebeen
onsidered.
Thework onmodelling theboard manufa turingpro ess at AssiDomän Frövi
was startedin 1992. Themain aimof theproje tisto ndmathemati almodels
of the pro ess that an explain the variations of important quality variables of
theboard whi h annotbemeasuredon-line. Hen e,insteadofdire t sensors, we
would like to employ mathemati al models, alled also soft sensors, to estimate
thesequalityvariablesfromotherquantitiesthataremeasuredon-line. Eventually,
we would also like to obtain simulation/predi tion tools that an ontribute to
higherprodu tqualityande ien y in severalways. For instan e, they ouldbe
usedtotestnew ontrolstrategies,togetabetterunderstandingofthepro ess,as
anoperator trainingtool,and also forpro ess optimization and model predi tive
ontrol.
Unfortunately,itiswellknownthatthesystemwearedealingwithisa
ompli- atednon-linearandtimevaryingmulti-inputmulti-outputpro ess. Italso ontains
alargeamountofun ertaintyand isae ted by unknownand immeasurable
dis-turban es. Therefore,itis learthatthetaskofdevelopingsuitablemodelsisvery
di ult.
Atthemomenttwodierentmodellingapproa hesarebeingtriedatAssiDomän
Frövi. The rst one onsists in a Dymola-basedmodel whi h is being developed
Whereasin thepastitwas onsideredsu ienttosimulatesubsystemsseparately,
the urrenttrendistosimulatein reasingly omplexphysi alsystems omposedof
subsystemsfrommultipledomains. Uptonow,allthewetendpartandthedrying
se tionofthepaperma hineare ompletedwithsatisfa toryresults,see[11℄,[55℄,
[32℄and[57℄.
These ondmodellingapproa histheoneusedin[76℄byPetterssonwhoderived
agrey-boxmodel ofthebending stinesspropertiesoftheboard. Themodelwas
implementedin theFrövimill informationsystemas abending stiness predi tor
andsimulatorforthepro essengineersandoperators. ItwasprogrammedinVisual
Basi and hasbeenrunning formore thanve years givingpredi tionsto within
thelaboratorymeasurementa ura y.
An approa h similarto Pettersson'sbendingstiness predi tor, isused inthis
thesistomodel urlandtwist(i.e. out-of-planehygroinstability)ofthepaperboard.
The proje t started about 5 years ago and the work of the rst three years is
summarizedin my Li entiateThesis, [12℄. Inthelast twoyearsof work,the url
modelhasbeenimprovedandparti ularattentionhasbeengiventotwoimportant
aspe tsofthemodellingpro ess: outlierdete tionandvariablesele tion.
1.2 Curl and twist modelling
Curlisthe tenden y ofpaperof assuminga urvedshapeand isobservedmainly
duringhumidity hanges. Curlinpaperandinpaperboardisalong-standing
prob-lembe ausethedeparturefrom theatmayseriouslyae tthepro essingof the
paper. Ex essive url redu es the quality of papersin e it is the most ommon
auseforsheetfeedingproblems. Forthisreason, ustomersimposestri tlimitson
the urland thereforeit is be oming in reasinglyimportant toprodu ea
arton-boardwithlimited url. Duetothee onomi signi an eofthe urlproblem,mu h
resear hhasbeenperformedtondpulp ompositionsandpro essingstrategiesto
eliminateor redu e url,see[87℄,[68℄. However, urlanddimensionalstability 1
in
generalare ae tedby alargenumberof omplex,inter-relatedfa tors, in luding
dierentdryingratesonthetwosidesof theboard, relativevariationin humidity
withinthesheetandme hani alstresseswithinthebres,andhen etheyarevery
di ultto analyzeandunderstand.
AtAssiDomänFrövi(Sweden),ex essive url ausesthelossofseveralhundreds
oftons ofboard everyyear, and itis one of themost di ultqualityvariable to
handle forthe operators and pro ess engineers. The ontinuously moving web in
the paper ma hine is rolled up on big rolls (tambours) and approximatelyevery
50 minutes the operators start an automati hange of tambours. Samples for
laboratorytest are available from thelast partof ea h roll, andabout20quality
variablesareanalyzedin thelaboratoryat dierentpositionsin the ross-ma hine
dire tion.Paper urlisoneofthequalityvariablesonlymeasuredinthelaboratory.
1
After onditioningfor at leastveminutes, url measurementsare takenwith an
opti aldevi e, andthen themedian values,for afew CD positions, are presented
to the operators. In hapter 2, however, we present an investigation on the url
measurementsshowingthat theydo notprovideagood hara terization url and
twist.
Be auseofits omplexity,andbe auseofthefa tthat url anonlybemeasured
afteranentire tambourhasbeenprodu ed, its ontrol isverydi ultand ostly.
Althoughout-of-spe i ationboardmaybere-pulpedandre- y led,bad url auses
awasteofplanttime,rawmaterialandenergy.
Inthiss enario,itis learthata urlpredi tor/simulatorwouldbeaparti ularly
usefultoolfortheoperatorsandpro essengineerstohelpthemtode idethebest
settingsand/or ontrola tionto take.
Despitethequitebroadliteratureon urlandtwist,onlyonemodelsuitablefor
industrialpro essandusefulforpro ess ontrolwasfound. ThemodelbyEdwards
etal.,isbasedonarti ialneuralnetworksandisdes ribedin[26℄. Theirmodelis
forone-plypaper,whilein our asewe onsiderfour-plyboardwhi hin generalis
moreae tedby urlandalsomore ompli atedtodealwith. Wedidnot ompare
theirapproa hwithoursbe ausetheirmodelwasnotavailable,anditwastoomu h
timedemandingtobuild aneuralnetworksmodelforthefour-plyboardbasedon
theirpaper.
Thus,itseemedtimelyto ta kletheproblemwith dierentadvan edmethods
of modelling. Theproblem onsists in the fa t that the pro ess is very omplex,
anda detailed modelling of thevarious phenomena involvedwould easily leadto
amodel too ompli atedto beatta kedwith ordinaryidenti ation methods. In
addition,thepoorqualityofthe urlmeasurementsmakesthesituationevenmore
di ult.
Thetaskisthustondareasonabletradeobetweenadetaileddes riptionof
thephenomenathatare onsideredmostimportantfordimensionalstabilityanda
modelthatlumpstogetherotherphenomenaasdisturban es.Ofthemanymethods
available,grey boxmodelling seemed one of themost suitableand promising. In
ontrasttoneural network modelling, thegreyboxmodellingapproa his
hara -terizedbyin ludingthemostimportantphysi sexpli itlyandlumpingtogetherthe
lessimportantandunknownphenomenaintobla kbox omponentsandsto hasti
disturban es. Itwaspreviouslyusedinsimilarproblemsyieldinggoodresults,[76℄,
and[31℄. Theinitialattempttomodel urlandtwistwithagreyboxmodelseemed
promising,see[12℄and[13℄,and hen ewede idedtogoonwithit.
1.3 Overview of the thesis
Theoverallthesis an bedividedinto twoparts. Therstone isthedevelopment
Themainpurposebehindtheproposed methodologyistoprovideas hemati
planning,together withsomesuggestedtools,when onfrontedwiththe hallenge
ofbuildinga omplexmodelofanindustrialpro ess. Parti ularattentionhasbeen
pla edto outlier dete tionand data analysis when building amodel from old, or
histori al,pro essdata. Theanalysisof thiskindofdata isdangerousbe ausein
generalwemiss the ne essarya priori knowledge. For instan e, there may have
beensomesensorfailures,orsomekindofmalfun tionswhi hweignore,sothatthe
faultyobservations annotbeeasilydistinguishedfromtheregularobservations.
Anotheraspe t arefully handled in theproposed methodologyis the
identi-abilityanalysis. Infa t,it israther ommonin pro essmodellingthat themodel
stru tureturnsouttobeweaklyidentiable. For instan e,whenexperiments
an-notbe arriedoutandonlyhistori aldataareavailableforidenti ation,itisvery
likely that the data set is not informative enough to safely estimate the model.
Thus,theproblemofvariablesele tionistreatedatlengthinthiswork,andanew
algorithmforvariablesele tionbasedonregularizationisproposedand ompared
withsomeofthe lassi almethods, yieldingpromisingresults.
These ond partisthe developmentofasuitablemodelstru ture for url and
twist. Themainpurposeofthispartofthethesisistondoutifgrey-boxmodelling
and estimation of url and twist in multi-ply paperboard is feasible, and in su h
aseto build anon-line predi tor that an be used by theoperators and pro ess
engineersas atool forde ision/support. Eventually, themodelwill beused fora
moregeneral model predi tive ontrol strategy. As ase ondaryobje tive, it will
serveas atool for a better understanding of the pro ess. Most of this part was
arried out during the rst three years of my PhD and are summarized in my
Li entiate Thesis, [12℄ and in the paper[13℄. However,the model presentedhere
diersfrom[12℄be auseithasbeenimprovedandsplitintothreedierentmodels,
oneforea hkindof url,seese tion2.4.
Contribution of the thesis
Themain ontributionsofthisthesisare:
+ Agrey-boxmodelof urlandtwistwasdeveloped. Themodelwasestimated
from pro ess data, and the validation showed ageneral agreement between
predi tionand measurement.
+ A areful study of dierent outlier dete tion methods was arried out. In
parti ular, a new approa h is proposed to deal with nonlinear models and
largedatasets.
+ A new variable sele tion method, the ISRR algorithm, was derived. Some
theoreti alaspe tsofthenewmethodwereinvestigatedandalargesimulation
+ A new identi ation methodology for the modelling of omplex industrial
pro essesis proposedandapplied tothe urlandtwistproblem.
Themain limitationsoftheresultsofthisthesisare:
- Although thegeneralbehaviorof the url andtwistmodels are satisfa tory
andtheaveragea ura yofthepredi tionsisgood,duringsometimeperiods
thepredi tionerrorsaretoolargetobeusablefromtheoperators. Webelieve
that someimportantinputsare stillmissing,andsomeofthesub-models of
the urlmodelaretoosimpleandshouldbeimproved.
- Themodi ationsproposed fortheforwardsear hmethod foroutlier
dete -tionshouldbeanalyzedwithamoretheoreti alapproa h.
- A moregeneral studyof the ISRRmethod should be arriedout: in whi h
situationsitispreferabletoothermethodsandinwhi honesitisnot
re om-mendable.
The results in this thesis have been reported in the following published and
submittedworks:
•
BortolinG.(2002). OnModellingandEstimationofCurlandTwistin Multi-Ply Paperboard, Li entiate Thesis, TRITA-MAT-02-OS-14, Department ofMathemati s,KTHSto kholm, Sweden.
•
BortolinG.,GutmanP.-O.,NilsonB.(2005)ModellingofCurlinMulti-Play Paperboard. Journalof Pro ess Control (inprint).•
Bortolin G., Gutman P.-O. (2005) A new regularization-based method for variablesele tion. Submitted toAutomati a.Outline of the thesis
Thedisposition ofthisthesisisas follows:
Chapter 2: In this hapter an overviewof thepapermanufa turing pro ess is
givenrst. Then,the urlandtwistmodel stru tureispresented.
Chapter 3: This haptergivesashort overviewof someoutlierdete tion
meth-ods. Then, the method developed in this work is presented together with some
examples.
Chapter 4: First,adetailed overviewof variablesele tionmethodsand ev
alua-tion riteria is given. Then, the new variable sele tionmethod developed in this
workisdis ussed. Theoreti alaspe tsofthenewpro edureareinvestigatedforthe
Chapter 5: Thenewidenti ation methodologyispresentedandapplied tothe
problemof urlandtwist. As aresult,asatisfa torydeterministi modelfor url
andtwistisa hieved.
Chapter 6: In this hapter, the deterministi model of url and twist is
om-plemented with a sub-model (an extended Kalman lter) that a ounts for the
disturban esand un ertainties. The model isthen simulated, and theresults are
omparedwiththemeasuredvalues.
Chapter7: Somepra ti alaspe tsoftheimplementationaredis ussed,andthen
2weeksofprodu tionaresimulatedandanalyzedindetail.
Chapter8: Ashortsummaryofthethesis, on lusionsanddire tionsforfuture
Dimensional Stability in Paperboard
2.1 Introdu tion
AssiDomänABCartonBoardinFrövi,Sweden,produ es350.000tonsofboardper
yearon a7meterswide and250meters longpaperma hine. Ashort des ription
of the pulp and paper pro ess an be found in the next se tion and for a more
ompletedes riptionthereadermayreferto[94℄.
Intheplanttherearemorethan700lo al ontrollerswithinanintegrateddigital
ontrol system (DCS). The primary physi al quality variables, su h as moisture,
basisweight,and thi kness, aremeasuredon-line bytraversingsensors lo atedin
thenalpartofthepaperma hine. Allavailablepro essinformationfrom on-line
measurementsand laboratorytests are stored in a database,within anadvan ed
millinformationsystem, alled Info. All thesedataarepresentedto theoperators
throughpro essowdiagrams,proles, andhistori al trends. Whenrequiredthe
operators ontrolthepro essbyadjustingthesetpointsofthelo al ontrollersin
theDCS.
Unfortunately,mostofthequalityvariablesrelevantforthe ustomersof arton
board,su has url,bendingstiness,printabilityfa tors,et .,arenotavailable
on-line, but only from laboratory tests. The ontinuously moving web in the paper
ma hineisrolleduponbigrolls(tambours). Approximatelyevery50minutesthe
operatorsstartanautomati hangeoftambours. Samplesforlaboratorytest are
availablefromthelastpartofea hroll,andsome20qualityvariablesareanalyzed
inthelaboratoryatdierentpositionsinthe ross-ma hinedire tion.
Then, the operators have to ompare the lab values with the re ommended
nominalvaluesandtaketheappropriatede isionsand ontrola tionsa ordingto
theparti ularsettingsat thatmoment.
Paper url is one of the quality variables that are onlymeasured in the
labo-ratory. Ex essive urlis along-standingproblem, itredu es thequalityofpaper,
anditae tsthe ustomersasitisthemost ommon auseforsheetfeeding
veryimportant to beable to ontrol it. However,be ause of its omplexity, and
be ause of the fa t that url an only be measured after an entire tambour has
beenprodu ed,its ontrolisverydi ultand ostly. Althoughout-of-spe i ation
board may bere-pulped and re- y led, bad url wastesplant time, raw material
andenergy. Thus, ouraim is to develop areliablemodel for url and twistthat
an beimplementedas an on-line predi tor and used forde ision-support by the
operators.
Inse tion 2.2,ashort des riptionof thepapermanufa turingpro essisgiven
and in se tion 2.3, url, twist and their measurement are dis ussed. Se tion 2.4
des ribes the semi-physi al model of url and twist. The model is based on my
Li entiateThesis[12℄. Thismodelhasbeenbuiltafterthat thepro essdatahave
been polished with a nonlinear leaning lter to get rid of as many outliers as
possible. This leaning lter is part of the proposed identi ation methodology
developed to deal with outliers and erroneousdata in amore e ient way. This
newmethodologyisdes ribedin hapter5.
2.2 Pro ess des ription
Thepaperboard manufa turing pro ess isan extremely ompli atedpro ess.
As-siDomänboard is omposed of 4layers,or plies. The twomiddle layersare
om-posedofbulktogetlightweightandbendingstiness,andthetoplayeris omposed
ofblea hed pulp to havegood printability. A s heme of themain se tions of the
millisshowningure2.1. Themainrawmaterialusedinpaperprodu tionispulp,
whi h onsists of extremely ne ellulose bres. There are various typesof pulp
qualitiesusedinthemillandmostofthemareprodu edinthesulphateplantand
storedinintermediatesilos. Fromthesilos,thepulpowsintothebeatingor
ren-ingpart,wherethebresaresubje ttome hani ala tiontodeveloptheiroptimal
papermakingpropertieswithrespe ttotheprodu tbeingmade. Then,thepulpis
sentto themixing tanks,one for ea hlayer. The operatorsde idetheproportion
of ea h pulp quality in ea h layer, and also the basis weight of ea h layer. The
operatorsalso ontrol theowof additives,likee.g. star h. Followingdilutionby
waterto a hievebelow1%ofbre on entration,thesto kissentthroughs reens
and leanerstoremoveforeignmaterials. Theoversizedmaterialsareremovedby
thes reens. Heavymaterials,orparti leswithaspe i gravitygreaterthanthat
ofthe bres, are removed withthe entrifugal leaners. Then the pulpis sentto
thepaperma hine, alled KM5. Therstpartof thepaperboardma hinearethe
headboxes,oneforea hlayer,whi hspreadthepulponthewires. Thepulpsto k
is dis harged to thewire at uniform dilution, thi kness and velo ity through the
headbox. Thewireisa ontinuousbeltof wovenmaterial. Asthe sto k andwire
pro eed, wateris removedrst bygravity,next bylow pressuregeneratedon the
ba ksideoftherollsandfoilsandnallybysu tiondevi eslo atedunderthewire.
0L[LQJ
7DQNV
3XOS
)UDFWLRQV
+HDGER[HV
DQG:LUHV
3UHVVDQG
'U\LQJ
&DOHQGHULQJ
&RDWLQJ
/LJKW
&DOHQGHULQJ
5HHOLQJ
Figure2.1: S hemati overviewofthe artonboardpro ess
table to the headbox to re eive more sto kand ontinue to form the ontinuous
web of paper. Showers below the forming table lean the wire on its return to
theheadbox. The thi kness of the sto kjet is determined by the opening of the
headbox sli e,while thevelo ityis provided bytheheadboxpressure. These two
parametersdeterminethebasisweightandstronglyinuen etheorientationofthe
bersinthelayer.
Afterthewirese tion,thefourpliesarepressedtogetherinthepressse tionin
orderto form the basi paperboard. Thepresses are hardrollsthat squeeze the
papergentlyto removethewaterandbring theberstogether topromote
bond-ing. Thewebleavesthepressse tionand ispassedaroundaseriesof steam-lled
drums, alleddrying ylinders. Thedryingtakespla esbypassingtheboardover
alargenumber of drying ylinders, whose temperatures arebetween 110-130
◦
C.
Thetemperatureofa ylindersurfa eisfun tion ofthesteampressurewithin the
ylinders. Theyaredividedinto 10groups,andthepressureofea hgroupis
on-trolled independently. After the drying se tion, the board is pressed together by
twohotmetalrolls alled alenders. Thispro essis alled alenderingandisused
inorder toa hievesmoothsurfa es,andto ontrolthi kness. Thetopsideof the
board is then overedwith a white oatingin order to provide it with asuitable
printingsurfa e. Finally,theboardisgivenitsnalnishbylight alenderingand
thenitisreeledontambours.
Manyoftheinputvariablesneededforthemodellingaremeasured ontinuously
on-line, and stored in the Info omputersystem as 1-minuteaverages, 12-minute
averages, hourly and daily averages. Curl and twist measurementsare taken,
in-stead,approximatelyevery50minutes. AllthemeasurementsarestoredintheInfo
systemtogetherwiththe orrespondingtimestamp.
Abiglimitationthatwehadtofa ewasthefa tthatitwasnotpossibletomake
experimentsonthepro essbe auseofthehigh osts. So,wehadtousedatafrom
normaloperation. Inthemillmanydierentboardqualitiesare produ ed,andall
themeasurementsarestored forseveral years. Fortunately,we foundoutthat by
identied. However, some of the model parameters have a very large standard
deviation,andoneofthepossible ausesmaybethela kofinputex itation.
2.3 Dimensional stability: some ba kground
The on eptofdimensionalstabilityreferstothe hara teristi of papertoretain
itsdimensionsinalldire tionsunderthestressofprodu tionandadverse hanges
inhumidity. Hen e,itindi atesawidevarietyofpropertiesrelatedtotheatness
of paper su h as hygroexpansion, shrinkage, url, wrinkle, et . Parti ularly, we
are interestedin url in paper and paperboard whi h is dened as thedeparture
from the at form. A urved surfa e may be hara terized by three urvature
omponents. Thegeneralexpressionsforthe urvaturesatapointare ompli ated
fun tionsoftheout-of-planedispla ementandtheposition oordinates. However,
fordeformedsurfa eswithsmallandmoderatedee tionswe anapproximatethe
out-of-planedispla ement,
w(x, y)
,ofthesheetas follows,[60℄:w(x, y) =
−
1
2
K
x
x
2
−
1
2
K
y
y
2
−
1
2
K
xy
xy
[m℄ (2.1)where
w(x, y)
istheout-of-planedispla ementofthesheet,x
istheMa hine Dire -tion(MD),andy
theCrossDire tion (CD).Thethree urvaturesarethendened bythefollowingexpressions:K
x
=
−
∂
2
w
∂x
2
,
K
y
=
−
∂
2
w
∂y
2
,
K
xy
=
−2
∂
2
w
∂x∂y
[
m−1
]
(2.2)Seegure2.2foranoverviewofthe onventions.
For a ylindri al surfa e, one an rotate the
xy
- oordinate system so that, in thenew oordinatesystem,thetwist omponent,K
TW
,iszeroandonlyoneofthe
url omponents,
K
MDor
K
CD, is nonzero. The url omponents
K
MDand
K
CDorrespondto theinverse ofthe radiusof urvaturein ea hdire tion, andin this
way they an be given physi al meaning. The larger the K-value is in absolute
magnitude,thestrongeristhe url.
Thebasi ausesfor urlandtwistaretheswellingand ontra tionofthebers
resultingfrom moisturevariations. Changes in moisture ontent ofboards whi h
areasymmetri inthethi knessdire tion auseout-of-planevariationsoftheboard
[29℄,[72℄.
The evaluation and analysis of url are further ompli ated bythe di ulties
in hara terizingit. Infa t,thereis notastandardmethodto measure url. The
valuesofthe url omponentsareusuallyspe i totheshapeandsizeofthesample
usedtomeasurethem. Forexample,narrowpaperstripsoftenhave urvaturethat
does not mat h the surfa e of the sheet from whi h the strips were ut. The
samplesownweightmayalso ausebending. Hen e, urlmustbemeasuredinsu h
measurements an only bemade if the spe imensare large enough. Inaddition,
in smallsamples and narrowstripsone mayend upin observing the o kling 1
of
paperinsteadof url.
At AssiDomän thelaboratory
K
MD
>0
K
CD
>0 K
TW
>0
y=CD
x=MD
w
Figure2.2: Curl onventions
surementsare ondu ted in the
follow-ingway. Anopti alinstrumentthat an
measure5testsamplesatthesametime
is used. 16 sheets of dimension 50x50
m are ut from the last part of ea h
tambour. In general, url is measured
in3ofthesesheets,i.e. intheedgesand
in themiddle. From a50x50sheet ve
square test pie es of dimension 10x10
marerandomly utandkeptina
on-trolled environment(50 % RH, 25
◦
C)
for about 5 minutes. Then, the url
omponents of the ve test pie es are
measuredbytheopti aldevi e,andthe
median valuesarepresentedtothe
op-erators.
Themiddleofthewebisknowntohavelessvariationsinthephysi alproperties,
i.e. thelayersare morehomogeneousinthe middlethanin the edges. Therefore,
rst,weareonly onsideringthe urlinthemiddleoftheweb. Inthenearfuture
wehopethatthisworkwillbeextendedtotheedgesoftheweb,similarlytowhat
it was done with Petterson's bending stiness predi tor. However, even though
we are onsidering a relatively small area, 50x50 m, in the enter of the web,
the url variations areverylarge. In aninvestigation we omputed thestandard
deviations of 1500 measurements from the middle of the web, a ording to the
pro eduredes ribedinthepreviousparagraph,whereonemeasurement omprises
5testpie es. Namely,supposethatea hmeasurement onsistsof5values
x
1
, ..., x
5
forea hkindof url. Then,we omputetheestimatedstandarddeviationforea hkindof urla ordingto
ˆ
σ =
v
u
u
t
1
n
− 1
n
X
i=1
(x
i
− ¯x)
2
where
n = 5
andx
¯
isthe arithmeti mean ofthe 5values. Theaveragestandard deviation,σ
¯
,ofseveralstandarddeviationestimates,ˆ
σ
1
, ..., ˆ
σ
m
isdened as¯
σ =
v
u
u
t
1
m
m
X
j=1
ˆ
σ
2
j
About4months ofprodu tionwerein ludedin orderto in ludeseveral board
qualities. Note,however,that ea hstandarddeviationestimateis omputedusing
1
dry-MD[m
−1
℄ CD [m−1
℄ Twist[m−1
℄Typi alrangevalues [-0.5,0.5℄ [-2,2℄ [-1,1℄
Standarddeviation 0.16 0.34 0.39
Table2.1: Typi alvaluesandaveragestandarddeviationsofthe urlmeasurements.
only5samples,andhen eitisratherun ertain.
Theinvestigationgavethe resultsshownin table 2.1where theaverageof the
standarddeviationsfortheCD url,MD urlandtwist,respe tively,arereported.
Clearly, table 2.1 showsthat the available url measurements are a poor way of
hara terizing url andtwist,andwemayexpe t that themodel will be strongly
ae tedbythelowqualityofthemeasurements. Webelievethatthereasonforso
largeun ertainties in thelab measurementis that the test samplesare relatively
smalland, hen e,ae tedbylo alvariationsofstru turalproperties,su hasber
orientation,density,hygroexpansivity,et .
Curlisaseriousproblemfor paperboardprodu ers,andalot ofinvestigations
were arriedoutto analyzethe auses of url, andto nd outmethodsto redu e
it. Some useful general referen es are [29℄, [34℄, [71℄ and [104℄. Finite elements
modellingof urlwas studiedin[17℄,[72℄,and[60℄.
Theonlyapproa hto urlpredi tionsimilartoourwork,wasfoundin[26℄where
a neural network model was developed for predi tion of paper url. The model
predi tionsare in orporatedinto a probabilisti framework,where the predi tion
isthemean,andthereis anasso iatedvarian e. Theoperatorsarethenprovided
witha predi tedvalue, and the orresponding onden e level. A ording to the
authors the model showed satisfa tory results, even though the predi tions were
notreliablein some ir umstan es,likeforinstan e the asesofhigh urlvalues.
Inour asewehaveamore ompli atedsituation,sin ewewanttopredi t url
formulti-plypaperboardwhi hin generalismu hmoresensitiveto urlproblems
thansingle-plypaper.
2.4 Modelling
As we stated in the introdu tion, the purpose of the model is to have a
de i-sion/supporttoolfortheoperatorsandpro essengineers,andeventuallyformodel
predi tive ontrol. In this work, we use grey-box modelling as des ribed in [10℄,
and[77℄. Thereasonsfor hoosing thisapproa his that betweenthemany
avail-ablemethodsformodelling,greyboxmodellingappearedone ofthemostsuitable
forthissituation. Infa t,thephysi al pro essisvery omplexand nonlinear,the
inuen e of someinputs is notentirelyunderstood, and besides, it depends on a
ad-estimationof omplex industrial pro esseswhere the priorknowledgeis available
butnot omplete, likein our ase,seee.g. [78℄, [31℄.
Thebasi ideaofgrey-boxmodellingisthefollowing.First,buildasemi-physi al
model by using the pro ess knowledge and by following as mu h as possible the
physi alprin iples. Thismodelshouldprovideagoodtradeobetweenadetailed
des riptionofthepro essandareasonablysimplemodel. Then,inase ondstage,
themodel is omplementedwith an extended Kalmanlter in order to take are
oftheunmeasurabledisturban es,theun-modellednonlinearities,andthepro ess
noisebyusingalltheavailable measurements.
The se ond part, i.e. the introdu tion of the Kalman lter, is dis ussed in
hapter6. Inthis hapter thesemi-physi almodelispresented.
Thesemi-physi almodelwasdevelopeda ordingtotheidenti ation
method-ology presented in hapter 5. The model presented in the next subse tion has
beenobtainedafterapreliminarypolishingofthedatafromoutliersbyanonlinear
leaning lter, alled Hampel lter. The semi-physi al model will be estimated
usingarobustestimationmethod,a ordingtotheidenti ationmethodology,see
se tion5.3. Unfortunately,itturned outthat themodelismarginallyidentiable.
Non-identiability,orweakidentiability,ofmodelparametersmaybe ausednot
onlyby a redundant, or almost redundant subset of parameters,but also by the
way howthe inputs of the system are generated. In our ase, in fa t, the input
variables are strongly orrelated, and theyare not su iently ex ited to identify
alltheparameters. The typi alapproa h forsu haproblem isvariable sele tion,
i.e. the sele tionofanoptimal subset ofinputvariables. This issueis dis ussed
in hapter 4,whereafewmethodsforvariablesele tionarepresented.
Semi-Physi al model
Oursemi-physi al model is theresult of alongtrialand errorpro ess. From the
pro ess knowledge, previoussimilar worksand basi physi al prin iples aninitial
model was built. Then, alarge orrelation analysis was arried out between the
urlandthepro essinputs,andbetweentheresidualsofthemodelandthepro ess
inputs. Several inputs were tested and their ee ts on the predi ted valueswere
arefullystudied. Onthebasisofthisanalysis,newinputvariableswerein ludedin
themodel,andotherswereinsteadex luded. Thispro edurewasthenrepeatedfor
dierenttimeperiods, anddierent board qualitiesuntilamodelwhi hprodu ed
satisfa torypredi tionerrorswasgenerated.
Eventually 3 stru turally identi al nonlinear models were omputed, one for
ea h kind of url, i.e. CD, MD and twist. This is also the main dieren e with
the Li entiate model, [12℄, where an unique Multi-Input-Multi-Output nonlinear
model was instead onsidered. Even though the 3 kinds of url are physi ally
related,webelievethattheirrelationshipistoo omplextobea uratelydes ribed
Symbol Meaning
B
h,i
Bendingofheadboxlips ofthei
th layer,[m℄.C
h
Chemi alsandadditives,[Kg/Ton℄.C
o
Total oating,[Kg/m2
℄.
C
p
Pressuresofthe alenderingse tions,[Pa℄.D
p
Steampressuresofthedrying ylinders,[Pa℄.F
Pulpfra tions,aftermixing tanks,[%℄.J
w,i
Headboxjettowirespeeddieren eofthei
thlayer,[m/min℄.P
l
Se ondpressload,[Pa℄.R
e
Reningenergies,[KWh/ton℄.S
p
Speedoftheboardin thema hine,[m/min℄.S
t
Steamaddedtothetoplayerbefore alendering,[Kg/m2
℄.
T
s,i
Tensilestiness indexofthei
thlayer,[Nm/Kg℄.T
w
Tensionoftheboardwebindierentpositions,[%℄.w
b,i
Basisweightofthei
thlayer, [Kg/m2
℄
W
l
WateraddedtothebottomlayerbyLAS,[Kg/m2
℄.
W
v
WateraddedbyVIB,[Kg/m2
℄.
z
i
Thi knessofthei
thlayer,[m℄.ρ
i
Densityofthei
thlayer,[Kg/m3
℄.
Table2.2: Final sele tionof inputvariables forthemodel. The subs ript
i
refers tothelayer,e.g. top,middleandbottomlayer.Theonlyrelevantdynami sinthepro essarethoseofthepulp on entrations
inthe mixingtanks. However,we onsider theestimated on entrationsafter the
tankswhi haretakenfromPetterson,[76℄and[78℄,andhen ethe urlpredi toris
astati model. Thenalsele tionofinputvariablesislistedintable2.2. Notethat
theestimatesofthi knesses,densities,basisweightsandtensilestinessindexesof
thelayersare omputedbythebendingstinesspredi tor,see[76℄,runningonline
inFrövi. TheLASisadevi esituatedafterthe oatingse tionusedto ontrol url.
Itapplies to thebottomply athin layerof water whi h is thendried byinfrared
dryers. TheVIBisadevi efortheCDprolemoisture ontrolsituatedintherst
partofthedryingse tion.
Thenalmodel anbedividedintodierentsub-models,asshowningure2.3.
Thestru ture of thesemi-physi al modelis the sameas in the Li entiate Thesis,
see[12℄foramoredetaileddes ription.
Thesemi-physi almodelis des ribedhereverybriey, see [12℄ for amore
de-taileddes ription.
Themain partis theme hani almodel, whi his des ribedin [12℄. It isbased
onCarlsson,[17℄,where lassi allaminatetheoryisusedtomodel urlandtwistof
multi-plyboard. Despiteitssimpli ity, Carlsson'smodelagreessatisfa torilywith
2.4. MODELLING 15
Mechanical
Model
K
Strain
Model
Fiber
Orientation
Hygroexp.
Coefficients
Moisture
Model
Thickness
Densities
Elastic Moduli
Figure2.3: Stru tureofthemodel.
homogeneouselasti medium. The me hani al model takes as inputs the strains
of the three layers, their thi kness, the densities and their elasti moduli. The
latter ones are al ulated as part of Petterson's bending stiness predi tor [76℄.
Theoutputsoftheme hani almodelarethe urvatures[m
−1
℄intheMD,CD,and
diagonaldire tions.
Intherest ofthis se tionthesubs ripts
i
andj
refertothelayer(top,middle, and bottom) and to the dire tion (MD,CD and Twist), respe tively. Thesuper-s ripts
f
,β
,H
,ands
aresymboli andrefertotheir orrespondingsubmodels,e.g.f
standsfortheberorientationsubmodel,β
forthehygroexpansivitysubmodel,H
forthemoisturesubmodel,ands
forthestrainsubmodel.Theberorientationangle,
φ
,isgeneratedbythehydrodynami alfor esinthe wiresduringthelayerformation,see[69℄,andin thisworkitismodelledas:φ
i
= arctan(U
i
f
θ
i
f
)
[rad℄,
andU
f
i
= [J
w,i
, B
h,i
]
(2.1)where
θ
f
i
areparametersto be identied, and theinput variablesJ
w
andB
h
are denedintable 2.2.In general, the hygroexpansivity oe ients,
β
, are determined by stru tural and me hani al properties of the bers, and of theber network,see [87℄, [104℄,[103℄,and[105℄. Adetailedmodellingofsu hpropertieswouldbevery omplex,and
beyondthepurposeofthis work. Wewantinsteadtoemployasimplemodel, and
verify itsreliabilitywith respe t to url predi tions. Hen e, asimplequasi-linear
modelwas used:
β
i
= J
φ
U
i
β
θ
β
i
[
−],
andU
β
i
= [w
b,i
, R
e
, C
h
]
(2.2)where
J
φ
isa oordinaterotationmatrixduetotheberorientationangle,φ
.θ
β
i
are parameterstobeestimatedandU
β
aretheinputsthatwerefoundmost orrelated
withthehygroexpansivitydened intable2.2.
environment after that the paperhas been onditioned in an oven for at least 5
minutes. Alsoin this asealinearmodelwasemployed:
H
i
= U
i
H
θ
i
H
[-℄,
andU
H
i
= [F, R
e
, ρ
i
]
(2.3)where
H
i
is the moisture after onditioning,θ
H
i
are parameters to be estimated andU
H
are the inputs orrelated with the nal moisture, see table 2.2 for the
denition. Themoisturemodel was identied independentlyfrom the url model
byusingthetotalmoisturemeasurementattheendofthepaperma hinea ording
tothefollowingmodel:
H
tot
=
X
i
w
b,i
w
b,tot
H
i
wherew
b,tot
=
X
i
w
b,i
(2.4)where
H
tot
isthemeasuredtotalmoisture,andw
b,tot
isthetotalbasisweight. Thestrainwasmodeledas alinearfun tionofthemoisture ontent:ǫ
ij
= β
ij
H
i
[-℄ (2.5)Internalstressesaregeneratedinsidethe artonboardduringthepaper-making
pro ess. Thedevelopment ofsu hstressesis avery omplexpro ess,notentirely
understood. A ordingtotheliterature,[67℄,[86℄,[101℄,[107℄,itdependsonmany
fa torssu hasdryingspeed,pulp omposition,dryingrestraints,et . Analogously
to the hygroexpansivity oe ients, a simple model was employed and its
a u-ra y was then orroborated by the reliability of url predi tions. So, as a rst
approximation,theinternalstresses
s
ij
,weremodelledas:s
ij
= J
φ
(s
0
ij
+ a
ij
exp(θ
i
s
U
i
s
))
[N/m2
℄ (2.6)
where
U
s
= [D
p
/S
p
, P
l
, S
t
, C
p
, C
o
, W
l
, W
v
]
where
J
φ
is a oordinaterotationmatrixdue totheberorientationangle,φ
.θ
s
i
,s
0
ij
anda
ij
areparametersto beidentied.U
s
are ontrolinputs that ontribute
tothedryingstressesinlayer
i
2,see table2.2.
2.5 Summary and on lusions
Curl is an important property for paperboard produ ers, but unfortunately it is
alsooneofthemostdi ultqualityvariablesto ontrolforpro ess engineersand
operators. In this hapter wehavegivenashort des riptionof thepapermaking
pro ess,inse tion2.2,andaformaldenitionof urlandtwist,in se tion2.3.
2
Inamore ompli atedmodel
s
0
Thepapermakingpro essisanextremely omplex,nonlinearandtimevarying
pro essae tedbymanyunmeasurabledisturban es. Curlandtwistare
phenom-enawhi harestronglyrelatedtothemanufa turing onditionssu hasthepressure
andpressingtime,dryingtemperatureanddryingtime,et . The ausesfor urlare
well do umented in the literature, but there are still many open questions under
investigation. Inaddition,itisverydi ultto hara terizein ane ientwaythe
magnitudeof url and twist. In fa t, there is not even agreement on astandard
method of measurementsin e, forinstan e, dierent sample sizes or shapesmay
giveriseto dierentresults. Themethod used at AssiDomänFrövi onsidersve
10x10 m samples. Their urland twist omponents are measuredby an opti al
devi e, and the median values are presented to the operators. Despite the fa t
that thesamples aretaken from asmall areain the enter ofthe paperweb, the
standarddeviationofthemeasurementsisveryhigh,seetable2.1. Webelievethat
this la k of a suitable hara terization of url is one of the main reasons behind
the di ulties in developing an ee tive ontrol strategy and also in building a
satisfa torymodel.
Then, in se tion 2.4 the semi-physi al model has been presented. After an
initialpolishingofthedata,threesatisfa torymodelswerebuilt,i.e. MD url,CD
urland twist. Themodels were satisfa tory in thesense that the residualswere
a eptablysmall, whiteand statisti ally independent from all theinput variables
tested.
Thethreemodelsarestru turallyidenti al, but learlytheyhavedierent
val-uesof parameterswhi h haveto beestimated from the pro ess data. Theinitial
identi ationwasnot ompletelysu essfulbe ausetheidentiabilityanalysis,see
se tion5.3,showedthatthemodelswereweaklyidentiable. Eitherthedatasetis
notinformativeenoughto provide areliableestimationof the parameters,or the
modelstru tureisredundant. Thetypi alapproa hin su h asituationis variable
sele tion, i.e. the sele tion of asubsets of inputswith the best predi tive
apa -ity. Thisisavery omplexproblem,anddespiteahugeamountofliteraturedeals
with it, there are still many unsolved problems. For instan e, dierent authors
usedierent pro edures whi h often yield dierent results, and thus it is un lear
whi h method is the mostsuitable in a givensituation. In hapter 4a review of
somemethodsand riteriaforvariablesele tionisgiven,andanewmethod alled
IterativelyS aledRidgeRegression,ISRR,isintrodu ed. Variablesele tionispart
ofthe identi ationmethodology presentedin se tion 5.2 andapplied to the url
Outlier Dete tion and Model
Diagnosti
3.1 Introdu tion
Regressionanalysisandmoregeneralstatisti almodellingareveryimportanttools
whi hare ommonlyappliedin manydierentelds. Theresultsoftheregression
analysis depend on the statisti al properties of all thedata. However,if some of
theobservationsinthedatasetdiersomehowfromthemajorityofthedata,the
tting pro ess may make unre ognizablethe dieren e by tting all the data in
thesamestraightja ket,[3℄. Inotherwords,leastsquaresmethodstrytomakeall
theresidualssmall,andhen enon-regularobservationsmaybeindistinguishable
fromregularobservations.
The main purpose of this hapter is to des ribe two methods to ope with
outlierswhi hwillbeusedintheproposedidenti ationmethodologypresentedin
hapter5. First,arobustestimation method, alled forwardsear h, isillustrated,
and afew modi ationsto its original algorithm are proposed. Then, an outlier
identiermethod, alledHampellter,isdis ussedandsomemodi ationsarealso
suggested.
Theforwardsear hmethodisusedtodete tandanalyzeobservationsthatdier
fromthebulkofthedata. Thesemaybeindividualobservationswhi haretheresult
ofsensor errors,re ording, or transmission errors,or ex eptionalphenomena. Or
theymay be asubsetof thedata, whi h issystemati allydierentfrom therest.
Itisthenimportanttoidentifysu hasubsetandanalyzeit,alsotoevaluateifthe
modelissuitableforitspurposeorifitshouldbeimproved.
TheHampellterbelongstoadierent lassofoutlierdete tionmethodssin e
itdoesnotusethemodeltoidentify erroneousdata. Thus, thislter an beused
foraninitialpolishingofthedataset.
Weareinterestedin dete tingoutliersbe auseofthedangerposedbythemin
tototallyosetaLSestimator,[80℄.
Outlierso ur very frequentlyin real data and they go often unnoti ed sin e
most of the data are pro essed by omputers without areful inspe tion. If the
dataset ontainsasingleoutlier,theproblemofidentifyingsu hanobservationis
relativelysimple. However,ifadata set ontainsmorethan one outlier,whi h is
likely to bethe ase in mostreal datasets, then the problem of identifying su h
observationsbe omes mu h moredi ult. Thisis due to masking and swamping
ee ts, see e.g. [80℄. Masking o urs whenan outlier goesundete ted be ause of
thepresen e ofothers, usually adja ent, outlyingobservations. Swamping is said
to o urwhen the non-regularobservations ausenon-outliers to be identied as
outliers.
Oneoftherst approa hesto outliersdete tionwas basedonregression
diag-nosti s, see[7℄ and[19℄. Unfortunatelyit wasproven withseveral examples,[80℄,
thatthesemethodsmayfailinthepresen eofmultipleoutliers. Thus,morerobust
methodshavebeensuggested,andsomeofthemareillustratedin se tion3.3. The
mostbasi onesaretheuseofthemedianandMedianAbsoluteDeviation 1
(MAD)
insteadofthemeanandstandarddeviation.
Adierentapproa hisbasedonrobustregressionmethodssu hastheleast
me-dianofsquares,LMS,ortheleasttrimmedsquares,LTS,introdu edbyRousseeuw,
[80℄. Thebasi ideainthesemethodsisthattode idewhetheragivenobservation
isabadpointweneedameasureofwhetherthedatasetregressionstru tureholds
forthedatapointinquestion. Ifanestimate
θ
ˆ
R
isrelativelyunae tedbyoutliers, thenthe residualsε = Y
− X ˆθ
R
should be ausefultoolto ag observationsthat donot follow theregressionstru ture. A similar ideais used also in the forwardsear halgorithmwhi histherobustestimationmethodthat weintendtoanalyze
moreindetail.
Oneofthemainproblemsween ounteredwithrobustregressionmethodsisthe
fa tthattheyare omputationallyexpensive. Theirusagewithnonlinearmodelsis
thus somehowlimitedby thelarge omputationalburden involved. Hen e,inthis
work we present someessentialimprovements to theforwardsear halgorithm to
shorten its omputational time. Themain modi ationregardstheidenti ation
ofanoutlier-freesubsetofobservationsne essaryto startthealgorithm. Inorder
to determineit, we apply therobust methods des ribedin se tion3.3 to identify
the
N/2
most onsistentsamples.Inaddition,wenoti edthatinsomesituations,likeforthe urlandtwistmodel
forinstan e,itisnotstraightforwardtodeterminethenumberofoutliersinadata
setwhenusingtheforwardsear hmethod. Hen e,asimplepro edureto estimate
thenumberofoutliersinthese situationsisalsoproposed.
Inse tion3.2,ageneraldes riptionofoutliersandleveragepointsisgiven.
Se -tion3.3presentssomemethodsforrobustmodeldiagnosti . Se tion3.4illustrates
theforwardsear h method, and in se tion 3.5 weintrodu e ourmodi ationsto
1
TheMADs aleestimateofasequen e
x
k
isdenedasS=1.4826median{|x
k
−
median{x
thealgorithm. Inse tion3.6theHampellterisdis ussed,andsomemodi ations
aretested inasimulationstudy.
3.2 Outliers and leverage points
Theliteratureonoutliersisvast,andageneraldenition is,e.g.,giveninBarnett
and Lewis, [5℄: an observation (or subset of observations) whi h appears to be
in onsistentwiththe remainder ofthat setof data.
We andistinguishbetweenseveraltypesofoutliers,andherewewillfollowthe
terminologyinRousseeuwandvanZomeren,[81℄. Anobservation
(x
i
, y
i
)
whosex
i
isoutlyingwith respe t themajorityof theX
-pointsis alled aleverage point or high leverage point. Theterm leverage omesfrom theme hani s,be ausesu hapointpullstheLSsolutiontowardit.
An observation
(x
i
, y
i
)
that deviatesfrom theregressionstru ture followedby themajorityofthedata,i.e. themodel,isinstead alledaregressionoutlier.Combiningthenotionofleveragepointsandregressionoutlier,weseethatfour
typesofobservationsmayo urinregressiondata:
1. regularobservations withinternal
x
i
andwell-ttingy
i
.2. verti aloutliers withinternal
x
i
andnon-ttingy
i
.3. good leverage points withoutlying
x
i
andwell-ttingy
i
.4. bad leveragepoints withoutlying
x
i
andnon-ttingy
i
.Abadleveragepointisthen usuallyanerroneousobservation, whileagood
lever-agepointis anobservationwhi h followsthemodelstru ture, butsomehowits
x
i
isoutlyingfromthemajorityoftheotherregressorvalues. Clearly,goodleveragepointsmaybeveryusefulintheidenti ationofasuitablemodel,whilebad
lever-agepointsare dangerousbe ausethey may giveriseto totally wrongresults. To
distinguishbetweengoodandbadleveragepointswehaveto onsider
y
i
aswellasx
i
andwealso needtoknowthepatternfollowedbythemajorityofthedata,i.e. themodel.Letus onsiderthefollowinglinearmodel:
Y = Xθ
0
+ e
(3.1)where
Y
isaN
-ve toroftheresponsevariable,X
isaN
× n
matrixofregressors,θ
0
is a
n
-ve torof unknown parameters to be estimated ande
is aN
-ve tor of independentrandomvariableswithzeromeanandunknownvarian eσ
2
thedatawiththeleastsquaremethodwehavethefollowingresults:
ˆ
θ
=
(X
T
X)
−1
X
T
Y
(3.2) Var(ˆ
θ) =
σ
ˆ
2
(X
T
X)
−1
(3.3)ˆ
Y
=
X ˆ
θ = P Y
(3.4)P
=
X(X
T
X)
−1
X
T
(3.5) Var( ˆ
Y ) =
σ
ˆ
2
P
(3.6)ε =
Y
− ˆ
Y
(3.7) Var(ε) =
σ
ˆ
2
(I
− P )
(3.8)ˆ
σ
2
=
ε
T
ε
N
− n
(3.9)where
ε
are theresidualsor predi tionerrorsandP
isthepredi tion matrix,[19℄. Itis possibleto show thatone outlier is enoughtoprodu e atotally wrongosetto a least squares estimator, i.e. the quantity
kˆθ − θ
0
k
may be ome arbitrarily largebe ause ofone outlier. On theother hand,it ispossibleto showthat thereexistrobustestimatorsthat an handleseveraloutliers. Inorderto formalizethis
aspe t we introdu e the on ept of breakdown point, [80℄. Consider the data set
Z =
{X, Y }
and letT
be aregression estimator, e.g.T (Z) = ˆ
θ
. Now, onsider allpossible orruptedsamplesZ
′
that areobtainedbyrepla ing
m
of theoriginal datapointsbyarbitraryvalues,in luding∞
. Letusdenotewithbias(m; T, Z
)the maximumbias ausedbysu h ontamination:bias
(m; T, Z) =
supZ
′
kT (Z
′
)
− T (Z)k
(3.10)Ifbias(
m; T, Z
)isinnitewesaythattheestimatorbreaksdown. Thebreakdown pointis thendened as thesmallestfra tionof ontamination that an ause theestimator
T
tobreakdown:ε
∗
N
(T, Z) = min
m
{
m
N
:
bias(m; T, Z)
isinnite}
(3.11) Looselyspeakingthebreakdownpointis thesmallestfra tionofgrosserrorsthatan arry theestimator overall bounds. The breakdown pointof the arithmeti
meanis
1/N
, viz. just one outlier an destroy themean. Themedian, instead, hasabreakdownpointof1/2
,i.e. it anhandleuptoN/2
− 1
outliersinthedata whi histhemaximumpossible.Robustness,however,doesnot omeforfree. Robustestimatorshave,ingeneral,
alowerstatisti ale ien y thanleastsquaresestimators 2
.
One of the early methods of dete ting model failures is examining the least
squaresresiduals,
ε
i
,or preferablyas aledversionofthem. However,ithasbeen 2Supposethat
θ
ˆ
andθ
˜
aretwodierentestimates.Thee ien yofθ
ˆ
relativetoθ
˜
isdenedas beingtheratiooftheirvarian es,i.e. e(θ, ˜
ˆ
θ
)=Var(θ
ˆ
)/Var(θ
˜
). Wesaythatanunbiasedestimatenoti ed,[19℄,that examination ofresidualsaloneis notsu ientfordete ting
in-uentialobservationsandoutliers,espe ially those orrespondingto high-leverage
points whi h in general tend to have small residuals. The predi tion matrix
P
, denedin(3.5),playsanimportantroleindeterminingthettedvalues,themag-nitudeoftheresidualsandtheirvarian estru ture,see(3.4),(3.6)and(3.8).
There-fore,ithasbeensuggestedtoexamineboththeresidualsandthediagonalelement
of
P
. Alternatively, insteadof onsidering thediagonalelementsofP
,the Maha-lanobisdistan e anbeexamined. Thisdistan emeasureshowfarisanobservationx
i
fromthe enteroftheX
-values. TheMahalanobisdistan e orrespondingtothei
thobservationx
i
isdenedasM D
2
i
= (x
i
− t)(X
T
X)
−1
(x
i
− t)
T
(3.12) wherex
i
isthei
throwoftheX
matrixandt
isthearithmeti meanoftheN
rows ofX
. ThepurposeoftheM D
i
istopointtoobservationswhoseexplanatorypart, i.e.x
i
,liesfarfromthatofthebulkofthedata.Themain riti ismagainstthe lassi aldiagnosti methods isthat theysuer
fromthemaskingee t,[80℄, by whi h multipleoutliers donotne essarilyhavea
largediagnosti measure. This isdue to thefa t that thesediagnosti s arebased
onstatisti s,namelymeanand ovarian e,whi harenotrobust: asmall lusterof
outlierswillattra tthemeanand inatethe ovarian ein itsdire tion.
3.3 Robust model diagnosti
Themain reasonfor the failure of the lassi aldiagnosti s, likethe Mahalanobis
distan e for instan e, is the fa t that they are ae ted by the observations that
theyaresupposed toidentify.
In order to avoid the maskingee t, Rousseeuw, [81℄, proposed thefollowing
robustdistan e:
RD
i
2
= (x
i
− T (X))C(X)
−1
(x
i
− T (X))
T
(3.13) whereX
is theset of data,x
i
is thei
th rowofX
,T (X)
is arobustestimator of lo ation(mean), and C(X) is a robustestimatorof s atter (varian e). Thus, theauthorssuggestedtoexaminethe residuals,
ε
i
,together withtherobustdistan e,RD
i
,insteadofthe lassi alMahalanobisdistan e,orthediagonalelementsofP
. In our proposed method for outlier dete tion, see se tion 3.5, we employ therobustdistan es
RD
i
to sort theobservations. Hen e,some robustestimators of lo ationands atter, ne essaryto omputeRD
i
a ordingto(3.13), arepresented next.Several robustestimators oflo ation,
T (X)
, and s atter,C(X)
, are available. Mostofthemarebasedontheideaofndingthemost onsistenth
observationand estimateT (X)
andC(X)
from them. Often the parameterh
isset toN/2
sin eofregularor non-regular observations. Note, however,that these robustmethods
impli itlyassumethatthe
X
variables havean(almost)gaussiandistribution. If, instead, they have a dierent distribution, for instan e they are lustered, thenthesemethodsmaynotbeverye ient.
ApopularmethodsuggestedbyRousseeuwandVanDriessen,[83℄,isthe
mini-mum ovarian edeterminant (MCD) estimator. TheMCDobje tiveistondthe
h
observationswhose lassi al ovarian ematrixhasthesmallestdeterminant,withN/2
≤ h ≤ N
. Basi ally, themethod tries tond theh
observations with losest explanatoryparts,x
i
. Moreformally,wedenetheMCDestimatorasthesolution ofthefollowingproblem. Sele tasubsetofobservationsJ
ofdimensionh
soasto minimizethedeterminantofW
whereW =
X
j∈J
(x
j
− t
J
)
T
(x
j
− t
J
)
where
x
j
isthej
throwofX,andt
J
=
1
h
X
j∈J
x
j
TheMCDestimateoflo ationisthentheaverageofthese
h
points,i.e.T (X) = t
J
, whereasthe MCD estimateof s ale is their ovarian ematrix,C = h
−1
W
. It is
possibletoprovethatwhen
h = (N + n+ 1)/2
theMCDestimatorhasabreakdown pointofnearly0.50,i.e. it anhandleuptoN/2
−1
outliers. Inaddition,Rousseeuw andVanDriessenproposedafastalgorithm to omputeanapproximate estimateoftheMCDwhi h an opewithverylargesamplesize.
In [28℄ two similar approa hes were proposed. The rst is alled Resampling
by Half Means (RHM) whi his aresamplingte hniqueaimedat determining the
distribution of observation ve tor lengths. Clearly, the outliers are expe ted to
dominatethe highestregion of the distribution. The algorithm an bedes ribed
as:
⊲
Takearandomhalf sample,i.e. asubsetofN/2measurements.⊲
Computetherobustdistan eRD
i
,(3.13),forallobservationsusingthe arith-meti meant
and ovarian eC
,estimatedfromthehalf sample.⊲
Markthetop(1
− c) × 100%
extremerobustdistan esasdubious.⊲
Repeatseveraltimes. Itisre ommendedthatthedataberesampledatleast2N
times.⊲
Finally omputet
andC
using onlythenon-dubiouspoints.Inthesamepaperanotheralgorithm, alledSmallestHalf-Volume(SHV), was
proposedtondthemost onsistent
N/2
samples. InSHVtheregressormatrixX
isrstautos aled,i.e. ea h olumnofX
is enteredarounditsmeanandnormalized withrespe titsstandarddeviation. Then,thefollowingstepsaretaken:⊲
Computeallinter-sampledistan esintheN
×N
matrixD
,i.e.D
i,j
=
kx
i
− x
j
k
wherex
i
andx
j
arethei
thandj
th rowsoftheX
matrix. Inother words,D
i,j
isthedistan ebetweenthei
thandj
thobservations.⊲
Sortea h olumnofD
inas endingorderandsumthesmallestN/2
distan es.⊲
The olumnwiththesmallestsumfortherstN/2
distan esisdetermined. These are theN/2
observations that are losestto ea h other in the multi-variateX
-spa e.⊲
Estimatet
andC
from thesemeasurements.InrobustSHV,robusts alingisappliedtothematrix
X
rst,i.e. themedian and MADareusedinsteadofmeanandstandarddeviation.Finally,analgorithm alledClosestDistan etoCenter (CDC)wasproposedin
[20℄. Thebasi ideais verysimilarto SHV.Namely, SHVidentiesthemost
on-sistentobservationsby al ulatingthedistan ebetweenea hpairofobservations.
CDCidentiesthemost onsistentobservationsby al ulatingthedistan eofea h
observationfromthe enter. InCDCthematrix
X
isrstautos aledandthenthe distan eis omputed forea h observation. TheEu lideandistan e, CDC2
,or the maximumnormdistan e, CDCm
, anbeused. Forarobustimplementationofthe algorithm,robusts aling issuggested.TheCDCalgorithmisthemethodthatwefoundmostrobustinoursimulations,
andthusitistheonethatwere ommendtouseinourproposedpro edureinse tion
3.5.
3.4 Forward Sear h
The forward sear h method, [3℄, is a pro edure to dete t outliers and for model
diagnosti . Themethodwasintrodu edbyHadi,[41℄, forthedete tionofoutliers
from at using approximatelyhalf of the observations. Dierent versions of the
methodaredes ribedin[42℄ and[2℄.
Themain ideabehindthe method isthat if thetrueparametersof themodel
wereknown,thenitwouldbeeasytodete toutlierswhi hwouldhavelarge
resid-uals. Hen e,themethodstartsbyusingrobustmethodstotasmall(supposedly)
outlier freesubset ofobservations. This subsetis then in rementedso that it
re-mains outlier free as long as possible. The onsequen e is that the outliers will