Multivariate modeling improves quality grading of sawn timber

(1)

Multivariate modeling improves quality grading of

sawn timber

Master Thesis’s in Engineering Physics

Department of Physics

Umeå University

Charlotta Wendel

(2)

Multivariate modeling improves quality grading of sawn timber Charlotta Wendel

Johan Oja, Norra Timber

Anton Eriksson, Department of Physics, Umeå University Examiner

Magnus Andersson, Department of Physics, Umeå University Master’s Thesis in Engineering Physics, 30 credits

(3)

Abstract

(4)

Glossary

CT

The CT Log Computed Tomography is installed before the saw in the saw line. It produces a 3D-reconstruction of the log using x-ray scanning of the internal features, e.g., knots. Gives the optimal cutting solution based on the final quality and resale value of the sawn timber.

FinScan

Short term for FinScan Boardmaster, which is an optical system that automatically determines the final quality grades of the boards. The system scans each side of each board using cameras that register appearance features e.g., knots, stain, splints etc.

Quality grade

The grade of a board is given with respect to appearance features. The grade is based on the rules presented in the Nordic Timber Grading Rules.

Virtual grading

The grading process in the CT that determines the quality grades of the virtually cut boards based on the features identified by the 3D-reconstruction.

Final grading

The final evaluation of the sawn timber occurring in FinScan. Determines the quality grades of the final boards.

FinScan model

A PLS-model using FinScan knot data as feature variables and the FinScan quality grades as a reference variable. Outputs are predicted FinScan grades, i.e. the grades that the model predicts FinScan would have given for each board.

CT model

A PLS-model using CT knot data as feature variables and the FinScan quality grades as a reference variable. Output are the predicted CT quality grades, i.e. the grades that the model predicts the CT would have given for each virtual board.

Final CT model

A PLS-model using the CT knot data as feature variables but takes the predicted FinScan quality grades as a reference variable. The outputs are the predicted CT quality grades.

(5)

1 Introduction

The forests constitute a source of renewable and natural resources that is very valu-able. At all times humans have used trees in many different applications, e.g. house and construction building. Over the years wood processing industries have devel-oped to exploit the forests. The initially small, simple sawmills have now become large-scale production sawmills with quite complex processes.

In a world where advanced technology is developing rapidly, also sawmills can take advantage by improving the production process. At a sawmill in northern Sweden, this is exactly what has been done. In 2018 a CT Log Computed Tomography[9] (CT) was installed in the saw line. The CT provides a 3D-reconstruction of the internal features, e.g. knots, for each log. From simulations the cutting solution is optimized to increase the final resale value of the sawn timber. At the end of the saw line, after the logs have been cut, dried, and sorted, the boards are scanned and evaluated using a FinScan Boardmaster[4]. At this step, the final quality grade of each board is assessed automatically based on the appearance of the board. For the process to benefit the most from the CT it is essential that the virtual and final quality grades for a board agree. Today the agreement between the CT and FinScan is approximately 55%, which indicates that there is potential for improvement. In both systems, the CT and FinScan, the grading process is based on the Nordic Timber Grading Rules. The rules regard different appearance factors, e.g. knots, wane, and splits. For each quality grade, the rules stipulate what is allowed and not. An identified problem with the grading process is the existence of measurement errors. The systems use measured data obtained when scanning boards or logs. That means that a knot can be incorrectly measured slightly too big or small, which can affect the resulting grade. Hence, the accuracy in grading is potentially decreased due to measurement errors.

1.1 Problem formulation

(8)

The problem is if the virtual grade does not agree with the final grade. Inconsistent grading can have multiple causes. One is that the logs are not cut according to the solution chosen in the CT, another is differences in grading. In order to improve the process even further, the agreement between the virtual and final grading has to be improved.

One source of error in the systems are measurement errors occurring when scanning the logs and boards. Such measurement errors can result in a knot being measured too big or too small. Since the grading rules are so-called hard rules, this implies that if a knot is close to the threshold value between two grades, it can be given an incorrect grade. Replacing the rule-based grading with a statistical grading model could make the process less sensitive to measurement errors.

1.2 Aim and Goals

This project aims at increasing the agreement between the virtual and the final grading of sawn boards in the sawmill. The hypothesis is that by predicting grades using multivariate models, the agreement between the systems should increase. The goal of the project is to develop a statistical model based on knot features for quality grading of boards to improve the agreement between the grades assessed in the CT and FinScan.

1.3 Related Work

(9)

whether the statistical modeling approach can improve the agreement between the virtual and final grades for boards or not.

1.4 Limitations

(10)

2 Theory

This section provides some essential theory for understanding the sawmill production process and the statistical knowledge that constitute the core of this project.

2.1 The Sawmill

Wood processing takes place in the sawmills where logs are cut into boards and plank. Today the sawmills have been developed into large-scale industries that export their final products all over the world. For the this report, the focus is on the production of boards.

2.1.1 The Sawmill Process

Each sawmill has its own particular process of cutting logs into boards. The main difference is the choice of machines, such as saws and evaluation systems, but the overall production flow is similar. The process described here is the one at the specific sawmill where the project has been carried out. A simple picture showing the flow of production is shown in Figure ??.

Figure 1 – A schematic picture showing the production flow at the sawmill. Starting at the top left and ending at the bottom right.

(11)

price of the log is set and the sawmill company officially buys the timber. After the initial inspection, the logs are sorted based on dimension and type (root, middle or top logs). This will simplify the cutting process by ensuring that each batch of logs put in the saw line contains similar logs.

The process of cutting the logs begins when a batch of logs are loaded onto the saw line. Initially, each log has the bark removed before proceeding to the CT. The CT scans every log using high-resolution, high-speed, X-ray tomography technology which provides a virtual 3D-image of the internal features of the log. In order to determine the optimal cutting solution, the log is virtually cut in various ways. The cut boards are virtually evaluated and graded after which the optimal cutting solution is chosen. That is done based on which solution that gives the highest final resale value.[9] For the log to be cut according to the determined solution in the CT it has to be positioned correctly on the conveyor. This is realized using a so-called turning machine, that turns the log into the desired position before it proceeds to the saws.

After the logs have been cut the resulting boards are being stacked and transported to the drying kilns. The sawn timber is left to dry until a certain level of moisture is reached. Then it is time for the final evaluation of the boards. At this particular sawmill, a FinScan Boardmaster is used to optically scan and evaluate each board based on appearance. The system automatically decides the quality grade. After the final evaluation, the boards are sorted and packed, ready to be transported to the customer.

2.1.2 Sorting Rules and Qualities

(12)

Grading Rules. This particular sawmill uses its own grading system that is based on the Nordic Timber Grading Rules.

There are basically four different grades that a board can be given. Over the years the notation has changed but often they are referred to as O/S, V, VI, and VII. However, in this report, they will be named A, B, C, and D which is in agreement with the terminology in the rules and guidelines in Nordic Timber Grading Rules. Grade A is the best quality, B the next best and C the third lowest quality the boards can be given. Grade D is given to boards that do not classify for either of the other grades but still are considered boards.[10]

There are several different parameters to consider in the evaluation of a board. For example, the number of knots, knot size, type of knot, splits, wane, fungal attack, and deformations.[12] To determine what grade a board belongs to each side of the board is inspected and the one meter (in length) of the side with the lowest quality is considered. This means that it is the worst meter of the board that sets the quality grade for the whole board. The grading rules presented in Nordic Timber Grading Rules stipulate threshold values for each defect and what is allowed for a certain grade, e.g. the allowed number of knots and the allowed maximum size of the knots. The threshold values are specified for sound and dead knots separately.[10]

2.1.3 Knot Features

(13)

(a) Sound knot (b) Dead knot (c) Leaf knot (d) Edge knot

Figure 2 – Pictures of some typical knots and their appearance.[11]

The prime distinction is made between sound and dead knots which are shown in Figures 2(a) and 2(b). The characteristics of the knots depend on the branch that they originate from. A branch that was alive as the tree was cut will result in a sound knot and the dead knot is the result of a dead branch. What knot features that appear also depends on what type of log it is. The tree has to be cut into pieces before transporting. Therefore logs are generally separated into root, middle or top logs.

Depending on what the wood is supposed to be used for the appearance features are more or less important. For some applications such as packaging and subfloors, it does not matter what the board looks like, but for woodwork, furniture, and paneling it is more important. This is why the knot features play such an important role in determining the quality grade for a board.

2.2 Statistical Methods for Classification

(14)

Given input data, statistical models can tell us something about a chosen feature. Figure 3 shows a simple, schematic overview of the modeling process. The input feature variables, X, generates an output response, Y.

Figure 3 – A figure showing the basic concept of statistical modeling. Given an input feature variables, X, the model will produce an output response, Y.

In statistical learning, you can either work with supervised or unsupervised learning. The difference between the two concepts is whether information about the response, Y, is available. In supervised learning, the response corresponding to the feature data is provided. Models can then be trained using the "correct answers" resulting in relationships that can be used to predict the response variable given new feature data. This is very useful in cases where it is complicated or very expensive to gather the Y data, but the X data is relatively easy to obtain. By using a trained model the Y data can then be predicted using the X data. Unsupervised learning methods are applied when there is no knowledge about the response but the feature data is available. An example of unsupervised learning is cluster analysis. With this method clusters are identified which can help understand the data. For example, it can be determined if there is some latent variable describing the data.[5] For this study supervised learning methods will be used since information about the response variable is available.

Given a data set, the nature of the problem depends on how the variables are cat-egorized. Quantitative variables are variables with continuous numeric values, for example, length, age, or income. Qualitative variables, on the contrary, are variables that handle classes or categories. That could be gender (male or female), living sit-uation (house or apartment) or brand of a product purchased (A, B or C). When the response variable is quantitative, the problem is often considered a regression problem. With a qualitative response, the problem is referred to as a classification problem.[5]

(15)

are a number of different methods that can be applied when working with statistical classification, for example, Support Vector Machines, Random Forest and Logistic Regression. Which one to use depends on the problem and data at hand. For this work, Partial Least Squares Discriminant Analysis (PLS-DA) have been used. 2.2.1 PLS-DA

PLS-DA is the qualitative version of Partial Least Squares Regression (PLS-R), which is a statistical method for regression closely related to Principal Component Analysis (PCA). Basically, PLS-R is a dimension reduction method that projects the predictor variables and response variables onto a new space. The method identifies linear combinations, or directions, of the original features to create a new set of features. Instead of fitting a linear model to the original features it is done using the new features, hence the dimension reduction. The newly identified features explain both the response and the predictors, which is a big difference compared to the PCA method.[5]

To understand the advantages of PLS we start from the fundamental linear regres-sion. Simple linear regression is a well-used supervised method. It finds relationships between a number of predictor variables that describe a response variable. This re-lationship is defined as in Equation 1.

Y = X1B1 + X2B2+ ... + XJBJ + , (1)

where Y is the response variable, X1, ..., XJ are the predictor variables, B1, ..., BJ are

the regression coefficients and is the error in the model with J being the number of predictor variables. By finding the regression coefficient a model is obtained from which the response variable can be predicted given a new set of X1, ..., XN. [5]

Extending the simple linear regression to multivariate linear regression (MLR) the corresponding equation to solve becomes

Y = XB + , (2)

(16)

predictor variables than observations or if there is collinearity in the data. [8] Then PLS-R is a more suitable method to use. PLS-R (and PLS-DA) decomposes X into the orthogonal scores, T , and a loading matrix P and the vector Y is decomposed into the orthogonal scores T and the loading matrix Q. The decomposition can be described with two fundamental equations which are presented in Equation 3 and 4 where E and F are the error matrices.

X = T PT + E, (3)

Y = T QT + F. (4)

This approach solves the problem with singularities and makes the method well suited to handle data with collinearity and when there are more predictors than observations.[8]

2.2.2 Evaluation of Models

Working with the development of statistical models it is necessary to evaluate their performance. That can be done in various ways. In the case of regression, the common measure for quality of fit is the mean square error, MSE. The corresponding measure for classification models is the test error rate or accuracy. The accuracy is defined as:

accuracy = no. of correctly classified observations

(17)

Cross-validation is an improved evaluation method that is based on the validation set approach. Instead of splitting the data set into training and testing sets, the data is split into k equally large subsets. The models are then trained on k − 1 subsets and tested on the remaining one. This is done k times until each subset has been subject for testing. The final accuracy is then calculated as the average accuracy. If k equals the number of observations the method is called leave-one-out cross-validation, otherwise it is referred to as k-fold cross-validation. Commonly 5-fold or 10-fold cross-validation is used.[5]

In classification problems, when testing the models the classifier can either correctly predict the class, or incorrectly predict the class. Along with the error rate, or accuracy, it is common to display the outcome of the predictions. This is typically made using a Confusion Matrix.[5] The Confusion Matrix is visually a table showing the predicted response versus the reference response. An example can be seen in Figure 4

(18)

3 Methods and Materials

3.1 Approach

To accomplish the goal of the project, it has consisted of two main parts, data pro-cessing, and data analysis. The first part, processing the data, included preparation and modification in such a way that it could be used for the statistical analysis. That required the development of variables describing the knot features of boards, also referred to as the knot variables. Determining what variables to include and develop them was a critical part of the data processing.

The second part included statistical analysis, the essence of the project. The knot variables are used as the predictor variables and quality grades are used as the reference variable. Each board constituting one observation. Models were built using PLS-DA as a statistical method. The evaluation of the models considered accuracy in prediction as the measure of performance.

Two datasets were used in the study. One had already been gathered at the start of the project while the other was collected at a test sawing performed during the project. These will be referred to as Dataset 1 and Dataset 2.

3.2 Software

(19)

3.3 The data

In the study, two datasets consisting of data retrieved from both the CT and FinScan was used. The data included information about knot features for boards of pine, Pinus Sylvestris, with dimensions 75 x 125 mm. Information about board lengths, ID numbers, rotation errors, and reasons for downgrade in FinScan was also included. Excluded from the data was information about defects other than knots since that was outside of the scope of the study. Along with the knot data for each board were the corresponding quality grades. The grades were the automatic grade assessed in FinScan and the virtual grade from the CT.

As stated before, the goal with the project is to develop a statistical model that predicts grades of boards based only on knot variables. It will then be prone to error if, for example, boards downgraded due to other factors than knots are included. To ensure that the models are based only on knot features we need to consider two things. First, the final grades assessed in FinScan have to be based only on knot features. This is not the case with the raw data, but fortunately, it was possible to simulate the grades based only on knot features. All grades used for the training and testing of models are therefore only based on knot features. Second, the turning error originating from the process of turning the log into the exact, optimal position cannot be too large. If the error is large, the cutting will not be made according to the solution chosen in the CT and the virtual and final grades will most likely not match. Since such a mismatch does not depend on the knot features it is desired to not include such boards in the models.

If this type of statistical method for quality grading is to be practically implemented at the sawmill, it will require large amounts of data. As reference variable either the FinScan quality grades can be used, or grades determined by a manual grader (an actual person) can be used. The difference would be that potential errors made by FinScan could be avoided using manually set quality grades. However, determining the manual grades for such large datasets is not feasible, it would require too much time and resources. With this in mind, it was decided that the study would focus on training the models using only the FinScan grades as the response variable.

3.3.1 Dataset 1

(20)

stored during the test sawing, while data from FinScan was obtained a few weeks later as the boards were evaluated.

To match the data obtained from the two systems correctly, each board was marked manually with a unique number, a board mark, directly after being cut. Each board mark could then be paired with the corresponding ID number from the CT data. When the boards were evaluated in FinScan the board marks were registered and it was possible to match the FinScan data with the CT data. There were some difficulties reading the board marks. To ensure correct matching, the images the system retrieves as the boards are scanned were saved from FinScan. From the images, the board marks could be identified with certainty. During the test sawing, two boards went missing, resulting in data for 158 boards. To ensure that all boards in the dataset were cut according to the CT, boards with a turning error larger than 10◦ were excluded from the study. When used for the analysis Dataset 1 therefore consisted of 98 boards.

For the 98 boards included in the study the agreement between the quality grades from the two systems is 54%. A summary of the distribution of the grades is presented in Table 1, comparing the CT grades with the FinScan grades.

Table 1 – The distribution of the original, virtual, grading in the CT compared to the FinScan quality grades based only on knot features, for Dataset 1.

FinScan grades CT grades A B C A 17 15 2 B 4 30 11 C 1 12 6 3.3.2 Dataset 2

(21)

was any uncertainty regarding if the matching was correct were removed from the dataset. In the end the dataset consisted of 153 boards that were included in the study.

One main difference in Dataset 2 compared to Dataset 1 is that information about the turning errors was not available. Therefore, it is not ensured that there are no boards where the turning error is larger than ten grades. The agreement between the quality grades from the two systems for this dataset is 50% and the summary of the distribution of the grades is presented in Table 2.

Table 2 – The distribution of the original, virtual, grading in the CT compared to the FinScan quality grades based only on knot features, for Dataset 2.

FinScan grades CT grades A B C A 28 29 5 B 3 39 31 C 2 6 10

3.4 Variables

The data collected from the two systems in the sawmill contains information about each identified knot on each side of each board. That information has to be trans-formed into variables possible to use in statistical models, the knot variables.

(22)

Table 3 – Table 2: The knot variables used for the statistical analysis. The column to the left describes the type of measure, the first row describing the type of knots. Each ’x’ indicating that the combination of measure and type of knots are included as a variable in the study.

All knots Sound knots Dead knots

Total no. of knots x x x

No. of knots per meter x x x

Max. size in x (mm) x x x Max. size in y (mm) x x x Mean size in x (mm) x x x Mean size in y (mm) x x x Covered area (%) x x x Ratio of knots (%) x x

As can be determined from the table there are eight different measures describing the knots, seven of these are calculated for all knots, sound knots, and dead knots. The last measure is only calculated for sound and dead knots, resulting in 23 vari-ables.

The rule-based grading system is assessing the quality based on the worst meter which should somehow be reflected in the knot variables. This is done by diving the board into regions and then calculate each variable for each region. The number of regions was chosen to be one, three and five. There are then 23 variables describing the knot features of a board. All variables are derived for each side of the board and also for each region. The total number of variables then becomes 23 · 4 · (1 + 3 + 5) = 828 variables.

3.5 Prediction models

(23)

Those boards are often downgraded due to other features that are not included in the study.

3.5.1 Two-step Approach

Since the problem is a multi-class problem it becomes harder to find distinct sepa-rations between the classes, especially when the number of observations available is limited than if it would be a binary classification problem with only two classes. In order to increase the probability of developing robust and stable models a two-step model approach has been applied. This means that instead of predicting whether a board belongs to one of three classes, the sorting process is done in two steps. In each step, only two classes are considered. The first step predicts whether the board is of quality A or not. If the board is not, step two decides if the quality is C or not. If the board has obtained neither quality A or C it is given quality B. A schematic figure visualizing this process is presented in Figure 5.

Figure 5 – Schematics of the two-step modelling approach.

In order to be able to adjust the distributions of the predicted quality grades, thresh-old values are introduced at each step. The model predicts the probability that the board should belong to a certain class. Using c1 and c2 as probability threshold

(24)

one; if pgradeA > c1 the board is given grade A. Similarly at step two; if pgradeC > c2

then the board is given grade C. In this study default values of c1 = c2 = 0.5 have

been used. 3.5.2 Models

The models that have been developed regards the CT and the FinScan systems. Initially, the models have been developed separately and then a combined, final model for improving the consistency in grading was developed.

To begin with, a model concerning the FinScan data was trained and tested. The setup for the FinScan model can be seen in Figure 6.

Figure 6 – Figure showing the schematics of the FinScan model. Input is the predictor variables based on FinScan data and the response variable being the grades retrieved in FinScan.

(25)

model output then becomes predicted grades for the boards based on the scanning of knot features in the FinScan system.

The corresponding setup for the CT model can be seen in Figure 7

Figure 7 – Figure showing the schematics of the CT model. Input is the predictor variables based on CT data and the response variable being the grades retrieved in FinScan.

(26)

The Final Model

The goal with the project was to develop a statistical model that can be used in both the CT and FinScan instead of the standard rule-based grading, in order to get an improved agreement between the systems. One way of doing this is to train and test models for each system separately and hope that the grading will be in agreement. Another approach would be to train a FinScan model which will give predicted grades as output. Then use these predicted grades as a reference when training the CT model. The probability that this will result in a better consistency is even higher since the CT model then will be trained on the predicted grades from FinScan. This is the final model that was developed during the project. An illustrative picture of this model setup can be seen in Figure 8.

Figure 8 – Figure showing the setup for the final model including the FinScan model and the CT model.

(27)

3.5.3 Training and Testing

In the process of model development, training and testing have been done in some different ways. First of all training and testing have been made using the whole data set. That means that all observations are used for training and then the model is tested on all observations. This indicates how well the model performs but still allows training with all available boards. Testing on observations that the models have already seen will probably result in overfitting. To avoid overfitting, 10-fold cross-validation has been used. As final testing the models have also been tested on the dataset that was not used for training.

3.6 Experimental setup

The details of the experimental setup used in the project are summarized in Tables 4 - 6. Table 4 shows a summary of the characteristics of the two datasets. The specifications of the three models described are summarized in Table 5 and Table 6 presents the setup for training and testing the models.

Table 4 – A table summarizing the characteristics of the two datasets included in the study.

Dataset 1 Dataset 2

Type of timber Pine Pine

Dimension 75x125 mm 75x125 mm

No. of boards 98 153

Information about turning error Yes No Original agreement in quality grading 55% 50%

Table 5 – Table showing the specifications of the three models used in the study. ’Pred.’ being the short notation for ’Predicted’.

Model Feature data Reference Variable Output

(28)

Table 6 – Table showing the setup for training and testing the three models, CT model, FinScan model and final CT model

Dataset 1

CT model Finscan model Final CT model All data Training All data All data All data

Testing All data All data All data 10-fold CV Training 90% 10 times 90% 10 times 90% 10 times

Testing 10% 10 times 10% 10 times 10% 10 times Final testing Training All data All data All data

Testing Dataset 2 Dataset 2 Dataset 2 Dataset 2

All data Training All data All data All data Testing All data All data All data 10-fold CV Training 90% 10 times 90% 10 times 90% 10 times

Testing 10% 10 times 10% 10 times 10% 10 times Final testing Training All data All data All data

(29)

4 Results

The developed models using the approach described in Section 3.5 are showing good results. Results are presented separately for each dataset but in similar manners. Each model is presented with the accuracies in prediction from testing. Accuracies from testing on the whole data set and 10-fold cross-validation are all presented. As results from the final testing accuracy and the corresponding Confusion Matrix are presented for each model.

4.1 Dataset 1

All three models were trained using Dataset 1. The main results from testing each model on all data and using 10-fold cross-validation are shown in Table 7.

Table 7 – Table showing the test results for the models trained and tested on Dataset 1. Results are presented both from testing on all data and the cross-validation.

Type of model Accuracy

All data Cross-validation CT model 80% 59%

FinScan model 85% 63% final CT model 84% 67%

The data in Table 7 shows the received accuracy, defined as in Equation 5 from each test. When testing the models on all data all models have an accuracy between 80-85%. The corresponding cross-validated values are lower, between 59-67%. For the CT model the cross-validated accuracy is 59%, which means that 59% of the predicted quality grades have been correctly predicted compared to the grades as-sessed in FinScan. The corresponding value for the FinScan model is 63%. For the final CT the received accuracy is 67% (cross-validated). This means that the model predicts 67% of the quality grades correctly comparing to the predicted grades from the FinScan model, that were used for training the model.

(30)

Final testing

As a final test for the models trained on the whole Dataset 1, they were tested on Dataset 2. The results received are shown in Table 8. As can be seen, the final CT model gives an accuracy in prediction of 66% which compared to the original agreement of 50%, see Sec. 3.3.2, is an increase of 16 percentage points.

Table 8 – Table showing the obtained accuracy for each model when testing on dataset 2.

Type of model Accuracy CT model 57% FinScan model 72% final CT model 66%

The Confusion Matrix for each test result when testing on Dataset 2 can be found in Tables 9 - 11. The matrices show the distribution of the predicted quality grades. What can be seen is that all grades are represented and that the main part of the boards gets the quality grade B.

Table 9 – The Confusion Ma-trix for the CT model trained on the whole Dataset 1 tested on the whole Dataset 2. Reference values are the FinScan grades.

Reference

Predicted A B C

A 18 17 4

B 13 43 16

C 2 14 26

Table 10 – The Confusion Matrix for the FinScan model trained on the whole Dataset 1 tested on the whole Dataset 2. Reference values are the FinScan grades.

Reference

Predicted A B C

A 18 10 21

B 15 64 17

(31)

Table 11 – The Confusion Matrix for the final CT model trained on the whole Dataset 1 tested on the whole Dataset 2. Reference values are the predicted FinScan grades. Reference Predicted A B C A 21 17 0 B 8 74 22 C 0 5 6

4.2 Dataset 2

The three models were also trained using Dataset 2. Table 12 shows the results from testing using all data and 10-fold cross-validation.

Table 12 – Table showing the test results for the models trained and tested on Dataset 2. Results are presented both from testing on all data and the cross-validation.

Type of model Accuracy

All data Cross-Validation CT model 73% 49%

FinScan model 84% 75% final CT model 78% 60%

Table 12 shows that the testing on all data generates results between 73-84%. The cross-validated accuracy for the FinScan model is 75% while the corresponding value for the CT model is 49%. The accuracy in prediction from the cross-validation for the final CT model is 60%, which can be compared to the original agreement in grading, which is 50%. That is an increase of 10 percentage points.

Final testing

(32)

Table 13 – Table showing the obtained accuracy for each model when testing on dataset 1.

Type of model Accuracy CT model 66% FinScan model 66% final CT model 73%

An accuracy of 73% is obtained using the final CT model which compared the original accuracy of 54%, see Section 3.3.1 shows an increase of 19 percentage points.

The corresponding Confusion Matrix for each test result can be found in Tables 14 - 16.

Table 14 – The Confusion Ma-trix for the CT model trained on the whole Dataset 2 tested on the whole Dataset 1. Reference values are the FinScan grades.

Reference

Predicted A B C

A 11 5 2

B 11 47 10

C 0 5 7

Table 15 – The Confusion Matrix for the FinScan model trained on the whole Dataset 2 tested on the whole Dataset 1. Reference values are the FinScan grades.

Reference

Predicted A B C

A 17 11 2

B 5 42 11

C 0 4 6

Table 16 – The Confusion Matrix for the final CT model trained on the whole Dataset 2 tested on the whole Dataset 1. Reference values are the FinScan grades.

Reference

Predicted A B C

A 16 5 0

B 14 48 2

(33)

5 Discussion

5.1 The Results

The results from the project are showing that the approach of using multivariate sorting instead of the rule-based sorting actually gives a higher agreement between the virtual and final grading. For the datasets, the original agreements between the grading systems are 54% and 50% respectively. Comparing these numbers to the accuracies given by the statistical models shows a significant increase. The obtained accuracies from cross-validation for the final CT model are 66% and 60%, which gives an increase in prediction accuracy of 13 and 10 percentage point for Dataset 1 and Dataset 2 respectively. Looking at the results from the final testing the increase in agreement between the virtual and final grading is even larger, and agreements of 66% and 73% are reached. Increasing the agreement between the two grading systems from approximately 50-55% to these numbers would mean that the potential of using the CT in the saw line is further exploited.

Differences in accuracy can be seen when comparing the CT model that was trained using FinScan grades and the final CT model that was trained using the predicted FinScan grades from the FinScan model. The former is resulting in a lower accu-racy, 57% and 49%, indicating that using the approach of the final model is a good choice.

(34)

5.2 Sources of Errors

Analyzing the results they are indeed showing that using this modeling approach improves the grading process and the consistency between the CT and FinScan. However, there are some things to keep in mind. The number of boards available highly affects the performance of the models. Even though it was possible to gather two datasets it would have been preferred if the number of boards was even higher. The models are still only trained using no more than 153 observations. Given a three-class three-classification problem this cannot be considered a big amount of data.

As described in the theory section, correct and exact positioning of the logs before cutting is very important for the outcome. In Dataset 1 boards with large turning error were excluded to make sure that this would not affect the models. Unfortu-nately, this is not the case with Dataset 2, since there was no information available regarding the turning errors. That means that the results for Dataset 2 are not as accurate as they could have been. The models developed using Dataset 2 can be affected by this as well as the results from testing the models developed using Dataset 1 on Dataset 2. How much is though hard to say. When gathering Dataset 1 the turning error was known to be quite high. Before Dataset 2 was gathered improvements in the turning machine were made which probably made the number of boards with a large turning error few. The results from using Dataset 2 are good and testing the model trained using Dataset 2 on Dataset 1 even gives the highest accuracy for the final CT model, indicating that large turning errors might not be present in Dataset 2.

The knot variables included in the statistical analysis should also be discussed. How the knot variables are derived and what variables to use is absolutely affecting the performance of the model. What variables to use is not obvious and there is nothing that guarantees that the variables chosen to use in this project are the optimal ones. Also, there could be a better alternative to the approach of calculating the variables for different regions. Maybe developing a proper software that can find the worst meter would improve the process. Improving the variables and how they are calculated can be essential for even further improvement of the models.

5.3 Future Work

(35)

systems. If the final grades that are given in FinScan agree better with the predicted virtual grades in the CT the cutting process could be even further improved. The sensitivity to measurement errors would be reduced and good quality boards that today are downgraded due to errors would obtain the higher quality.

In order to use multivariate sorting, it has to be developed further. The models have to be trained and optimized using a larger dataset than the one used in this project. It can be guessed that at least 100 boards for each quality grade are needed to develop robust models.[2] Also, it has to be considered all the different dimensions and types of logs that are sawn. The knot features differ quite a lot depending on if the logs are root, middle or top logs. This will affect the models of course. Maybe it is possible to develop a universal model that can be used at all times, but that would probably not give as good results as if models were developed for specific sets of dimensions and types of logs.

(36)

6 Conclusions

As this project now has been finalized it can be concluded that the PLS method for multivariate sorting of boards it a promising method. The results show that the agreement between the virtual grading in the CT and the actual grading in FinScan is improved using this approach. With a better agreement in grading between the two systems, the outcome from the sawing process should be in better accordance with what the CT decides on and the value of the sawn timber should be increased accordingly.

This project has had its limits and one of them was the limited number of boards in the data set available. In the future, it is of importance to get a larger amount of data to train models on in order to make them stable and robust. With the fin-gerprint traceability application available the gathering of data should be simplified. Since this study is an initial study on the subject the models have to be further developed and tested before the concept can be implemented and used in everyday production.

(37)

References

[1] Berglund, Anders, et al. ”Customer adapted grading of Scots pine sawn timber using a multivariate method.” Scandinavian Journal of Forest Research 30:1 (2015): 87-97.

[2] Lycken, Anders, and Johan Oja ”A multivariate approach to automatic grading of Pinus sylvestris sawn timber.” Scandinavian journal of forest research 21.2 (2006): 167-174

[3] Carl-Johan Möller ”Development of Fingerprint Traceability in a Modern Sawmill.” (2019).

[4] FinScan Oy (2019) https://finscan.fi/products/boardmaster/?lang=en (Accessed: 2019.05.20)

[5] James, Gareth, et al. An Introduction to Statistical Learning with Applications in R. Vol. 112, New York: springer, 2013.

[6] Olofsson, Linus, et al. ”Multivariate product adapted grading of Scots pine sawn timber for an industrial customer, part 1: Method development.” Wood Material Science Engineering (2019): 1-9.

[7] Olofsson, Linus, et al. ”Multivariate product adapted grading of Scots Pine sawn timber for an industrial customer, part 2: Robustness to disturbances.” Wood Material Science Engineering (2019): 1-8.

[8] Fordellone, Mario, Andrea Bellincontro and Fabio Mencarelli. ”Partial Least Squares Discrimination Analysis: A Dimensionality Reduction Method to Clas-sify Hyperspectral Data” arXiv preprint arXiv: 1806.09347 (2018).

[9] Microtec (2019) https://microtec.eu/en/catalogue/products/ctlog/ ?tag=grading (Accessed: 2019.05.20)

[10] Swedish Sawmill Managers Association. (1994). Nordic timber: grading rules for pine (Pinus sylvestris) and spruce (Picea abies) sawn timber: commercial grading based on evaluation of the four sides of sawn timber. Umea, Sweden. [11] Swedish Wood; Mikael Eliasson (2019) Quality of wood https:

(38)

Multivariate modeling improves quality grading of sawn timber