Evaluation of texture feature based on basic local binary pattern for wood defect classification

others


Introduction
Trees are one of the most important forms of plant wealth. They grow on the earth's surface and are characterized by their length from the rest of the plants. Trees are considered the only source of wood used in the construction of homes, home furniture, decorations, and papermaking. Wood is a versatile material that is the only renewable building material. Wood structures typically combine different elements that provide the best possible endurance, heat insulation, sound, moisture insulation, fire resistance, and long life span. By increasing the proportion of timber in construction, the use of other building materials, such as concrete, steel, and bricks, can be reduced. These building materials, which are not derived from renewable raw materials, require much energy to produce and increase carbon dioxide emissions. The manufacturers want to get it in a quality wood that has high endurance and lasts longer. There are many types of wood, including solid wood and softwood. Each of these types has certain characteristics to distinguish and make it preferred on others. Wood is evaluated according to its characteristics, including hardness, elasticity, wood composition, durability, fiber, and color.
Wood is vulnerable to bacteria and microorganisms because it is considered a natural biological material [1]. These bacteria and microorganisms minimize wood quality and cause significant damage to wood hardness by destroying the internal wood structure. The presence of these problems on the wood Wood defects detection has been studied a lot recently to detect the defects on the wood surface and assist the manufacturers in having a clear wood to be used to produce a high-quality product. Therefore, the defects on the wood affect and reduce the quality of wood. This research proposes an effective feature extraction technique called the local binary pattern (LBP) with a common classifier called Support Vector Machine (SVM). Our goal is to classify the natural defects on the wood surface. First, preprocessing was applied to convert the RGB images into grayscale images. Then, the research applied the LBP feature extraction technique with eight neighbors (P=8) and several radius (R) values. After that, we apply the SVM classifier for the classification and measure the proposed technique's performance. The experimental result shows that the average accuracy achieved is 65% on the balanced dataset with P=8 and R=1. It indicates that the proposed technique works moderately well to classify wood defects. This study will consequently contribute to the overall wood defect detection framework, which generally benefits the automated inspection of the wood defects. reduces the value of wood and demands it. The manufacturers that use the wood as a primary material in manufacturing must determine the woods' stability quality because it helps to warranty the quality of the productions and the price determination. Manufacturers should take preventive measures by checking the wood's quality and making sure that there are no defects on the wood outward. Detection of defects is done traditionally through eye observation, and the process is repetitive, slow, and timeconsuming [2] [3]. Thus, it is not easy to verify the wood's quality thoroughly and accurately [3] [4].
There are several criteria for determining the quality of wood, such as knots and cracks. These criteria affect and reduce the quality of woods. These days, the quality of wood and the detection of defects and flaws are identified through traditional visual inspection to detect those defects. Visual inspection does not take only much time but also provides an inaccurate and unreliable result. Automated vision-based inspection systems can deliver more accurate results, detect defects and flaws in less time, and more reliable results to determine the quality of wood for use in manufacturing high-quality wood products and provide reliable results in the quality control process [2]. One of the steps prior detection of defects on wood surfaces is feature extraction. When we extract the features, it is going to classify these features into several defect classes. In our research, we focused on the feature extraction process only. Different timber surface has different shape and size of defects. The feature extraction is applied on the wood surface to determine the defects on that wood surface. Accordingly, there are many methods, techniques, and features to detect defects, but it is important to choose the ideal feature for better detection performance and accuracy [5].
Feature extraction is the process of wood surface characterization, then the results of these features extracted will be the input to the classification process to detect the defects. Several feature extraction techniques have been used previously on wood defect detection, such as Local Binary Pattern (LBP), Gray Level Dependence Matrix (GLDM), and SURF. Zhang [6] has introduced the LBP algorithm briefly to detect the wood defects because of the complexity of the Gray Level Co-occurrence Matrix (GLCM) feature extraction technique to extract the feature of the images. They took the wood defect as the research object, and they extract the Local Binary Pattern (LBP) texture of detect images. For the classification, BP neural network has been used to identify the defects. They achieved a 93% identification rate. In another work, Qayyum [2] has proposed GLCM with PSO trained neural network classifier to detect the three different types of knots defects. Their experiment has been done on 90 samples of images. The samples were distributed equally among the three types of defects. For a feedforward neural network, the proposed technique uses four texture features: energy, contrast, correlation, and homogeneity. The results produced by the applied experiment are 0.3483, the Mean Square Error of the network for the training dataset, and the accuracy rate is 78.26%.
Additionally, Fahrurozi [7] claimed that the edge detection technique could enhance and improve the feature extraction technique used in their study, GLCM, to extract the feature of wood texture. The experiment has been done on four species of wood. This study has been conducted with five edge detection operators; Roberts, Sobel, Prewitt, Canny, and LoG. Based on the experiment's produced results, the authors conclude that the Sobel operator and angle parameters produced the best results, and Sobel is the most suitable operator to identify the wood defect texture with GLCM [7]. Barmpoutis [8] proposed a new algorithm to detect the defects on a wood surface by using image processing techniques and scanners. The proposed algorithm can detect five types of defects which are cracks, annual growth rings, relief, notches, and holes. It also can identify the clear wood. The new algorithm has been compared with two feature extraction techniques: grayscale texture analysis and spatial texture analysis. The classifier used in their experiments is the SVM classifier. The new algorithm produced the highest accuracy rate. The average accuracy rate achieved by the new algorithm is 94.44%. Hittawe [9] introduced and suggested two feature extraction techniques: LBP and SURF, to detect the defects on the wood. For the classification, they used an SVM classifier to detect knots and cracks. The experiment used two different datasets containing two different types of wood with different properties and features. The experiments' results show that the integration of both features implements better, rather than using a single feature alone. Mahram [10] achieved 100% in their accuracy rate for wood defect detection in another comprehensive and extensive study. The proposed feature extraction techniques are Gray Level Co-occurrence Matrix (GLCM), LBP, and Statistical Moments. They used Principal Components Analysis (PCA) and Linear Discriminate Analysis (LDA) as a dimension reduction tool. They used Support Vector Machine (SVM) and K-Nearest Neighbor (KNN) for classification. Similarly, using a combination of features, Zhang [11] efficiently combined LBP and Dual-Tree Complex Wavelet Transform (DTCWT) features to come up with perfect features for detection to reduce the experimental errors and get more accurate results. This method has been tested on color wood pictures, and it shows better results with a lower error percentage.
Several recent studies have focused on wood defect detection [4], [11]- [23]. There are many feature extraction techniques been utilized to detect and countermeasure such problems. Local Binary Pattern (LBP) is one of these feature extraction techniques. It is a very efficient and simple texture operator that labels the pixel of an image by thresholding each pixel's neighborhood with the center pixel's value and considers the result as a binary number. Thus, it will form simple computation and easy to analyze images. It is going to facilitate the possibility of detecting defects on the wood surface.
This research examines one of the feature extraction techniques, called local binary pattern (LBP). LBP is one of the common approaches in many successful pattern recognition applications [22]- [26]. It associates and combines the statistical and structural models of texture analysis. LBP is a useful and simple method, and it can enhance the detection accuracy of wood defects [6].

Local Binary Pattern (LBP)
The LBP operator gives decimal values to each pixel of an image and encodes every pixel's surrounding structure called LBP codes [27]. Fig. 1 shows how the LBP works. Eight neighbors were compared by subtracting the pixel value in their center. According to the result shown, the positive comparisons are encoded as 1, and the negative is encoded as 0. A binary number is produced by combining a pixel's values in a clockwise direction starting from the top-left pixel then the value converted to decimal. The identifier is known as LBP codes [28]. For the fixed pixel center coordinates ( , ), LBP is defined as a binary contrast between the center pixel and the n surrounding pixels. Texture Ｗ is defined as the united distribution of the gray levels of n pixels: = ( , 1 , . . . , ), where, corresponds to the gray value of the center pixel ( = 1, 2, … , ) corresponds the gray value of the n equally spaced pixels on a circle of radius ( > 0)that forms a circularly symmetric set. The coordinates of the neighbors of the center pixel in the circle R edge can be calculated as in (1).
To achieve invariant for any monotonic transformation, only the signs of the differences were considered as in (2).
A binomial weight 2 will be assigned to each sign ( − ) and transform the differences into a unique LBP code (3).
Zhang et al. [11] adopted a uniform LBP of values (P,R) equal to (8,1), i.e., around a circle of radius R was eight adjacent pixels, and the mapping type was uniform. In the experiments, the feature set selection is very important. After many cross experiments, they selected all the feature sets with pixels 3*3. The feature extraction divided the wood images into three layers of R,G,B and divided each layer into many small blocks. Next, they extracted the 59-dimension features denoted by LBP histograms from each block, marked by LBR, LBG, and LBB separately. Finally, the dimensions of the features are reduced to 1 * 177, represented by LBP. The LBP texture feature extraction process is shown in Fig.   2. The accuracy of the proposed method reached more than 90%. However, the is implemented to one defect type only. Motivated by this work, we will employ a similar approach to our dataset, which is comprised of eight classes.

Data Collection and Preparation
In this research, we covered four types of wood: Getah, KSK, Meranti, and Merbau. Each type of wood contains eight types of defects: Blue Stain, Brown Stain, Hole, Knots, Pocket, Rot, Split and Wane. The dataset is obtained from the UTeM wood defect database [29].
The dataset has been categorized into two groups, unbalanced and balanced dataset. The unbalanced dataset contains the whole dataset samples. It consists of 7487 samples across the four types of wood and the eight types of defects mentioned earlier, with different numbers of samples among each type of wood and defects. The balanced dataset contains an equal number of samples across the four types of wood and the eight types of defects. It consists of 1600 samples; each type of wood contains 400 samples, and each defect contains 50 Samples. The size of each image in the dataset is 60x60. We did a balancing for the dataset to test the feature extraction technique and enhance the accuracy rate. Table 1 and Table 2 shows the detailed number of samples for unbalanced and balanced dataset across the four types of wood and the eight defects types.

Overall Implementation
This method's overall idea works as follows: first, when we load the photo of the defected wood, the system will apply image preprocessing techniques. Training the data will take place. The feature extraction will then extract the important information from the data and save it in the database. After that, based on the data stored in the database, the system will classify the data and generate the error rate and the confusion matrix. Fig. 3 shows the overall implementation procedures of our experiment.

Extracting Feature From LBP
In this section, we will discuss the procedures involved in extracting feature texture from LBP. Before we apply the feature extraction, we need to apply the preprocessing technique by converting the image into grayscale image to minimize RGB photo size [30]. After that, the feature extraction technique will take place. First, a photo is divided into several parts of the grayscale photo that has been preprocessed. Then, by local binary pattern operator will extract the local binary pattern (LBP) for each part divided, after that, proceed to compute the histograms for each part. Then, we concatenate the histograms all together to get a boost histogram for the better histogram. Then save the feature extracted for coming classification. The parameters of the LBP operator are (P, R), where P is the number of sampling points in the region "neighbors" with the radius R. In our experiment, the selected neighbors (P=8) and the radius values are (R=1,2,3,4), various values were set iteratively. We needed to run the experiment in various radius values to test which radius value is the most suitable for the parameters.

classification
The classification process is considered the most important process in this experiment because it generates the result needed. After the system read the tested grayscale photo, then the median filter was applied, then extracted the local binary pattern (LBP) and enhanced the histogram. After that, the system will classify it with the data saved in the database. Then the system will call the data from the database finding for the threshold rate. If the value is equal to or more than the value of the data saved in the database, the system will accept and inform the classification, display the data, and ask for more identification; if not, the system will ask for more identification. Our experiment performed SVM classifier for the multiclass model since we have eight classes, "eight types of defects". This multiclass model is characterized as fast in training time.

Results and Discussion
Many experiments and implementations have been done in this research to examine the several factors that will impact the performance of Local Binary Pattern and SVM. Then, the overall implementation process has been categorized into two main sections; The first section is implementing the feature extraction technique and SVM with an unbalanced dataset, "the whole data" and the second section is implementing the feature extraction technique and SVM with the balanced dataset. Both sections have been tested with different numbers of radius and carried out the technique's accuracy rate.

Experiments on Unbalanced Dataset
In the first section of our implementation, we implemented our proposed LBP technique with an SVM multiclass classifier on four wood species: Getah, KSK, Meranti, and Merbau. Each type of wood consists of eight types of defects which are Blue stain, Brown stain, Hole, Knots, Pocket, Rot, Split and Wane. We selected the whole samples available in our dataset. As illustrated in Table 3, our four species of wood contain the various number of samples and the different number of samples among the eight types of defects. We ran the LBP with the same number of neighbors (P = 8) and various radius values (R= 1,2,3,4). Our measure used to examine the performance of the applied technique is the accuracy rate. We used the accuracy rate to decide whether LBP is suitable for detecting the wood defects. The accuracy rates shown in Table 3, Getah has the highest accuracy 61.1%, when P=8 and R=1 because the distribution of the sample number among the most of each defect type is almost balanced. We can also notice from the observation in Table 3 that when we increase the radius R-value, it adversely affects the accuracy rate. This might be due to the image size 60x60 and pixels loss. Table 3 shows the accuracy rate for each type of wood with a different number of radius.

Experiments on Balanced Dataset
Due to the low accuracy rate produced while implementing the whole dataset in the first section of the implementation process, we assumed that a low accuracy rate was produced among each type of wood with several radius values because the dataset is unbalanced. This means that we need to find a solution to enhance and improve the accuracy rate by balancing the dataset and applying the LBP with SVM.
In the second section of our implementation process, we decided to rearrange the dataset to enhance and improve the performance of the feature extraction technique to boost the accuracy rate. Therefore, we balance the original dataset by giving each type of the eight defects the same number of samples. As illustrated in Table 4, each type of wood contains 400 samples, and each type of defect contains 50 samples of defected images. We applied the same experiments with the same technique, the same multiclass classifier, and various radius values. The accuracy rate has been improved compared to the first section of the implementation process. We got the highest accuracy rate for all the types of wood (Getah= 67.5, KSK=62.5, Meranti= 63.25, and Merbau=67.3) when (P=8, R=1). We can also notice from the observation on Table 4 that when we increase the value of radius R, it adversely affects the accuracy rate. This might be due to the image size 60x60 and pixels loss. Table 4 shows the accuracy rate for each type of wood with a different number of radius.

Confusion Matrix
The confusion matrix is a technique to highlight the performance of the classification method. In this research, we used the confusion matrix for the balanced dataset to better understand the classification model used in this research. The confusion matrix's idea is to summarize the number of correct prediction and incorrect values for each class in the classification. From the confusion matrix, now we can identify which class "defect" is problematic for each species. It shows the classes that affect the accuracy rate of our experiment. Fig. 4 shows the confusion matrix for Getah species when (P=8, R=1), the accuracy is low for Pocket 33.3% and Split 38.5%. Fig. 5 shows the confusion matrix for KSK when (P=8, R=1), the accuracy is low for Split 48.3% and Wane 46.7%. Fig. 6 shows the confusion matrix for Meranti when (P=8, R=1), the accuracy is low for Pocket 47.6%. Fig. 7 shows the confusion matrix for Merbau when (P=8, R=1), the accuracy is low for Rot 44% and Split 46.3%. Some classes show the similarity of the texture pattern; therefore might not be represented well by just LBP. We might have to add other features. Table 5 shows the classes' names displayed in the confusion matrix. In the first section of our experiment, we applied the LBP with the SVM multiclass model on the whole dataset, which contains 7487 samples of sub-image. These samples are distributed among the four wood species and the eight types of defects with various samples. The first step of the implementation procedures is to convert the RGB image to a grayscale image. Then, we select the parameters' values (P=8 and R=1,2,3,4) before we apply the LBP. Next, the LBP will be applied to train the data. After that, the classification process will classify the data trained by the LBP, examine the LBP and produce the accuracy rate to measure the performance of LBP. The best result is generated when (P=8, R=1). The accuracy rate of each type of wood was as follows (Getah=61.1%, KSK=57.95%, Meranti=51.81% and Merbau= 56.73%).
In the second section of our experiment, we rearranged the dataset and balanced it, which means that we gave all the four species of wood the same number of samples, and we distributed the samples equally among the eight types of defects. The total number of samples is 1600 samples of sub-image. Every type of wood contains 400 samples, and each type of defect contains 50 samples. We applied the same procedures that we did in the first section. After we balanced the dataset and applied the LBP with the classifier, the accuracy rate has been improved. The best result is generated when (P=8, R=1). The accuracy rate of each type of the wood was as follow (Getah=67.8%, KSK=62.5%, Meranti=63.25% and Merbau= 67.3%). The accuracy rate is a little higher when samples are balanced. However, the accuracy result is considered moderate. This could be due to insufficient samples or defects having closely similar texture patterns. Consequently, we recommend adding more samples in future work and attempting other feature extraction techniques related to texture representation. Finally, the objectives of this study have been accomplished, which includes proposing a feature extraction technique using Local Binary Pattern "LBP" for wood defect classification, analyzing an appropriate number of neighbors and radius parameter of Local Binary Pattern "LBP" and evaluating the feature extraction techniques using Local Binary Pattern "LBP" across common classifiers.

Conclusion
In this research, our goal is to explore and study how to differentiate every type of defect using LBP. We applied the feature extraction technique "LBP" with the SVM multiclass model as a classifier to detect the defects on the wood. Our measure used to examine the applied technique was accuracy rate. Our experiment has been divided into two sections. In the first section, LBP was implemented with an SVM multiclass classifier on the unbalanced dataset. The second section applied the same technique and classifier on a balanced dataset to enhance the accuracy rate produced from the experiments on the unbalanced dataset. Both experiments have been done on four wood species, i.e., Getah, KSK, Meranti, and Merbau, and eight types of defects, i.e., Blue Stain, Brown Stain, Hole, Knots, pocket, Rot, Split, and Wane. Best classification result was achieved when the dataset is balanced with (P=8 and R=1). The average classification accuracy is 65% which indicates a moderate classification performance. Pocket, split, wane and rot were found to be mostly confused with other classes which contributes to its low accuracy from the confusion matrix. This maybe due to the similar representation of the defects itself. Future works could be directed towards using other LBP variants or combining with other texture feature extraction technique to increase the classification performance.