Prediction of silicon content in the hot metal using Bayesian networks and probabilistic reasoning

The blast furnace is a metallurgical reactor that operates countercurrent with the descending metallic charge and ascending gases. Cast iron is the product formed by reducing metal oxides that react chemically with reducing elements such as carbon monoxide (CO) and hydrogen gas (H2). In this process, the air is preheated and then blown through the tuyeres, producing carbon dioxide (CO2), which reacts with the coal to form carbon monoxide. The moisture in the air reacts with the coke and PCI to form carbon monoxide and hydrogen [1]-[4].

input into the furnace and, in some cases, may indicate excess coke in the furnace. Since coke costs predominate in the production of cast iron, tighter control of silicon content has economic advantages. In order to improve production conditions, several models have been proposed in the field of technology and modeling to simulate blast furnaces and predict the effects of changes in production parameters [11]- [14].
The application of solutions based on neural networks has become very popular due to their versatility and the possibility of developing answers and making them more reliable, as the neural network receives new data during the training process. The application of the neural network technique in steel production is new, and there are few works on this topic, mainly for the control of impurities such as silicon [17]- [19].
In this context, the main objective of this work was to build the source code of a Bayesian artificial neural network to determine the number of neurons with the best results for predicting the silicon content in cast iron, varying the number of neurons in the hidden layer by 10,20,25,30,40,50, 75 and 100 neurons.

Method
The database consists of 75 variables (divided into 7 groups) corresponding to 3.5 years or 1150 operating days and 86,250 cells. According to the literature, the input variables were selected based on the influence of silicon incorporation in hot metal. The silicon prediction model uses variables such as theoretical flame temperature, blowing pressure, slag rate, coke rate, and PCI rate. These variables were the most important variables considered in the construction of the model, but the blast furnace has a large number of process variables that affect the silicon content.
The selection of variables for the construction of a database is not an easy and trivial task since including too many secondary variables can make the training and learning of the neural network difficult. On the other hand, the accuracy of the artificial neural network may deteriorate if important variables are omitted when training the model, which may lead to overestimating the data. The variables selected for the database were defined based on the experience of operators, technicians and engineers in the metallurgical sector. Thus, the selected database considered seven groups that affect the operation of the blast furnace and the incorporation of silicon into the hot metal. The considered division was (1) blowing air; (2) blast furnace gas; (3) thermal control; (4) fuels; (5) iron ore; (6) hot metal; (7) blast furnace slag.
Blowing air, control of blowing air (pressure, velocity, volume), is essential for process control as it provides information on deviations in reactor operation. Blast furnace gas, control of blast furnace gas, is important to control carbon consumption (coal and PCI), specific airflow, and permeability. Thermal control, this group of variables is essential to control the reactor performance, and the quality of the hot metal since the dissolved silicon in the cast iron is directly proportional to the reactor's operating temperature and the quality of the metallic charge fed through the blast furnace head. Fuels: Metallurgical coke and PCI combine with oxygen to allow the reactor to reach temperatures of about 1,500º Celsius to smelt iron ore. Iron ore: quality control of the iron ore and fluxes is important to reduce unwanted impurities such as silicon. Hot metal and slag: Blast furnace slag is obtained by melting and separating metal slags and mainly consists of stable oxides such as (MgO), (CaO), (Al2O3) and (SiO2), which account for up to 95% by weight. Hot metal/slag control is important as impurities should preferably be part of the chemical composition of blast furnace slag [15].

Outliers and probabilistic reasoning
In this work, two techniques were used to identify and remove outliers to create a database corresponding to the normal operation of the blast furnace. The first technique considered the maintenance events and operational instabilities as "technical outliers", since these events do not represent the "normal operation" of the reactor and would affect the learning of the neural network.
The second technique used the principle of exploratory data analysis, which consisted of locating the variables classified as severe outliers. A Gaussian normal distribution of all input variables was performed and all localized data (-3σ and -4σ) and (3σ and 4σ). There were 345 classified as severe outliers and removed from the database. Fig. 1 illustrates the region of severe outliers [16].
Measures of central tendency include mean, median, and mode, while measures of variability include standard deviation, maximum, and minimum values. Table 1 to Table 7 present the descriptive statistics of the input variables, while Table 8 presents the descriptive statistics of the output variables (silicon).

Neural network architecture
According to the literature, it is ideal to use 85% of the data to train and validate the neural network's learning process and use the remaining 15% only when the network shows performance considered satisfactory in the initial phase. Test data is used in the mission to evaluate the generalization and learning ability of the neural network. It is important to test the neural network to ensure that the results obtained are consistent with the training steps [17]- [22]. The training phase consists in presenting ANN a set of data for learning and processing the synaptic weights and subjecting them to the activation functions of the neurons. The database was divided into 3 groups: Training, Testing, and Cross Validation. Table 9 illustrates the division of the database. In this study, feedforward-type artificial neural networks were used. The signal always propagates forward from input to output, and the neurons of one layer are not connected to the neurons of the previous layer. It does not feedback the output information to the network's inputs [23]. Artificial neural networks have 10,20,25,30,40,50,75, and 100 neurons in each layer. Fig. 2 illustrates the architecture of the artificial neural network. In this study, Bayesian regularization algorithm was used because elaborating a neural network to predict the silicon content in hot metal is a difficult task. It is necessary to structure hierarchical models and control the complexity of learning. In other words, it is necessary to know how many parameters and hyperparameters are needed to obtain an adequate representation of the algorithm that generates the data [24].
This problem becomes complicated when we have limited training data. A complex model usually fits the training data very well, but this does not necessarily mean that the error in testing the algorithm will be small. Very simple or very complex models give poor approximations. In this sense, it is necessary to establish a measure (based on a principle) of the neural models' complexity to have a criterion that allows the preference of certain models [25].
Since the information on the training error does not provide information on which neural model provides the best generalization, the complexity problem was first solved by dividing the available data into 3 sets, the so-called training set and 2 other sets used for testing and validation to solve the neural model complexity problem [26][27][28][29].
The Bayesian regularization approach is suitable for complex problems such as the blast furnace simulation because it deals with the complexity problem in a very different way and, among other things, it allows more efficient use of the available data since the validation data set is not needed and can be used as part of the training set [30].
In summary, the Bayesian method allows data modeling at two levels of inference. The first level of inference involves the computation of the neural network parameters and hyperparameters, which is typically one of the tasks in fitting the model to the training data, and the second level of inference involves the task of model comparison, which basically favors certain models based on their complexity, as described in Fig. 3.  The figure above illustrates the parts involved in processing the collected and modeled data. The two red boxes denote two steps involved in the Bayesian inference process. The first box involves the inference of parameters and hyperparameters of the neural model based on the data. The second level of inference involves the task of comparing models. Occam's razor principle is used to generalize complex models [31].

Model validation
The usual method for evaluating a neural network model is to use the MSE (mean square error) results, because the lower the values found, the better the predictive capacity. Equation (1) illustrates the MSE between the actual and predicted values [32].
Pearson's correlation coefficient (R) is also used to validate the model. However, this parameter evaluates the linear relationship between input and output variables, i.e., the correlation coefficient R does not evaluate the quality of the neural model, but the mathematical correlation between the neural network response and the target values of the database [33]. Pearson's correlation coefficient (R) is presented in Equation (2).
Where (n) is the number of observations, (Yneural) is the value calculated by the artificial neural network, and (Yreal) is the value measured during the blast furnace operation. Pearson's correlation coefficient (R) varies from 0 to 1. The closer it is to 1, the greater the correlation between the input and output variables. The correlation classification as a function of the obtained coefficient is shown in Table 10. Strong correlation 0.7 < R ≤ 0.9 Very strong correlation 0.9 < R ≤ 1.0 Extremely strong correlation

Results and discussions
It was found that the larger the number of neurons in the hidden layer, the lower the Pearson correlation coefficient. However, it was found that the neural networks with 50, 75, and 100 neurons showed better performance in predicting silicon when the content of this element was higher than 0.5%, despite a lower mathematical correlation (1% on average). Table 11 shows that the neural networks gave similar results to the dataset. It is noticeable that the treatment and removal of outliers before modeling the neural networks was crucial for the performance of the results. The models working in isolation show good and converging results for predicting silicon content in hot metal. When analyzing Fig. 4  When training the neural network, the MSE value decreased as the number of neurons increased. In the present study, the neural networks were configured to be trained in up to 1000 epochs. The network with 100 neurons required more epochs (892 epochs) to achieve convergence. Pettersson et al. [8] argue that neural networks for predicting silicon in hot metal have a more erratic behavior that hinders the convergence of results. However, this fact was not found in this study, probably because the database resembles a large data set and probably this large data set caused the neural network to converge and present excellent results.
Complementarily, the Pearson correlation coefficient (R) was calculated for each neural network and it was found that the results presented are better than those predicted in the literature [2][3][4][5][6][7][8], and [22][23][24][25][26][27][28][29][30][31][32][33][34]. The analysis of Fig. 5 shows an excellent mathematical correlation between the database (target) and the values predicted by the neural network. Based on the results found, a comparison was made with the other models mentioned in this paper to evaluate the neural network's performance with 30 neurons as shown in Fig. 6 and Table 12.  It can be concluded that the results of this work were superior to the models reported in the literature, indicating that the use of a Bayesian regularization algorithm in complex modeling situations is a beneficial alternative to refine the results.
In the context of a technical discussion, can mention that silicon content in hot metal is an important quality parameter that needs to be monitored as it serves as a thermal indicator for the blast furnace. Low silicon levels indicate a possible cooling of the reactor and require countermeasures to avoid serious problems in the operation. Since the silicon in the process comes from the raw materials, especially from the coke ash and the gangue of the metallic charge, the use of raw materials with low variations in composition is one of the ways to control the content obtained in production and keep it as constant as possible at its optimal level. It is also worth noting that the excess of silicon in the hot metal requires a greater amount of calcium oxide (CaO) in the steel plant to perform refining, which leads to a greater amount of slag and higher costs. Therefore, silicon content prediction models are a useful tool to work with lower safety margins, optimize fuel consumption and improve the efficiency of the steelmaking process [35]- [39].
The mechanism of silicon incorporation into hot metal occurs in 2 ways. The first begins with the formation of silicon oxide (SiO) from the silicon in the coke ash in an area of the blast furnace known as the raceway. The gas [SiO(g)] formed in this area rises and is dissolved in the slag as silicon dioxide (SiO2) or in the hot metal as silicon. The second possibility is the reduction of silicon oxide [SiO(g)] by carbon dissolved in the hot metal [40]- [43]. There is also the possibility of reoxidation of silicon in the hot metal when the cast iron chemically interacts with iron oxide (FeO) dissolved in the slag according to Equation (4) and (5) and Equation (6). International Journal of Advances in Intelligent Informatics ISSN 2442-6571 Vol. 7, No. 3, November 2021, pp. 268-281 The main parameters affecting the SiO gasification rate and SiO reduction to silicon dissolution, and thus the silicon content in hot metal, are: (1) a high value of adiabatic flame temperature (RAFT), which produces a higher amount of gaseous silicon oxide, while low RAFT values decrease the temperature of the hot metal; (2) the increase of total gas pressure, which decreases the rate of formation of gaseous silicon oxide; (3) the wettability between coke and slag, where the decrease of wettability decreases the silicon incorporation rate; (4) the chemical composition of the slag: An increase in the basicity of the slag increases the oxidation rate of SiO to SiO2; (5) the increase in the heat flow ratio (HFR) decreases the location of the cohesive zone, which decreases the oxidation rate of gaseous silicon oxide [44]- [47].
The knowledge of neural networks is stored in the synaptic weight of neurons (w), so that the importance of a given variable is related to the weight high weight (w) has a greater impact on the output of a neuron than a neuron with a low of that neuron in the model. Thus, a weight. Therefore, the most important variables for each model are those with the highest weight (w). As for the silicon prediction model, the most important variables were sinter and blowing air. Sinter contains SiO2, which serves as a Si source for hot metal, explaining its influence on the model. As for blowing air, higher values favor more blowing and affect the thermal level of the blast furnace, which affects the silicon content in the hot metal [48].
Other variables that also affect the silicon content are the enrichment of O2, the pressure, and the blowing rate in the tuyeres. These variables can affect the shape, thickness, and position of the cohesive zone and the conditions for SiO formation and, consequently, the silicon content in the hot metal. For example, low permeability may indicate a thicker cohesive zone. An increase in the enrichment content of O2 tends to increase the temperature of the blast furnace and decrease the amount of nitrogen injected, thereby increasing the thermal level and favoring the permeability of the blast furnace. Thus, a high or thicker cohesive zone tends to increase the silicon content in the hot metal as well as a higher thermal level which favors the conditions for the incorporation of silicon into the hot metal, which may indeed be influenced by variables [49]- [51].
In conclusion, the silicon contained in the iron ore loaded through the upper part of the reactor is released into the hot metal and the slag, so that using the binary basicity (CaO/SiO2) and by calculating the mass balance, it is possible to determine the silicon content in the hot metal, which underlines the importance of controlling the conditions affecting the basicity of the slag and the quality of the cast iron produced.

Conclusion
Regarding simulation methods for predicting process variables, the increasing development of computing capacity, leading to cheaper and cheaper devices with larger capacity, is driving the development of more complex algorithms with better results, as is the case with neural networks. Therefore, it is concluded that advances in computing capacity are enabling the development of models to simulate complex processes. Data processing and outlier identification are essential to ensure that models converge. The choice of interdependent variables, such as the composition of the top gas and the basicity of the slag, has made the model more effective; When we apply the technique of the synaptic weight, we can mention that ANN confirms that the theoretical flame temperature, blowing pressure, and coke rate positively affect the silicon content in hot metal. This is consistent with the literature, as the theoretical flame temperature increases the channel's temperature and favors the rate of SiO gas formation, while the blowing pressure and the coke rate favor the incorporation of silicon into the cast iron and increase the heat input of the blast furnace. On a smaller scale, synaptic weight analysis showed that slag rate had a small effect on changing silicon content, suggesting that this variable does not directly contribute to the mechanism of silicon incorporation into cast iron. Also, on a smaller scale, the analysis of synaptic weights showed that the production rate of coke and hot metal (increase in daily production) favors an increase in the sulfur content of hot metal, while (CaO) and (MgO) vary inversely proportionally. Although all neural networks with 20, 25, and 30 neurons produced converged and