Modelling a Cement Precalciner by Machine Learning Methods

This work is a feasibility study of modelling the calcination process in a cement precalciner by employing machine learning algorithms. Calcination plays a significant role in characterising the clinker quality, energy demand and CO2 emissions in a cement production facility. Due to the complex nature of the calcination process, it has always been a challenge to reasonably model the precalciner system. This study is an attempt of finding a feasible alternative to answering this challenge. In this study, six machine learning algorithms were tested to analyse three output variables, which are, 1). the apparent degree of calcination, 2). CO2 molar fraction (dry basis) and 3).water molar fraction in the precalciner outlet stream. Fifteen input variables were used to train the algorithms, of which the values were obtained through a large number of simulated datasets by applying mass and energy balance to the precalciner system. A number of machine learning algorithms showed better predictability and Artificial neural network (ANN) showed the best performance for all three output variables.


Cement manufacturing and calcination
Cement is one of the frequently utilized materials in building infrastructure facilities. Cement manufacturing is a globally crucial industrial sector that is highly energy intensive. It is responsible for a considerable share of global CO2 emissions. The dominant uses of carbon-intensive fuels, such as coal in clinker making and calcination process, are accountable for a large amount of CO2 emissions in the cement industry. Calcination in the cement manufacturing process is a complex industrial phenomenon involving mass transfer, heat transfer, and physical and chemical reactions. Materials are subjected to high temperatures so as to cause a chemical and physical change. Process emissions from calcination of limestone are 60%, where 0.5 tonnes of CO2 is emitted per tonne of clinker production (IEA, 2008). The endotherm reaction at 950 °C in the calciner demands about 1700 MJ/t clinker energy, which is around 50% of total energy (WWFI, 2008). Figure 1 shows a schematic diagram of a typical drybased cement manufacturing facility. Most of the modern cement facilities are equipped with a precalciner system located between the preheater and the rotary kiln. In the production process, raw materials, typically 80-90% limestones, are prepared by crushing, grinding and adding chemicals. This preprocessed raw material (which is referred a 'raw meal') is then preheated to 750°C and sent to the precalciner (also called as calciner).
Precalciner intiates the chemical decomposition of limestone (CaCO3) into lime (CaO) and carbon dioxide (CaCO3 ↔ CaO + CO2). About 90% of raw meal is calcined at this unit (GmbH, 2016). Precalciner system provides direct combustion through solid-gas heat exchange, where it disperses and suspends cement raw meal powder in an airflow. The pre-calcined meal then enters the rotary kiln, where the remaining calcination process is completed. Clinker formation takes place in the kiln and finally the clinker is sent to the clinker cooler.
Stability and the effectiveness of the calcination process directly affects the final clinker quality, smooth operation in the subsequent rotary kiln operation and the energy requirement of the pyroprocessing unit. The exothermic process of fuel combustion and the endothermic process of carbonate decomposition in the raw meal occurs simultaneously in the precalciner. The optimum operation of precalciner conserves energy and reduces emissions associated with both precalciner and rotary kiln. Calcination degree, which is an indicator of the performance of the precalciner, is affected by several parameters such as temperature inside the calciner, residence time of the raw meal in the system, solid gas separation, dust circulation effect and kinetic behavior of raw materials (Mikulčić et al., 2012;Mohammadhadi, 2018).
Calcination degree is expressed in two ways; either true calcination degree or apparent calcination degree (Tokheim, 1999). The apparent degree of calcination ηapp is mentioned as ADOC in this paper, which is used as an indicator to monitor the calcination process in the cement production line because accurate calcination degree cannot be measured easily. However, it is not easy to measure the apparent degree of calcination online. Instead, samples are extracted from the process line and analyzed offline in the laboratory. The frequency between two subsequent analyses can be one hour or even several hours, depending on the availability of laboratory capacity. The precalciner outlet temperature is therefore used as the primary controlled variable in the precalciner to control the degree of calcination. Oxygen and carbon monoxide levels are also controlled because they are indicators of the fuel combustion and stabilisation of the process, respectively (Osmic et al., 2020) Figure 2 shows different input and output variables to the cement precalciner system. These variables belong either to basic input streams (i.e. preheated raw meal, fuel and tertiary air) or primary output stream (i.e. calcined meal). Description of symbols can be found in Table 1. Some of these variables can be measured online by appropriate sensors, while others are difficult to measure. In such situations, they are computed using available measurements. The computation can be accomplished on the basis of appropriate assumptions . .

A modelling approach to assessing the performance of the precalciner
Several researchers have attempted to develop relationships between variables in precalciner process using the soft and hard modelling approach. Coupling, time-varying delay, and nonlinearity of precalciner system make it hard to establish an exact mathematical model to realize performance indicators such as ADOC. Mass and energy balance (MEB) provides a fundamental approach to derive correlation to determine a required process output. Authors in this study have experience in employing MEB to model precalciner.
When there are input parameters which are unknown or cannot be measured directly, an iterative procedure is used during the MEB calculation. An example of an alternative approach to MEB is machine learning methods where this iterative process can be skipped. Machine learning (ML) has shown promising results in modelling complex and nonlinear manufacturing processes that deal with noisy, limited and nonintegrated data. Machine learning algorithms such as Support Vector Machine (SVM) and Artificial Neural Network (ANN) have proven their capabilities in this regard. (Gang and Hui, 2010) developed a model by using Least Squares Support Vector Machine (LS-SVM) with radial basis function (RBF) kernel for determining the apparent degree of calcination. The furnace temperature and pressure, the outlet temperature and pressure of the calciner, the temperature of the tertiary air and the lay-off quantity of cement raw were used as inputs to the model. (Griparis et al., 2000) proposed, adaptive, robust and fuzzy control to achieve the desired degree of precalcination of the raw meal, low carbon monoxide, while stabilising the precalcination process considering the multivariable dependencies in the precalciner system. (Yang et al., 2010) developed a back-propagation neural network (BPNN) and Radial basis function neural network (RBFNN) to assess the kiln temperature and oxygen content based on five variables which are coal flow to the kiln, coal flow to the precalciner, raw meal flow, rotary speed of kiln and negative pressure of the preheater exit.
The performance of the machine learning algorithms depends highly on the quality of input data. Therefore, collection and preparation of training dataset is an important step in the modelling process. Training data can be provided in three ways; 1) simulated data 2) actual process data and 3) designed experiment data. Simulated data is generated by theoretical models such as statistical models and computer simulations. Actual process data are randomly selected raw process data and many manufacturing companies have historical data in their database. Designed experimental data can be obtained using a Taguchi or Design of Experiment (DOE) approach. Among these three approaches, training and optimizing a model using a large number of less expensive simulation data and testing the model with a smaller dataset of process data is a cost-effective approach.
This feasibility study aims to provide an alternative approach to conventional mass and energy balance to model precalciner in a cement manufacturing process. Simulated data from MEB calculations were used to train, validate, and test different machine learning models to predict apparent calcination degree, molar fraction of water and CO2 (dry basis) in precalciner output. They were assessed based on known values of fifteen input variables.

Input and output data
The first phase of modelling work in this study was selecting input and output variables for models. These variables are listed in Table 2, including their maximum, minimum, mean and standard deviations. The dataset included 20543 samples. The full-factorial design approach, a famous experiment design, was used to generate the synthetic input data matrix. These data were used to obtain the output data matrix by applying mass and energy balance to the precalciner. The system boundary of the model is shown in Figure 1.   Table 1 shows the list of algorithms that were used to train the dataset. These algorithms fall under six regression model categories as linear regression, SVM, regression tree, ensemble gaussian process regression (GPR) and ANN. There are different algorithm types (19 in total) under each of first five categories. The dataset was trained for all these algorithms. For ANN, the data was trained with Levenberg-Marquardt backpropagation algorithm with 20 neurons and one hidden layer. Selection of number of neurons and hidden layers for the ANN model was an arbitrary option. Figure 3 illustrates the ANN network architecture representing inputs and outputs.

Methodology
Three statistical indicators were used for evaluation of the model performance, which include mean absolute error (MAE), root mean squares error (RMSE) and coefficient of determination (R 2 ). They were calculated as shown in Equation 1 to 3. (2) Here ̂ is the estimated value by the model, is the actual value of the response process (MEB based simulation data), and is the number of samples in the dataset. .

Results and Discussion
As mentioned earlier in Table 1, there were 5 regression model categories which were trained from the dataset. For each category, the model which gave the minimum RMSE was selected to predict the three output variables of apparent calcination degree, CO2 molar fraction and H2O molar fraction. The summary of their statistical performance is shown in Table 3. It also shows the ANN model results.
In addition, it also shows the linear regressionclassical model result to give an understanding how the classical linear regression method deviates to other methods. For predicting the apparent calcination degree, ANN gives the best results while GPR -rational quadratic method also shows successful results. Both Ensemble Bagged Tree and classical linear regression method show poor prediction results. ANN model also showed best performance for building relationship with inputs and CO2 molar fraction and H2O molar fraction. Linear regressionstepwise algorithm was also successful for all the three outputs, but it demanded a considerable computational time compared to SVM and regression trees. SVM-medium gaussian and regression treefine, gave the third best results for CO2 molar fraction prediction and H2O molar fraction prediction respectively. Training by Gaussian process regression algorithms were stopped due to high computational time for the CO2 and H2O models.
Figures 4 and 5 illustrate the prediction results for two different models. Figure 4 shows the performance by the ANN model compared to the MEB-based simulated data. ANN model predictions show a fit with R 2 =1, and due to the large number of samples tested, data scattering along the 1:1 line is not visible. Therefore, a small section of the x and y-axis was magnified to show the data swarm around the fitting curve. Results show that the ANN model can effectively formulate the relationship between these 15 parameters to the three output properties selected in this study. Figure 5 shows the comparison of linear regressionclassical model compared to its MEB-simulated data. It shows the least performance compared to other algorithms for this dataset.  There are both advantages and disadvantages between different machine learning algorithms. Their performance also heavily depends on the type of data. The theory behind these algorithms are not mentioned in this paper and can be found in literature such as in (Shwartz and David, 2014). In general, it is said that SVM and regression trees have fast prediction and training speeds, but they are suitable for handling minor problems and prone to overfitting (Bonaccorso, 2017). Ensemble gives high accuracy and performance for small and medium-size datasets, but tuning is required. Gaussian process is an effective algorithm for both regression and classification. A Gaussian process is a probability distribution over possible functions, and can deal effectively with data uncertainty (Irwin, 1997). The most critical drawback of GP regression is higher computation time. In this study, using GPR algorithm were terminated for H2O and CO2 molar fraction prediction. The advantage of modelling by ANN is that the model can be established directly with the input and output data of the application when there is less prior knowledge of the application. It is suitable for the highly nonlinear and uncertain system. ANN model has better online correction capability (Abiodun, Jantan et al., 2018). But it uses a large memory, and training speed can be slow. However, the choices of the quantity and quality of training data, learning algorithm, and topology and type of the network are all critical to the performance of a soft sensor model. The data used to train the models in this study are simulated data. Some data points used in model development may not be practical in plant operation. The process data generated from real-time plant operation is a mixture of noise from raw materials, energy inputs, equipment, system running state, and time-varying chemical and physical parameters of raw materials and products. Therefore, if the real process data can train the models mentioned in this study, it will be an excellent opportunity to assess the results of this feasibility study. Plant operators can use such models tuned for a prolonged period to reduce downtime and take decisions before the results of offline lab samples arrive. Since plant data always include noise and undefined variations, the way they fit to the models might be different than reported in this study. Therefore it is always recommended to tune the model with a number of plant data before the models are used directly for practical applications.

Conclusion and Recommendation
This work describes a regression attempt to determine the apparent degree of calcination, CO2 molar fraction and H2O molar fraction in a cement precalciner system by formulating relationships between several input variables. Different types of machine learning algorithms were tested to test their suitability to build relationships with the input data and output data. Since  this is a feasibility study, synthetic data were used to train the models. These data were obtained from applying mass and energy balance. Results show that a number of machine learning algorithms show good performance with respect to the classical linear regression method. Several factors affect the calcination process in the precalciner and adding their contribution as model inputs to tune the developed models are recommended to increase the model robustness. In addition, it will also provide clues whether sampling frequency should be increased or not for experimentally measured parameters used to calculate the precalciner performance.
In particular, training a model using synthetic data can be viewed as a learning process. The advantage of using synthetic data from such theoretical models is that the number of data points can be increased to decrease the error inexpensively. The results can be viewed as a guide for a proposal distribution generator for approximate inference and can be used to draw a formal connection between inputs to optimize network parameters. As the second step of this study, testing the models by selected process data that represent extreme and typical plant operation conditions is recommended. This will lead to develop more realistic models based on the actual plant data. The system boundary used for the model was the precalciner system. However, the model can be more meaningful if the system boundary can be expanded to cover the entire pyroprocessing unit.