Identification of virtual plants using bayesian networks based on parametric L-system

The method used to create the virtual plant model was L-System. L-System is a formal rule that consists of axiom and production rules in the alphabet L-System [1]. The production rule is the rewriting system [2]. There are two variations of L-System: stochastic L-System and parametric LSystem [3]. Stochastic L-System gives probability for the production rule, while parametric L-System provides parameters for production rules and it is able to replace long production rules into short ones. Parametric L-System can be used to create plant model such as grass, rice paddy, and trees. Plants can grow well if the environment supports it. One of the environmental factors that support plant growth is the availability of fertilizer to supply the nutrients [4].


Introduction
The method used to create the virtual plant model was L-System.L-System is a formal rule that consists of axiom and production rules in the alphabet L-System [1].The production rule is the rewriting system [2].There are two variations of L-System: stochastic L-System and parametric L-System [3].Stochastic L-System gives probability for the production rule, while parametric L-System provides parameters for production rules and it is able to replace long production rules into short ones.Parametric L-System can be used to create plant model such as grass, rice paddy, and trees.Plants can grow well if the environment supports it.One of the environmental factors that support plant growth is the availability of fertilizer to supply the nutrients [4].
There are three processes to build virtual plant model as affected by the environment [5].The first process is to develop the virtual plant model using parametric L-System.For this process, a group of alphabetical variables is formed, axiom is determined, and product rules are set.The second process is to decide on alphabetical variables, axiom, and production rules that are affected by the environment [6].Both processes are done as trials.The third process is to combine the first two processes and form the virtual plant mode as affected by the environment.Specific programming technique is used for the third process.
In this study, we suggested a combination of the three processes in a single process by using Bayesian networks that can help developing data modelling that was based on a probability representing the virtual A R T I C L E I N F O A B S T R A C T plant modelling [7].All components in virtual plant modelling is affected by the environmental factors such as the alphabet variable, the axiom, and the production rules that are formed into nodes and correlation among nodes are shown in order to test all the connections and delete the connections that have low correlations.
The use of Bayesian networks can shorten the process of developing virtual plant model while easing the extraction of information structure to generate the axiom, production rules, and alphabets in L-System.The Bayesian networks process uses the probability theory and the graph theory [8]- [10].The probability theory is used to represent the value of information structure in the form of the alphabet in the L-System as the qualitative model of the virtual plant.The graph theory is used to represent the relations of every information structure in the form of axiom and production rules for L-System as a quantitative model for the virtual plant.The final result of the virtual plant modelling by incorporating the environmental factors is the combination of qualitative and quantitative models.
Probability theory is used to represent the value of information structure in the form of alphabets in the L-System as a qualitative model for the virtual plant.Meanwhile, the graph theory is used to represent the connections of every information structure in the form of axiom and production rules in the parametric L-System as the quantitative model for virtual plants.Virtual plant modelling as affected by the environmental factors is a combination of quantitative and qualitative models.

L-System
Development of the virtual plant modelling used the L-System; L-System is a rewriting technique that is repeatedly done [2].The main components of L-System are the alphabets, the axioms, and the production rules.The alphabets are the group of letters (V) from the formal symbols, such as a, b, c, and so on, or other characters.The axiom (initiator) is the w string from the V symbols and the group of V strings are noted as V*.With V = {a, b, c}, several strings can be formed such as: a, b, ac, ba, acb, baac, etc.The length (w) of a string is the total symbols in a string.The production rules (rewriting rules) are the mapping of the symbol aϵV to the string w ϵV*, that is written with the notation p: a  w.If a symbol aϵV does not have a production rule, thus it can be assumed that the symbol is mapped on itself so that a is the constant of the L-System [2].

Parametric L-System
The template is used to format your paper and style the text.All margins, column widths, line spaces, and text fonts are prescribed; please do not alter them.You may note peculiarities.For example, the head margin in this template measures proportionately more than is customary.This measurement and others are deliberate, using specifications that anticipate your paper as one part of the entire proceedings, and not as an independent document.Please do not revise any of the current designations.
Parametric L-System is a module in the L-System.Other than the alphabet, axiom, and production rules, there is a module in the parametric L-System.Every symbol has a list of parameters related to the symbol [11].An alphabet symbol that is combined with a list of parameters is called a module.A symbol can be related to several other real values a1, a2... an and it is written within brackets such as A (a1, a2... an).A parameter can be combined with other parameters or numeric constants in an arithmetic expression involving a general operator such as +, −, * , /, exponential operator, and mathematical functions such as sine, cosine, tangent, arccosines, arcsine, arctan, floor, ceiling, truncate, absolute value, exponential, logarithmic, as well as random function.Production step in the parametric L-System needs a condition that have to be fulfilled.The condition should be in the form of logical expressions such as >, <, ≥, ≤, ==,!=, &&and ||.Parametric L-System consists of two types: 1L-System and 2L-System [12].The production rules in 1L-System only has one production in a context, with predecessor → successor as the production rule pathway.For the 2L-System, the production rule has a left and/or right context as strings of modules, with predecessor: condition →successor as the production rule pathway.As an example, axiom is used to start the production with: "baaaaaa" as the grammar syntax, and there are two production processes.The first production consists of one context.The contexts are "b" < "a", and "b"→"a".This context means that "b" produces "a" if and only if the condition is "b" less than "a".The second production also consists of one context.The context is "b"→"a", which means "b" produces "a".The illustration of production rule is shown in Fig. 1.

Graphical interpretation for the parametric L-System
After arranging the line of generation resulted from the axiom and production rules, the next step is to interpret it graphically.The list of symbols that are commonly used in the parametric L-System is shown in Table 1.

Table 1. Symbols commonly used in parametric L-System
Id Symbol Definition To draw forward for l units, with l > 0. 2 Turn left with the rotation matrix of R(α) for a degrees 3 Turn right with the rotation matrix of R(α) for a degrees 4 Turn left with the rotation matrix of R(β) for a degrees 5 Turn right with the rotation matrix of R(β) for a degrees 6 Turn left with the rotation matrix of R(δ) for a degrees 7 Turn right with the rotation matrix of R(δ) for a degrees 8 Setting the thickness of a line as x.
For example, if an L-System string says "++++++F", it means to turn left for 60° with the intersection corner of δ = 10° and it is continued with drawing of the F. Meanwhile, in parametric L-System, similar directions can be written as "+(60)F".

Conditional probability
In developing Bayesian networks, the structure is built with a statistical approach called Bayes theory that uses conditional probability.A conditional probability is a calculation of probability of an event, A, when another event, B, has happened, which is noted as P (A|B) that combines the probability of both A and B. This theory is used to calculate the probability of a set of data to enter in a specific group based on the available inferential data [13].The basic equation in the Bayes theory is shown in (1).
Bayesian networks can be used to take probabilistic decision (inferential).Probabilistic inference predicts the unknown variables directly by using other known variables.

Joint probability distribution
A joint probability distribution has two random variables: x and y.The probability distribution defines their simultaneous behaviour and it is represented as a function for all pairs (x, y) [13].
Table 2 shows two discrete random variables P(X=x,Y=y).Measurement for the length and width of a leaf was done to the nearest mm.X denotes the length and Y denotes the width of a leaf.The values of X were 132, 133, and 134 mm, while the values of Y were 17 and 18 mm.There were six pairs (X,Y) identified.The joint probability distribution for each pair is shown in Table 2.The sum of all the probabilities is 1.0.The highest probability of all pairs is (133, 17).The lowest probability of all pairs is (132, 17).The joint probability function is the function fXY (x, y) = P(X= x, Y= y).

Bayesian networks
Bayesian networks are a simple Probabilistic Graphical Model (PGM) that is built from the probability theory and graph theory.Probability theory deals with the data directly, while graph theory relates directly with the desired representation.The Bayesian networks method is a good method for learning process that is based on data training that uses conditional probability as the basis [14].Bayesian networks consist of two main parts: 1) The graph structure of Bayesian networks is called as Directed Acyclic Graph (DAG).DAG consists of nodes and edges.Nodes represent the random variables and edges represent direct dependent relationship that can also be interpreted as the causal effect between related variables.Lack of edge symbolizes the free conditional relationship between variables.
2) Group of parameters.The group of parameters define the conditional probability distribution for every variable.In Bayesian networks, a node corresponds with a random variable.Every node is associated with a group of conditional probability, p (Xi |Ai), where Xi is the variable that is associated with a node and Ai is the parent set in the graph.
An example of solving a case using Bayesian networks is explained as follows.For example, there are three variables used: sprout, organic fertilizer, and inorganic fertilizer.The dependent variable is the sprout, while the independent variables are the organic fertilizer and inorganic fertilizer.If we are going to calculate the probability of sprout (S) as affected by the supply of organic fertilizer (O), the case is noted as P(S|O).By using joint probability, a Bayesian network graph structure is developed using the following steps: 1) There are three events where every event has two truth values, true or false, so the probability is 2 3 = 8.
2) Prior can be used for every variable as shown in Fig. 2. where P(O|I) defines the probability of the event "O" as affected by the event "I".Since the variables "O" and "I" do not have relationship, thus the equation can be simplified into P(O).P (S|O, I), which shows the probability of "S" as affected by "O" and "I".This is the assumption for conditional independence that is represented in the form of graph structure from the Bayesian networks.
If there are three conditional probability to be calculated as P(S|O,I), P(O|I), and P(S|O,~I), thus the probabilities are 0.95, 0.2, and 0.90.

Datasets
The plant used as the research object was zinnia elegant jack.Fifty units of this plant were used for the data training and another five units were used as the data test.The first preparation was to condition the plant with similar conditions: plant age of 5 days, plant height of 5 cm, and medium-size poly-bags as the planting media.Every zinnia plant was given similar treatment: watering frequency, light intensity, and other environmental factors.To determine the effect of the environmental factors on the growth of the plant, treatments with different types of fertilizer were applied involving organic and inorganic fertilizers.The first measurement of the plant was stem height, by using a ruler.The next measurement was length of branch, leaf area, and flower area.Leaf area is the multiplication of leaf width and length, while flower area is the circular area.Measurement and observation of the plant growth was carried out every day for 30 days at 8 am.To get a model of the virtual plant as affected by the combination of organic and inorganic fertilizers, the following research flowchart was used (Fig. 3).In Fig. 3, the first stage: to know the environmental factors that affected the growth of zinnia plants, we provided variation of combination treatments between organic and inorganic fertilizer to 55 units of zinnia plants.The second stage: to know the growth and development of zinnia plants, we observed the 55 units of the plant for 30 days.Growth of zinnia plants was marked with increased height and size of the stem, leaves, and flowers.The growth was quantitative, measurable, and irreversible (the plant could not return back to its initial size).Development of zinnia plants are a differentiation process (change of cell shape) of the plant cells to form unique structure and functions.Development is a qualitative parameter that cannot be measured using an instrument and it is reversible that means it can return back to its initial condition.
The third stage: data of the growth of zinnia plants collected from the field observation could be used as the basis to create directed acyclic graph (DAG) to estimate the probability value for every quantitative data.The values were then formed into nodes, where each node represented a random variable.The qualitative data were used to create networks topology that provided directional arrows, where each arrow represented a direct causal relationship (direct influence).Bayesian networks can be used to get probabilistic inference to predict the value of a variable that cannot be collected directly, thus it has to utilize other known variables.For example, a probabilistic decision can determine the conditional probability of the length of a sprout if the effect of organic and inorganic fertilizer is known.After the Bayesian networks structure is built, it can be used to find probabilistic inference.The fourth stage: probabilistic inference system was used to predict the construction of the plant structure as the qualitative model for the virtual plant.The fifth stage: probabilistic inference system was used to predict the plant variables as a quantitative model for the virtual plant.The sixth stage: to combine the quantitative and qualitative models into a virtual plant model.

Data analysis using Bayesian networks
Bayesian network can represent the correlation between variety of combinations of organic and inorganic fertilizers with the plant growth.Bayesian network can also be used to calculate the probability of plant growth.Bayesian network is a tool for modelling and reasoning with uncertain beliefs [7], [15]- [17].
Data processing using Bayesian network is performed in three steps.The first step in building a graph structure of Bayesian network is by predicting the order of events so that the effect of provision of combinations of organic and inorganic fertilizer can be seen on the plant growth.The graph structure of Bayesian network is called directed acyclic graph (DAG).The tool used to develop DAG was the SamIam software [18].DAG consists of nodes and edges.After the DAG was formed, it was continued with determination of parameters (Prior Probability) of the effect of the fertilizer on the plant growth.Prior probability was used to see the possible effect of fertilizer, where as soon as fertilizer was provided, the growth probability of the plan can be updated.Data of prior probability are presented as nodes.Several factors that affected the growth of virtual zinnia plant were the qualitative factors (sprout function, leaf function, stalk function, and bloom function), which are defined in Table 3.Meanwhile, the quantitative data (sprout_1, stalk_1, leaf_left_1, sprout_1_1, sprout_1_2, leaf_right_1, leaf_left_2) are listed in Table 4 and the environmental factors (organic fertilizer, inorganic fertilizer) are shown in Table 5. Next, a graph was prepared to link nodes and arrows, as can be seen in Fig. 4. To construct a complete virtual model of zinnia plant, calculation of probability scenario is needed by considering several factors, such as: 1) Fertilizer can be organic and/or inorganic; 2) Fertilizer affects the growth of zinnia sprout (Sprout_1); 3) Every zinnia plant has a life cycle that is shown as the function of sprout growth (Sprout_Func), stalk growth (Stalk_Func), leaf growth (Leaf_Func), and flower blooming (Bloom_Funct); 4) After the sprout of zinnia emerges (Sprout_1), the main stalk will grow (Stalk_1); 5) The tip of the main stalk (Stalk_1) will bloom into flowers (Bloom_1); 6) Next, two sprouts will emerge (Sprout_1_1, Sprout_1_2) under the left leaf (Leaf_Left_1) and the right leaf (Leaf_Right_2); and 7) Return to step no. 4 as the second order (order 2).The second stage was to develop conditional probability table (CPT) and marginal probability table (MPT).To get the values for CPT table, a conversion process is needed on the probabilistic description as can be seen in Table 6.We also performed integration of fuzzy sets to get an efficient CPT [19].For example, organic fertilizer is defined with three fuzzy numbers taken from the general sets where every part represents the composition point: A1 = 51-75 gram = High, A2 = 26-50 gram = Medium, A3 = 0-25 gram = Low.Triangle membership function for every fuzzy number is shown in (3).

Boolean Operation
The x is the depth that is calculated for its membership value.
) ( Organic Ai x is the membership value of x on Ai, 1 , i q is the upper quartile of Ai, 2 , i q is the middle quartile of Ai, and 3 , i q is the lower quartile of Ai.
For example, if we are going to calculate the membership value of 75 g of organic fertilizer, we will get: Organic A2 x = 0.75, and ) ( Organic A3 x = 0.25.Fuzzy groups represent the composition of organic fertilizer that can be written as MPT, as shown in Table 7 and Fig. 5.

Id Set
Qualitative scale By using the graph in Fig. 5, we can directly determine the membership value of fertilizer composition, where 4.75 g of fertilizer can result in membership value of 0.75 for medium depth and 0.25 for low depth.Meanwhile, the membership value of the high depth is 0 since at 75 g of fertilizer; the line that represents the high depth does not have any value.The third stage: based on the CPT displayed in Fig. 4, inferential would be performed in order to collect posterior data.While working on inference, we performed trials to know how big the factors that affect the fertilizer composition on the virtual zinnia plant model.The results are as follows:

Representation of Triangular
1) Approximately 87.5 % of probability for organic fertilizer to affect the virtual zinnia plant model and 12.5 % of probability for organic fertilizer not to affect the growth of the virtual plant.
2) As much as 93.7 % of probability of inorganic fertilizer to affect the virtual model of zinnia plan, and 6.25 % of probability of inorganic fertilizer not to affect the virtual plant model.

Qualitative model of virtual zinnia plant
After getting the values of MPT and CPT, we determined the prior data that would be used to get posterior data.By the using the tools developed by the researcher, prior data were used as the input as shown in Fig. 4. Organic and inorganic fertilizers affected the qualitative model of virtual zinnia plant.Meanwhile, several other factors that also affected the qualitative model of zinnia plant were: 1) The morphology units of virtual zinnia plant such as: sprout_1, stalk_1, leaf_left_1, sprout_1_1, sprout_1_2, leaf_right_1, leaf_left_2 as the sets of alphabets.Σ={f, pd, pu, rr, sprout, stalk, leaf,

Quantitative model of virtual zinnia plant
The process to build quantitative model of zinnia plant was related to the age of zinnia plant.Several factors that affected the quantitative model of zinnia plant were: 1) The morphology functions of virtual unit of zinnia plant (sprout_funct, stalk_funct, leaf_funct, and bloom_funct).
2) The number of morphology unit of virtual Zinnia plant or also called as the production results were four, which were shown as: sprout, stalk, leaf, and bloom.
Posterior data collected from the trial process were used to set the quantitative model of virtual zinnia plant in the form of parametric L-System as shown in Fig. 7.

Model of zinnia plant, visualization, and fitness
Model of zinnia plant growth combined qualitative model and quantitative model.The model contained development and growth of zinnia plant as shown in Fig. 8.
The software package MathEvolvica was used to visualize the model of plant growth.The package consists of kLSystem.mand TurtleInterpretation.m.The kLSystem.m was used for initialization of the parametric L-System [20].The development and growth of zinnia plant were decoded with the Mathematica software.The TurtleInterpretation.m was used in the visualization of the model of zinnia plant growth.The Mathematica program codes were converted into a computer graphic.
The visualization of plants growth with six iterations is shown in Fig. 9. Number 1 shows the first iteration of the grammar rule that was replaced with sprout with two internodes, two lateral leaves, two lateral sprouts, and a bloom bud.Number 2 defines the second iteration of the grammar rule accorded to the function growth.Number 3 shows the next process that was continued from the first iteration with a rewriting rule that ended with the number 6.

Real data processing
The following evaluations are meant to give readers some ideas of how a virtual plant.Bayesian network might be used in development of virtual zinnia plant.The error rate is generated to compare the fitness of real plant with virtual plant.To calculate the error rate, we used the Mean Absolute Percentage Error (MAPE) as shown in (4).With xi is the actual data of the number-i and yi is the visual data of the number-i.The calculation obtained an average level of error percentage of 9.45 %.The level of error rate (MAPE) was less than 40%, which is categorized as good and dependable [21].Fig. 11 shows the comparison of fitness values with six iterations between real and virtual plants, where five treatments of combination of organic and inorganic fertilizer were applied.The highest fitness value was 6.41 from treatment 5 th with combination of high amount of organic fertilizer and medium amount of inorganic fertilizer.

Conclusion
Based on the analysis of the program testing on prior/parameter values of observation data, it was shown that the model of virtual zinnia plant had 87.5 % of probability value for the effect of organic fertilizer and 12.5 % of probability value for the lack of effect of organic fertilizer.Furthermore, there was 93.75% of probability value for the effect of inorganic fertilizer and 6.25% of probability value for the lack of effect from inorganic fertilizer.The error value was 9.45% from five plants that was collected from the comparison between real and virtual plants.The highest fitness value of 6.41was collected from treatment 5th (combination of high organic fertilizer and medium inorganic fertilizer) from five treatments of combination of organic and inorganic fertilizers.
The results showed that there were further studies that can be conducted such as to improve the composition of inorganic fertilizer, to add more variety of combination between organic and inorganic fertilizer, and to mark the morphology measurement of zinnia plant.

Fig. 2 .
Fig. 2. Directed acyclic graph representing two independent possible causes of a sprout growth 3) Development of Bayesian networks.From Fig. 2, the partition order can be written as {O, I} → {S}, and several problems can be solved using (2).P (O, I, S) = P (S|O, I) P (O|I)   

Fig. 3 .
Fig. 3.The research steps were divided into six stages

Fig. 4 .
Fig. 4. Directed graphical model representing two independent potential causes of sprout growth with the prior probability distribution.

Fig. 5 .
Fig. 5. MPT of composition of fertilizer using fuzzy system

Fig. 6 .
Fig. 6.The complicated structure of zinnia plant qualitative model in L-System.

Fig. 7 .
Fig. 7.The structure of zinnia plant quantitative model in parametric L-System.

Fig. 8 .Fig. 9 .
Fig. 8. Structure of model of zinnia plant growth in parametric L-System

Fig. 10 .
Fig. 10.The fitness of virtual plant in x-axis,y-axis, and z-axis

Fig. 11 .
Fig. 11.The comparison of fitness level of real and virtual zinnia plant

Table 2 .
Symbols used in plant growth

Table 3 .
The variables of growth function inquantitative data

Table 4 .
The growth function variables of zinnia plant growth inqualitative data

Table 5 .
The variables of fertilizer