Fusion of Neural Networks, Fuzzy Systems and Genetic Algorithms: Industrial Applications Fusion of Neural Networks, Fuzzy Systems and Genetic Algorithms: Industrial Applications
by Lakhmi C. Jain; N.M. Martin
CRC Press, CRC Press LLC
ISBN: 0849398045   Pub Date: 11/01/98
  

Previous Table of Contents Next


Suppose in this example that we want to extract the consequent value ω(1) for rule (l = 1) described in (17).

Each variable in the two training samples in (16) has a membership degree in each antecedent fuzzy set A3(1) and A4(1). In expressions (18) to (20), we show the corresponding degrees attributed to values , and y′, for examples in (16). The triangular partition causes all numerical values to always have two non-zero membership degrees and a null degree in the other fuzzy sets, as illustrated in Figure 3 for each training example. The difference using Gaussian functions is that each variable would have a number of degrees equal to the attributed fuzzy sets.

The algorithm extracts the value of ω(l) using steps 5 to 8. It considers the fuzzy sets A3(l) and A4(l) of condition part in rule R(1). Therefore, the conclusion value is computed by Equation (21) combining each output value y′ (k), weighted and normalized by their contribution degrees to the specified rule.

4.4 The Neuro-Fuzzy Algorithm

The neuro-fuzzy algorithm developed by Wang [24] uses the hybrid model developed by Takagi-Sugeno in [3]. In this type of model, condition part uses linguistic variables and the conclusion part is represented by a numerical value which is considered a function of system’s condition expressed in the variables x1,x2,...,xm (22). These models are suitable for neural-based-learning techniques as gradient methods to extract the rules [6] and generate models with a reduced number of rules.

The neuro-fuzzy algorithm uses membership functions of Gaussian type. With Gaussian fuzzy sets, the algorithm is capable of utilizing all information contained in the training set to calculate each rule conclusion, which is different when using triangular partitions.

Figure 4 illustrates the neuro-fuzzy scheme for an example with two input variables (x1,x2) and one output variable (y). In the first stage of the neuro-fuzzy scheme, the two inputs are codified into linguistic values by the set of Gaussian membership functions attributed to each variable. The second stage calculates to each rule R(l) its respective activation degree. Last, the inference mechanism weights each rule conclusion ω(l), initialized by the cluster-based algorithm, using the activation degree computed in the second stage. The error signal between the model inferred value Y and the respective measured value (or teaching value) y′, is used by the gradient-descent method to adjust each rule conclusion. The algorithm changes the values of ω(l) to minimize an objective function E usually expressed by the mean quadratic error (23). In this equation, the value y′ (k) is the desired output value related with the condition vector . The element Y(x′ (k)) is the inferred response to the same condition vector x′ (k) and computed by Equation (24).

Equation (25) establishes adjustment of each conclusion ω(l) by the gradient-descent method. The symbol α is the learning rate parameter, and t indicates the number of learning iterations executed by the algorithm.


Figure 4  The neuro-fuzzy scheme.

The inference function (24) depends on ω(l) only through its numerator. expression composing the numerator is now denoted by a and is shown in (26).

The denominator of function (24) is dependent on a term d(l), defined in (27), and denoted by b in (28).

To calculate the adjustment of each conclusion value ω(l), it is necessary to compute the variation of the objective function E, ∂E, in relation to the variation that occurred in ω(l) in the anterior instant, ∂ω(l). Therefore, using the chain rule to calculate ∂E/∂ω(l) results in expression (29).

The use of the chain rule looks for the term contained in E that is directly dependent on the value to be adjusted, i.e., the conclusion value ω(l). Therefore, we can verify by chain Equation (29) that it starts with E dependent of Y value, the Y value depends on term a and, at last, the expression a is a function of ω(l).

Using Equations (26) to (28), the Y function is written as (30).

The three partial derivatives of chain rule are computed resulting in Equations (31), (32), and (33).

Substituting the three derivatives in chain Equation (29), the final partial derivative of E in relation to ω(l) results in expression (34).

The replacement of derivative ∂E/∂ω(l) in Equation (25) gives the final result presented in (35). In this equation, d(l) represents the activation degree of rule (l) by condition x′ (k). The expression (d(l)) is the normalization factor of value d(l). Using these two considerations, the adjustment to be made in ω(l) can be interpreted as being proportional to the error between the neuro-fuzzy model response and the supervising value, but weighted by the contribution of rule (l), denoted by d(l), to the final neuro-fuzzy inference.


Previous Table of Contents Next

Copyright © CRC Press LLC