[ieee 2009 international conference on electronic computer technology, icect - macau, china...
TRANSCRIPT
An Introduced Neural Network-Differential Evolution Model for Small Signal Modeling of PHEMTs
Mazhar B. Tayel Amr H. Yassin Faculty of Engineering Faculty of Engineering Department of Electrical Engineering Alexandria University Alexandria University, Egypt . [email protected]
Abstract –Since neural network algorithms are able to model nonlinear relations between different data sets, an introduced neural network model (INN) based on a generalized differential evolution training algorithm (INN-DE) is presented for pseudomorphic high electron mobility transistor (PHEMT). This global optimization algorithm is applied to avoid the local minima problem in the gradient descent-training algorithm and to achieve acceptable solution. The main advantage of this technique is its validation in wide range of frequencies and high accuracy for the small signal characteristics. The proposed (INN-DE) model is used to predict the scattering parameter values for various bias values different from the ones in the data set used for training. This model has been verified by comparing predicted and measured values of a PHEMT for a certain data set of S-parameters at different frequencies and bias points. Keywords: HEMT; S- parameters; small signal model; Neural networks.
INTRODUCTION It is important to develop an almost accurate small-signal model under variable conditions. Thus, the small-signal model results should be as close to the measured values as possible. One of the most important factors that are needed to optimize the equivalent circuit elements of PHEMT is the small-signal S-parameters. The S- parameters of microwave transistors are frequency, temperature and bias-dependent [2], and are considered as figure of merit for high frequency transistor performance. In general the scattering parameters data reflect the element values of an equivalent circuit model for PHEMT.
I. SMALL SIGNAL MODELING An introduced neural network model with differential evolution training algorithm (INN-DE) using Agilent ATF35143, a pseudomorphic AlGaAs/InGaAs/GaAs PHEMT is selected [3,4] because of its excellent room temperature noise properties. For this device the gate length is 0.5 μm and the gate width is 400 μm at 2GHz [4]. The small signal model has 3 input variables (frequency, Vds and Ids), and 8 output variables (magnitudes and phases of the scattering parameters). A large number of scattering parameters data, measured at various frequencies and
biases, was used to train this (INN-DE), using the strategy of differential evolution algorithm. Differential evolution (DE) can be classified as global stochastic population-based evolutionary optimization algorithm that has the ability of finding the global optimum of a given objective function [7]. The optimization goal here is to minimize an objective function E by applying the values of neural network weights w = ( w1, w2 , ……., wN ). The main idea of differential evolution (DE) is to generate trial parameters vectors. For each iteration the algorithm mutates vectors by adding weighted random vector differential to them .If the cost of the trial vector is better than that of the target, then the target vector is replaced by trial vector in the next generation [8]. This process is repeated until a stopping criterion is reached, which stops the algorithm as shown in figure (1).
Mutation
Crossover
obj Function min ?
Stop Algorithm
Initialize population
TrueFalse
Figure 1. Basic Structure of differential evolution algorithm. To apply DE in neural network training, the following steps are introduced: 1) Initialization: Start with a certain population of NP , NP > 3 of N dimensional randomly chosen weight vectors. 2) Mutation operation: For each weight vector g
iw with current generation or iteration number g , (g > 0) and a population member i , i =1,2,….NP ,a mutant vector
1giv + is formed according to the relation (1) [8,9]:
2009 International Conference on Electronic Computer Technology
978-0-7695-3559-3/09 $25.00 © 2009 IEEE
DOI 10.1109/ICECT.2009.149
499
11 2( )g g g g g g
i i best i r rv w w w w wμ+ = + − + − (1) where wr1 and wr2 are two randomly chosen population vectors, and the mutated vector vi is
formed in terms of their vector difference and gbestw
which is the best population element of the previous iteration , multiplied by μ the mutation constant , where μ ∈ [0,2]. 3) Crossover operation : is performed and a trial vector 1g
iu + is generated . For each integer Component j , ( j = 1, 2,….., N ) of the mutant weight vector 1g
iv + , a random real number “rand “ is generated from the interval [0,1]. This number is compared with the crossover constant ρ ( ρ ∈ [0, 1] ) as follows :
1, ,1
,, ,
(2)gi j i j randg
i j gi j i j rand
v if rand then j Iu
w if rand then j I
ρ
ρ
++
⎧ ≤ =⎪=⎨> ≠⎪⎩
If rand ρ≤ then the jth component of the trial
vector 1giu + gets the jth component of the mutant
vector 1giv + otherwise the jth component of the trial
vector 1giu + gets the value of the target vector
giw .In the previous equation r a n dI is a randomly
selected index that is used to ensure the trial vector has at least one component from the mutant vector . The trial vector is accepted for the next iteration if the value of the Error function E is reduced otherwise the previous value of the target vector g
iw is retained. 4) Selection process: in this step the fitness function starts to decrease at some iteration according to the following relation (equation 3):
( ) ( )( ) ( )
1 1
1
1 1(3)
g g gi i ig
i g g gi i i
u if E u E ww
w if E u E w
+ +
+
+ +
⎧ <⎪= ⎨≥⎪⎩
i.e. if the objective value of iu is less than iw then
iu replaces iw in the next generation otherwise iw is retained .The process is iterated until the termination criterion is satisfied.
II. EXPERIMENTAL PROCEDURES
The available ATF 35143 S-parameters data set is divided into three sets: training set, test set and data set. The training set is used to update the weights [5]. Patterns in this set are repeatedly presented in random order. The weight update equations are applied after a certain number of patterns. The test set is used to test the performance of the model. The validation set is used to decide when to stop training only by monitoring the mean square error (MSE). The MSE for both training data and test data is calculated by setting the neuron numbers to only 12 in the first hidden layer, 8 neurons in the second hidden layer, and 8 neuron in the output layer, as shown in figure (2).
F
Vds
Ids
input Layeroutput Layer
first hidden layersecond hidden layer1
12
1
8
S11 mag
S11 ang
S22 ang
Figure 2. The INN model. m = magnitude, a = angle
The learning rate is set to (0.06), and the momentum to 0.8, the squared error obtained is about 0.0016.The number of neurons in the hidden layers is selected close to the number of input and output variables so that to avoid overfitting (in case of too many neurons) and underfitting (in case of too few neurons) for the model. An initial pre-processing stage for the data sets is essential. Data must be scaled to the range used by the input neurons in the neural network. This is typically the range of –1 to 1 or zero to 1. Normalization of the input vectors is performed by calculating mean value and standard deviation for the input and output vectors. The total number of training epochs was 600.To apply DE optimization to this (INN-DE) the weight population size NP was chosen to be in the range of 2N where N is the dimension of the problem [9]. Mutation coefficient μ and crossover constant ρ are selected as 0.8 and 0.5 respectively. According to some experimental experience the value of NP will be acceptable in the range: 2N ≤ NP ≤ 4N so that to avoid overfitting or underfitting of the model [9]. In this section The modeling results were realized using MATLAB release 14 from Math works cooperation [6], and the time of executing the training cycle until convergence took about 4 minutes on a conventional 1.7 GHz PC for gradient based model but larger time for
500
the optimized (INN-DE) model. The outputs of the introduced INN-DE model appear to be very close with the original (measured) data.
III. EVALUATION METHOD The ability of the proposed INN-DE for prediction is measured by calculating average error (equation 4) of the training and testing samples (n samples):
Average error = 1
1 n
i ii
T Pn =
−∑ (4)
where Ti is the target ( measured ) value of the ith sample, and Pi is the output (predicted ) value of the ith sample .From table (1) , it is observed that the average error is less than 1% in some cases. Considering the rms errors:
2
2(5)
( ) ( )
( )
p mi j n i j n
ni j
mi j n
n
S S
S
ω ωε
ω
−=
∑
∑
where Sijp is the predicted S-parameters using the INN-DE model and Sijm is the measured S-parameters[10]. The correlation coefficient can be used as a measure of quality of the introduced INN-DE model and is given by:
( ) ( )
( ) ( )2 2(6)
i ii
i ii i
x x y yr
x x y y
⎛ ⎞− −⎜ ⎟
⎝ ⎠=⎡ ⎤− −⎢ ⎥⎣ ⎦
∑
∑ ∑
where i=1,2,…n , n is the number of test patterns, x and y are the measured and predicted values of S-parameters respectively. Only the overall correlation coefficient is calculated for each data set. The reported testing correlation coefficient is calculated by cross validating all the predictions for all the test data sets, therefore for each data set only one testing correlation coefficient is reported as indicated in table (2). It should be remarked here that the value close to one for the correlation coefficient indicates an excellent predictive ability for the proposed model [2,3].
Table 1. RMS error for the test pattern calculated from equation (5) (Pattern, which is not used in the training process)
RMS Error %
S-Parameters
Gradient descent training
GD
Differential evolution training
DE S11 1.1027 0.8943 S21 1.5569 1.6709 S12 1.9032 1.8944 S22 3.9206 2.7126
Table 2. Correlation coefficients for the test pattern (Pattern, which is not used in the training process)
Correlation coefficients
S-Parameters
Gradient descent training
GD
Differential evolution training
DE S11 0.9989 0.9996 S21 0.9998 0.9999 S12 0.9999 0.9999 S22 0.8026 0.8315
IV. RESULTS The introduced INN-DE model which is used to simulate the ATF-35143 PHEMT characteristics, is compared to the experimentally measured characteristics that were taken from the manufacture’s data sheet. The evaluation of the results can be seen in table1.and table2. The predicted and measured S-parameters for the PHEMT are shown in figures (3) and (4) for both gradient based and optimized models respectively, at a fixed bias. It can be seen that the predicted S-Parameters data from both models is very close with the measured values that obtained from the manufacture’s data sheet. Figures (3) and (4) indicate the comparison of predicted S-parameters from both models with the experimental values on the smith chart for Vds= 3v and, Ids=15 mA for gradient based model and INN-DE optimized model respectively. It can be seen that the proposed INN-DE model matches the measured test values with good accuracy as a result of calculating the average error (equation 4), rms error (equation 5) and correlation coefficients (equation 6). This verifies the proposed small signal model for the PHEMT.From equation (5), the rms errors are in the range 1.5%, and the resulting correlation coefficients from equation 4, are in the range of 0.9 which indicates a good prediction capability of the proposed INN-DE model.
501
Sample results Gradient descent training
Figure 3a. Comparison between measured S11, S22 (solid line) and modeled S11, S22 (∆ ∆ line) on smith chart for vds= 3V,Ids=15 mA .
Figure 3b. Measured and predicted S11, S22 magnitude and angle – Frequency relations for vds=3V and Ids=15mA. ( __ ) measured data set, (--+--) predicted data set.
Freq S11m S11sim S22m S22sim 1 0.94 0.93558 0.57 0.54883 2 0.84 0.82689 0.5 0.48016 3 0.73 0.71725 0.44 0.41147 4 0.63 0.62951 0.37 0.35189 5 0.58 0.57266 0.31 0.29384 6 0.56 0.55232 0.24 0.23126 7 0.55 0.54745 0.19 0.17556 8 0.56 0.56322 0.14 0.12954 9 0.6 0.59694 0.11 0.092272 10 0.64 0.63473 0.09 0.093712
S11
S22
502
Sample results Gradient descent training
Figure 3c. Comparison between measured S21, S12 (solid line) and modeled S21, S12 (∆ ∆ line ) on smith chart for Vds=3V ,Ids=15 mA .
(The values on smith chart are : 0.1* S21 and 4* S12)
Figure 3d. Measured and predicted S21, S12 magnitude and angle – Frequency relations for vds=3V and Ids=15mA. ( __ ) measured data set, (--+--) predicted data set.
Freq S21m S21sim S12m S12sim 1 6.97 7.0663 0.042 0.03986 2 6.13 6.2047 0.074 0.071895 3 5.31 5.3047 0.102 0.094129 4 4.59 4.5667 0.107 0.10664 5 4 3.9734 0.115 0.11396 6 3.53 3.5176 0.121 0.12116 7 3.14 3.1044 0.125 0.125 8 2.81 2.8022 0.127 0.12644 9 2.5 2.5536 0.128 0.12849 10 2.26 2.3149 0.13 0.13048
0.1 x S21
4 x S12
503
Sample results Optimized (INN-DE) model
Figure 4a. Comparison between measured S11, S22 (solid line) and modeled S11, S22 (∆ ∆ line) on smith chart for vds =3V, Ids =15 mA.
Figure 4b. Measured and predicted S11, S22 magnitude and angle – Frequency relations for vds=3V and Ids=15mA. ( __ ) measured data set, (--+--) predicted data set.
Freq S11m S11sim S22m S22sim 1 0.94 0.94566 0.54883 0.55962 2 0.84 0.8412 0.48016 0.4872 3 0.73 0.72677 0.41147 0.41493 4 0.63 0.63385 0.35189 0.35612 5 0.58 0.57932 0.29384 0.29775 6 0.56 0.55541 0.23126 0.23927 7 0.55 0.54778 0.17556 0.18807 8 0.56 0.55935 0.12954 0.14433 9 0.6 0.59825 0.092272 0.10658 10 0.64 0.63925 0.093712 0.09829
S11
S22
504
Sample results Optimized (INN-DE) model
Figure 4c. Comparison between measured S21,S12 (solid line) and modeled S21,S12 (∆ ∆ line) on smith chart at for vds=3V,Ids=15 mA.
(The values on smith chart are : 0.1* S21 and 4* S12)
Figure 4d. Measured and predicted S21, S12 magnitude and angle – Frequency relations for vds=3V and Ids=15mA. ( __ ) measured data set, (--+--) predicted data set.
Freq S21m S12sim S12m S12sim 1 6.97 7.0721 0.042 0.041421 2 6.13 6.2491 0.074 0.073043 3 5.31 5.3815 0.102 0.093826 4 4.59 4.6595 0.107 0.10639 5 4 4.0315 0.115 0.11581 6 3.53 3.5602 0.121 0.1224 7 3.14 3.1084 0.125 0.12631 8 2.81 2.836 0.127 0.12621 9 2.5 2.5337 0.128 0.12773 10 2.26 2.2414 0.13 0.13033
0.1 x S21
4 x S12
505
V. CONCLUSION
In conclusion, the introduced INN-DE model has been used to model the small signal characteristics of HEMTs effectively. Generalization of model is proven by testing it with different sequence of data that have not participated during the training process. In this study, differential evolution has been used as a global optimization method for feed-forward neural networks. The simulation results in figures (3) and (4) show very slight differences (between differential evolution optimized INN-DE model compared with the gradient-based model). Optimized INN-DE model seems to give some better results according to rms error calculations. Differential evolution can rather be used in validation of reached the optimum value of objective function concerned in this work. The Pearson correlation coefficient was used to evaluate the ability of the INN-DE model to predict the test data. The modeled S-parameters are close to the original data of the manufacture, and the model show that fit of the angle of S22 is very good until about 8 GHz . It is not necessary to solve nonlinear regression equations that take a long running time .In this case using neural networks models is considered as an advantage.
REFERENCES [1] N. Oukhanski and H.-G. Meyer, “Low Noise Temperature PHEMT Readout for Quantum Devices,” Proceedings of Sixth European Workshop on Low Temperature Electronics (WOLTE-6), European Space & Technology Centre, Keplerlaan 1, 2201 AZ Noordwijk (The Netherlands), 163-168, 23-25 June 2004. [2] ANNs in Bias-Dependent Modeling of S- parameters of Microwave FETs and HBTs Zlatica Marinkovic, Aleksandar , Vera Markovic, Olivera Pronic June, 2006 Microwave Review. [3] Marinković ,Marković ,Prediction of HEMT’s Scattering and noise Parameters using Neural Networks , Microwave Review , December 2002. [4] Agilent Technologies, ATF35143 Technical Data sheet. [5] Haykin S. , Neural Networks: A Comprehensive Foundation, Prentice Hall, 1999. [6] H.Demuth and M. Beale. Neural Network TOOLBOX User’s Guide. For use with MATLAB. The MathWorks Inc., 1998. [7] llonen , Kristian and Lapinen ," Differential Evolution Training Algorithm for Feed-Forward Neural Networks " , Neural Processing Letters , Volume 17 , Issue 1 (2003) Pages: 93 – 105 Year of Publication: 2003 . [8] Karin Zielinski, Dagmar Peters, and Rainer Laur , " Run Time Analysis Regarding Stopping Criteria For Differential Evolution and Particle Swarm Optimization “, 1st International Conference on Experiments, System Modeling and Optimization, Athens, 6-9 July 2005.
[9] V.P. Plagianakos, D.G. Sotiropoulos, and M.N. Vrahatis , " Integer weight training by deferential evolution algorithms " , N.E. Mastorakis (ed.), Recent Advances in Circuits and Systems, pp. 327-331.World Scientific Publishing Co. Pte. Ltd., ISBN 9810236441, 1998. [10] GaAs MESFET Characteristics Using Least Squares Approximation by Rational Functions. ”, IEEE Transactions on Microwave Theory and Techniques, vol. 41, no. 2, February 1993. [11] Mankuan Vai, Shuichi Wu, Bin Li, Sheila Prasad, “Reverse Modeling of Microwave Circuits with Bidirectional Neural Network Models”, IEEE Transactions on Microwave Theory and Techniques, vol. 46, no. 10, pp. 1492-1494, 1998. [12] Mankuan Vai, Sheila Prasad, “Microwave Circuit Analysis and Design by a Massively Distributed Computing Network”, IEEE Transactions on Microwave Theory and Techniques, vol. 43, no. 5, pp. 1087-1094, 1995. [13] Mankuan Vai, Zhimin Xu, Sheila Prasad, “Qualitatively Modeling Heterojunction BipolarTransistors for Optimization: A Neural Network Approach”, IEEE/Cornell Conference on Advanced Concepts in High Speed Semiconductor Devices and Circuits, pp. 219-227, 1993. [14] Design of Cryogenic 700 MHz HEMT Amplifier Leif Roschier and Pertti Hakonen , Helsinki University of Technology, Low Temperature Laboratory, Otakaari 3 A, Espoo, FIN- 02015 HUT, Finland. [15] CENGIZ ,GUNES,CAGLAR , Soft Computing Methods in Microwave Active Device Modeling , Turk J Elec Eng , VOL 13, No,1. 2005 , YURKEY. [16] Mankuan Vai, Sheila Prasad, “Neural Networks in Microwave Circuit Design – Beyond Black Box Models”, International Journal of RF and Microwave CAE, pp. 187-197, 1999. [17] David M. Pozar, Microwave Engineering, Third Edition, John Wiley and Suns, Inc., Hoboken, NJ, 2005. [18] Mankuan Vai, Shuichi Wu, Bin Li, Sheila Prasad, “Creating Neural Network Based Microwave Circuit Models for Analysis and Synthesis”, Asia Pacific Microwave, Conference Proceedings, pp. 853-856, 1997. [19] Junhong Liu , Jouni Lampinen " A Differential Evolution Based Incremental Training Method for RBF Networks " GECCO’05, June 25– 29, 2005, Washington, DC, USA. Copyright 2005.
506