int. j. communications, network and system sciences vol.2 no.6-02-06-20090922104705.pdf ·...

Int. J. Communications, Network and System Sciences, 2009, 6, 461-582 Published Online September 2009 in SciRes (http://www.SciRP.org/journal/ijcns/).

Copyright © 2009 SciRes. IJCNS

TABLE OF CONTENTS

Volume 2 Number 6 September 2009

Performance of Block Space-Time Code in Wireless Channel Dynamics

W. M. JANG, J. H. JUNG…………………………………………………………………………………… 461

Beam Pattern Scanning (BPS) versus Space-Time Block Coding (STBC) and Space-Time Trellis Coding (STTC)

P. K. TEH, S. ZEKAVAT……………………………………………………………………………………… 469

Rain Attenuation Impact on Performance of Satellite Ground Stations for Low Earth Orbiting (LEO) Satellites in Europe

S. CAKAJ………………………………………………………………………………………………… 480

Detection, Identification and Tracking of Flying Objects in Three Dimensions Using Multistatic Radars

L. S. KALANTARI, S. MOHANNA, S. TAVAKOLI……………………………………………………… 486

Investigation into the Performance of a MIMO System Equipped with ULA or UCA Antennas: BER, Capacity and Channel Estimation

X. LIU, M. E. BIALKOWSKI, F. WANG, K. BIALKOWSKI…………………………………………… 491

Efficient Bandwidth and Power Allocation Algorithms for Multiuser MIMO-OFDM Systems

J. SHU, W. GUO……………………………………………………………………………………………… 504

Two-Dwell Synchronization Techniques and Mimo Systems for Performance Improvements of 3G Mobile Communications

F. BENEDETTO, G. GIUNTA…………………………………………………………………………… 511

Bandwidth Optimization in 802.15.4 Networks through Evolutionary Slot Assignment

V. KRISHNAMURTHY, E. SAZONOV…………………………………………………………………… 518

A Scalable Architecture for Network Traffic Monitoring and Analysis Using Free Open Source Software

O. ABIONA, T. ALADESANMI, C. ONIME……………………………………………………………… 528

Positioning a Node of Wireless Sensor Networks in 3 Dimensional Space W. H. NIE, S. G. JU, A. R. XUE, F. LI ………………………………………………………………… 540

A Trust Model Based on the Multinomial Subjective Logic for P2P Network J. F. TIAN, C. LI, X. M. HE, R. TIAN……………………………………………………………………… 546

A Rank-One Fitting Method with Descent Direction for Solving Symmetric Nonlinear Equations G. L. YUAN, Z. X. WANG, Z. X. WEI……………………………………………………………………… 555

A Novel Packet Switch Node Architecture for Contention Resolution in Synchronous Optical Packet Switched Networks

V. S. SHEKHAWAT, D. K. TYAGI, V. K. CHAUBEY…………………………………………………… 562

A Network Intrusion Detection Model Based on Immune Multi-Agent N. LIU, S. J. LIU, R. LI, Y. LIU……………………………………………………………………………… 569

A Low Power and High Speed Viterbi Decoder Based on Deep Pipelined, Clock Blocking and Hazards Filtering

C. ARUN, V. RAJAMANI…………………………………………………………………………………… 575

International Journal of Communications, Network and System Sciences

(IJCNS)

Journal Information

SUBSCRIPTIONS

The International Journal of Communications, Network and System Sciences (Online at Scientific Research

Publishing, www.SciRP.org) is published monthly by Scientific Research Publishing, Inc.,USA.

E-mail: [email protected]

Subscription rates: Volume 2 2009 Print: $50 per copy.

Electronic: free, available on www.SciRP.org.

To subscribe, please contact Journals Subscriptions Department, E-mail: [email protected]

Sample copies: If you are interested in subscribing, you may obtain a free sample copy by contacting Scientific

Research Publishing, Inc at the above address.

SERVICES

Advertisements

Advertisement Sales Department, E-mail: [email protected]

Reprints (minimum quantity 100 copies)

Reprints Co-ordinator, Scientific Research Publishing, Inc., USA.


COPYRIGHT

Copyright© 2009 Scientific Research Publishing, Inc.

All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in

any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as

described below, without the permission in writing of the Publisher.

Copying of articles is not permitted except for personal and internal use, to the extent permitted by national

copyright law, or under the terms of a license issued by the national Reproduction Rights Organization.

Requests for permission for other kinds of copying, such as copying for general distribution, for advertising or

promotional purposes, for creating new collective works or for resale, and other enquiries should be addressed to

the Publisher.

Statements and opinions expressed in the articles and communications are those of the individual contributors and

not the statements and opinion of Scientific Research Publishing, Inc. We assumes no responsibility or liability for

any damage or injury to persons or property arising out of the use of any materials, instructions, methods or ideas

contained herein. We expressly disclaim any implied warranties of merchantability or fitness for a particular

purpose. If expert assistance is required, the services of a competent professional person should be sought.

PRODUCTION INFORMATION

For manuscripts that have been accepted for publication, please contact:


Int. J. Communications, Network and System Sciences, 2009, 6, 461-468 doi:10.4236/ijcns.2009.26050 Published Online September 2009 (http://www.SciRP.org/journal/ijcns/).


461

Performance of Block Space-Time Code in Wireless Channel Dynamics

Won Mee JANG, Jong Hak JUNG 1The Peter Kiewit Institute of Information Science, Technology & Engineering, Omaha, USA

2Department of Computer and Electronics EngineeringUniversity of Nebraska–Lincoln, Omaha, USA Email: [email protected], [email protected]

Received March 9, 2009; revised May 8, 2009; accepted July 15, 2009 ABSTRACT In this work, we observe the behavior of block space-time code in wireless channel dynamics. The block space-time code is optimally constructed in slow fading. The block code in quasi-static fading channels pro-vides affordable complexity in design and construction. Our results show that the performance of the block space-time code may not be as good as conventionally convolutional coding with serial transmission for some channel features. As channel approaches fast fading, a coded single antenna scheme can collect as much diversity as desired by correctly choosing the free distance of code. The results also point to the need for robust space-time code in dynamic wireless fading channels. We expect that self-encoded spread spec-trum with block space-time code will provide a robust performance in dynamic wireless fading channels. Keywords: Space-Time Codes, Diversity, Multiple Transmit Antennas, Self-Encoded Spread Spectrum.

1. Introduction Space-time codes introduce temporal and spatial correla-tion into signals transmitted from different antennas in order to provide diversity at the receiver as well as cod-ing gain without sacrificing bandwidth [1,2]. Most opti-mal space-time codes have been developed in block code in slow fading channels [1,3–7]. The block code in quasi-static fading channels provides affordable com-plexity in design and construction. It has been shown that in a system with t transmit and r receive antennas, and a slow fading channel, the average channel capacity with perfect channel state information (CSI) at the receiver is about mint,r times larger than that of a single antenna system [8]. Tight exponential upper bound is obtained on the decoding error probability of block codes transmitted over fully interleaved fading channels with perfect CSI at the receiver [10]. These bounds do not require integra-tion in their final version, and they are reasonably tight in a certain portion of the data region that exceeds the cut-off rate of the channel. If channel-state information is also available to the transmitter, very high capacity is achievable without the need for time diversity [9]. How-ever, mobile communication channels are dynamic and undergo slow fading to fast fading rather quickly as mo-

bile speeds and surrounding structures change. Some space-time codes have been developed for fast fading channels under the assumption of low data rates and low signal-to-noise ratios [1]. Nevertheless, as wireless Inter- net services are incorporated into mobile communica-tions, space-time codes that are optimal for fast fading channels, high data rate and low bit error rate (BER) are required.

In this work, we consider the performance of current space-time code that is optimally constructed for slow fading channels (hereafter we called it the block space-time code). We raise the question whether the performance of the block space-time code would be at least as good as conventional serial code when channel characteristics change dynamically. In this paper, we analyze and compare two different transmitter structures: the parallel transmitter that employs the block space- time code and the serial signal transmitter. Conventional channel code, such as convolutional code with a single transmit antenna, is used in the serial signal transmitter. We compare the BER of the two systems under the same bandwidth, average transmit power, data rate and a simi-lar encoder processing complexity. The results suggest that the performance of the block space-time code can be degraded below the conventional code for some channel features.

W. M. JANG ET AL.


462

Figure 1. Trellis diagram of block space-time code and convolutional code.

Figure 2. Transmitter structures.

Self-encoded spread spectrum (SESS) introduced in [11] eliminates the need for traditional pseudo noise (PN) code generators. As the term implies, the spreading code is obtained from the random digital information source itself. Multiuser convolutional code directly applicable to SEMA in cellular system is developed in [12]. A chip interleaving and iterative detection scheme for SEMA improve system performance significantly in fading

channels [13]. The cooperative SESS performance is shown to be superior to other conventional cooperative systems [14]. We currently conduct the research on SESS with multiple-input multiple-output (MIMO). Our future work is to develop SESS block space-time code. Due to the inherent time diversity in SESS, we expect SESS block space-time code to maintain a robust per-formance in dynamic wireless fading channels.

PERFORMANCE OF BLOCK SPACE-TIME CODE IN WIRELESS CHANNEL DYNAMICS


463

2. System Model 2.1. Block Space-Time Code We consider a base station to a mobile communication where the base-station equipped with n antennas and the mobile is equipped with r antennas. Data are encoded by the channel encoder, and the encoded data go through a serial-to-parallel converter and are split into n streams of data. Each stream of data is the input to a pulse shaper. Then, the output of each shaper is modulated. We con-sider the 8-state trellis code [1] shown in Figure 1. We use the 8-state 4-PSK trellis space-time code to obtain numerical results for comparison, although our analysis can be generalized to other space time-codes. The space-time code and 4-PSK modulation using two trans-mitter antennas is shown in Figure 2. The two streams of incoming data, data-1, d1, and data-2, d2, are encoded, pulse-shaped, modulated and transmitted in parallel over the two transmit antennas. Alternatively, the two data streams can be considered divided from a common data source. The signal constellation employed here is 4-PSK and the signal points are labeled by the elements of Z4, the ring of integers modulo 4. The edge level c1c2 in Fig-ure 1 shows that signal c1 is transmitted over the first antenna and that signal c2 is transmitted over the second antenna. This code can be described in terms of a se-quence (d1,d2) of binary inputs. The output signal pair

at time t is given by [1] 1 2t tc c

1 2 1 22 1

1 2 11

( ) (2,2) (2,0)

(1,0) (0,2) (0,1)

t t t t

t t t

c c d d

d d d

(1)

At each time slot t, the output of modulator-i is a sig-nal 1

t that is transmitted using transmit antenna i for 1≤i

≤n. The n signals are transmitted simultaneously, each from a different transmit antenna, and all signals have the same transmission period.

c

The signal at each receive antenna is the sum of the n transmitted signals contaminated by a noise and cor-rupted by Rayleigh fadings. We assume that the elements of the signal constellation are normalized by a factor of

bE , where Eb is the bit energy, so that the average en-

ergy of the constellation is the unity. A decision is based on the received signals at each receive antenna 1≤j≤r. The signal j

ty received by antenna j at time t is given

by

,1

nj t i j

t i j t bi

y c E

tn (2)

where the noise at time t is a complex Gaussian

random variable with a zero-mean and variance N0/2 per dimension, independent for all j and t. The coefficient

jtn

,ti j is the path gain from transmit antenna i to receive

antenna j at time t. We are interested in the behavior of the block space-time codes that are optimally constructed for slow fading as channel dynamics change to inde-pendent path gains for every i, j and t.

A maximum-likelihood sequence detector is applied for decoding. We assume ideal channel state information; thus, the path gains

,ti j , i=1,2,…n j=1,2,…r are pre-

cisely known to the receiver. Since jty is the received

signal at receive antenna j at time t, the branch metric for a transition labeled 1 2

t tnt is given by

2

,1 1

n nj t i

t i jj i

y t

(3)

Viterbi decoding is then applied to obtain the path with the lowest accumulated metric.

2.2. Conventional Serial Code Convolutional coding is applied to each data stream as shown in Figure 2. The encoded symbols are serial- to-parallel converted and fed to the modulator. The re-quired bandwidth is equivalent to the block space-time code, although the bandwidth expansion produced by the encoder can be reduced considerably less than the recip-rocal of the code rate [15]. We use the convolutional code with the code rate (1/R), the constraint length (K), and the generators equal to 1/2, 3 and [5 7] in octal, re-spectively [16]. This code has a free distance (dfree) equal to 5. The two 4-state diagram of this code is shown in Figure 1. The code can be described in terms of a se-quence of binary inputs. The output, from the

first encoder at time t is given by

1,1 1,2t tc c

1,1 1,2 1 1 12 1) (1,1) (0,1) (1,1)t t t t tc c d d d (

2,1(c c

(4)

Modulo 2 addition is performed to obtain the encoder output pairs. Likewise, the output of the second encoder

can be generated. The encoders’ outputs are

serial-to-parallel converted and fed to the modulator. The signal constellation employed here is 16-QAM for a sin-gle transmit antenna and the signal points are labeled by the elements of Z16. Considering that 16-QAM and 4-PSK display approximately 5 dB difference for

2,2t t )

Eb/N0≥0 dB, these modulation schemes are more favor-able for the block space-time code system. Nevertheless, our results show that the conventional serial code system can outperform the block space-time code in some chan-nel characteristics.

The output of a 16-QAM modulator can be repre-sented as a complex number,

1,1 1,2 2,1 2,2(2 ) (2 )t t t t tc c c c c (5)

W. M. JANG ET AL.


464

where ζ= 1 . The signal jty received by antenna j at

time t is given by

( ) ( )vvf v e U v (12)

to maintain the same average received power without fading. U(v) is the unit step function. With r receiving antennas and 1, 2j t

t j t by c E jtn (6)

where 1,t

j is the path gain from the single transmit

antenna to receive antenna j at time t. Notice that we scaled the transmit bit energy to maintain the same aver-age bit energy in both systems for fair comparison. For Viterbi decoding, we replace Equation (3) with

1

rVX

the pdf of X can be represented as a Gamma distribution with the parameter, 2r, as [19]

2 11( ) ( )

(2 1)!r x

Xf x x er

U x (13) 2

11,

1

rj t

t j tj

y

(7) Therefore,

00

( )

2 2 (2 / sin( / 4)

( )

r

b

X

P

Q x E N

f x dx

c e

3. Performance Analysis (14) 3.1. Block Space-Time Code

In Equation (14), we applied the approximation of the symbol error probability of QPSK [20],

From Figure 1, the codeword (0, 2, 2, 0) has the free dis-tance from the all-zero codeword, (0, 0, 0, 0). The free distance is defined as the minimum Hamming weight of all possible codewords. For moderate and high sig-nal-to-noise ratios, it is well known that the free-distance term in the union bound on the BER performance domi-nates the bound [17,18]. Assuming ideal channel state information, the probability of transmitting c=(0, 0, 0, 0) and deciding in favor of e=(0, 2, 2, 0) at the decoder is

2

0

2 log2 sib

eE M

P QN M

n

(15)

with M=4. Therefore, the probability of the bit error can be found as [17]

(b rP P ) c e (16)

where

,

20

( , 1, , 1, , 1, )

( ( , ) / 2 )

tr i j

b

P i n j r t

Q d E N

c e

c e

l (8)

/dfree dN B k

Bd and k are the number of nonzero information bits and the total number of information bits, respectively, on the dfree path. The error coefficient, Ndfree, is the total number, or multiplicity, of the free distance code word. For the chosen codewords, c=(0, 0, 0, 0) and e=(0, 2, 2, 0), Ndfree, Bd and k are 1, 1 and 4, respectively.

where l is the block length, N0/2 is the noise variance per dimension,

2 /21( )

2xQ e

dx

and 3.2. Conventional Serial Code

2

2,

1 1 1

( , ) ( )r l n

t i ii j t t

j t i

d

c e From Figure 1, the codeword, (3, 1, 3) has the free dis-tance from the all-zero codeword, (0, 0, 0). Therefore, the probability of transmitting c=(0, 0, 0) and deciding in favor of e=(3, 1, 3) can be obtained as in Equation (8) with d2(c,e) as follows for r=1:

c e (9)

For r =1 22 1 1 1 1 2

1,1 1 1 2,1 1 1

22 1 1 2 2 21,1 2 2 2,1 2 2

( , ) ( ) ( )

( ) ( )

d c e c

c e c e

c e 2e (10) 22 1 1

1,1 1 1

22 1 11,1 2 2

23 1 11,1 3 3

( , ) ( )

( )

( )

d c

c e

c e

c e 1e

(17) Applying c= (0, 0, 0, 0) and e= (0, 2, 2, 0),

2 22 12,1 1,1( , ) 4( )d c e 2 (11)

Using c =(0, 0, 0)and e=(3, 1, 3), Since ,ti j is Rayleigh fading, the probability density

function (pdf) of 2 22 1 2

1,1 1,1 1,1( , ) 9 9d c e23 (18)

2tV ,i j can be shown as [19]



465

Figure 3. Block space-time code, two transmit antennas, 8 states, 4-PSK, 1, 2 and 4 receive antennas, Rayleigh.

Figure 4. Convolutional code, single transmit antenna, two 4 states, 16-QAM, 1, 2 and 4 receive antennas, Rayleigh.

W. M. JANG ET AL.


466

Figure 5. Analytical BER of block space-time code and convolutional code, 1, 2 and 4 receive antennas, Rayleigh.

Figure 6. Simulation BER of block space-time code and convolutional code, a half average transmit power for convolutional code, 1, 2 and 4 receive antennas, Rayleigh.



467

With r receiving antennas, the pdf of the above equa-tion can be represented as the combination of two ran-dom variables,

2 21 31,1 1,11

r

X

and

221,11

rY Consequently [19],

2 1

1

1( ) ( )

(2 1)!

1( ) ( )

( 1)!

r xX

r yY

f x x e U xr

y y e U yr

, and

(19)

f

Therefore, the probability of bit error is

0 0

( )

(9 )(4 /10 ) ( ) ( )

r

b 0 X Y

P

Q x y E N f x f y dxd

c e

y(20)

In Equation (20), we employ the approximation of the

symbol error probability, 3 ( 4 / 5 )e b 0P Q E N , for 16-

QAM modulation [20]. The probability of the bit error is obtained from Equation (16), where Ndfree , Bd and k are 1, 1 and 3 respectively.

4. Simulation Results We assume that the perfect channel state information is available at receive antennas in the following simulations. We consider the dynamic channel characteristics of in-dependent fading for every bit interval. In Figure 3, the simulation BER and the analytical BER of the block space-time code with 8-state, 4-PSK and two transmit antennas are compared in Rayleigh fading channels for one, two and four receive antennas. The simulation BER approaches the analytical BER at high signal-to-noise ratios (SNR). With a larger number of receive antennas, the rate of the approach becomes faster. With two or more receive antennas, the difference between the simu-lation and the analysis is less than 1 dB for SNR≥6 dB. The receive antenna diversity significantly improves the system performance as expected. The BER performance of two four-state convolutional codes and a 16-QAM system with a single transmit antenna is displayed in Figure 4. The simulation BER rapidly approaches the analytical BER. The BER difference is less than 1 dB for all SNR. We observe similar effects of receive antenna diversity as in the block space-time codes. The analytical results of the two systems are shown in Figure 5. We can see that the BER of the conventional serial code is better than the block space-time code for all SNR under the same bandwidth, average transmits power, data rate and

a similar processing complexity as channel dynamics reach to independent fading in each bit interval. The dif-ference becomes larger at high SNR. We show the simu-lation BER of the block space-time code and the conven-tional serial code in Figure 6. Half of the average trans-mit power of the block space-time code is assigned to the conventional serial code transmission. Now we observe that the BER of both systems is equivalent. Our results show that for the same Eb/N0 , the performance of the block space-time code may not be as good as the con-ventional convolutional code for some channel features. As channel dynamics reach fast fading, a coded single antenna scheme can collect as much diversity as desired by suitably choosing the free distance of code. Block space-time codes are most useful in slow fading when temporal diversity is not available. The advantage of the block space-time code can be diminished significantly as wireless channel approach fast fading. 5. Conclusions In this paper, we show that the block space-time coding gain can be degraded below the conventional channel coding with a single transmit antenna for some channel characteristics. Our results suggest that there is a need for a robust space-time code in rapidly changing wireless channels. Our future work is to develop SESS block space-time code. Due to the inherent time diversity in SESS, we expect SESS block space-time code to provide a robust performance in dynamic wireless channels. 6. Acknowledgements This work was supported in part by the contract award FA9550-08-1-0393 from the U.S. Air Force Office of Scientific Research. Thanks are due to Dr. J. A. Sjogren whose support has allowed the authors to investigate the feasibility of self-encoded block space-time code in dy-namic wireless fading channels. 7. References [1] V. Tarokh, N. Seshadri, and A. R. Calderbank,

“Space-time codes for high data rate wireless communi-cation: Performance criterion and code construction,” IEEE Transactions on Information Theory, Vol. 44, No. 2, pp. 744–765, March 1998.

[2] S. Alamouti, “A simple transmit diversity technique for wireless communications,” IEEE Journal on Selected Areas in Communications, Vol. 1, No. 16, pp. 1451– 1458, 1998.

[3] V. Tarokh, H. Jafarkhani, and A. R. Calderbank, “Space-time block coding for wireless communications: Performance results,” IEEE Journal on Selected Areas in

W. M. JANG ET AL.


468

Communications, Vol. 17, No. 3, pp. 451–460, March 1999.

[4] S. Baro, G. Bauch, and A. Hansmann, “Improved codes for space-time trellis-coded modulation,” IEEE Commu-nications Letters, Vol. 4, No. 1, pp. 20–22, January 2000.

[5] H. E. Gamal and A. R. Hammons, “On the design of al-gebraic space-time codes for MIMO block-fading chan-nels,” IEEE Transactions on Information Theory, Vol. 49, No. 1, pp. 151–163, January 2003.

[6] H. E. Gamal and A. R. Hammons, “On the design and performance of algebraic space-time codes for BPSK and QPSK modulation,” IEEE Transactions on Communica-tions, Vol. 50, No. 6, pp. 907–913, June 2002.

[7] A. Song, G. Wang, W. Su, and X. G. Xia, “Unitary space-time codes from Alamouti’s scheme with APSK signals,” IEEE Transactions on Wireless Communica-tions, Vol. 3, No. 6, pp. 2374–2384, November 2004.

[8] G. J. Foschini and M. J. Gans, “On limits of wireless communications in a fading environment when using multiple antennas,” Wireless Personal on Communica-tions, Vol. 6, No. 3, pp. 311–335, March 1998.

[9] E. Biglieri, G. Caire, and G. Taricco, “Limiting perform-ance of block-fading channels with multiple antennas,” IEEE Transactions on Information Theory, Vol. 47, No. 4, pp. 1273–1289, May 2001.

[10] I. Sason, S. Shamai, and D. Divsalar, “Tight exponential upper bounds on the ML decoding error probability of block codes over fully interleaved fading channels,” IEEE Transactions on Communications, Vol. 51, No. 8, pp. 1296–1305, August 2003.

[11] L. Nguyen, “Self-encoded spread spectrum and multiple access communications,” Proceedings of the IEEE 6th International Conference on Spread Spectrum Technol-ogy & Applications, ISSSTA2000, Vol. 2, pp. 394–398,

NJIT, NJ, September 6–8, 2000.

[12] J. H. Jung, W. M. Jang, and L. Nguyen, “Self-encoded multiple access multiuser convolutional codes in uplink and downlink cellular systems,” International Journal of Communications, Network and System Science, Vol. 2, No. 4, pp. 249–257, July 2009.

[13] Y. S. Kim, W. M. Jang, Y. Kong, and L. Nguyen, “Chip-interleaved self-encoded multiple access with it-erative detection in fading channels,” Journal of Commu-nications and Networks, Vol. 9, No. 1, pp. 50–55, March 2007.

[14] K. Hua, W. M. Jang, and L. Nguyen, “Cooperative self encoded spread spectrum in fading channels,” Interna-tional Journal of Communications, Network and System Science, Vol. 2, No. 2, pp. 91–96, May 2009.

[15] D. Divsalar and M. K. Simon, “Spectral characteristics of convolutionally coded digital signals,” IEEE Transactions on Communications, Vol. COM-28, No. 2, pp. 173–186, February 1980.

[16] J. G. Proakis, Digital Communications, 4th Edition, pp. 492, McGraw Hill, 2001.

[17] S. Lin and D. J. Costello, Error Control Coding, 2nd Edi-tion, PEARSON Prentice Hall, Upper Saddle River, NJ, 2004.

[18] L. C. Perez, J. Seghers, and D. J. Costello, “A distance spectrum interpretation of Turbo codes,” IEEE Transac-tions on Information Theory, Vol. 42, No. 6, pp. 1698– 1709, Part I, November 1996.

[19] A. Papoulis and S. U. Pillai, Probability, Random Vari-ables and Stochastic Processes, 4th Edition, pp. 87, 190, McGraw Hill, 2002.

[20] B. P. Lathi, Modern Digital and Analog Communication Systems, 3rd Edition, pp. 612–614, Oxford University Press, New York, 1998.



469

Beam Pattern Scanning (BPS) versus Space-Time Block Coding (STBC) and Space-Time Trellis Coding (STTC)

Peh Keong TEH, Seyed (Reza) ZEKAVAT Department of Electrical and Computer Engineering, Michigan Technology University, Houghton, Michigan, USA

Email: rezaz, [email protected] Received April 2, 2009; revised June 10, 2009; accepted July 22, 2009

ABSTRACT In this paper, Beam Pattern Scanning (BPS), a transmit diversity technique, is compared with two well known transmit diversity techniques, space-time block coding (STBC) and space-time trellis coding (STTC). In BPS (also called beam pattern oscillation), controlled time varying weight vectors are applied to the an-tenna array elements mounted at the base station (BS). This creates a small movement in the antenna array pattern directed toward the desired user. In rich scattering environments, this small beam pattern movement creates an artificial fast fading channel. The receiver is designed to exploit time diversity benefits of the fast fading channel. Via the application of simple combining techniques, BPS improves the probability-of-error performance and network capacity with minimal cost and complexity.

In this work, to highlight the potential of the BPS, we compare BPS and Space-Time Coding (i.e., STBC and STTC) schemes. The comparisons are in terms of their complexity, system physical dimension, network capacity, probability-of-error performance, and spectrum efficiency. It is shown that BPS leads to higher network capacity and performance with a smaller antenna dimension and complexity with minimal loss in spectrum efficiency. This identifies BPS as a promising scheme for future wireless communications with smart antennas. Keywords: Antenna Array, Beam Pattern Sweeping, Transmit Diversity, Space-Time Block Codes, and

Space-Time Trellis Coding.

1. Introduction Transmit diversity schemes use arrays of antennas at the transmitter to create diversity at the receiver. Different transmit diversity techniques have been introduced to mitigate fading effects in wireless communications [1–5]. Examples are space-time block coding [1–3], space-time trellis coding [3–5], antenna hopping [6] and delay di-versity [6,7].

In Space-Time Block Coding (STBC), data is encoded by a channel coder and the encoded data is split into N unique streams, simultaneously transmitted over N an-tenna array elements. At the receiver, the symbols are decoded using a maximum likelihood decoder. This scheme combines the benefits of channel coding and diversity transmission, providing BER performance gains. However, receiver complexity increases as a func-tion of bandwidth efficiency [3] and requires high num-ber of antennas to achieve high diversity orders. More-over, antenna elements should be located far enough to

achieve space diversity and when antenna arrays at the base station (BS) are used in this fashion, directionality benefits are no longer available [1–3]. This reduces the network capacity of wireless systems in terms of number of users.

In Space-Time Trellis Coding (STTC) information symbols are encoded by a unique space-time channel coder and the encoded information symbols are split into N unique streams, simultaneously transmitted over N antenna arrays elements. At the receiver, after receiving a block of symbols denoted by frame (e.g., 130 symbols per frame), Viterbi algorithm is used to recover and er-ror-correcting the information symbols in the frame [3–5]. This scheme combines the benefits of space diver-sity and coding gain, providing a significant probabil-ity-of-error performance gain. However, the receiver complexity increases exponentially as a function of number of trellis states (transmit antennas); and, in gen-eral, high order of trellis states (transmit antennas) are required to achieve high diversity and coding gain [8,9].

P. K. TEH ET AL.


470

Moreover, similar to STBC, in STTC antenna array ele-ments should be located far enough to achieve space di-versity which reduces STTC network capacity in terms of number of users.

BPS has been introduced as a powerful transmit diver-sity technique capable of enhancing both wireless net-work capacity and probability-of-error performance with minimal cost [10–13]. In this scheme, antenna elements located at the distance of half a wavelength form an an-tenna array. These antenna arrays are mounted at the BS. They are incorporated to create directional beams steered toward the desired users. Time varying phase shifts are applied to antenna elements to move the antenna pattern within the symbol duration Ts. The antenna pattern starts from a point in space at time zero, sweeps an area of space from time 0 to Ts, and returns back to its initial position after time Ts, and repeats similar sweeping again. The beam pattern movement is small, e.g., in the order of 5% of half power beam width (HPBW). Simulations in [10] has shown that in rich scattering environments, BPS leads to a time varying channel with a small coherence time Tc with respect to Ts. This generates an artificially created fast fading channel leading to a time diversity that can be exploited at the receiver [10,11]. Hence, BPS leads to: a) high performance via time diversity, and b) high network capacity (in terms of number of users) via directionality inherent in BPS.

Here, BPS is compared with STBC and STTC schemes with their antenna replaced by directional an-tenna arrays (without scanning) [9] in order to achieve directionality (i.e., Spatial Division Multiple Access (SDMA) benefit) available in BPS. The elements of comparison are: 1) probability-of-error (bit-error-rate, BER, and frame-error-rate, FER) performance, 2) net-work capacity, 3) system complexity (in terms of physi-cal dimension), and 4) bandwidth efficiency.

Figure 1. Space-time block codes system (N = 2)

Table 1. STBC structure for (N=2).

Antenna 0 Antenna 1

Time, t 0s 1s

Time, t+Ts *1s *

0s

The results confirm that BPS scheme leads to higher network capacity and BER/FER performance and lower complexity. However, BPS technique relative spectral efficiency is less than STBC and STTC, e.g., in the order of 5%. In other words, BPS technique offers higher qual-ity-of-service and network capacity with a minimal cost of spectrum efficiency. This introduces BPS as a power-ful scheme for future generation of wireless communica-tions with smart antenna arrays.

Section 2 introduces STBC, STTC and BPS schemes. Section 3 compares their characteristics and, Section 4 presents and compares their capacity and BER/FER per-formance simulations. Section 5 concludes the paper. 2. Introduction of STBC, STTC and BPS

Techniques Here, we briefly introduce the fundamentals of the three techniques, STBC, STTC and BPS. 2.1. STBC STBC is a transmit diversity technique capable of creat-ing diversity at the receiver to improve the performance of communications systems. STBC utilizes N transmit antennas separated far apart to ensure independent fades [1,2]. At a given symbol period, N signals are transmitted simultaneously from N antennas. The signal transmitted from each antenna has a unique structure that allows the signal to be combined and recovered at the receiver. For simplicity in presentation, we only consider STBC with 2 transmit antennas (N = 2) (see Figure 1).

We consider s0 and s1 two consecutive signals gener-ated at two consecutive times t0 and t1 = t0+Ts, respec-tively. The signal transmitted from antenna zero is de-noted by s0 and the one from antenna one is denoted by s1. At the next symbol period, the transmitted signal from antenna zero is and the signal transmitted from

antenna one is where * is the complex conjugate

operation (see Table 1). The channel is denoted by h0 for transmit antenna 0 and h1 for transmit antenna 1. The main assumption here is that the fading is constant across two consecutive symbols (i.e., over t and t1 = t +Ts, t [0,Ts]); we can represent the channel fading for antenna 0 and 1 as:

*1s

*0s

0

1

0 0 0 0

1 1 1 1

( ) ( )

( ) ( )

js

js

h t h t T h e

h t h t T h e

(1)

respectively, where Ts is the symbol duration, i, i, i 0,1 are the Rayleigh fading gain and phase, respec-tively. The received signal at time t and t + Ts, corre-sponds to

BEAM PATTERN SCANNING (BPS) VERSUS SPACE-TIME BLOCK CODING (STBC) AND SPACE-TIME TRELLIS CODING (STTC)


471

0 0 1 1

* *0 1 1 0

( )

( )s

t

s t T

r t h s h s n

r t T h s h s n

(2)

respectively. Here, nt and st Tn are complex random

variables representing receiver noise and interference at time t and t + Ts, respectively.

In the STBC receiver, Maximal Ratio Combining (MRC) leads to an estimation of s0 and s1, corresponding to:

* *0 0 1

*1 1 0

ˆ

ˆ *

s

s

t t

t t

T

T

s h r h r

s h r h r

(3)

respectively (note: rt=r(t)). Substituting (1) and (2) into (3), we obtain

2 2 * *0 0 1 0 0 1

2 2 * *1 0 1 1 0 1

ˆ

ˆ

s

s

t t

t T t

Ts s h n h n

s s h n h n

(4)

In other word, a maximum likelihood receiver leads to the removal of the s1 and s0 dependent terms in ŝ0 and ŝ1, respectively. This generates a high probability-of-error performance at the receiver. 2.2. STTC Technique STTC is a transmit diversity technique that combines space diversity and coding gain to improve the perform-ance of communication systems [3,5,8]. STTC utilizes N transmit antennas separated far apart to ensure inde-pendent channels. At a given symbol period, N signals are transmitted simultaneously from N antennas. The signal transmitted from each antenna has a unique struc-ture with inherent error-correction capability to allow signal to be recovered and corrected at the receiver [8]. In this paper, we only consider the simulation scenario presented in [3], that is /4-QPSK, 4-states, 2 b/s/Hz STTC (hereafter, denoted as STTC-QPSK) that utilizes two transmit antennas and one receive antenna.

The trellis structure of STTC-QPSK is shown in Fig-ure 2(a) and the constellation mapping in Figure 2(b). In STTC-QPSK, information symbols are encoded using a channel coder by mapping input symbols to a vector of output (codewords) based on a trellis structure (Figure 2(a)). Here, information symbols are encoded based on the current state of the encoder and the current informa-tion symbols. Thus, the encoded codewords are corre-lated in time.

At the left of the trellis structure (Figure 2(a)) are the STTC codewords (s1,s2), s1,s2 0,1,2,3. In Figure 2(a), there are four emerging branches from each trellis state, because there are four possible QPSK symbols, namely 0,1,2,3. For example, consider the space time trellis coder that starts at state (q1,q2) = (0,0) (represented by

00). When the information symbol is 10, the coder tran-sition from state 00 to 10 produces the output code-words (s1,s2) of (0,2). When the next information symbol is 11, the coder transition from state 10 to 11 produces the output codeword (2,3). The channel coder continues to change from its current state to a new state based on the incoming information symbols. Based on the design, the channel coder resets to state 0 after com-pleting the coding of a frame (e.g., 130 symbols). The output code-words of the encoder is then mapped into a /4-QPSK constellation (Figure 2(b)). The mapping re-sults in two information symbols. Each information symbol is then transmitted on each antenna simultane-ously. Through this encoding scheme, redundancy is introduced into the system but at the same time, the symbols are transmitted over two antennas. Therefore, coding redundancy does not impact the throughput. In order to achieve SDMA to improve network capacity, each STTC-QPSK antenna element is replaced with one antenna array [9] to generate two static beams directed toward the desired users (Figure 3).

The channel is denoted by h0 for transmit antenna 0 and h1 for transmit antenna 1. We represent the channel fading for antenna i, i 0,1 as:

( ) iji i ih t h e (5)

respectively, where i, i, i 0,1 are the Rayleigh fading gain and phase, respectively. The received signal at time t can be modeled as

0 0 1 1( ) ( ) ( ) ( )r t h s t h s t n t (6)

where si(t) is the transmitted symbol and n(t) is the com-plex random variable representing receiver noise at time t. The receiver is designed using Viterbi algorithm. The branch metric for a transition labeled q1(t) q2(t) corre-sponds to [3]

2

1

( ) ( )P

i ii

r t q t

(7)

where P is the number of transmit antenna. Viterbi algo-rithm is used to compute the path with the lowest accu-mulated metric [3]. 2.3. BPS BPS is a new transmit diversity technique utilizing an antenna array to support directionality and transmit di-versity via carefully controlled time varying phase shifts applied to each antenna element. This creates a slight motion of the beam pattern directed toward the desired users [10]. Beam pattern movement creates an artificial fast fading environment that leads to time diversity ex-ploitable by the BPS receiver [11]. Beam pattern move-

P. K. TEH ET AL. 472

Figure 2. (a) STTC-QPSK trellis structure, and (b) Constellation mapping using gray code

Figure 3. STTC far located antenna elements are replaced by antenna arrays to support SDMA.

Figure 4. Antenna array structure. ment is created by applying time varying phase (t) to the elements of antenna array (see Figure 4).

In BPS, the beam pattern sweeps an area of space within Ts (symbol duration) and returns to its initial posi-tion and starts moving again. Properly selecting the phase offset (t) leads to a movement of antenna beam

pattern that ensures: 1) constant large scale fading over Ts, and 2) the generation of L independent fades within each Ts.

1) Achieving constant large-scale fading: In order to ensure constant large-scale fading over each symbol pe-riod Ts, the mobile must remain within the antenna ar-ray’s HPBW at all times. This corresponds to

, 0sd

Tdt

1 (8)

where is the HPBW, φ is the azimuth angle, dφ / dt is the rate of antenna pattern movement, and Ts·(dφ / dt) is the amount of antenna pattern movement within Ts. The received antenna pattern amplitude is ensured to remain within the HPBW for the entire symbol duration, Ts, us-ing the control parameter , 0 < < 1.

2) Achieving L independent fades within each Ts: Us-ing (8), the phase offset applied to the antenna array is




473

found to be (see [3,6,7]):

2( )

2s

s

Tdt

T

t

(9)

where is the wavelength of the carrier and d is the dis-tance between adjacent antenna elements.

The sweeping of the beam pattern creates an artificial fast fading channel with a coherence time that may lead to L independent fades over Ts. This is a direct result of the departure and the arrival of scatterers within the an-tenna array beam pattern window. Simulation results in [10] and [11] assuming a medium size city center, with 0.0005 < < 0.05, reveals that time diversity gains as high as L = 7 is achievable using BPS scheme.

Assuming BPSK modulation, the transmitted signal can be represented as

0( ) cos(2 ) ( )so Ts t b f t g t (10)

where b0 –1,+1 is the transmitted bit, fo is the carrier frequency, and gTs(t) is the pulse shape (e.g., a rectangu-lar waveform with unity height over 0 to Ts). The nor-malized signal received at the mobile receiver input cor-responds to:

1

00

1( ) cos 2 ( , )

( ), [ / , ( 1) / ], 1,2,...,

M

l om

l s s

r t b f t m tM

n t t lT L l T L l L

l

(11)

where m 0,1,2,…, M–1 is the mth antenna array ele-ment (Figure 2), nl(t) is an additive white Gaussian noise (AWGN), which is considered independent for different time slots (l), l is the fade amplitude in the lth time slot, and l is its phase offset (hereafter, this phase offset is assumed to be tracked and removed). Moreover, in (11),

( , ) (2 ) cos ( )t d t (12)

where (2d/)cosφ is the phase offset caused by the difference in distance between antenna array elements and the mobile (assuming the antenna array is mounted horizontally), and θ(t) is introduced in Equation (9). Ap-plying the summation over m, Equation (11) corresponds to

0( ) ( , )

1cos 2 ( , ) ( )

2

l l

o l

r t b AF t

Mf t t

n t (13)

Here,

2

12

sin ( , )1( , )

sin ( , )

M tAF t

M t

(14)

is the antenna array factor. Assuming the mobile located at φ = /2, (12) can be approximated by (t, φ) = (t) = -(t). Moreover, assuming that antenna array’s peak is

directed towards the intended mobile at time 0, and small movements of antenna array pattern over Ts, i.e., in Equation (9), is small, the array factor is well ap-proximated by AF(t, φ) 1.

The time varying phase of (9) in (12) and (13) leads to a spectrum expansion of the transmitted (and the re-ceived) signal. Because the parameter in (9) is consid-ered small (e.g., = 0.05), this expansion is minimal (see Subsection 3.2). After returning the signal to the base-band the received signal corresponds to:

0 , 1,2,..., l l lr b n l L (15)

3. BPS versus STBC, STTC STBC, STTC and BPS are compared in terms of physical antenna dimension, complexity, spectrum efficiency, network capacity and BER performance. 3.1. Complexity and Physical Antenna Dimension The main complexity of BPS scheme is at the transmitter mounted at the BS to generate a time varying beam pat-tern directed toward the desired user, whereas, the com-plexity of STBC scheme is mainly due to the number of transmitting antennas, N, at the BS and the combining scheme at the receiver [3].

The complexity of STTC scheme is mainly due to both the encoder (transmitter) and decoder (receiver). The encoding process requires a space-time channel coder to encode the information symbols according to a specific trellis structure (e.g., Figure 1). The decoding complexity that utilizes Viterbi algorithm increases exponentially with the number of states (transmit antennas) of the trel-lis structure [3].

Here, we consider: 1) Space-Time Coding (STC) techniques (i.e., both

STBC and STTC) use two antenna arrays to gener-ate directional beam pattern: a) Each antenna array contains six antenna elements (each element is separated by o / 2), and b) The antenna arrays are separated far enough (e.g., by 5o) to ensure inde-pendent fades. Here, o is the wavelength of the carrier frequency (or the average wavelength of all carrier frequencies if multi-carrier transmission is used).

2) BPS technique uses: a) a single 6-element antenna array (elements are separated by o/2), and b) Beam-pattern movement is assumed to result in up to seven fold diversity (in general, a function of parameter ) [10].

STC schemes’ antenna dimension is higher than BPS since STBC scheme utilizes 2 antenna arrays (in general,

P. K. TEH ET AL.


474

any number of antenna arrays). Considering, antenna array elements are separated by o/2, the length of the antenna array would be 2.5o. To ensure independent fades, these antennas should be located apart enough (e.g., 5o). This leads to the total length of 10o for STBC antenna array while BPS needs just 2.5o length antenna array. Thus, the physical antenna dimensions of STC techniques are much greater than the antenna array dimensions for BPS scheme. Moreover, STTC physical antenna array dimensions (specifically, with each an-tenna element replaced by an antenna array) increase as the number of antenna arrays increases.

Antenna array pattern characteristics (e.g., its HPBW) changes with frequency [12,13]. Hence, in wideband multi-carrier systems, (e.g., in multi-carrier code division multiple access, MC-CDMA, or orthogonal frequency division multiplexing (OFDM) systems) each group of sub-carriers might be required to be transmitted over unique antenna arrays in order to create an ideal SDMA; and hence, a number of antenna array clusters or antenna array vector clusters are required (see [12,13] for more information). In this case, the complexity and the dimen-sions of STBC and STTC are much higher than BPS scheme. In general, the dimensions (and, as a result, the complexity) of STC schemes increase as the number of antenna arrays increases. In addition, the complexity of STTC increases as the number of trellis states increases and as a result the required number of antenna arrays increases (in order to create higher orders of space diver-sity and coding gain). 3.2. Spectrum Efficiency and Throughput BPS technique creates a bandwidth expansion as it is discussed in the previous section, while STBC scheme with static beam patterns does not generate this expan-sion. BPS system bandwidth is expanded by a factor corresponds to

exp.

( . .) ( . .)100%

( . .)BPS without BPS

without BPS

BW BWf

BW

(16)

where (B.W.)BPS = bandwidth needed with BPS and (B.W.)withoutBPS = bandwidth needed without BPS. Con-sidering (13) and using (12) and (9), the expansion factor fexp. corresponds to

exp.( 1)

100%2

d Mf

(17)

Hence, with a constant Ts, , , d and M, for both BPS and STBC systems, the relative reduction in bandwidth efficiency due to BPS corresponds to

( 1)1 1

2after BPS

Rbefore BPS

d M

Considering d = /2, and typical values of (e.g., = 0.5 rad.), and M = 6, (18) can be approximated by

(1 ) 100%R (19)

With this definition, the relative reduction in BPS spec-trum efficiency is determined by the control parameter, . For example, considering = 0.05 (an antenna sweeping is equivalent to 5% of HPBW), R = 95%. On the other hand, with a constant bandwidth available to both BPS, and STBC and STTC, the throughput of BPS is less than STC techniques by the factor fexp. (e.g., by a factor of less than 5%). This disadvantage of BPS is very minimal with respect to advantages of BPS techniques as dis-cussed in this paper. 3.3. Capacity and Performance In this paper, we have assumed the same antenna arrays (with the same HPBW and approximately the same di-mension and complexity) for both BPS and STC systems. This assumption leads to higher order of diversity via BPS compared to STC (e.g., up to 7 fold diversity in BPS versus 2 fold diversity in STC), which better miti-gates fading effects in BPS system compared to STC systems. Hence, while this leads to a higher probabil-ity-of-error performance in BPS systems, considering a constant signal power to noise power ratio, it leads to a higher network capacity as the number of users’ increases. The details of capacity and performance enhancements are presented in the next section via simulations. 4. Simulations 4.1. BER Performance Simulations Simulations are performed assuming:

a) Mid-size city center (e.g., 3 scatterers per 1000m2) that leads to 7 fold diversity with BPS technique;

b) BPSK transmission for STBC and BPS compari-son and QPSK transmission for STTC and BPS comparison;

c) One received antenna; d) Switched beam smart antenna arrays (with HPBW

= 18o) are mounted at the BS; e) Quasi-static channel, i.e., channel characteristic is

static over 2 consecutive symbol periods, Ts, for STBC and over the entire frame, for STTC-QPSK and then changes in an independent manner; and,

f) STTC-QPSK frame is equal 130 symbols.

00%

(18)

For simplicity of comparison and to illustrate the benefits of time diversity induced by BPS scheme, Equal Gain Combining (EGC) over time components is as-sumed. EGC technique does not rely on channel estima-tion to perform the combining. The performance simula-tions for STBC compared to BPS are shown in Figure



475

Figure 5. BER/FER performance comparing (a) STBC versus BPS scheme, and (b) STTC-QPSK versus BPS.

Figure 6. BPS performance for different R values.

5(a). It can be observed that BPS scheme offers 5 dB and 15 dB improvement in performance at probabil-ity-of-error 10-3 compared to STBC scheme and tradi-tional BPSK system without diversity, respectively. The performance improvement in BPS scheme is due to the high order of time diversity gains achieved through beam pattern movement. The diversity order achievable via STBC is lower than BPS, and, therefore, its BER per-formance is lower compared to BPS scheme.

The performance simulations for BPS versus STTC-QPSK are shown in Figure 5(b). It is observed that BPS scheme offers 12 dB and 22 dB improvements in performance at probability-of-error 10-3 compared to STTC-QPSK scheme with antenna arrays and without beam pattern movement, respectively. The performance improvement via BPS is the result of high order of time diversity gains achieved through beam pattern movement. Although STTC-QPSK offers both diversity and coding gain, the diversity order offered by STTC-QPSK is much inferior compared to BPS-QPSK; thus, even without coding gain benefit in BPS-QPSK scheme, it surpasses the performance of STTC-QPSK with relatively lower complexity.

In Figure 6, BER performance of BPS system is gen-erated for different relative spectrum efficiency, R. Increasing the parameter leads to higher order of di-versity that enhances BER performance of the system; and, on the other hand, it reduces BPS relative bandwidth efficiencies. For example, as it is discussed in [10], in a rich scattering environment, = 0.005 leads to two-fold diversity which is equivalent to R = 99.5%. Increasing from 0.005 to 0.05 increases the diversity achievable to 7 folds, and reduces the relative spectrum efficiency to R = 95%. This is equivalent to a decrease in throughput from 0.5% to 5%. 4.2. Network Capacity Simulations Network capacity simulations are performed assuming:

a) MC-CDMA transmission with N = 32 carriers; b) Four fold frequency diversity over the entire

bandwidth; c) For STBC-BPS comparison, we consider in-

ter-cell interference effects from the first tier cells (see Figure 7). This interference is reduced via long codes assigned to signals transmitted to the users of each cell;

d) For STTC-BPS comparison, inter-cell interfer-ence effects are ignored, (see Figure 7);

e) Mid-size city center (e.g., 3 scatterers per 1000m2) that leads to 7 fold diversity with BPS technique;

f) Users are distributed uniformly in the cell; g) Inter-user-interference within the cell is reduced

via random assignment of Hadamard-Walsh codes

P. K. TEH ET AL.


476

Figure 7. Interfering cells assuming one-tier cellular net-work. The direction of beam patterns that will interfere with intended mobile is represented.

(in MC-CDMA systems); h) Equal Gain Combining (EGC) over frequency

components; i) Switched beam smart antenna arrays (with HPBW

= 18o) are mounted at the BS; and, j) Signal power to noise power ratio is SNR = 10dB

for STBC and SNR = 12dB for STTC. With these assumptions, the received BPS/MC-

CDMA signal corresponds to [12]:

16 1,

, ,0 0 0

1, ,2

( )

( , ) cos 2 ( )

( , )

cK nNc l n n

l cac k n c

c o

n nMc c l c l

r t bR

AF t f n f t

t

k c k c

(20)

Here, AF(t, φc) is the array factor introduced in (14), nl(t) is an additive white Gaussian noise (AWGN), which is considered independent for different time slots (l), bc,k+1,-1 is the cth cell’s kth user’s transmitted bit,

,nc k is the Hadamard-Walsh spreading code for kth user

and nth sub-carrier in the cth cell, cn is the long code of

the nth sub-carrier for cth cell, ,

nc l is the Rayleigh fade

amplitude on the nth sub-carrier in the lth time slot in the cth cell and ,

nc l is its phase (which is assumed to be

tracked and removed). ,nc l is assumed independent over

time components, l, and correlated over frequency com-ponents, n [14]. Kc represents the number of users effec-tively interfere with the desired user.

In the neighboring cells, these users are located at the antenna pattern (sector) with directions shown in Figure 7. Considering assumptions (f) and (i)

( )2c

HPBWE K K

(21)

when E(·) denotes the expectation and K is the number of users available in each cell. In (20) 1/(Rc)

a represents the long-term path loss of the signal received by the mobile (MS) in the cell 0. This signal is transmitted by the BS of neighboring cells to the users located in those cells, and

in the directions which interfere with the intended mobile (see Figure 7). In Figure 7, D is the cell radius. Assuming the intended mobile is located at / 2D and approximating the cov-erage area by a triangle, / 2D represents the approxi-mate center of mass of users in the beam pattern cover-age area. Rc represents the distance between the BS of the cell c, c 0,1,2,…6, and the intended mobile in the cell 0. From the geometry in Figure 7, vector R formed by the elements Rc, c 0,1,2,…6, corresponds to [12]

R = [1.00 3.83 3.44 1.975 1.83 1.975 3.44] (22)

where R0 is normalized to one and the others are normal-ized with respect to this value.

In (20), the power factor a is a function of user loca-tion, BS antenna height and environment. Considering urban areas, parameter a changes with the carrier fre-quency and BS antenna height. In urban areas, a = 1, if Rc < Dmax, and a = 2 if Rc > Dmax, where Dmax = D(fo,ha), (Dmax is a function of the carrier freuquency fo and an-tenna height ha). Considering fo = 900MHz, and BS height, ha > 25m, Dmax ≈ 1000m (see [15]). Assuming a cell of radius D ≈ 500m, and by referring to [15], we find that a = 2 for cells 1, 2 and 6 whereas a = 1 for cells 3, 4 and 5. Thus, in the simulations we ignore the interference from cells 1, 2 and 6 and only consider inter-cell inter-ference from cells 3, 4 and 5 with a little loss in accu-racy.

With the model introduced in (20), the received STBC/MC-CDMA signal corresponds to

16 1,0 , ,1 ,

00 0 0

, ,

16 1,0 , ,1 ,

10 0 0

, ,

[ ] [ 1]( )

cos(2 ( ) ) ( )

[ 1] [ ]( )

cos(2 ( ) ) ( )

c

c

n nK Nc c k c c k

ac k n c

n n nc k c o c

n nK Nc c k c c k

ac k n c

n n nc k c o c

b i b ir t

R

0 0

1 1

f n f t n t

b i b ir t

R

f n f t n t

(23)

where bc,k[i] and bc,k[i+1], i 0,2,4,… is the kth user ith information bit in the cth

cell for STBC, ,0nc and ,

nc l are

the Rayleigh fade amplitude due to antenna 0 and an-tenna 1 in the nth sub-carrier in the cth cell and ,0

nc and

,nc l are their phase, respectively, ,

nc k is the Hadamard-

Walsh spreading code for kth user and nth subcarrier, cn

is the long code of the nth sub-carrier in the cth cell,

1/(Rc)a characterizes the long-term path loss and n(t) is

an additive white Gaussian noise (AWGN). Figure 8(a) represents network capacity simulation

results generated considering MRC across time compo



477

Figure 8. Capacity performance (a) STBC and BPS, and (b) STTC and BPS

nents in BPS and across space components in STBC (see [3] and [4]) and EGC across frequency components in both BPS and STBC. It is observed that a higher network capacity is achievable with BPS/MC-CDMA. For exam-ple, at the probability-of-error of 10-2 BPS/MC-CDMA offers up to two-fold higher capacity. It is also observed that STBC/MC-CDMA offers a better performance compared to the traditional MC-CDMA without diversity when the number of users in the cell are less than 80. However, as the number of users in the cell increases beyond 80, the performance of STBC/MC-CDMA be-comes even worse than traditional MC-CDMA (i.e., MC-CDMA with antenna array but without diversity benefits). This is because STBC scheme discussed in this paper (see [1]) is designed to utilize MRC. It has been shown that MRC combining scheme is the optimal com-bining scheme when there is only one user available, while in a Multiple Access environment, MRC enhances

the Multiple Access Interference (MAI) and therefore degrades the performance of the system [16].

Considering STTC-QPSK, with assumption (d), STTC-QPSK/MC-CDMA received signal corresponds to

1 1

0 0 0, 1 1,0 0

0 0

( )

cos(2 ( ) ) ( )

K Nn n

k k kk n

no

r t s s n

f n f t n t

(24)

Here s0,k and s1,k is the kth user information bit trans-mitted from antenna 0 and antenna 1, respectively, 0

n

and 1n are the Rayleigh fade amplitude due to antenna

0 and antenna 1 in the nth sub-carrier and 0n and n

l are

their phase, respectively, nk is the Hadamard-Walsh

spreading code for kth user and nth sub-carrier, and n(t) is an additive white Gaussian noise (AWGN).

Network capacity simulations for STTC-QPSK are generated assuming EGC across time components (in BPS), space components (in STTC-QPSK) and fre-quency components for BPS and STTC-QPSK [Figure 8(b)]. Figure 8(b) represents STTC versus BPS-QPSK simulation results. This figure shows that BPS-QPSK is superior compared to STTC-QPSK and QPSK without diversity. In this simulation, BPS-QPSK leads to signifi-cantly better capacity due to the time diversity induced by beam-pattern movement and frequency diversity in-herent in MC-CDMA. The results also show that QPSK performance is superior compared to STTC-QPSK. This agrees with the FER simulation results in Figure 5(b), where QPSK is better than STTC-QPSK at low SNRs (e.g., at SNR = 10 dB). This is because STTC-QPSK is designed under the assumption of high enough SNR val-ues; thus, it is less efficient compared to QPSK at low SNRs [17]. (The capacity curve for higher SNR values may lead to better STTC-QPSK performance compared to QPSK; however, STTC-QPSK shows a lower per-formance compared to BPS-QPSK for all SNRs). Thus, it is observed that a higher network capacity is achiev-able via BPS/MC-CDMA. It is also worth mentioning that STTC-QPSK performance can be significantly im-proved via interference suppression/cancellation tech-niques at the cost of system complexity as discussed in [19–21]. In this paper, we conducted the comparison without a complexity added to the STTC scheme via implementing interference suppression algorithms.

Simulations confirm that BPS offers superior network capacity compared to STC schemes; however, there are two issues associated with BPS scheme: 1) diversity achievable via BPS changes with distance; greater the distance of mobile from the BS, higher the diversity and network capacity [10]. It is notable that in general, the average number of users located in constant width annu-luses (with BS at the center) increases as the distance

P. K. TEH ET AL.


478

from the BS increases; and 2) BPS works just in urban areas (or in rich scattering environments); but, because a high network capacity is only required in urban areas, this is not a critical issue. Moreover, BPS can also be merged with STC techniques, e.g., via the structure shown in Figure 4. In this case, the traditional antenna arrays are replaced with time varying weight vector an-tenna arrays to direct and move the antenna pattern. An-other approach for merging BPS with STBC is intro-duced in [18].

Nevertheless, it is worth mentioning that BPS scheme achieve the probability-of-error performance and the network capacity benefits with a relatively less complex-ity. This makes BPS a prominent scheme for future wireless generations with smart antennas. However, the spectrum efficiency of BPS is about 5% less than STC which is a minimal disadvantage compared to the bene-fits created by BPS technique. 5. Conclusions A comparison was preformed between STBC, STTC- QPSK and BPS transmit diversity techniques in terms of network capacity, BER/FER performance, spectrum effi-ciency, complexity and antenna dimensions. BER per-formance and network capacity simulations are gener-ated BPS, STBC, and STTC schemes. This comparison shows that BPS transmit diversity scheme is much supe-rior compared to both STBC and STTC-QPSK schemes: a) The BS physical antenna dimensions of BPS is much smaller than that of STC techniques, and b) The BER/FER performance and network capacity of BPS is much higher than that of STC schemes. The complexity of BPS system is minimal because the complexity is mainly located at the BS, and the receiver complexity is low because all the diversity components enter the re-ceiver serially in time. In terms of spectrum efficiency, both STC schemes outperform BPS scheme by a very small percentage (e.g., in the order of 5%). BPS scheme introduces a small bandwidth expansion due to the movement in the beam pattern that eventually results in a lower throughput per bandwidth. 6. References [1] S. M. Alamouti, “A simple transmit diversity technique

for wireless communications,” IEEE Journal on Selected areas in Communications, Vol. 16, No. 8, pp. 1451–1458, 1998.

[2] V. Tarokh, H. Jafarkhani, and A. R. Calderbank, “Space-time block codes from orthogonal designs,” IEEE Transactions on Information Theory, Vol. 45, No. 5, pp. 1456–1467, July 1999.

[3] V. Tarokh, N. Seshadri, and A. R. Calderbank,

“Space-time codes for high data rate wireless communi-cation: Performance criterion and code construction,” IEEE Transactions on Information Theory, Vol. 44, pp. 744–765, March 1998.

[4] V. Tarokh, A. F. Naguib, N. Seshadri, and A. Calderbank, “Space-time codes for high data rate wireless communi-cations: Performance criteria in the presence of channel estimation errors, mobility, and multiple paths,” IEEE Transactions on Communications, Vol. 47, No. 2, Febru-ary 1999.

[5] A. F. Naguib, V. Tarokh, N. Seshadri, and A. R Calder-bank, “A space-time coding modem for high-data-rate wireless communications,” IEEE Journal on Selected Areas in Communications, Vol. 16, No. 8, October 1998.

[6] N. Seshadri and J. H. Winters, “Two signaling schemes for improving the error performance of frequency divi-sion-duplex transmission system using transmitter an-tenna diversity,” International Journal Wireless Informa-tion Networks, Vol. 1, No. 1, pp. 49–60, January 1994.

[7] J. H. Winters, “The diversity gain of transmit diversity in wireless systems with Rayleigh fading,” in Proceedings of the 1994 ICC/SUPERCOMM, New Orleans, Vol. 2, pp. 1121–1125, May 1994.

[8] R. W. Heath, S. Sandhu, and A. J. Paulraj, “Space-time block coding versus space-time trellis codes,” Proceed-ings of IEEE International Conference on Communica-tions, Helsinki, Finland, June 11–14, 2001.

[9] V. Tarokh, A. Naguib, N. Seshadri, and A. R. Calderbank, “Combined array processing and space-time coding,” IEEE Transactions on Information Theory, Vol. 45, No. 4, pp. 1121–1128, May 1999.

[10] S. A. Zekavat and C. R. Nassar, “Antenna arrays with oscillating beam patterns: Characterization of transmit diversity using semi-elliptic coverage geometric-based stochastic channel modeling,” IEEE Transactions on Communications, Vol. 50, No. 10, pp. 1549–1556, Octo-ber 2002.

[11] S. A. Zekavat, C. R. Nassar, and S. Shattil, “Oscillating beam adaptive antennas and multi-carrier systems: Achieving transmit diversity, frequency diversity and di-rectionality,” IEEE Transactions on Vehicular Technol-ogy, Vol. 51, No. 5, pp. 1030 –1039, September 2002.

[12] S. A. Zekavat and C. R. Nassar, “Achieving high capacity wireless by merging multi-carrier CDMA systems and oscillating-beam smart antenna arrays,” IEEE Transac-tions on Vehicular Technology, Vol. 52, No. 4, pp. 772– 778, July 2003.

[13] P. K. Teh and S. A. Zekavat, “A merger of OFDM and antenna array beam pattern scanning (BPS): Achieving directionality and transmit diversity,” accepted in IEEE 37th Asilomar Conference on Signals, Systems and Computers, November 9–12, 2003.

[14] J. W. C. Jakes, Microwave Mobile Communications, New York, Wiley, 1974.

[15] A. J. Rustako, N. Amitay, G. J. Owens, and R. S. Roman, “Radio propagation at microwave frequencies for line-of- sight microcellular mobile and personal communica-



479

tions,” IEEE Transactions on Vehicular Technology, Vol. 40, No. 1, pp. 203–210, February 1991.

[16] J. M. Auffray and J. F. Helard “Performance of multicar-rier CDMA technique combined with space-time block coding over Rayleigh channel,” IEEE 7th International Symposium on Spread-Spectrum Technology, Vol. 2, pp. 348–352, September 2–5, 2002.

[17] A. G. Amat, M. Navarro, and A. Tarable, “Space-time trellis codes for correlated channels,” IEEE International Symposium on Signal Processing and Information Tech-nology, Darmstadt, Germany, December 14–17, 2003.

[18] S. A. Zekavat and P. K. Teh, “Beam-pattern-scanning dynamic-time block coding,” Proceedings of Wireless Networking Symposium, The University of Texas at

Austin, October 22–24, 2003.

[19] B. Lu and X. D. Wang, “Iterative receivers for multiuser space-time coding systems,” IEEE Journal on Selected Areas in Communications, Vol. 18. No. 11, pp. 2322– 2335, November 2000.

[20] E. Biglieri, A. Nordio, and G. Taricco, “Suboptimum receiver interfaces and space-time codes,” IEEE Transac-tions on Signal Processing, Vol. 51, No. 11, pp. 2720– 2728, November 2003.

[21] H. B. Li and J. Li, “Differential and coherent decorrelat-ing multiuser receivers for space-time-coded CDMA sys-tems,” IEEE Transactions on Signal Processing, Vol. 50, No. 10, pp. 2529–2537, October 2002.



Rain Attenuation Impact on Performance of Satellite Ground Stations for Low Earth Orbiting

(LEO) Satellites in Europe

Shkelzen CAKAJ Post and Telecommunication of Kosovo (PTK), Dardania, Prishtina, Kosovo

Email: [email protected], [email protected] Received May 10, 2009; revised July 2, 2009; accepted August 28, 2009

ABSTRACT Low Earth Orbits (LEO) satellites are used for public communication and for scientific purposes. These satellites provide opportunities for investigations for which alternative techniques are either difficult or impossible to apply. Ground stations have to be established in order to communicate with such satellites. Usually these satellites communicate with ground stations at S-band. The communication quality depends on the performance of the satellite ground station, in addition to that of satellite. The performance of the satellite ground stations is expressed through Figure of Merit. The aim of this paper is to analyze the rain attenuation impact on the performance of the respective ground station. Rain attenuation depends on geographical location where the satellite ground station is implemented. In order to compare this effect on satellite ground station performance, some cities of Europe are considered. Finally, the rain attenuation impact on the satellite ground station Figure of Merit for the hypothetical satellite ground station installed in Prishtina is analyzed. Keywords: LEO, Satellite, Ground Station, Rain Attenuation, Performance.

1. Introduction The typical satellite communication system comprises of a ground segment, space segment and control segment. The function of the ground segment is to receive or transmit the information to the satellite in the most reliable manner while retaining the desired signal quality. The general organization of a satellite ground station consists of antenna subsystem with associated tracking system, transmitting and receiving equipment, monitoring system and power supply as presented in Figure 1. The separation of the transmission and reception is achieved by means of duplexer [1].

The goal of this paper is to analyze the rain attenuation impact on the performance of the ground station (downlink performance) respectively on the received signal quality to the end user as presented in Figure 1. 2. Downlink Performance For the satellite communication systems the downlink

performance (receiving system) is commonly defined through a Receiving System Figure of Merit as

SG T ,

where:

S A comT T T p (1)

G is receiving antenna gain, TS is receiving system noise temperature, TA is antenna noise temperature and Tcomp is composite noise temperature of the receiving system, including lines and equipment [1]. The satellite ground station receiving system and environment is presented in Figure 2. In Figure 2, TC represents the sky noise temperature, Tm is medium temperature and A is medium attenuation [2]. Un- desired noise is in part, injected via antenna (kTAB) and a part is generated internally (kTcompB) by line loss and equipment. 2.1. Antenna Noise Temperature The antenna temperature depends on where the antenna is looking at. Let us consider the antenna itself is lossless. Under assumption that antenna sees

RAIN ATTENUATION IMPACT ON PERFORMANCE OF SATELLITE GROUND STATIONS FOR LOW EARTH ORBITING (LEO) SATELLITES IN EUROPE


481

Figure 1. The satellite ground station architecture. the sky without medium attenuation, the solid angle subtended by the noise source (sky) is much larger than antenna beam angle (Figure 2), so, antenna noise temperature TA is equal to the sky noise temperature TC. This is the best propagation case, where TA= TC [2]. When an atmospheric absorptive process takes place the absorption increases the temperature. If it is considered the total cosmic temperature as TC, the absorptive medium temperature as Tm and the attenuation due the absorptive process as A, then antenna noise temperature TA is [2]:

10 10(1 10 ) 10AA m CT T T A (2)

TC ranges from 3K to 10K and Tm from 275K to 290K for rain [2]. Rain produces by far the highest attenuation because of the maximal humidity, and represents the worst case in propagation. 2.2. Rain Attenuation Rain attenuation depends on: number of raindrops along the path, the size of drops and the rain path length [2,3]. Considering the density and the size of drops constant along the distant r, it is found that the power Pr(0) from no raining area will diminish exponentially to the power Pr(r) passing through the raining area with a distance r, expressed as [2]:

( ) (0) rr rP r P e (3)

where is the reciprocal of the distance required for the power to drop by a factor e-1 [2]. Expressing this into logarithmic scale as a propagation loss L in [dB] gives:

10log (0) / ( ) 4.3r rL P P r

Figure 2. The ground station and environment.

or, as usually expressed as specific rain attenuation in decibels per kilometer, it is:

/ 4.3L r (5)

Based on some empirical models it is found out that (specific rain attenuation) depends only on R, where R is the rainfall rate measured on the ground in millimeters per hour [2]. From these empirical models, the usual form of expressing is:

baR (6)

where a and b are constants which depend on frequency, polarization and average rain temperature. Table 1 shows values of a and b at various frequencies at 20 for both polarization [ITU 838, ITU-R P.838-3].

The reference rainfall rate depends on geographical location. For most of Europe it is 30 mm/h up to 50 mm/h.

Rain attenuation AR (dB) depends on specific rain attenuation and total rain slant path length lr. Rain attenuation path length geometry is presented in Figure 3 [2]. All heights are considered above mean sea level. The effective rain height hr is the same as the height of the melting layer, where the temperature is 0 [2]. Here ε0 stands for elevation angle.

The values for effective rain height vary according to the latitude φ of the ground station [ITU, 618].

Table 1. Parameters of rain attenuation model.

f [GHz] ah bh av bv

2 0.0000847 1.0664 0.0000998 0.9490

2.5 0.0001321 1.1209 0.0001464 1.0085

3 0.0001390 1.2322 0.0001942 1.0688

3.5 0.0001155 1.4189 0.0002346 1.1387 r (4)

S. CAKAJ


482

FrozenMelting Layer (0 )

Satellite

h

h

lr

r

s

0

Figure 3. Rain attenuation path geometry.

Since, Europe belongs to the Northern Hemisphere, these values expressed in (km), are given by [2]:

5 0.075 23 , for 23rh (7)

5rh , for 0 23 (8)

Rain path length from Figure 3 can be expressed as:

0sinr s

rh h

l

(9)

where the hs is altitude of the ground station. Then, rain attenuation AR (dB) for rain path length lr is:

0sinb b

R r rh

A l aR l aR

(10)

where is . rh h h s

r

For paths where the angle is ε0 < 5º it is necessary to account for the variation of the rain in the horizontal direction. This effect is treated by implementing a reduction factor s, and then attenuation is given by:

bRA aR sl (11)

According to [ITU, 618], this reduction factor of rainy path length is empirically defined as [2,4]:

00.015

1sin

135

rR

sl

e

(12)

e = 2,718 is Nepper constant. This equation is valid for rainfall which does not exceed 0.01% of the time in average year (around 53min) [2].

It is obvious from Figure 3 that the best propagation case is under elevation angle of 90º because of the shortest rain path length. The lock between a satellite and a ground station is established and lost under elevation angles of at least 2º because of natural barriers. This represents longest path and consequently the worst propagation case.

Table 2. Altitude and latitude of European cities.

Figure 4. Rain attenuation for different elevation. 3. Rain Attenuation in Europe Rain attenuation depends on geographical location [4]. Some cities of Europe are chosen where hypothetically is supposed to implement a satellite ground station.

From the http://earth.google.com/ are provided latitude and altitude of these cities as presented in Table 2. Data from Table 2 are used for rain path length calculations [4,5].

The rain path length under elevation angles of 2º is the longest, which accompanied with the highest rain fall rate for Europe of 50 mm/h represents the worst propagation case from the rain attenuation point of view [4,5].

Considering data for Vienna, rain attenuation A [dB] for different elevation angles is presented in Figure 4 [4–6]. For elevation angles below 5º the reduction factor s is applied.

From Figure 4 it is obvious that the rain attenuation

LocationAltitude(hs) [m]

Latitute [°]

hr

[km] Δh=hr-hs [km]

Madrid 588 40.4 3.695 3.107

Tirana 104 41.3 3.625 3.521

Roma 14 41.9 3.582 3.568

Prishtina 652 42.6 3.525 2.873

Zagreb 130 45.8 3.290 3.160

Vienna 170 48.2 3.110 2.940

Paris 34 48.8 3.060 3.026

Brussels 76 50.8 2.915 2.839

London 14 51.5 2.862 2.848

Berlin 34 52.5 2.786 2.752



483

Table 3. Rain attenuation at the worst propagation case.

Table 4. Range of antenna temperature parameters.

Figure 5. Rain attenuation (f = 2.5 GHz, R=50 mm/h).

Figure 6. Rain attenuation (f =3.5 GHz, R=50mm/h).

is the smallest when antenna is pointed to the sky under an elevation angle of 90º. For an elevation angle of 0 = 2º the rain attenuation is the highest, and represents the worst case for the link budget calculation. Thus the results for the worst propagation case (R=50mm/h and 0 = 2º) based on Equation 6 up to Equation 12 and different frequencies are presented in Table 3.

From Table 3, rain attenuation for frequencies f = 2.5GHz, f = 3.5GHz and R=50mm/h is presented in Figure 5 and Figure 6.

Figure 5 and Figure 6 show that for Central Europe at range of (2–4) GHz the rain attenuation varies from 0.5dB to 3dB. This depends on the operational frequency and location. From Table 3 and Figure 5, it is obvious that the rain attenuation of 1dB, it is sufficient to be considered within a link budget calculations at S-band. Rain attenuation for frequencies less than 2GHz should not be considered. 4. Rain Attenuation Impact on Performance The downlink performance is expressed through Figure of Merit G/TS. For G/TS calculation, system noise temperature TS must be calculated, respectively, antenna noise temperature TA and composite noise temperature Tcomp (Equation 1). Let us consider firstly antenna temperature from the rain attenuation perspective. Considering Equation 2, for already calculated rain attenuation A and known TC and Tm, easily can be calculated antenna noise temperature TA which directly impacts downlink performance [5]. The range of these parameters is presented in Table 4.

Thus, the antenna temperatures for the worst rain case (0 = 2º, R=50 mm/h) and the highest Tm=290K and TC=10K are presented in Table 5 and represent the highest possible antenna temperature for respective frequencies related to the listed locations.

The next component at Equation 1 is Tcomp which represents the noise generated by equipment and lines. Since the goal of this paper is to analyze and compare the rain attenuation on the performance of the ground stations, then the system of the same equipment and lines is hypothetically considered to be implemented at the all listed cities. This approach will eliminate equipment impact, leaving room for conclusions only related to the rain attenuation. Considering the Figure 1, in the Table 6 are shown the parameters of the equipment and lines of the hypothetical satellite ground station.

The composite noise temperature Tcomp is calculated based on Equation 13 [7], as:

Attenuation [dB] Location

2 [GHz] 2.5 [GHz] 3 [GHz] 3.5[GHz]

Madrid 0.4217 0.8141 1.3239 2.2337

Tirana 0.4694 0.9061 1.4735 2.5417

Roma 0.4747 0.9163 1.4902 2.5704

Prishtina 0.3941 0.7606 1.2370 2.1337

Zagreb 0.4379 0.8260 1.3434 2.3172

Vienna 0.4180 0.7800 1.2535 2.1622

Paris 0.4122 0.7957 1.2941 2.2321

Brussels 0.3960 0.7582 1.2242 2.1117

London 0.3911 0.7550 1.2278 2.1179

Berlin 0.3795 0.7326 1.1914 2.0551

Rain medium temperature (Tm) (275–290) K Sky temperature (TC) (3–10) K

Rain attenuation (AR) (1–3) dB

S. CAKAJ


484

Table 5. Antenna temperature.

Table 6. Parameters of hypothetical ground station.

Table 7. Figure of merit [G/TS (dB/K)].

Figure 7. Figure of merit (Prishtina case).

LNA c LNCcomp f

f f LNA f LNA c

T T TT T

G G G G G G (13)

Considering, Equation 13 and Table 5 about the highest possible antenna temperature it is calculated Figure of Merit for frequency band (2–4) GHz, for the hypothetical satellite ground station implemented at all listed location and presented in Table 7 [8].

Results of calculations show that for the same frequency and the same rain fall rate the difference in Figure of Merit among cities is less than 0.2dB. This means that the ground station’s performance within central Europe does not strongly depend on location. This fact creates open opportunities for the implementation of the LEO ground stations in Europe for different purposes. For Prishtina city, Figure of Merit on dependence of frequency in Figure 7 is presented.

For the same parameters from Table 7, just by changing the receiving antenna gain the Figure of Merit is calculated and presented. These diagrams confirm that the main improvement on the figure of merit could be achieved by the receiving antenna gain. 5. Conclusions The rain attenuation analyses are related to link budget considerations due to the ground station design. For Central Europe at range of (2–4) GHz the rain attenuation varies from 0.5dB to 3dB. The performance of the ground station within central Europe does not strongly depend on location. The difference on downlink figure of merit among all considered cities in central Europe for S-band in average is less than 0.2dB under the same rain fall rate, or in average less than 1dB for different rain fall rates. Rain attenuation of 1dB, it is sufficient to be considered within link budget calculations at S-band. 6. References [1] G. Maral and M. Bousquet, “Satellite communication

systems,” John Willey & Sons, Ltd, Chichester, England, 2002.

[2] S. R. Saunders, “Antennas and propagation for wireless communication systems,” John Wiley, Ltd, Toronto, 1993.

[3] P. G. Pino, J. M. Riera, and A. Benarroch, “Slant path attenuation measurements at 50GHz in Spain,” IEEE, Antennas and Wireless Propagation Letter, Vol. 4, pp. 162–164, 2006.

[4] S. Cakaj and K. Malaric, “Rain attenuation at low earth orbiting satellite ground station,” in Proceedings of 48th International Symposium on Multimedia Systems and Applications, ELMAR, Zadar, Croatia, pp. 247–250, June 2006.

[5] S. Cakaj and K. Malaric, “Rain attenuation modeling

Antenna Temperature[K]

Location 2

[GHz] 2.5

[GHz] 3

[GHz] 3.5

[GHz] Madrid 34.9 56.1 80.9 120.4 Tirana 37.6 60.8 87.7 129.6 Roma 37.9 61.8 88.4 130.6 Prishtina 33.4 53.3 76.9 114.8 Zagreb 35.3 56.7 81.8 121.6 Vienna 33.7 53.9 77.7 115.9 Paris 34.4 55.2 79.6 118.5 Brussels 33.1 52.9 76.4 113.9 London 33.2 53.1 76.4 114.2 Berlin 32.5 51.9 74.7 111.8

Receiving antenna gain ( G ) 40 dBi Feed line loss ( Lf ) 0.3 dB LNA noise figure ( FLNA ) 0.75 dB LNA gain ( GLNA ) 40 dB Loss (cabling and filtering) ( LC ) 4 dB LNC noise figure (FLNA) 0.7 dB LNC gain (GLNA) 35 dB

Frequency 2GHz 3GHz 4GHz Rain Rate

50 mm

50 mm

50 mm

Prishtina 19.44 17.99 15.87 Roma 19.27 17.67 15.56 Vienna 19.43 17.97 15.85 Berlin 19.47 18.05 15.93 Brussels 19.45 18.00 15.89 London 19.45 18.00 15.88

Madrid 19.38 17.88 15.76 Paris 19.40 17.92 15.79 Tirana 19.28 17.70 15.56 Zagreb 19.37 17.86 15.73



485

for low earth orbiting ground station at S-band in Europe,” in Proceedings of IASTED, 18th Inter- national Conference on Modeling and Simulation, Montreal, Canada, pp. 17–20, May 30–June1, 2007.

[6] W. Keim and A. L. Scholtz, “Performance and reliability evaluation of the S-band, at Vienna satellite ground station,” Talk, IASTED, International Con- ference on Communication System and Networks, Palma de Mallorca, Spain, 5 pages, 2006.

[7] S. Cakaj and K. Malaric, “Rigorous analysis on perfor-

mance of LEO satellite ground station in urban environment,” International Journal of Satellite Communications and Networking, Vol. 25, No. 6, pp. 619–643, UK, November/December 2007.

[8] S. Cakaj and K. Malaric, “Downlink performance comparison for low earth orbiting satellite ground station at S-band in Europe,” IASTED, in Proceedings of 27th International Conference on Modeling, Identification and Control, Innsbruck, Austria, pp. 11– 13, February 2008.



Detection, Identification and Tracking of Flying Objects in Three Dimensions Using Multistatic Radars

Laleh S. KALANTARI, Shahram MOHANNA, Saeed TAVAKOLI Faculty of Electrical and Computer Engineering, The University of Sistan and Baluchestan, Zahedan, Iran

Email: [email protected], [email protected], [email protected] Received December 24, 2008; revised February 27, 2009; accepted May 22, 2009

ABSTRACT Multistatic radar systems can be used in many applications such as homeland security, anti-air defense, anti-missile defense, ship’s navigation and traffic control systems. Multistatic radars, which are capable of detecting and tracking flying objects in three-dimension coordinate systems, are simulated in this paper. The location and velocity of flying objects as well as their radar cross sections are computed. The object’s path is also estimated by tracking the object. Keywords: Bistatic Radar, Multistatic Radar, RCS, Frii’s Formula, Doppler Effect

1. Introduction Nowadays, most of radars are monostatic in which one antenna or a pair of antennas positioned at the same point is used for transmitting and receiving electromagnetic waves. The IEEE defines a bistatic radar as “a radar sys-tem that uses antennas at different locations for transmis-sion and reception”[1]. The distance of separation be-tween the transmitter and receiver can be very long. This distance is referred to as the baseline range [1–5]. Obvi-ously, if they are nearly co-located, i.e. if the baseline is very small, the radar system is approximated as a monostatic one.

As bistatic radars are only able to detect objects (tar-gets) in a two-dimension coordinate system, multistatic radars can be employed if objects in a three-dimension (3D) coordinate system are to be detected. A multistatic radar system has two or more transmitter or receiver an-tennas, which are separated by large distances compared with the size of antennas [1]. Multistatic radars have several advantages including high isolation between the transmitter and receiver antennas and invulnerability to external interferences such as jamming. Also, mutual interference between the transmitter and receiver is highly reduced. In such systems, multiple receivers can work with one transmitter and if the transmitter stops working, the receiver can be easily adapted to receive waves from another transmitter. Transmitter to receiver switches or duplexers are expensive, lossy and heavy. Hence, they are not used in multistatic radars. The

monostatic radar needs to propagate more power than a multistatic one; therefore, the latter is safer from attack. Increasing the number of receivers results in increasing the surveillance area [6]. However, there are some dis-advantages. The main disadvantage is that synchroniza-tion between the transmitter and receiver and geometry of the entire system are too complicated in multistatic radars [7–9].

Nowadays, the use of bistatic and multistatic radar systems in homeland security is of interest to researchers [10–13]. In [14], a software simulation package was de-veloped for bistatic radars. Introducing both technical and theoretical information about the bistatic radars, the technology of a television based bistatic radar system was introduced in [7]. This technology makes use of a non-cooperative television transmitter as an illuminator for the bistatic radar system. In [15] and [16], the prob-lem of tracking a target with bistatic and multistatic ra-dars, their geometry dependencies with respect to meas-urements, as well as their input to a tracking and fusion system was studied. In [17], a unified view of the track-ing algorithms available for multistatic radar systems was presented. Then, the tracking performance of the proposed algorithms was evaluated by means of Monte Carlo simulation techniques. In [18], the constant veloc-ity model was applied for tracking the object, however, the object velocity was not constant in general. This lim-ited the use of this algorithm in tracking the object to short ranges. Using range-only measures collected by multistatic radars, the use of interval analysis to solve the problem of maneuvering target tracking was investigated

DETECTION, IDENTIFICATION AND TRACKING OF FLYING OBJECTS IN THREE DIMENSIONS USING MULTISTATIC RADARS


487

in [19]. In [20], the design and initial evaluation of a short-range, prototype multistatic radar, operating in the ‘cooperative’ mode, was reported. It was designed to be capable of both spatial and temporal coherent processing of received waves. In [6], the system aspects of an anti-intruder multistatic radar based on impulse radio ultra wide band technology were addressed. The system composed of one transmitting node and at least three receiving nodes, positioned in the surveillance area, to detect and locate a human intruder (target) moving inside the area.

In this paper, a multistatic radar system, which is ca-pable of computing the location, velocity, kind of object as well as object’s path, is simulated. More importantly, we determine the kind of object by calculating the Radar Cross Section (RCS) in 3D. This is an advantage of this paper, as it was considered as a given parameter in some research works such as [14] and [6]. The RCS is a very important parameter, which indicates the kind of the ob-ject such as jumbo jets, fighter aircrafts, missiles, etc [21]. The multistatic radar system can comprise one transmitter and a number of receivers. When the trans-mitter sends a wave, it is scattered from objects, if there are any, and the receivers get the scattered waves. Two receivers are adequate to find the velocity, location and RCS in 3D. In this paper, we only consider two receivers, however, if the number of receivers is more than two, the first two receiving maximum power can be chosen.

2. Problem Formulation The analysis of multistatic radars can be simplified to that of bistatic radars’ components. Figure 1 illustrates the geometry of a bistatic radar [9];

Figure 1. The geometry of a bistatic radar.

where XT and show the location of transmitter and receiver, and L is the distance between these two sites. and refer to the transmitter to target and the target to receiver ranges, respectively.

XR

RRTR is the

bistatic angle, V is the target velocity and is the velocity aspect angle. The transmitter, receiver and ob-ject are all located in a plane that called bistatic plane [6, 8,15,22]. 2.1. Equation of Power If a narrow band wave with power of is sent by the transmitter, after scattering from the object, the received power of each receiver, , can be determined using

Friis’ Formula [8].

TP

iRP

2

3 2 2, 1,2,...,

4i

i

i i

T T R B

R

T R T R P

P G GP

R R L L L

i n (1)

In Equation (1), n, λ and p are the number of re-ceivers, the wavelength and the average propagation loss, respectively.

L

TG and i refer to the gain of transmit-ter and gains of each receiver antennas.

RG

TL and i are transmitting and receiving system losses. B

RL

refers to the RCS of the object. In a practical case, TP ,

RP , , , ,pL TG RG TL , can be measured, therefore, the RCS of the object,

RL

B , can be obtained from Equation (1). 2.2. Estimation of Target Location To estimate the location of a target in bistatic radars, receivers measure the time interval, , between transmitted pulses and received echo waves, scattering from a target. Hence, the range sum, , can be calculated using the following equation.

TTT

R RR T

8. , 2.998 10R T TTR R c T c (2)

In each bistatic plane, TR , RR and can be ob-tained using the range sum and either R or T

(shown in Figure 1). If T is known, TR , and RR can be calculated from Equations (3–5) [7,8,9].

2 2

, 1, 2,...,2 sin( )

i

i i

T R i

T

T R T

R R LR i

R R L

n (3)

2 2 2 sin( ) , 1, 2,...,i iR T i T TR R L R L i n (4)

1cos( )

sin , 1, 2,...,i

i

i T

iR

Li

R

n (5)

L. S. KALANTARI ET AL.


488

Having transmitter and receiver locations Table 1. Values of RCS for some typical objects [21].

, , and . ,T T T R R R ix y z x y z Object RCS(dBsm)

Jumbo jet airliner 20

Large bomber or commercial jet 16

Large fighter aircraft 7.78

Four-passenger jet 3

Conventional winged missile -3

as well as and that obtained from Equations (3,

4), the location of object can be calculated in Cartesian coordinate system by solving Equations (6,7).

TR iRR

zyx ,,

2 2

T T TR x x y y z z 2

T (6) RCS, such as those given in [23] and [24], which calcu-late the RCS of a cone and a missile. The RCS of typical objects can be found in text books [3,21,25] and some of them are shown in Table 1 [21].

2 2 2, 1,2,...,

i i i iR R R RR x x y y z z i n (7)

2.3. Computation of Target Velocity

3. Results To estimate the velocity of target, the equation of “Dop-

pler effect” is used [15]. The multistatic radar system described in this paper con-sists of one transmitter and two receivers. To obtain the location and velocity of an object, the transmitter radar sends wave pulses with a specific period and then the receiver radars receive some pulses. In general, the pulses coming to receivers can consist of two parts. The first part is the pulses received from the transmitter di-rectly. The second part is the scattered waves coming from an object, if there is any. First, the range sum, RRi+RT is calculated from Equation (2). Then, RT, R

,

1, 1, 2,...,iRT

d bi

dRdRf i

dt dt

n (8)

Also, the rate of change of transmitter and receiver ranges with respect to time is given by

cos( 2) , 1,2,...,iT

i i i

dRV i

dt n (9)

cos 2 , 1, 2,...,iR

i i i

dRV i

dt R, β,

dRT /dt, dRR /dt in each bistatic plane are calculated using the parameters given in Table 2.

n (10)

combining Equations (9,10) yields Next, combining Equations (3,4) and (6,7) the object

location is estimated in 3D. Finally, combining Equa-tions (9,10) and (12,13) the velocity of the target is computed.

niV

f iii

ibd ,...,2,1,2cos)cos(2

,

(11)

where is the number of receivers. In this re-search work n is set to 2. Having and

as well as

n

TR iRRBased on the above-mentioned simulation procedure, a

MATLAB code was developed. Running this program, the velocity of the object, RT, RR1 and RR2 are given by 321.71m/s, 63.06km, 90.9km and 68.57km, respectively. The location of the object is estimated at (-30, 30, 46.6) km. Figure 2 and 3 show the system geometry and the polar plot of the location of the object.

dt

dR

dt

dR iRT , ,

the partial derivatives of yx, and with respect

to time,

ztx , ty and tz , can be obtained

from Equations (12,13).

The required parameters to determine the RCS of the object are shown in Table 3. Using Equation (1) along with these parameters, result in a RCS of 7.78dBsm. Ac-cording to Table 1, this amount of RCS indicates that the object is a large fighter aircraft [21].

T T

T T

dR R x

dt x tR Ry

y t z

z

t

(12) Also, the path of the aircraft obtained by tracking the

object is shown in Figure 4.

, 1,2

Ri Ri Ri

Ri

dR R Rdx dy

dt x dt y dt

R dzi

z dt

(13)

4. Conclusions Considering the advantages of multistatic radar systems in comparison with monostatic and bistatic ones, a multistatic radar system having one transmitter and two receivers was simulated. The simulation was able to cal-culate the location and velocity of an object in 3D. More importantly, by computing the RCS, it determined what

2.4. Calculation of RCS

There are several analytical methods to calculate the

DETECTION, IDENTIFICATION AND TRACKING OF FLYING OBJECTS IN THREE DIMENSIONS USING MULTISTATIC RADARS


489

Table 2. Required parameters to estimate the location and velocity of an object.

Description Parameter Value

Transmitter location xT ,yT ,zT (0,0,0)km

First receiver location xR1 ,yR1 ,zR1 (-20,30,0) km

Second receiver location xR2 ,yR2 ,zR2 (25,20,0)km

Time interval transmitted pulses and first receiver received echo ΔTTT1 0.5135ms

Transmitter azimuth angle at first bistatic plane θT1 -41.27deg

Transmitter azimuth angle at second bistatic plane θT2 4.26deg

Aspect angle at first bistatic plane 1 85.77deg

Aspect angle at second bistatic plane 2 95.27deg

Transmitted frequency fT 2GHz

First received frequency fR1 1.999999811GHz

Second received frequency fR2 2.000000344 GHz

Table 3. Required parameters to determine the RCS of an object.

Description Parameter Value

Transmitted power PT 35 dBW

First receiver power received

PR1 -124.8 dBW

Second receiver power received

PR2 -122.39 dBW

Power gain of transmitter and receivers antenna

GR 40 dB

Transmit system loss LT 1 dB

Receive system loss LR 1 dB

Average propagation lossLP 1 dB

Figure 2. The system geometry.

Figure 3. The polar plot of the location of the object.

Figure 4. The path of the aircraft.

L. S. KALANTARI ET AL.


490

kind of object was flying in the surveillance area. In ad-dition, the path of the object was estimated by tracking the object. As an application, this radar system can be applied to anti-air defense, anti-missile defense, ship’s navigation and traffic control systems.

In this paper only two receivers were considered. To widen the surveillance area, the number of receivers can be increased. However, in any detection only two re-ceivers, which receive maximum power, are selected. This will be studied in our future work.

5. References [1] “IEEE standard radar definitions,” IEEE, 2008.

[2] L. S. Kalantari, “Simulation of multistatic radar systems,” Master Thesis, The University of Sistan and Baluchestan, Zahedan, Iran, 2009.

[3] R. S. A. R. Abdullah and A. Ismail, “Forward scattering radar current and future application,” International Jour-nal of Engineering and Technology, Vol. 3, pp. 61–67, 2006.

[4] M. I. Skolnik, Introduction to Radar Systems, 3rd Edition, McGraw-Hill, New York, 2001.

[5] M. I. Skolnik, Radar Hand Book, McGraw-Hill, New York, 1970.

[6] E. Paolini, A. Giorgetti, M. Chiani, R. Minutolo, and M. Montanari, “Localization capability of cooperative anti-intruder radar systems,” EURASIP Journal on Ad-vances in Signal Processing, Vol. 2008, pp. 15, 2008.

[7] C. Wei and W. Chang, “System level investigations of television based bistatic radar,” Master Thesis, Cape Town, pp. 107, 2005.

[8] N. J. Willis, Bistatic Radar, 2nd Edition, Artech House, 1995.

[9] M. I. Skolnik, Radar Hand Book, 2nd Edition, McGraw-Hill, New York, 1990.

[10] J. Byrnes and G. Ostheimer, Advances in Sensing with Security Applications, Ciocco, Springer, Italy, 2005.

[11] W. Wang, “Application of near-space passive radar for homeland security,” Sensing and Imaging: An Interna-tional Journal, Vol. 8, pp. 39–52, 2007.

[12] H. D. Griffiths and C. J. Baker, “Fundamentals of tomo-

graphy and radar,” NATO Advanced Study Institute Ad-vances in Sensing with Security Applications, 2005.

[13] P. Withington, H. Fluhler, and S. Nag, “Enhancing home-land security with advanced UWB sensors,” IEEE Mi-crowave Magazine, Vol. 4, pp. 55–58, 2003.

[14] C. L. Teo, “Bistatic radar system analysis and software development,” Master Thesis, Naval Postgraduate School, Monterey, California, pp. 116, 2003.

[15] T. Johnsen and K. E. Olsen, “Bi- and multistatic radar,” Advanced Radar Signal and Data Processing, pp. 4.1– 4.34, 2006.

[16] T. Johnsen, B. Hafskjold, and K. E. Olsen, “Tracking and data fusion in Bi- and multistatic radar,” IEEE Radar Conference, Arlington, USA, 2005.

[17] A. Farina, “Tracking function in bistatic and multistatic radar systems,” IEE Proceedings F, 1986.

[18] W. Chongyu, X. Shanjia, and W. Dongjin, “Analysis of target tracking based on range difference measurement of multistatic radar system,” Journal of Electronics, Vol. 17, 2000.

[19] G. L. Soares, A. Arnold-Bos, L. Jaulin, C. A. Maia, and J. A. Vasconcelos, “An interval-based target tracking ap-proach for range-only multistatic radar,” IEEE Transac-tions on Magnetics, 2008.

[20] T. E. Derham, S. Doughty, K. Woodbridge, and C. J. Baker, “Design and evaluation of a low-cost multistatic netted radar system,” IET Radar Sonar Navigation, Vol. 1, 2007.

[21] C. A.Balanis, Antenna Theory: Analysis and Design, 3rd Edition, John Wiley and Sons, New York, 2005.

[22] B. R. Mahafza, Radar Systems Analysis and Design Us-ing Matlab, 2nd Edition, Chapman & Hall, 2005.

[23] T. Mosayyebi-Dorcheh, L. S. Kalantari, S. Mohanna, and S. Tavakoli, “Electromagnetic scattering from an infinite dielectric cone,” Progress in Electromagnetics Research Symposium, Russia, 2009.

[24] L. S. Kalantari, T. Mosayyebi-Dorcheh, S. Mohanna, and S. Tavakoli, “Missile radar cross section calculation and its use in 3-D anti-missile defense system,” Progress in Electromagnetics Research Symposium, Russia, 2009.

[25] W. Wiesbeck, Radar System Engineering, 13th Edition, 2007.



Investigation into the Performance of a MIMO System Equipped with ULA or UCA Antennas: BER, Capacity

and Channel Estimation

Xia LIU1, Marek E. BIALKOWSKI2, Feng WANG1, Konstanty BIALKOWSKI3

1Student Member IEEE, School of ITEE, The University of Queensland, Brisbane, Australia 2Fellow IEEE, School of ITEE, The University of Queensland, Brisbane, Australia

3Member IEEE, National ICT Australia-Queensland Lab, Brisbane, Australia Email: xialiu, meb, fwang, [email protected]

Received June 15, 2009; revised July 27, 2009; accepted August 30, 2009 ABSTRACT This paper reports on investigations into the performance of a Multiple Input Multiple Output (MIMO) wireless communication system employing a uniform linear array (ULA) at the transmitter and either a uni-form linear array (ULA) or a uniform circular array (UCA) antenna at the receiver. The transmitter is as-sumed to be surrounded by scattering objects while the receiver is postulated to be free from scattering ob-jects. The Laplacian distribution of angle of arrival (AOA) of a signal reaching the receiver is postulated. The performance of bit error rate (BER), capacity and channel estimation for a MIMO system are evaluated for the two cases that the receiver is equipped with ULA or with UCA antennas. Keywords: MIMO, BER, BPSK, FSK, Channel Capacity, EDOF, Channel Estimation, ULA, UCA, Spatial

Correlation

1. Introduction In recent years, there has been a growing interest in the communication research community in the signal trans-mission technique employing multiple element antennas both at the transmitter and receiver sides of a wireless communication system. The reason is that it can sig-nificantly improve the transmission quality in terms of data throughput (capacity) and coverage area without the need for extra operational frequency bandwidth. Known as the multiple-input multiple-output (MIMO) technique, it is one of the promising techniques for the next genera-tion of mobile communications. For its physical imple-mentation, the MIMO technique frequently assumes uniform linear arrays (ULA) at both the transmitter and receiver ends of a wireless communication system. However, to obtain operation with larger angular views, uniform circular arrays (UCA) and their similarities such as triangular, square, pentagonal or hexagonal arrays are also considered. It can be expected that different con-figurations of antenna arrays will result in different spa-tial correlations of transmitted/received signals and thus they will influence channel properties between transmit-

ter and receiver in a different way. It is well known that MIMO channel capacity performance is based on the properties or channel matrix. According to [1] and [2], MIMO system BER performance and training-based channel estimation performance are determined by the channel correlation matrix which is affected by channel properties. These, in turn, the MIMO systems employ ULA or UCA receiver will affect the bit error rate (BER), channel estimation and the MIMO system capacity dis-tinctly.

In this paper, calculations of the MIMO system BER are performed for two modulation schemes, BPSK and FSK for both noncoherent and coherent cases. For chan-nel estimation the SLS and MMSE estimation methods are considered. In the undertaken investigations it is as-sumed that the receiver employs either ULA or UCA antennas while the transmitter uses only ULA. Also as-sumed is that the transmitter is surrounded by scattering objects while the receiver is free from scatterers. To de-termine the antenna array spatial correlation pattern, a Laplacian distribution for the angle of arrival (AOA), which provides a good agreement with the measured data [3], is postulated.

X. LIU ET AL.


492

2. System Model 2.1. System Configuration and Spatial Correlations Figure 1 shows the configuration of the investigated MIMO system. The case of 4x4 MIMO is considered.

The transmitter is assumed to be equipped with a ULA antenna surrounded by scattering objects that are uni-formly distributed in a circle. Antenna elements in the array have an omnidirectional radiation pattern in the azimuth plane. The considered case represents a mobile station operating close to the ground where many sur-rounding obstacles are expected. In turn, the receiver is assumed to be equipped with either ULA or UCA of om-nidirectional antenna elements free from any surrounding obstacles. This configuration can represent a base station with antennas located high above the ground where there are no scattering objects.

In Figure 1, θ stands for the central AOA which is de-termined by the physical position of dominant scatterers with respect to the receiving antenna array. Assuming that the AOA follows the Laplacian distribution, the mathematical expressions for the real and imaginary parts of spatial correlation between the m-th and n-th antenna for the case of UCA receiving antenna are given as [3]:

0

2

22 21

Re ( , ) ( )

(1 )2 ( )cos[2

4

c

a

k ck

Rr m n J Z

a eJ Z k

a k

( )]

(1)

2

20

2 1

(1 )Im ( , ) 4

(2 1)

( )sin[(2 1)( )]

a

lk

k c

a eRr m n C

a k

J Z k

2

(2)

where Cl is a normalizing constant given as [3]:

2(1 )l a

aC

e

(3)

with a representing a decay factor related to the angle spread (AS). When a increases the angle spread de-creases. is an n-th order Bessel function of the first (.)nJ

kind. Zc is related to the antenna spacing and α is the relative angle between the m-th and n-th antenna. If we let

1 2 [cos( ) cos( )]mR

K n

(4)

2 2 [sin( ) sin( )]mR

K n

(5)

where m is the angle of m-th antenna in azimuthal

planes, then:

2 2

2 2

2 2

1sin( )

1 22

cos( )1 2

1 2c

K

K KK

K K

Z K K

(6)

The mathematical expressions for real and imaginary components of spatial correlation between m-th and n-th antenna at the receiver for the case of ULA antenna are given as [4]:

0

2

22 21

Re ( , ) ( )

(1 )2 ( )c

4

c

a

k ck

Rr m n J Z

a eos(2 )J Z k

a k

(7)

2

20

2 1

(1 )Im ( , ) 4

(2 1)

( )sin (2 1)

a

lk

k l

a eRr m n C

a k

J Z k

2 (8)

θ

Receiver

(a). UCA (b). ULA

Figure 1. 4-element UCA and ULA.

INVESTIGATION INTO THE PERFORMANCE OF A MIMO SYSTEM EQUIPPED WITH ULA OR UCA ANTENNAS: BER, CAPACITY AND CHANNEL ESTIMATION


493

0 0.5 1 1.5 2 2.5 30.2

0.4

0.6

0.8

1

d/λ|R(1

,2)|,

SC

bet

wee

n 1

and

2 el

eme

nt

ULA SC pattern at central AOA = 60 deg

a = 3a = 10a = 30

0 0.5 1 1.5 2 2.5 30

0.2

0.4

0.6

0.8

1

d/λ|R(1

,2)|

, SC

bet

wee

n 1

and

2 e

lem

en


a = 3a = 10a = 30

0 0.5 1 1.5 2 2.5 30.5

0.6

0.7

0.8

0.9

1

d/λ|R(1

,2)|,

SC

be

twee

n 1

and

2 el

eme

nt


a = 3a = 10a = 30

0 0.5 1 1.5 2 2.5 30

0.2

0.4

0.6

0.8

1

d/λ

|R(1

,2)|,

SC

be

twe

en 1

and

2 e

lem

en

UCA SC pattern at central AOA = 30 deg

a = 3a = 10a = 30

0 0.5 1 1.5 2 2.5 30

0.2

0.4

0.6

0.8

1

d/λ|R(1

,2)|

, SC

bet

wee

n 1

and

2 e

lem

ent


a = 3a = 10a = 30

0 0.5 1 1.5 2 2.5 30

0.2

0.4

0.6

0.8

1

d/λ|R(1

,2)|

, SC

bet

we

en 1

and

2 e

lem

ent


a = 3a = 10a = 30

Figure 2. Spatial correlation between antenna 1 and 2 for UCA and ULA at AOA of 30o, 60o and 90o.

where Zl = 2π(m-n)d/λ and d is antenna spacing.

The above expressions (1) (2) and (7) (8) can be ap-plied to determine spatial correlations between any two antenna elements in UCA or ULA receiving antennas. Note that these expressions do not include the effect of antenna mutual coupling. This condition is approximately fulfilled when the antenna element spacing is about half of the wavelength or more.

Figure 2 shows the spatial correlation between two antenna elements (1 and 2) of a UCA or ULA antenna when the central AOA is 30o, 60o and 90o. There are three curves in each plot. These curves correspond to a different decay factor a of 3, 10 and 30.

From the results presented in Figure 2, it is apparent that for the same antenna spacing d/λ the spatial correla-tion in ULA is higher than that in UCA when the central AOA increases from 30o to 90o. This can be due to the fact that ULA offers limited diversity when signals arrive from directions close to the ULA end-fire direction. UCA eliminates this deficiency as it offers almost a uni-form view angle for all directions.

2.2. Channel model A flat block-fading narrow-band MIMO system with Mt array antennas at transmitter and Mr array antennas at receiver is considered. The relationship between the re-ceived and transmitted signals is given by (9):

( ) ( )sY Hs t v t (9)

where Ys is the Mr x N complex matrix representing the received signals; s(t) is the Mt x N complex matrix rep-resenting transmitted signals at time domain t; H is the Mr x Mt complex channel matrix and v(t) is the Mr x N complex zero-mean white noise matrix at time domain t. N is the length of transmitted signal. The channel matrix H describes the channel properties which depend on an-tenna array configuration and signal propagation envi-ronment.

In order to simulate properties of the MIMO channel we apply the Kronecker model [5,6]. In this model, the transmitter and receiver correlations are assumed to be separable and the channel matrix H is represented as:

1/2 1/ 2R g TH R H R (10)

where Hg is a matrix with identical independent dis-tributed (i.i.d) Gaussian entries with zero mean and unit variance and RR

and RT are spatial correlation matrices at the receiver and transmitter, respectively. The channel correlation is expressed as,

HHR E HH (11)

where E stands for expected statistic value. For the array configurations shown in Figure 1, the cor-relation experienced by pairs of transmitting antennas can be written as [7]:

0( , ) [2 ( ) / ]tR m n J m n (12)

Therefore, the correlation matrix Rt for the MS trans-

X. LIU ET AL.


494

tM

rM

mitting antennas can be generated using (13)

(1,1) (1, )

( ,1) ( , )

t t

t

t t t t t

R R

R

R M R M M

(13)

In turn, the correlation matrix for the receiving anten-nas, Rr, can be obtained using Equations (1) (2) and (7) (8) and can be shown to be given as (14).

(1,1) (1, )

( ,1) ( , )

r r

r

r r r r r

R R

R

R M R M M

(14)

Having determined Rt and Rr, the channel matrix H can be calculated using Equation (10).

3. BER Performance with BPSK Modulation Scheme

3.1. BER Performance Analysis Under a fading channel scenario, the average error prob-ability can be obtained by averaging the conditional error probability over the probability density function (pdf) of instantaneous SNR γ as:

0( | ) ( )eP P e P d

(15)

To obtain p(γ), the method in [14] can be used to find the characteristic function Фγ of γ. By applying the op-eration of an inverse Fourier transform (IFT) to the characteristic function, p(γ) can be derived. According to [1], the general expression for the characteristic function of γ is given as:

( ) | |r

mMw I nw R

m (16)

In which, w=t/ρ and ρ is the transmit SNR; m indicates the fading distribution properties. Such as m=1 and m=1/2 corresponds Rayleigh and the one-sided Gaussian distribution, respectively; R is the channel correlation matrix.

Here, we assume the modulation schemes for the MIMO system under investigation are differential binary phase-shift key (DBPSK) and binary orthogonal fre-quency-shift key (BFSK). In the noncoherent case, the conditional BER for DBPSK and BFSK is given as [8],

1( | ) exp( )

2P e (17)

in which α is modulation constant. BFSK corresponds to α=1/2 and DBPSK corresponds to α=1. The average BER can be written as:

0

1exp( ) ( )

21

( ) |21

| |2 r

noncoherente

nt

mM

P P

t

I Rm

d

(18)

In turn, for the noncoherent case, the conditional BER for DBPSK and BFSK is given as [8],

( | ) ( 2 )P e Q (19)

in which, Q(x) is Gaussian Q function and is expressed as [9],

2/2

20

1( ) exp( ) , 0

2sin

xQ x d x

(20)

The average BER can be written as

2

/2

20 0

/2

/sin0

2

1exp( ) ( )

2 sin1

( ) |

1| |

sinr

coherente

nt

mM

P d

t

I R dm

P d

(21)

3.2. Numerical Results For the convenience of simulation, average BER of non-coherent DBPSK and BFSK are applied to evaluate the BER performance for the MIMO system. We assume that a ULA is present at the transmitter and either a UCA or ULA is located at the receiver. Simulations are per-formed for different values of the central AOA, decay factor, SNR and varying numbers of transmit/receive antennas.

In the first scenario, 4-element array antennas are used at both the transmitter and receiver of a MIMO system. The spacing d between adjacent elements of ULA or the radius R of UCA at transmitter is set at 0.5 wavelength (λ). To reduce the antenna mutual coupling (which is neglected here) and correlation, d and R can be made larger than 0.5λ. Figures 3, 4, 5 and 6 show BER as a function of SNR for both UCA and ULA for three values of decay factor a, and for the central AOA equal to 0o, 30o, 60o and 90o. The modulation schemes are nonco-herent BFSK and DBPSK.

The presented results indicate that BER decreases when SNR increases. At a higher decay factor, BER performances are worse. This can be explained by the fact that a larger decay factor corresponds to a smaller angle spread (AS) indicating a higher spatial correlation level. BER performances are degraded due to correlation.

In Figures 3 and 4, one can see that for the central AOA of 0o and 30o BER for both BFSK and DBPSK for ULA are better than for UCA for the three chosen values of decay factor; the BER performance of DBPSK is always



495

0 5 10 1510

-6

10-5

10-4

10-3

10-2

10-1

ULA & UCA BER vs SNR @ central AOA=0deg

γ (dB)

BE

R

ULA a=3 FSK ULA a=3 BPSK ULA a=10 FSK ULA a=10 BPSK ULA a=30 FSK ULA a=30 BPSK UCA a=3 FSK UCA a=3 BPSK UCA a=10 FSK UCA a=10 BPSK UCA a=30 FSK UCA a=30 BPSK

Figure 3. Noncoherent FSK and BPSK BER of UCA and ULA vs SNR at central AOA=0 o for three values of decay factor a of 3, 10 and 30.

0 5 10 1510

-5

10-4

10-3

10-2

10-1


γ (dB)

BE

R

ULA a=3 FSKULA a=3 BPSKULA a=10 FSKULA a=10 BPSKULA a=30 FSKULA a=30 BPSK UCA a=3 FSK UCA a=3 BPSK UCA a=10 FSK UCA a=10 BPSK UCA a=30 FSK UCA a=30 BPSK

Figure 5. Noncoherent FSK and BPSK BER of UCA and ULA vs SNR at central AOA=60 o for three values of decay factor a of 3, 10 and 30. better than BFSK.

However, when the central AOA is increased to 60o and 90o an opposite result is observed in Figures 5 and 6. In the latter case, performance of UCA is superior in comparison with ULA. These opposite trends indicate that at a certain value of central AOA, the performances of UCA and ULA should be equal.

In order to determine the cross point (for BER) further simulations are performed. The results are shown in Fig-ure 7. BER is presented in unit of dB. One can see in Figure 7 that BER for ULA increases when the central AOA increases. This is because the ULA’s spatial corre-lation level increases as the central AOA gets larger. While the BER curve for UCA is almost constant through the central AOA range. The cross point is be-

0 5 10 1510

-6

10-5

10-4

10-3

10-2

10-1


γ (dB)

BE

R

ULA a=3 FSK ULA a=3 BPSK ULA a=10 FSK ULA a=10 BPSK ULA a=30 FSK ULA a=30 BPSK UCA a=3 FSK UCA a=3 BPSK UCA a=10 FSK UCA a=10 BPSK UCA a=30 FSK UCA a=30 BPSK

Figure 4. Noncoherent FSK and BPSK BER of UCA and ULA vs SNR at central AOA=30 o for three values of decay factor a of 3, 10 and 30.

0 5 10 1510

-5

10-4

10-3

10-2

10-1


γ (dB)

BE

R

ULA a=3 FSKULA a=3 BPSKULA a=10 FSKULA a=10 BPSKULA a=30 FSKULA a=30 BPSKUCA a=3 FSKUCA a=3 BPSKUCA a=10 FSKUCA a=10 BPSKUCA a=30 FSKUCA a=30 BPSK

Figure 6. Noncoherent FSK and BPSK BER of UCA and ULA vs SNR at central AOA=90 o for three values of decay factor a of 3, 10 and 30. tween approximate AOA=40o and AOA=50o. To the left of the cross point, BER of ULA is lower than the one for UCA. In turn, on the right hand side, UCA’s perform-ance is better.

Using the earlier described settings for the 4x4 MIMO, the number of receiving antenna elements is increased and then the spatial correlation patterns and channel ca-pacity versus the number of transmit and receive anten-nas are simulated.

Because of a usually small size of the mobile station, the number of transmitting antennas is limited. This is not the case of base station which offers a larger avail-able area where more antennas can be added. Here, the number of antenna elements in a ULA is assumed to in-crease along the line with same spacing d, as shown in

X. LIU ET AL.


496

0 10 20 30 40 50 60 70 80 90-120

-110

-100

-90

-80

-70

-60ULA vs UCA BER vs central AOA @ γ=15dB & a=3

Central AOA (deg)

BE

R (dB

)

ULA FSK Rx=4ULA BPSK Rx=4UCA FSK Rx=4UCA BPSK Rx=4

Figure 7. Noncoherent BFSK & DBPSK BER of UCA and ULA vs central AOA.

(a). ULA

(b). UCA

Figure 8. ULA and UCA antenna arrays.

Figure 8(a). In turn, for a UCA the number of antenna elements with the same spacing d increases on the circle, as shown in Figure 8(b). In the UCA case, when the number of elements on the circle increases with spacing d unchanged, the radius R increases correspondingly.

Figure 9 presents the spatial correlation between an-tenna elements 1 and 2 for receiving UCA and ULA for the new settings. From Figure 9(a), one can see that when the number of antenna elements increases the spa-tial correlation for UCA varies from 0 to 1. However, the variation due to an increased number of antenna ele-ments is very small. For the ULA case, the spatial corre-lation level of receiving antennas is unchanged when the number of antennas increases.

Similarly as Figures 3–6, Figures 10, 11 and 12 show the results of BER for a different number of receiving antenna elements for the cases of ULA and UCA receiveing

0 0.5 1 1.5 2 2.5 30

0.2

0.4

0.6

0.8

1

d/

|R(2

,1)|

B. ULA SC pattern with different number of Rx @ central AOA=30 and a=3

Number of Rx=4Number of Rx=6Number of Rx=8Number of Rx=10

0 0.5 1 1.5 2 2.5 30

0.2

0.4

0.6

0.8

1

d/

|R(2

,1)|

A. UCA SC pattern with different number of Rx @ central AOA=30 and a=3

Number of Rx=4Number of Rx=6Number of Rx=8Number of Rx=10

Figure 9. Spatial correlation between antenna 1 and an-tenna for different numbers (4,6,8,and 10) of antenna ele-ments in receiving UCA (A) and ULA (B) antenna arrays at central AOA of 30° and decay factor a of 3.

0 5 10 1510

-8

10-7

10-6

10-5

10-4

10-3

10-2

10-1


γ (dB)

BE

R

ULA FSK Rx=6ULA BPSK Rx=6ULA FSK Rx=8ULA BPSK Rx=8UCA FSK Rx=6UCA BPSK Rx=6UCA FSK Rx=8UCA BPSK Rx=8

Figure 10. ULA vs UCA BER with different number of Rx antenna array elements at decay factor a equal to 3 and central AOA equal to 0o.

antennas. An increase in the number of antenna elements in ULA and UCA brings improvement to the BER per-formance. The cross points between red curves repre-senting ULA and blue curves standing for UCA move to the right from approximate 40° to 50° as the number of antennas increases.

4. MIMO Channel Capacity with Perfect Knowledge of Channel Matrix

4.1. MIMO Channel Capacity & EDOF If CSI is perfectly known at the receiver but unknown at the transmitter, the capacity of a MIMO system with Mr receive antennas and Mt transmit antennas can be deter-



497

mined using [10,11]:

2(log det[ ( )])R

HM

t

C E I HHM

(22)

where .H stands for the transpose-conjugate; ρ is the total transmitted SNR.

An alternative expression for the capacity in such a case can be obtained by decomposing the channel into n = min(Mr, Mt) virtual single input single output (SISO) sub-channels, and can be shown to be given as (23),

2log (1 )n

ii

i

Cn

(23)

where n

ii

and the gains of sub-channels are represented by the ei-genvalues of the channel correlation matrix HHH.

Here, it is assumed that the transmitted power is equally allocated to each sub-channel, which is easy to accomplish in practice. The channel capacity can be fur-ther maximized by applying power allocation schemes such as ‘water-filling’. However, this is not easy to im-plement, as CSI is required at the transmitter, which must be sent from the receiver to the transmitter.

It has to be noted that the MIMO channel capacity can be related to the channel effective degree of freedom (EDOF) [12]. In order to determine EDOF, the channel matrix properties and the signal to noise ratio (SNR) are required. According to [12], the EDOF is defined as:

0

(2 )d

EDOF Cd

(24)

Given the eigenvalues of the channel correlation HHH, it can be rewritten as

20 0

2(2 ) [log (1 )]

1

n n i

ii i

i

d d nCd d n

n

(25)

It is apparent that when ρλi /n >>1, (18) is approxi-mately equal to n and EDOF becomes maximum. In this case, every sub-channel is useful to transmit signals. In turn, when ρλi/n <1, EDOF is smaller than n, some sub-channels are not efficient to transmit signals. Rea sons for the reduced EDOF can be due to an increased level of channel correlation and decreased SNR. 4.2. Numerical Results Based on the presented theory, the channel EDOF and capacity are simulated for a 4x4 MIMO system. The

spacing d between adjacent elements of ULA or the ra-dius R of UCA at transmitter is set at 0.5 wavelength (λ). Figures 13, 14, 15 and 16 show EDOF and capacity as a function of SNR for both UCA and ULA for three values of decay factor a, and for the central AOA equal to 0o, 30o, 60o and 90o.

The presented results reveal that both EDOF and ca-pacity increase when SNR increases. At a higher decay factor, both EDOF and capacity are lower. This is be-cause of the fact that a larger decay factor corresponds to a smaller angle spread (AS) indicating a higher spatial correlation level. EDOF and capacity are degraded due to correlation.

In Figures 13 and 14, one can see that for the central AOA of 0o and 30o both EDOF and capacity for ULA are higher than for UCA for the three chosen values of decay factor.

0 5 10 1510

-7

10-6

10-5

10-4

10-3

10-2

10-1


γ (dB)

BE

R

ULA FSK Rx=6ULA BPSK Rx=6ULA FSK Rx=8ULA BPSK Rx=8UCA FSK Rx=6UCA BPSK Rx=6UCA FSK Rx=8UCA BPSK Rx=8

Figure 11. ULA vs UCA BER with different number of Rx antenna array elements at decay factor a equal to 3 and central AOA equal to 90o.

0 10 20 30 40 50 60 70 80 90-140

-130

-120

-110

-100

-90

-80

-70

-60ULA vs UCA BER vs central AOA @ γ=15dB & a=3

Central AOA (deg)

BE

R (dB

)

ULA FSK Rx=4ULA BPSK Rx=4ULA FSK Rx=6ULA BPSK Rx=6ULA FSK Rx=8ULA BPSK Rx=8UCA FSK Rx=4UCA BPSK Rx=4UCA FSK Rx=6UCA BPSK Rx=6UCA FSK Rx=8UCA BPSK Rx=8

Figure 12. ULA vs UCA BER with different number of Rx antenna array elements decay factor a equal to 3.

X. LIU ET AL.


498

0 5 10 150.5

1

1.5

2

2.5

3A. ULA vs UCA EDOF vs SNR @ central AOA=0deg

SNR (dB)

ED

OF

ULA a = 3 ULA a = 10ULA a = 30UCA a = 3UCA a = 10UCA a = 30

0 5 10 15 20 25 300

5

10

15

20

25

30B. ULA vs UCA Capacity vs SNR @ central AOA=0deg

SNR (dB)

Cap

acity

(dB

)

ULA a = 3ULA a = 10ULA a = 30UCA a = 3UCA a = 10UCA a = 30

Figure 13. EDOF and capacity of UCA and ULA vs SNR at central AOA=0 o for three values of decay factor a of 3, 10 and 30.

0 5 10 15

0.8

1

1.2

1.4

1.6

1.8

2

2.2A. ULA vs UCA EDOF vs SNR @ central AOA=60deg

SNR (dB)

ED

OF

ULA a = 3ULA a =10ULA a = 30UCA a = 3UCA a = 10UCA a = 30

0 5 10 15 20 25 300

5

10

15

20

25


SNR (dB)

Cap

acity

(dB

)


Figure 15. EDOF and capacity of UCA and ULA vs SNR at central AOA=60o for three values of decay factor a of 3, 10 and 30.

When the central AOA is increased to 60o and 90o an opposite result is observed in Figure 5 and 6. In the latter case, performance of UCA is superior in comparison with ULA.

To determine the cross point (for EDOF or capacity) further simulations are performed. The results are shown in Figure 17. One can see in Figure 17 that both EDOF and capacity decrease for the case of ULA when the cen-tral AOA increases at two different SNR. This is because the ULA’s spatial correlation level increases as the central

0 5 10 15 20 25 300

5

10

15

20

25


SNR (dB)

Cap

acity

(dB

)


0 5 10 150.5

1

1.5

2


SNR (dB)

ED

OF


Figure 14. EDOF and capacity of UCA and ULA vs SNR at central AOA=30 o for three values of decay factor a of 3, 10 and 30.

0 5 10 15

0.8

1

1.2

1.4

1.6

1.8

2


SNR (dB)

ED

OF


0 5 10 15 20 25 300

5

10

15

20

25


SNR (dB)

Cap

acity

(dB

)


Figure 16. EDOF and capacity of UCA and ULA vs SNR at central AOA=90ofor three values of decay factor a of 3, 10 and 30. AOA gets larger. This degrades channel capacity. The cross point is between AOA=40o and AOA=50o. To the left of the cross point, EDOF and capacity of ULA is higher than for UCA. In turn, on the right hand side, UCA’s performance is better.

Figures 18, 19 and 20 show the results for channel ca-pacity for a different number of receiving antenna ele-ments for the cases of ULA and UCA receiving antennas. An increase in the number of antenna elements in ULA and UCA brings improvement to the channel capacity.



499

The cross points between red curves representing ULA and blue curves standing for UCA move to the right as the number of antennas increases. When the number of receiving antenna elements is 10, the capacity of ULA is superior to UCA for the central AOA of 0o to 50o. 5. MIMO Channel Estimation For the training based channel estimation method, the relationship between the received signals and the training sequences is given by Equation (26) as

Y HP V (26)

0 10 20 30 40 50 60 70 80 901

1.2

1.4

1.6

1.8

2

2.2

2.4

2.6

2.8ULA vs UCA EDOF vs central AOA @ a = 3

Central AOA (deg)

ED

OF

ULA SNR = 10dBUCA SNR = 10dBULA SNR = 15dBUCA SNR = 15dB

0 10 20 30 40 50 60 70 80 904

5

6

7

8

9

10

11

12ULA vs UCA Capacity vs central AOA @ a=3

Central AOA (deg)

Channel

Capac

ity

ULA SNR = 10dBUCA SNR = 10dBULA SNR = 15dBUCA SNR = 15dB

Figure 17. EDOF and capacity of UCA and ULA vs central AOA.

0 5 10 15 20 25 305

10

15

20

25

30

35ULA vs UCA Capacity with different number Rx antennas @ central AOA=0deg

SNR (dB)

Cap

acity

(dB

)

ULA Number of Rx=4ULA Number of Rx=6ULA Number of Rx=8ULA Number of Rx=10UCA Number of Rx=4UCA Number of Rx=6UCA Number of Rx=8UCA Number of Rx=10

Figure 18. ULA vs UCA capacity with different number of Rx antenna array elements at decay factor a equal to 3 and central AOA equal to 0o.

0 5 10 15 20 25 305

10

15

20

25

30

35ULA vs UCA Capacity with different number Rx antennas @ central AOA=90deg

SNR (dB)

Cap

acity

(dB

)

ULA Number of Rx=4ULA Number of Rx=6ULA Number of Rx=8ULA Number of Rx=10UCA Number of Rx=4UCA Number of Rx=6UCA Number of Rx=8UCA Number of Rx=10

Figure 19. ULA (blue lines) vs UCA (red lines) capacity for different number of Rx antenna array elements at decay factor a equal to 3 and central AOA equal to 90o.

0 10 20 30 40 50 60 70 80 9016

17

18

19

20

21

22

23

24

25

26ULA vs UCA Capacity vs central AOA with different number of Rx antennas at SNR=15dB

Central AOA (deg)

Cha

nnel

Cap

acity

ULA Number of Rx=4

ULA Number of Rx=6ULA Number of Rx=8

ULA Number of Rx=10

UCA Number of Rx=4

UCA Number of Rx=6UCA Number of Rx=8

UCA Number of Rx=10

Figure 20. ULA vs UCA capacity with different number of Rx antenna array elements decay factor a equal to 3.

Here the transmitted signal S in (1) is replaced by P, which represents the Mt x L complex training matrix (sequence) where L is the length of the training sequence. The goal is to estimate the complex channel matrix H from the knowledge of Y and P. The transmitted power in the training mode is assumed to be given by a constant value. According to [13] and [14], the estimation using SLS or MMSE method requires orthogonality of the training matrix P. In the undertaken analysis, the training matrix P is assumed to satisfy this condition.

The performance of SLS method can be obtained by scaling up the results from the least square (LS) method. Using the LS method, the estimated channel can be writ-ten as [13,14],

†ˆLSH YP (27)

X. LIU ET AL.


500

where .† stands for pseudo-inverse. The mean square error (MSE) of the LS method is given as

2ˆLS LS F

MSE E H H (28)

in which E. denotes a statistical expectation. Accord-ing to [13] and [14], the minimum value of MSE for the LS method is given as

2

minLS t r

t

M MMSE

(29)

in which ρt stands for transmitted SNR in training mode.

The SLS method reduces the estimation error of the LS method and the improvement is given by the scal-ing factor γ as

H

LS H

tr R

MSE tr R

(30)

The estimated channel matrix is represented by [13,14]

†2 1

ˆ( )

HSLS H

n r H

tr RH YP

M tr PP tr R

(31)

Here, σn2 is the noise power; RH is the channel correla-

tion matrix defined as RH=EHHH and tr. implies the trace operation.

The MSE for SLS is given as [13,14] 2

2 2

ˆ

(1 )

SLS LSF

H LS

MSE E H H

tr R MSE

(32)

The minimized MSE of MMSE method can be written as [7,8]

min

SLS LS H

LS H

MSE tr RMSE

MSE tr R

(33)

By taking into account expression (23), the mini-mized MSE of the SLS method (27) can be rewritten as

1 12

1 12

[( ) ]

[( ) ]

tSLS H

t r

nt

ii t r

M SE tr RM M

M M

(34)

where n=min(Mr, Mt) and is λi the i-th eigenvalue of the channel correlation RH.

In the MMSE method, the estimated channel matrix is given as (35) [13,14],

2 1ˆ ( )H HMMSE H n r HH Y P R P M I P R (35)

The MSE of MMSE estimation is given as 2ˆ

in which RE is estimation error correlation written as

1 2 1 1

ˆ ˆ( )( )

( )

HE MMSE MMSE

HH n r

R E H H H H

R M PP

(37)

The minimized MSE for MMSE is obtained as [7,8] 1 2 1 1( ) H H

MMSE n rMSE tr M Q PP Q (38)

In (38), Q is the unitary eigenvector matrix of RH and Λ is the diagonal matrix with eigenvalues of RH. The minimized MSE for the MMSE method, given by Equa-tion (38), can be rewritten using the orthogonality prop-erties of the training sequence P and the unitary matrix Q, as shown by

1 1

1 1 1

( )

( )

MMSE t r

n

i t ri

MSE tr M I

M

1

(39)

From Equations (33), (38) and (39), one can see that MSE of SLS and MMSE methods depends on the chan-nel correlation which, in turn, is affected by the trans-mitter and receiver spatial correlations.

In the first instance, the SLS and MMSE channel es-timation methods are assessed via computer simulations. In the undertaken simulations, the transmitter of the MIMO system is assumed to be equipped with ULA while the receiver uses either UCA or ULA. The case of 4x4 MIMO system is considered. The simulations are performed for different values of central AOA, decay factor a and the transmitted SNR (ρt = ρ). The other as-sumptions are similar to the ones already described in Subsection 3.2.

Simulations of MSE as a function of ρ (ρ=Ps/σn2) for

the SLS and MMSE channel estimation are performed for two decay factors of 3 and 30 assuming the central AOA of 0o, 30o, 60o and 90o. The results are shown in Figures 21, 22, 23 and 24.

In all of the cases presented in Figures 21, 22, 23 and 24 it is apparent that when ρ increases MSE decreases for both SLS and MMSE irrespectively from the choice of decay factor. MSE of SLS looks to be independent of the decay factor. Also only negligible changes in MSE of SLS are observed when CLA replaces ULA at the re-ceiver. However, MSE of MMSE is sensitive to the choice of decay factor and is smaller for larger decay factors.

With reference to the choice of the central AOA of 0o and 30o in Figures 21 and 22, one can see that MSE of MMSE for ULA is larger than for UCA.

This happens irrespectively of the choice of the decay factor value. However, in the case of central AOA of 60o and 90o, shown in Figures 23 and 24, one can see that the opposite conclusion takes place. The MSE of MMSE for the UCA is getting greater than when the ULA is used at the receiver. MMSE MMSE E

FMSE E H H tr R (36)



501

0 5 10 15 20 25 30-60

-50

-40

-30

-20

-10

0

10ULA vs UCA SLS&MMSE MSE vs SNR @ central AOA=0deg

ρ (dB)

MS

E (

dB)

ULA SLS a = 3ULA MMSE a= 3ULA SLS a = 30ULA MMSE a = 30UCA SLS a =3UCA MMSE a=3UCA SLS a = 30UCA MMSE a = 30

Figure 21. MSE vs ρ for receiving ULA (blue lines) and UCA (red lines) at central AOA=0o.

0 5 10 15 20 25 30-60

-50

-40

-30

-20

-10

0


ρ (dB)

MS

E (

dB)

ULA SLS a = 3ULA MMSE a = 3ULA SLS a = 30ULA MMSE a = 30UCA SLS a = 3UCA MMSE a = 3UCA SLS a = 30UCA MMSE a = 30

Figure 22. MSE vs ρ for UCA and ULA at central AOA=30o.

0 5 10 15 20 25 30-60

-50

-40

-30

-20

-10

0


ρ (dB)

MS

E (

dB)



0 5 10 15 20 25 30-70

-60

-50

-40

-30

-20

-10

0


ρ (dB)

MS

E (

dB)



0 10 20 30 40 50 60 70 80 90-34

-32

-30

-28

-26

-24

-22

-20

-18ULA vs UCA MSE vs central AOA

Central AOA (deg)

MS

E (dB

)

ULA MMSE ρ = 15dBUCA MMSE ρ = 15dBULA MMSE ρ = 20dBUCA MMSE ρ = 20dB

Figure 25. MSE vs central AOA for ULA (blue line) and UCA (red line) at decay factor a equal to 3.

Figure 26. MSE vs central AOA for different number of antenna elements in receiving ULA and UCA at decay fac-tor a equal to 3 and ρ equal to 20dB.

X. LIU ET AL.


502

Figure 25 shows the simulated results for MSE versus central AOA for two cases of ρ equal to 15dB and 20dB, respectively. One can see that when ρ is equal to 15dB, MSE of MMSE for ULA is larger than for UCA when the central AOA is smaller than 50o. In turn, when the central AOA is larger than 50o an opposite situation takes place: MSE of MMSE for UCA is larger than the one for ULA.

Similar observations are made when ρ is equal to 20dB. However, in this case the central AOA cross point is moved to about 60o.

Figure 26 shows the results for MSE similar to those of Figure 25. However, they are obtained for different number of receiving antenna elements of 4, 6, 8 and 10.

It can be seen in Figure 26 that when the receiving ar-ray includes 4 antenna elements, the channel estimation shows the best performance for both ULA and UCA. When the number of antenna elements is increased from 4 to 6, 8 and 10, the channel estimation accuracy is get-ting worse for both ULA and UCA cases. These results confirm our expectation that larger size MIMO systems face the problem of decreased estimation of MIMO channel. 6. Conclusions In this paper, we have reported on investigations into the performance of BER, channel capacity and channel es-timation of a MIMO system employing Uniform Linear Array at the transmitter and either a Uniform Circular Array or Uniform Linear Array at the receiver. In the presented investigations, the transmitter is assumed to be surrounded by scattering objects while the receiver is postulated to be free of scatterers. The signal angle of arrival (AOA) has been assumed to follow the Laplacian distribution. The angle spread (AS) is characterized by the decay factor.

The attention has been paid to the effect of different spatial correlation in receiving linear and circular arrays. The obtained results have shown that for the central AOA varying from 0o to 90o, UCA’s spatial correlation pattern (as a function of element antenna spacing) is relatively constant while ULA’s spatial correlation level increases; both UCA’s and ULA’s spatial correlation patterns are not sensitive to the increased number of ar-ray elements.

At a larger decay factor corresponding to a smaller angular spread (and thus a higher level of spatial correla-tion), the BER of both FSK and BPSK are increased for both the UCA and ULA receiving antenna cases. Simu-lation results also presented the variation of BER as a function of central AOA varying from 0o to 90o when the signal to noise ratio γ is equal to 15dB. It has been shown that at γ=15dB, BER for ULA is lower in comparison with UCA when the central AOA is smaller than 45o.

When central AOA becomes larger than 50o, the UCA performance is better in terms of lower value of BER. When the number of receiving antennas increases, the performance gets better in terms of BER for both ULA and UCA cases.

The obtained results have also shown that for a larger decay factor, the channel capacity is reduced for both UCA and ULA receiving antennas. The 4x4 MIMO sys-tem employing the receiving ULA shows higher capacity when the central AOA is smaller than 40o. For central AOA greater than 50° the opposite happens and the sys-tem using UCA outperforms the one using ULA. When the number of receiving antennas increases, improve-ments to channel capacity are demonstrated for both ULA and UCA. The cross points for ULA and UCA ca-pacity curves move to the right when the number of an-tennas increases. When the number of receiving antennas is 10, the capacity performance for ULA is superior to UCA for central AOA of 0o to 50o.

For channel estimation performance, at a larger decay factor, the MSE of training based channel estimation methods such as SLS and MMSE is reduced for both the UCA and ULA receiving antenna cases. This agrees with the findings of [15] and [16]. Other presented results have concerned the variation of MSE as a function of central AOA varying from 0o to 90o when the signal to noise ratio ρ is equal to 15dB or 20dB. It has been shown that at ρ=15dB, MSE of MMSE for ULA is higher in comparison with UCA when the central AOA is smaller than 50o. When central AOA becomes larger than 50o, the ULA performance is better in terms of lower value of MSE. For ρ of 20dB a similar trend has been observed but the cross point occurs for the central AOA equal to 60o. When the number of receiving antennas increases, the performance gets worse in terms of MSE for both ULA and UCA cases. 7. References [1] J. Luo, J. R. Zeidler, and S. McLaughlin, “Performance

analysis of compact antenna arrays with MRC in corre-lated Nakagami fading channels,” IEEE Transactions on Vehicular Technology, Vol. 50, No. 1, January 2001.

[2] X. Liu, M. E. Bialkowski, and F. Wang, “Investigations into the effect of spatial correlation on channel estimation and capacity of multiple input multiple output system,” International Journal of Communications, Network and System Sciences, Vol. 2, No. 3, June 2009.

[3] J. Tsai, R. M. Buehrer, and B. D. Woerner, “Spatial fad-ing correlation function of circular antenna arrays with Laplacian energy distribution,” IEEE Communications Letters, Vol. 6, No. 5, pp. 178–180, May 2002.

[4] J. Tsai, R. M. Buehrer, and B. D. Woerner, “The impact of AOA energy distribution on the spatial fading correla-tion of linear antenna array,” Vol. 2, pp. 933–937, IEEE 5th VTC, May 2002.



503

[5] E. G. Larsson and P. Stoica, “Space-time block coding for wireless communication,” Cambridge University Press, 2003.

[6] C. N. Chuah, D. N. C. Tse, and J. M. Kahn, “Capacity scaling in MIMO wireless systems under correlated fad-ing,” IEEE Transactions on Information Theory, Vol. 48, pp. 637–650, March 2002.

[7] W. C. Jakes, Microwave Mobile Communications, John Wiley & Sons, New York, 1974.

[8] J. G. Proakis, Digital Communications, 3rd Edition, McGraw-Hills, New York, 1995.

[9] M. Simon and M. S. Alouini, “A unified approach to the performance analysis of digital communication over gen-eralized fading channels,” Proceedings of IEEE, pp. 1860–1877, September 1998.

[10] E. Telatar, “Capacity of multi-antenna Gaussian chan-nels,” Europe Transactions on Telecommunication, Vol. 10, No. 6, pp. 585–596, November 1999.

[11] T. L. Marzetta and B. M. Hochwald, “Capacity of a mo-bile multiple-antenna communication link in Rayleigh flat fading,” IEEE Transactions on Information Theory, Vol. 45, No. 1, pp. 139–157, January 1999.

[12] D. Shiu, J. Foschini, M. J. Gans, and J. M. Kahn, “Fading correlation and its effect on the capacity of multielemnt antenna system,” IEEE Transactions on Communications, Vol. 48, No. 3, March 2000.

[13] M. Biguesh and A. B. Gershman, “MIMO channel esti-mation: Optimal training and tradeoffs between estima-tion techniques,” Proceedings of ICC’04, Paris, France, June 2004.

[14] M. Biguesh and A. B. Gershman, “Training-based MIMO channel estimation: A study of estimator tradeoffs and optimal training signals,” IEEE Transactions on Signal Processing, Vol. 54, No. 3, March 2006.

[15] X. Liu, M. E. Bialkowski, and S. Lu, “Investigations into training-based MIMO channel estimation for spatial cor-related channels,” Proceedings of IEEE AP-S Symposium, Hawaii, USA, 2007.

[16] X. Liu, S. Lu, M. E. Bialkowski, and H.T. Hui, “MMSE channel estimation for MIMO system with receiver equipped with a circular array antenna,” Proceedings of 2007 Asia Pacific Microwave Conference, APMC2007, pp. 1–4, December 2007.



Efficient Bandwidth and Power Allocation Algorithms for Multiuser MIMO-OFDM Systems*

Jin SHU, Wei GUO The Wireless Information Network Laboratory, University of Science and Technology of China, Hefei, China

Email: shujin,[email protected] Received December 24, 2008; revised March 5, 2009; accepted May 26, 2009

ABSTRACT This paper studies the problem of finding an effective subcarrier and power allocation strategy for downlink communication to multiple users in a MIMO-OFDM system with zero-forcing beamforming. The problem of minimizing total power consumption with constraint on transmission rate for users is formulated. The prob-lem of joint allocation is divided into two stages. In the first stage, the number of subcarriers that each user will get is determined based on the users’ average signal-to-noise ratio. In the second stage, it finds the best assignment of subcarriers to users. The optimal method is a complex combinatorial problem which can only be assuredly solved through an Exhaustive Search (ES). Since the ES method has high computational com-plexity, the normalized user selection algorithm and the simplified-normalized user selection algorithm are proposed to reduce the computational complexity. Simulation results show that the proposed low complexity algorithms offer better performance compared with an existing algorithm. Keywords: Multiuser, MIMO-OFDM, Adaptive Resource Allocation, QoS

1. Introduction The multiuser MIMO-OFDM system has great potential of providing enormous capacity due to its integrated space-frequency diversity and multiuser diversity. As-suming knowledge of channel state information (CSI) is available at the transmitter, the performance can be fur-ther improved through the adaptive resource allocation. For the OFDMA systems with single antenna, several resource allocation methods were proposed in [1–3] to minimize the total transmit power given QoS by utilizing the multiuser diversity in frequency domain. [4,5] inves-tigated the SDMA-OFDM system in an environment with multi-antenna equipped at the base station. [4] pro-posed an optimal lagrangian iteration method to maxi-mize the system throughput under the total power con-straint. Because the optimal scheme is complicated, a greedy algorithm was proposed to reduce the complexity in [5].

Considering a multiuser MIMO-OFDM system with downlink beamforming, it is assumed that the base sta-tion can acquire perfect CSI, [7] employed the SUS (Semi-orthogonal User Selection) algorithm proposed in

[6] to minimize the total transmit power satisfying the QoS of users. But in [7] the size of SDMA group was fixed, therefore, the orthogonality of channels of users in a group was not well guaranteed.

In order to guarantee the orthogonality of channels of users in a group, we propose the Normalized User Selec-tion (NUS) algorithm. In NUS algorithm, each user group is regards as a virtual user, the number of users in a group is normalized to unitary, and then the resource allocation schemes for OFDMA can be employed. The NUS scheme has to traverse all the user groups on each subcarrier, obviously, the computation complexity is large when there are lots of users. In order to further re-duce the complexity, the S-NUS algorithm (Simpli-fied-NUS) is proposed. On each subcarrier, a user with the channel which has the largest magnitude and lowest correlation with the other already selected users is se-lected. The way to calculate the spatial correlation among users is employed as in [6]. When the number of users is huge, the S-NUS algorithm can greatly reduce the complexity. In our proposed algorithms, the number of users on each subcarrier is not fixed but depends on the spatial correlation of users. Since the number of users on each subcarrier is not a constant, it is hard to count the number of subcarriers for each user. In order to count the number of subcarriers easily, we pull-in the statistical

*Manuscript received, 2008. This work was supported by the National Basic Reseasch Program of China (973 program), 2007CB310602.

EFFICIENT BANDWIDTH AND POWER ALLOCATION ALGORITHMS FOR MULTIUSER MIMO-OFDM SYSTEMS


505

weights of subcarriers. With the statistical weight, the Bandwidth Assignment Based on SNR (BABS) algo-rithm proposed in [2] can be applied to determine the number of subcarriers for each user. Simulation results show that both of the NUS and S-NUS algorithms can achieve better performance than the algorithm in [7]. Compared to NUS algorithm, the S-NUS algorithm has lower complexity but with little performance loss, and is a better choice.

This paper is organized as follows. Section 2 presents the system model and the formulation of the problem. Section 3 introduces two sub-optimal resource allocation algorithms. Section 4 shows the simulation results. Fi-nally, Section 5 concludes the paper.

Notation: We use stands for the transpose of a

matrix (vector), stands for the pseudo-inverse of a

matrix, stands for the conjugate transpose of a ma-

trix,

. T

†.

.

A denotes the size of the set A, ,A

k k repre-

sents the k th diagonal element of A . 2. System Model 2.1. Channel Model and Transmit Structure We consider a downlink MIMO-OFDM system with a base station supporting data traffic to K user terminals. The base station is equipped with M transmit antennas and each user terminal has a single receive antenna. We assume that K≥M .The frequency band is divided into N subcarriers. It is considered that the channel matrix dose not vary during the coherence interval of T .The received signal of user k on subcarrier n can be represented as

, ,k n k n n k ny h x ,z (1)

Where 1,

Mk n

h C

,k nh

,k nz

is the channel gain matrix of user

k and entries of are assumed to be identically in-

dependent distributed with zero mean and unit variance, is the transmit symbol from the base station

antennas, is complex Gaussian noise with zero

mean and unit variance of user k.

1Mnx C

At the transmitter, we employ the zero-forcing beam-forming (ZFBF) transmit strategy. In ZFBF, the trans-mitter selects an active user set of size 1,... nS K

nS M to which data will be transmitted. The data

symbol is multiplied by the beamforming vector

as follows ,j ns

j,nw

,

n

n j n jj S

Then the received signal (1) becomes

, , , ,

n

k n k n j n j n j n k nj S

y P s

,z ,h w (3)

In [6], the beamforming vector is selected to satisfy the zero-interference condition , for, 0k n j n ,h w j k .

Denote and be the corresponding

submatrices of

( )n nSH (n nSW

1, ,...,T Tn K

)T

n n ,hH h , , re-

spectively.

1 ,...n n, K n ,wW w

The beamforming matrix can be simply ob-

tained using pseudo inverse of as follows:

( )n nSW

(n SH )n

1†( ) ( ) ( ) ( )n n n n n n n n n nS S S S S W H H H H (4)

2.2. Problem Formulation Since ZFBF can transmit M spatial sub-stream simulta-neously, maximum M users can be allocated by ZFBF in each subcarrier. Let ρk,n indicate whether the user k is

chosen on subcarrier n, denote Ck,n indicate that user k

can transmit c bits on subcarrier n, ρk,n=1 if Ck,n≠0，

ρk,n=0 if Ck,n=0。

,1

, 1,...K

k nk

.M n

N

K

(5)

,1

, 1,...N

k n kn

c R k

(6)

Where stands for the number of bits user k want to

transmit every symbol. Constraint (5) means at most M users could be assigned to one subcarrier, constraint (6) means Rk bits should be transmitted per symbol for user k.

kR

The optimization problem can be formulated in the sense of the total transmit power satisfying (5), (6) as follows.

, ,

,

,,1 1 ,

min

subject to (5), (6)

k n k n

N Kk k n

k ncn k k n

f c

(7)

Where

, 1

,

1,

( ( ) ( ) )k n n

n n k k

k SS S

H H

is the effective channel gain on subcarrier n for user k . fk(c)stands for the required transmit power to transmit c bits when channel gain is unity. When uncoded 2c ary QAM is employed, the required transmit power

,n j nP s

,x w (2)

J. SHU ET AL.


506

can be tightly approximated as [8]:

0 1 2 log 5

1.6

ck

k

N Bf c

ER

K

(8)

Where BERk is the bit error rate of user k, N0 is the variance of Gaussian white noise and is assumed to be unitary in this paper. 3. Subcarrier and Bit Allocation The solution of optimization problem (7) can be sepa-rated into three stages. In the first stage, the number of required subcarriers for each user is roughly determined based on target rate and the average channel gain of each user. In the second stage, allocate subcarriers to each user according to the number of subcarriers obtained in the first stage. In the third stage, bit allocation for as-signed subcarriers to each user is performed. For each user, a greedy algorithm for single user is employed to allocate bits as in [1]. 3.1. Resource Allocation In a wireless environment, the channel state of some us-ers will be inferior to others’; these users tend to need more transmit power. As shown in [1–3], more subcarri-ers should be assigned to these users with lower average channel gain to satisfy the rate constraint of these users. Since the number of users is not stationary for the MIMO-OFDM systems with ZFBF on each subcarrier, it is hard to count the number of subcarriers for each user. If the number of subcarriers for each user is added up one by one, the result is that the number of subcarriers is not a constant, so it is hard to determine whether the number of subcarriers for each user is satisfied. We as-sume that only one user transmit data on a subcarrier ,the rate of the user is r ,when there are two user transmit data on this subcarrier ,the rate of each users is approximated as r/2, so, each user is regarded as to be assigned half of a subcarrier. And so forth, when there are three and four uses transmit data on a subcarrier, each user is regarded as to be assigned one third and one fourth of a subcarrier. Therefore, we pulls-in the statistical weights of subcarri-ers. Let be a subset of user indexes on

subcarrier n and

1,... nS

n nM S . When the user k is selected,

the number of subcarriers of it adds 1 nM . In accor-dance with this, the sum of the number of subcarriers for all users is exactly . In this way, the resource alloca-tion algorithm for OFDMA can be employed for the MIMO-OFDM system considered in this paper. Assum-ing each user k experiences of the identical channel gain for each subcarrier

N

2

,1

1 N

k knN

h

the total number of subcarriers for user k is nk. When the channel gain is identical for each user on all subcarriers, the optimization of (7) is modified to find nk, 1,...,k K . The Bandwidth Assignment Based on SNR (BABS) al-gorithm proposed in [2] can be applied to find the solu-tion of the above problem. 3.2. Subcarrier Assignment Algorithm Once the number of subcarriers to each user is deter-mined, the next step is to assign the specific subcarriers to each user. The original problem (7) is modified as the problem to find ,k n .

,

,

1 1 ,

mink n

N Kk nk

kn k k k n

Rf

n

(9)

Subject to

,1

, 1,...K

k nk

M n

N (10)

,1

N

k n kn

Mn

(11)

In order to solve the above problem, two subcarrier assignment algorithms (NUS and S-NUS) are proposed in this paper.

Algorithm 1. Normalized User Selection Algorithm (NUS)

In OFDMA systems shown in [1–3], the subcarriers are assigned to the users with the largest channel gain to maximize the total throughout or minimize the total transmit power. Since in a multi-user MIMO-OFDM system with ZFBF, the effective channel gain depends on the orthogonality of channels of the user set assigned to a subcarrier, it is quite complicated to assign the subcarri-ers. In order to minimize the total transmit power, it is efficient to assign a user with the channel which has the largest magnitude and lowest correlation with the other already users assigned on a subcarrier. In [7], the number of users assigned simultaneously on each subcarrier is fixed as M. But it is difficult to select M users with the channel which is low correlation with other already se-lected users while the number of total users is not large enough. Therefore, assign M users simultaneously in a subcarrier is not good enough. In order to guarantee the orthogonality of the channels of users in a user set, we propose the NUS algorithm the user set of a subcarrier is regard as a virtual user by the proposed NUS algorithm. In NUS algorithm, the number of users in a user set is normalized to unitary; the best virtual user is selected on each subcarrier just the same as in OFDMA. Denote

,n p 1 p P

,n p

be the p th candidate user set on sub-

carrier n , 1, K ， ,n p M , n ,



507

1

!,

! !

M

l

K K KP

l l l K l

.

The subcarrier assignment algorithm is shown as fol-low.

Step 1. Initialization

0

,

,

1, , ; 1, , , 1, ,

, 0, ,

, 1, ,

n

n k n

kk ave

k

U N T K n

S k n

RR k Kn

N

Step 2. Select the subcarrier

2

, ,

ˆ ˆ ˆ ˆ ˆ, , , , ,

ˆ arg min ;

, , , 1,n

k k ave k nn Uk T

n p n n p j n p n p

n f R U U

T j p

h

ˆ

,

n

P

Step 3. Select the optimal user set

ˆ ˆ ˆ ˆ ˆ, , , , , , ,1

/ , ,pM

n p k k ave n p n p j n p n p n pj

r f R M j M ˆ ,

1,

ˆ,

0 0ˆ ˆ ˆ ˆ ˆ, ,

ˆ arg min , 1,2, ,

, ,

n p

n p n n k n

p r p P

k S S k

Step 4. Count the number of subcarriers

0ˆ

ˆ,

1, , 0,

, finish else go Step 2

n k k k n nn p

k S n n if n T T k nM

U

,

In Step 1, Tn is the candidate user set of nth subcarrier, U is the candidate subcarrier set, is the selected user

set of nth subcarrier, Rk is the average bits user k transmit each symbol, nk is the subcarriers user k own determined by BABS, therefore, Rk,ave is the average bits in the sub-carriers for user k .

0nS

In Step 2, select the subcarrier ň with the minimum transmit power among users, each subcarrier is selected only once. P is the total number of candidate user sets, φň,p is the P th user set of subcarrier ň, γň,p,j is the effec-tive channel gain of user sets.

In Step 3, select the optimal user set based on the cri-terion: for the P th user set, Mň,p=|φň,p|, after normalizing the number of users, each user is equivalent to 1/Mň,p of a user. Therefore, the total transmit power of the user set is compose of transmit power of each user which transmit Rk,ave/Mň,p bits. Select the users set with the minimum transmit power in this way on each subcarrier.

In Step 4, If the assigned user satisfies the required number of subcarriers, the rest of subcarriers will not be assigned to user any more. As described in Subsec-tion 3.1, once the

k

kp th user set is selected, the number

of subcarriers for each user in the p th user set adds

1/Mň,p.

Algorithm 2. Simplified Normalized User Selection Al-gorithm (S-NUS)

In subcarrier assignment algorithm of NUS, Step 2 and Step 3 need to traverse all the candidate user sets on each subcarrier, it is complicated when the number of users is large. In order to further lower the complexity, S-NUS algorithm is proposed. Selecting a user with the largest channel gain, then select other users with large channel gain and low correlation with already selected users. In [6], it is shown that this algorithm can achieve the as-ymptotic performance as DPC with number of users in-creasing. The subcarrier assignment algorithm is shown as follows (Figure 1).

Step 1 is the same as NUS.

0

,

,

1, , ; 1, , , 1, ,

, 0, ,

, 1, ,

n

n k n

kk ave

k

U N T K n N

S k n

RR k Kn

2

, ,ˆ arg min

ˆ ; 1n

k k ave k nn Uk T

n f R

U U n i

h

1( ) ( )

ˆ ˆ, , , 21 ( )

( )i

i ik n i k n

j i

g gg I

g

h

ˆ

2

ˆ, , ,

ˆ ˆ ˆ( ) ( )ˆ ˆ ˆ, , , ,

0 0ˆˆ ˆ ˆ ˆˆ,

ˆ arg min ( )

, , 1

ˆ ˆ , 1;

nk k ave k n i

k T

i ik n i k n k n

n n n nk n

k f R g

g g

S S k T T k

h h

nT

ˆ, ( )

ˆ ˆ ˆ

ˆ, ( )

, , , l n i

n n n

l n i

h gl T if then T T l

h g

0ˆ , 1 , 0, ,n k k k nk S n n i if n T k n

U

1i i

i M

Figure 1. Flow chart of the subcarrier assignment algo-rithm.

J. SHU ET AL.


508

Figure 2. Required average SNR vs. different values of with M=2.

Figure 3. Required average SNR vs. different values of with M=4.

Step 2 Selecting the subcarrier is the same as NUS. In Step 3, gk,i is the orthogonal component of hk,n

spanned by

(1), ( 1),,...,n ig g n ,

when i=1 , this implies . [6] indicates that , , ,k n i k ng h

( ) ( )g i h i when the orthogonality of channels of users

is good enough. The user with the minimum transmit power is chosen while transmitting

k

ˆ,k aveR ˆ

nk T bits

every time. In Step 4, if the remainder whose channels are not

semi-orthogonal to the th user’s will be dropped off. k is a positive constant [6]. In ZFBF, selecting a non-orthogonal user degrades the effective chamnnel gain of the other users. Therefore, forcing semi-othogo-

nal among users not only promotes the performance of the system but also reduces the complexity of the algo-rithm.

In Step 5, judge whether the number of subcarriers is satisfied and count the number of subcarriers as in NUS. 3.3. Algorithmic Complexity In this section, the worst case performance of each algo-rithm is studied as a function of the number of transmis-sion antennas M, the number of users K and the number of subcarriers N . The optimal method for subcarriers allocation requires exhaustive search, so the computa-tional complexity is

.N

KO

M

Computational complexity of the algorithm in [7] is O(M2KN2). The NUS algorithm need to traverse all can-didate user sets on each subcarrier, the process of trav-ersing all the candidate user set needs

2 KO M

M

and selecting the subcarrier needs O(N) on each subcar-rier, so the computational complexity is

2 2 KO M N

M

.

The S-NUS algorithm is similar to the algorithm in [7], but more simple, computational complexity is O(M2KN).. 4. Numerical Results Performance of the proposed algorithms is investigated in this section. An OFDM system with 128 subcarriers is considered. We assume that the channel of each antenna of each user is identically independent and experiences frequency selective fading. The sum target bit rate of users is 512bits/symbol and target rate of each user is identical. For adaptive bit loading, QPSK, 16QAM, 64QAM and no data transmission are adopted here. When uncoded 2c-ary QAM is employed, the required average SNR can be t ightly approximated as

1 2 log 5

1.6

ck

k

BERf c

[8].

Figure 2 shows that the required average SNR versus when the number of transmission antennas is 2. It is seen that the system achieves the best performance when the value of is 0.65. Figure 3 shows that the required average SNR versus when the number of transmis-sion antennas is 4. It is seen that the system achieves the



509

best performance when the value of is between. [0.35, 0.50] The gap of performance is very big with different values of , so it is important to choose a suitable .Since the value of in [7] is 1, the per-formance is inferior to the proposed algorithms.

Figure 4 and Figure 5 show the performance of the proposed algorithms compared with the algorithm in [7] and none-adaptive algorithm when the number of trans-mission antennas is 2 and the number of users is 4 and 8. Fig.6 and Figure 7 show the performance of the proposed algorithms compared with the algorithm in [7] and none-adaptive algorithm when the number of transmis-sion antennas is 4 and the number of users is 4 and 8. The value of for S-NUS algorithm is 0.65 and 0.4 in two antenna configuration respectively. Each subcarrier is assigned to only one user in proper order in the none-adaptive method. From Figure 4 to Figure 7, it is seen that compared with the algorithm in [7], both BABS+NUS and BABS+S-NUS achieve significant performance improvement.

Since the S-NUS method first selects a user with the minimum transmit power when transmitting Rk,ave bits, then selects the users with large channel gain and low correlation with the other already selected users. But there is maybe a user set in which channel gain of users is not large enough but the orthogonality among users is better, this user set may be a better choice. NUS algo-rithm can select the better user set, hence, the perform-ance of NUS is superior to S-NUS. But compared with NUS method, S-NUS method has only little performance loss with lower computational complexity, so S-NUS method is a better choice when the number of users is very large. Besides, because diversity of multiple users is applied, it is seen that the required average SNR is de-creased with the increasing number of users.

Figure 4. BER vs. required average SNR with M=2 when the number of user is 4.




J. SHU ET AL.


510

5. Conclusions Two suboptimal algorithms for subcarriers and power allocation among users in a MIMO-OFDM system have been described in this paper. Dividing the problem into two stages enabled the design of algorithms with low computational complexity, which operates well in our simulation. The NUS algorithm has a better performance but the complexity is larger, hence, the S-NUS algorithm has a good trade-off between the performance and the complexity. The numerical results show that both of the two proposed algorithms achieve better performance while the computational complexity is almost the same as the algorithm in [7]. Actually, if the resource alloca-tion method for MIMO-OFDM systems is divided into two stages like this in this paper, the SDMA (Space-Di-vision Multiple Access) grouping algorithm for MIMO systems can be employed. For example, the SUS (Semi- Orthogonal User Selection) algorithm [6] is employed in the S-NUS algorithm. In next step, our research is to investigate more SDMA grouping algorithms and use them for the resource allocation in the MIMO-OFDM systems.

Besides, the ZFBF is power inefficient because beam-forming weights are not matched to user channels. There- fore, the problem of resource allocation employing more efficient techniques such as MMSE-BF (Minimum Mean Square Error), RBF (Random Beamforming) need to be further explored. 6. References [1] C. Y. Wong, R. S. Cheng, K. B. Letaief, and R. D. Murch,

“Multiuser OFDM with adaptive subcarrier, bit, and

power allocation,” IEEE Journal on Selected Areas in Communications, Vol. 17, pp. 1747–1758, October 1999.

[2] D. Kivanc, G. Q. Li, and H. Liu, “Computationally effi-cient bandwidth allocation and power control for OF-DMA,” IEEE Transactions on Wireless Communications, Vol. 2, pp. 1150–1158, November 2003.

[3] I. Kim, I. S. Park, and Y. H. Lee, “Use of linear pro-gramming for dynamic subcarrier and bit allocation in multiuser OFDM,” IEEE Transactions on Vehicular Technology, Vol. 55, pp. 1195–1207, July 2006.

[4] Y. M. Tsang and R. S. K. Cheng, “Optimal resource al-location in SDMA/multi-input-single-output/OFDM sys-tems under QoS and power constraints,” in Proceedings of WCNC 2004, pp. 1595–1600, 2004.

[5] P. W. C. Chan and R. S. K. Cheng, “Reduced-complexity power allocation in zero-forcing MIMO-OFDM downlink system with multiuser diversity,” in Proceedings of ISIT 2005, pp. 2320–2324, 2005.

[6] T. Yoo and A. Goldsmith, “On the optimality of multian-tenna broadcast scheduling using zero-forcing beam-forming,” IEEE Journal on Selected Areas in Communi-cations, Vol. 24, pp. 528–541, March 2006.

[7] Y. Shin, T. S. Kang, and H. M. Kin, “An efficient re-source allocation for multiuser MIMO-OFDM systems with zero-forcing beamformer,” in Proceedings of PIMRC 2007, pp. 1–5, 2007.

[8] S. T. Chung and A. J. Goldsmith, “Degree of freedom in adaptive modulation: A unified view,” IEEE Transactions on Communications, Vol. 49, pp. 1561–1571, September, 2001.



511

Two-Dwell Synchronization Techniques and Mimo Systems for Performance Improvements of 3G

Mobile Communications

F. BENEDETTO, G. GIUNTA Digital Signal Processing, Multimedia, and Optical Communications Laboratory, Department of Applied Electronics

University of ROMA TRE, Rome, Italy Email: [email protected]

Received April 20, 2009; revised June 3, 2009; accepted July 29, 2009 ABSTRACT This paper considers the case of smart antennas and multiple inputs multiple outputs (MIMO) systems, suited for the radio access of 3G mobile communications, involving two-dimensional spatio-temporal signal processing and two-dwell procedures. The main novelty of our work is twofold: first, a two-dwell acquisition technique is here performed to save the mean acquisition time versus one-dwell acquisition techniques; sec-ond, the searching procedure is driven from the estimates of the local signal-to-interference-plus-noise ratio, reducing again the mean acquisition time. Some examples of application to the detection of 3G communica-tion signals in typical mobile scenarios are provided and we have verified the effectiveness of the analyzed spatio-temporal two-dwell procedures. The presented technique seems to constitute a promising tool for analytic setting of near optimum spatio-temporal acquisition testing procedures based on serial search/veri-fication modes. Keywords: Initial Code Acquisition and Synchronization, MIMO Systems, Probability Tests, Sptio-Tempo-

ral Signal Processing, Mobile Communications

1. Introduction Direct-Sequence Spread-Spectrum Code-Division Multi-ple-Access (DS/SS CDMA) has raised a significant im-portance in mobile communications of the last few years [1]. It has emerged as the incoming standard for third-generation radio-mobile transmission technology [2], since it offers significant advantages in terms of channel capacity, mobile power consumption, link qual-ity and resilience to multi-path propagation [3]. However, in order to exploit the advantages of a DS/SS signal in a CDMA system, receivers must first be able to synchro-nize the local pseudo-noise (PN) code with the incoming PN code [4,5]. Usually, the problem is solved via a two-step approach: first an initial code acquisition, which synchronizes the transmitter and receiver to within an uncertainty of a chip period, is realized; second, a code tracking phase, which performs and maintains fine synchronization between transmitter and receiver, is car-ried out [4].

Synchronization process in Wideband Code-Division

Multiple-Access (W-CDMA) [2] (i.e. the process in which the mobile station searches for cell and scrambling codes and their timing) consists of five sequential steps: 1) slot synchronization, 2) frame synchronization and scrambling code group identification, 3) scrambling code identification, 4) frequency acquisition, and 5) cell iden-tification. This contribution addresses algorithms for the initial cell search consisting of slot synchronization (i.e. the first stage), under a possible systematic frequency error. The combined goal of such stages is to deliver a reliable code-time candidate to the frequency acquisition stage.

The process of searching for a cell and synchronizing to its downlink scrambling code is often referred to as cell search [2]. Cell search is necessary after the mobile station has switched on (initial search), and during idle and active modes, for identifying new camping cells or handover candidates, respectively. Idle and active mode search is also called target cell search. The performance of cell search impacts the perceived switch-on delay (ini-tial search), stand-by time (idle mode search), and link

F. BENEDETTO ET AL.


512

quality (active mode search), and thus is important to mobile station design.

Conventional initial acquisition methods in the radio access of third generation (3G) mobile communications are based on serial multi-dwell hypothesis tests, based on non-coherent correlation from a number of data blocks. The test sequentially searches for the most likely codes and their optimum timing shift as reliable candidates for code (and code offset) acquisition [2,6]. The constant false alarm rate (CFAR) criterion, often employed to perform effective tests [7], is adopted to determine the threshold value.

This work considers the case of smart antennas and multiple inputs multiple outputs (MIMO) systems, suited for the radio access of 3G mobile communications [8,9]. The presented technique involves two- dimensional (2D) spatio-temporal signal processing and two-dwell proce-dures. The main novelty of the presented procedure is twofold: first, a two-dwell acquisition technique is here performed to save the mean acquisition time respect to one-dwell acquisition techniques; second, the searching procedure is driven from the estimates of the local sig-nal-to-interference-plus-noise ratio (SINR), reducing again the mean acquisition time.

This paper is organized as follows. In Section 2, the basic outlines of two-dwell non-coherent code acquisi-tion techniques are presented, while in Section 3 the analysis is extended to the case of smart antennas and MIMO systems, involving 2D spatio-temporal signal processing. In particular, we preliminary discuss on the spatial coverage of the customers, assuming an adaptive management of call admission from distinct angular sec-tors. Section 4 shows the analytic performance expressed by using the Generalized-Q (GQ) functions, and provides application to typical mobile operating scenarios. The conclusions of the paper are finally drawn in Section 5. 2. Frameworks of Two-Dwell Non-Coherent

Acquisition The two opposite cases of acquired or mismatched code offset are often referred to as in-sync and out-of-sync conditions. These cases differ because the output of a matched filter is ideally constant in the former condition, while it randomly varies in the latter one. In fact, it is well known that the user codes employed are orthogonal only if the users are chip-synchronized with each other. In practice, any pair of codes may present a relevant cross-correlation for nonzero chip offset. Such a residual correlation acts as a random variable (the codes are usu-ally modulated by independent data streams), character-ized by a noise-plus-interference variance depending on the effective time synchronization. Testing for the pres-ence of useful signal should discriminate over the fol-lowing two hypotheses:

- H1: presence of signal, in-sync case; - H0: absence of signal, out-of-sync case.

Hence, the multi-dwell sequential testing procedure [10,11] must decide on the presence (H1 hypothesis) or absence (H0 hypothesis) of useful signal from a set of 0 testing variables (N=number of dwells, being N = 2 in the application presented herein). The H1 hypothesis is accepted if all the testing variables exceed a correspond-ing set of N thresholds. Conversely, if one (at least) of the N variables assumes a lower value, the H1 hypothesis is rejected. In practice, the fairly frequent H0 hypothesis can be detected in a quicker time from the test of few variables. This usually reduces the overall mean decision (signal acquisition) time.

The N testing variables consist of the complex corre-lator outputs r1 … r2 … rN, taken at N different times, between a given number of samples of each reference signal and the received signal, which is assumed to be corrupted by additive independent Gaussian noise. In fact, a large number of samples is required in presence of very low signals, such as in the case of the spread- spec-trum systems that need large processing gains. In such cases, the Gaussian model is asymptotically valid, as consequence of the central limit theorem. Therefore, the correlator output is usually assumed to be a non-zero mean (H1 hypothesis) or zero mean (H0 hypothesis) Gaussian complex variables.

Past attempts considered the case of independent data blocks for sake of simplicity. Conversely, it has been shown that performance improves while serial tests based on correlation magnitude of overlapping data blocks are taken into account [11]. The performance of the whole system, expressed in terms of the probabilities of detection and false alarm, have been extensively computed and analysed [11], achieving a further per-formance optimisation in [12]. A closed form solution to the mathematical problem of determining the error probabilities (as a function of Bessel functions) was also therein addressed by the definition of the generalized-Q (GQ) functions (generalizing the Marcum’s Q-func-tions).

For N=2, the probabilities of false alarm and detec-tion (Pf and Pd) obtained by testing the correlation mag-nitudes |r1| and |r2| measured over the (overlapped) data block with respective lengths L1 and L2, can be expressed by the GQ-functions [11,12], that also depend on the respective testing thresholds V1 and V2:

21 2

(2)1 1 2 2 1

2 1 2 1 2

( ) 222

0 0 2 1 2

2 1

2 21 1 2 2

1 11 2 2

( , )

( , ; , , )

( )!

!( !) 2 1 1

( ), 1 , 1

2 2

d

m L L n m

kk n n

P P r V and r V H

GQ L L V V m

k n L me

L Lk n

L L

V L L Vk k n

L L L

(1)

TWO-DWELL SYNCHRONIZATION TECHNIQUES AND MIMO SYSTEMS FOR PERFORMANCE IMPROVEMENTS OF 3G MOBILE COMMUNICATIONS


513

2 21 1 2 2

11 2 2

0 1 2

2 1

(2) ( ,1 1 2 2 0

( , , , ;0)2 1 2 1 2

( ), 1 , 112 2

1 1k

k

P P r V and r V Hf

GQ L L V V

V L L Vk k

L L L

L L

L L

)

(2)

having defined the complement to one of the incomplete gamma function as [11,12]:

11

1( ; )

( )a t

xx a t

a

e dt (3)

In such a case, optimum test design consists in the minimization of the mean acquisition time for a given number of cells and constant probabilities of detection Pd, false-alarm Pf, and a penalty time TP following a false-alarm decision. Adoption of the optimal set of pa-rameters in the acquisition test can lead to a further sig-nificant reduction of the mean duration of sequential tests [12]. Standard numerical algorithms, such as the steepest descent or the Newton-Raphson’s, can deter-mine the optimal set.

In the two-dwell case (sometimes named as search and verification modes), the mathematical problem con-sists of searching for the set of the optimum values of four parameters (i.e. two time durations L1, L2 of the tests and the corresponding two threshold values V1, V2) that minimize the function mean acquisition time, under two constraints (i.e. the desired error probabilities Pd and Pf). It is interesting to observe that this will minimize the standard deviation of the acquisition time too [10]. 3. Two-Dimensional Two-Dwell Spread-

Spectrum Code Acquisition In our analysis we have used the Constant False Alarm Rate (CFAR) procedure, often employed to perform ef-fective tests [7]. In particular, the CFAR test is accom-plished in two successive parts: first, a threshold is de-termined to limit the false-alarm probability Pf at a given reduced value (also named size of the test); second, the probability of detection Pd (also named power of the test) is evaluated for the threshold previously determined [13]. The probability of false alarm must be tuned to guarantee a very low number of possible false alarms, which even-tually imply a relevant penalty time to the acquisition device. Large probabilities of detection (up to 100%) are typical of well-performing testing variables [10,11].

Starting from the one-dwell approach of Katz and Glisic [8,9], we exploit the capabilities of spatio-temporal search when two-dwell algorithms are employed. Also in

such a case, the GQ-functions represent a very useful tool to express the performance in an analytic form also in the two-dimensional case.

The mean code acquisition time of the test for the sin-gle cell is well approximate by [10]:

2( ) ( )

2d

f Pd

PN q N P T

P

(4)

where TP represents the penalty time for a false alarm decision that implies a very time consuming acquisition system reset [11], while:

1( )

01 1

( ) ( ) /jN

ij f

j j

N N q L P

(5)

and, in particular, for N = 2: (2) (2) (1)1 2(2) fL L P (6)

From the analysis of the GQ-function of the second order (GQ2), we obtain the important theoretical rela-tionship assuming the constant testing probabilities Pd

(2) and Pf

(2):

'SNR SNR (7)

(2) (2)(2) (2)11 22

; L

L L 22

L

(8)

(2) (2)(2) (2)1

1 2; V

V V 2V

(9)

2

'(2)(2)

(10)

The same procedure can apply to the problem of se-quential cell acquisition, after defining the concept of angular cell following the approach of [8,9]. In such a case, the spatial cells are directionally scanned by means of the antenna array. Each cell covers a given angular sector. It can possibly vary according to the employed procedure. Moreover, the distribution of noise and noise-like interfering users may vary versus the angle itself.

The choice of the width of the angular sectors (exam-ined in the following) is critical since narrowing the beam increases the detection probability in the de-lay-angle cell that actually contains the signal, while the effective signal-to-interference-plus-noise ratio (SINR) becomes higher. But, in the meanwhile, it can lead to a higher overall mean acquisition time, because, for a given uncertainty, it also implies an increase in the num-ber of sectors to be tested.

Defining H1,i and H0,i as the hypotheses that the system decides that the signal is present or not (with the tested code and time offset) in the i-th cell, being S1,i and S0,i the hypotheses that the tested signal candidate is actually

F. BENEDETTO ET AL.


514

present or not in the i-th cell, their respective probabili-ties are:

1, 1, 1, 0, 0,

1, 1, 1, 0, 0, 0,

, ,

1 11

i i i i i

i i i i i

d f

P H P S H P S H

P S P H S P S P H S

P Pm m

i (11)

0, 1, 0, 0, 0,

1, 0, 1, 0, 0, 0,

, ,

1 11 1 1

i i i i i

i i i i i

d f

P H P S H P S H

P S P H S P S P H S

P Pm m

i

)

...

m

(12)

The total mean acquisition time TOT for searching over an ensemble of m possible cells is expressed as:

1 1,1 1 2 0,1 1,2

1 2 3 0,1 0,2 1,3

( ) ( ) ( ) (

( ) ( ) ( ) ( )TOT P H P H P H

P H P H P H

(13)

where the terms 1, 2, 3 … include the penalty fac-tors of their corresponding sectors. The sum extends till m if only one round of search is implemented, i.e.:

1

1, 0,1 1

( ) ( )im m

TOT k i ik i k j

P H P H

(14)

If a number g of search cycles are implemented, the upper limit (formerly the sector number m) of the two sums in the above equation becomes m•g. It may even go to infinite (with a probability that tends to zero) if a no-stop search (g→+∞) is considered, that is:

1

] 1, ] 0, ]1 1

( ) ( )i

TOT k i im mk i k j

P H P H

(15)

where the sector indices i, j, and k are now evaluated modulus m (respectively denoted as i]m, j]m, and k]m).

Let us observe that when the probabilities P(H1,i), P(H0,i) and the acquisition times are the same in the various sectors (i.e. P(H1,i) = and P(H0,i) = 1-, and i = ) the overall mean acquisition time reduces to:

1

1

(1 )iTOT

k i k

(16)

4. Results and Discussion In this section, we aim to show through a wide number of simulation trials the effectiveness of the previously analyzed spatio-temporal two-dwell technique, providing application to typical mobile operating scenarios. We present the performance of the new scheme in terms of overall acquisition time, μTOT, and signal-to-interference plus noise ratio (SINR) varying the number m of angular

sectors as well as the number g of full scans of all the sectors.

Some examples of application to the detection of di-rect-sequence spread-spectrum communication signals are provided. In particular, we are assuming an adaptive management of call admission control based on a uni-form a priori probability of the (possibly non-uniform) angular sectors. In such a target case, the spatial cover-age of the customers is optimized. The effect of a ran-dom (non-ideal) distribution of users is then considered in all the following analyses. Assuming that no informa-tion is available on the a priori distribution of user sig-nals, two strategies of call search may be thought, namely random and SINR-ordered. In fact, beginning the search from the highest SINR cells till the lowest SINR ones reduces the mean acquisition time. In actual envi-ronments, the measured SINR value accounts for the number of interfering users, regarded as noise-like inde-pendent disturbs.

In particular, Figure 1 shows two possible cases of uniformly distributed interference plus noise in m = 5 (Figure 1(a)) or 10 (Figure 1(b)) angular sectors. The SINR directly represents the ratio between the useful synchronization signal power and the interference- plus-noise power (measured at the input of a receiver) on a linear scale (not in dB). For a proper matching, we have assumed that the same total signal and inter-ference-plus-noise powers over the ensemble of all the sectors in all the examined cases. The equivalent con-stant distribution of interference- plus-noise is shown in dashed lines for a visual comparison. In such a case, the SINR (considering both thermal noise and noise-like interference) of each angular cell increases proportionally to the number of considered sectors.

In particular, we have examined three reference cases, respectively taken from Tables 2, 3, and 4 of [11] (also used for numerical comparison in [12]). We have varied the number of circular sectors (m = 1…10), all of them characterized by uniform random distribu-tion of interference plus noise. The overall acquisition time μTOT is depicted in the Figures 2(a)–(c) versus the chosen number m of sectors.

Moreover, the obtained overall acquisition time μTOT is plot in Figure 3 versus the number of sector scans (g), for m = 3 (Figure 3(a)) and 10 (Figure 3(b)) total sectors respectively, assuming the uniform random distribution of interference plus noise. In both the cases, the limit value (g→+∞) is well approximated for g ≥ 6 search cycles. The overall acquisition time μTOT is finally depicted in the Figure 4 versus the chosen number m of sectors, for a large number g of search cycles (g = 10 were used here). It can be seen that a number m ≥ 7 of spatial sectors may be a valid compromise, in the examined example, between the high performance allowed by focusing techniques and the

TWO-DWELL SYNCHRONIZATION TECHNIQUES AND MIMO SYTEMS FOR PERFORMANCE IMPROVEMENTS OF 3G MOBILE COMMUNICATIONS


515

0

0.2

0.4

0.6

0.8

1

1.2

1 2 3 4 5m (sectors)

SIN

R

0

0.2

0.4

0.6

0.8

1

1.2

1 2 3 4 5 6 7 8 9 10

m (sectors)

SIN

R

(a) (b)

Figure 1. A uniform realization of random interference plus noise (SINR) in 5 angular sectors (left) and in 10 angular sectors (right).

50

150

250

350

450

550

650

750

1 2 3 4 5 6 7 8 9 10

number of scanned sectors (g)

random scan ordered scan

overall acquisition time [Kchips]

50

150

250

350

450

550

650

750

1 2 3 4 5 6 7 8 9 10

number of scanned sectors (g)



(a) (b)

Figure 2. The obtained overall acquisition time versus the number of sector scans (g) for m = 3 (left) and for m = 10 (right) total sectors assuming the uniform random distribution of interference plus noise. system complexity in terms of both operating apparatus and computation cost of signal processing.

Therefore, the main novelty of the presented ap-proach is twofold: first, a two-dwell acquisition tech-nique is here performed to save the mean acquisition time respect to one-dwell acquisition techniques;

second, the searching procedure is driven from the estimates of the local signal-to-I nterference-plus- noise ratio (SINR), reducing again the mean acquisi-tion time. The obtained results evidence the effective-ness of the analyzed spatio-temporal two-dwell pro-cedures.

F. BENEDETTO ET AL.


516

Figure 3. Overall acquisition time versus the number m of sectors (for large g) in the first reference case: (a) Table 2, (b) Table 3, (c) Table 4 of [6] (SINR = -18 dB per chip; q = 63; Pd = 0.99; Pf =0.0001) assuming uniform random distribution of interference plus noise. 5. Conclusions This paper extends the analysis of two-dwell procedures (search /verification modes) to the spatial domain for the initial code synchronization of 3G mobile commu-nications. We showed that also in this case the GQ functions may represent a useful tool to express the performance in an analytic form, as well in the two-dimensional case. From the technical viewpoint, a two-dwell acquisition technique is here performed to save the mean acquisition time respect to one-dwell

acquisition techniques. Two searching strategies (or-dered and random respect to the measured SINR) have been considered and implemented. Some examples of application to the detection of DS/SS communication signals in typical mobile scenarios are provided and we have verified the effectiveness of the analyzed spatio-temporal two-dwell procedures.

The presented technique seems to constitute a promising tool for analytic setting of near optimum spatio- temporal acquisition testing procedures based on serial search/verification modes.

TWO-DWELL SYNCHRONIZATION TECHNIQUES AND MIMO SYTEMS FOR PERFORMANCE IMPROVEMENTS OF 3G MOBILE COMMUNICATIONS


517

200

300

400

500

600

700

800

900

1000

1100

1 2 3 4 5 6 7 8 9 10

number of sectors (m)



Figure 4. The obtained overall acquisition time versus the number m of sectors for a large number g of search cycles assuming the uniform random distribution of interference plus noise. 6. Acknowledgement The authors wish to thank Prof. A. Neri of the University of Roma Tre for his helpful discussions on application to 3G Mobile Systems. 7. References [1] R. L. Pickholtz, L. B. Milstein, and D. L. Schilling,

“Spread spectrum for mobile communications,” IEEE Transactions on Vehicular Technology, Vol. 40, pp. 313–322, February 1991.

[2] Y. P. E. Wang and T. Ottosson, “Cell search in W-CDMA,” IEEE Journal on Selected Areas in Com-munications, Vol. 18, No. 8, August 2000.

[3] R. Esmailzadeh and M. Nakagawa, “Pre-rake diversity combination for direct sequence spread spectrum mo-bile communications systems,” IEICE Transactions on Communications, pp. 1008–1015, August 1993.

[4] Y. H. Lee and S. Tantaratana, “Sequential acquisition of PN sequences for DS/SS communications: Design and performance,” IEEE Journal on Selected Areas in Communications, Vol.10, pp. 750–759, May 1992.

[5] S. Tantaratana, A.W. Lam, and P. J. Vincent, “Non-coherent sequential acquisition of PN sequences for DS/SS communication with/without channel fading,” IEEE Transactions on Communications, Vol. 43, pp. 1738–1746, 1995.

[6] J. C. Lin, “Noncoherent sequential PN code acquisi-tion using sliding correlation for chip-asynchronous direct-sequence spread-spectrum communications,” IEEE Transactions on Communications, Vol. 50, No. 4, pp. 664–676, April 2002.

[7] E. Moulines and K. Choukri, “Time-domain proce-dures for testing that a stationary time series is Gaus-sian,” IEEE Transactions on Signal Processing, Vol. 44, pp. 2010–2025, August 1996.

[8] M. Katz and S. Glisic, “Two approaches for enhancing performance of two-dimensional code acquisition in spatially nonuniform interference environments,” in Proceedings of International Conference on Wireless Personal Multimedia Communications, WPMC’01, Aalborg, Denmark, pp. 1071–1076, September 2001.

[9] M. Katz, J. Iinatti, and S. Glisic, “Search strategies for two-dimensional code acquisition in environments with non-uniform spatial distribution of interference,” in Proceedings International IEEE Vehicular Tech-nology Conference, VTC 2001 Spring, IEEE VTS 53rd, Vol. 2, pp. 1454–1458, 2001.

[10] D. M. Di Carlo and C. L. Weber, “Multiple dwell se-rial search: Performance and application to direct- sequence code acquisition,” IEEE Transactions on Communications, Vol. 31, pp. 650–659, May 1983.

[11] G. Giunta, “Generalized Q-functions for application to non-coherent serial detection of spread-spectrum communication signals,” IEEE Transactions on Signal Processing, Vol. SP-48, No. 5, pp. 1506–1513, May 2000.

[12] G. Giunta, A. Neri, and M. Carli, “Constrained opti-mization of non-coherent serial acquisition of spread- spectrum code by exploiting the generalized Q-func-tions,” IEEE Transactions on Vehicular Technology, pp. 1378–1385, September 2003.

[13] G. Giunta and L. Vandendorpe, “A ‘Rayleighness’ test for DS/SS code acquisition,” IEEE Transactions on Communications, Vol. 51, No. 9, pp. 1492–1501, September 2003.



Bandwidth Optimization in 802.15.4 Networks through Evolutionary Slot Assignment

Vidya KRISHNAMURTHY, Edward SAZONOV Department of Electrical and Computer Engineering, Wallace H. Coulter School of Engineering

Clarkson University, NY, USA Email: [email protected]

Received March 26, 2009; revised May 15, 2009; accepted July 29, 2009 ABSTRACT Traditional Wireless Sensor Networks (WSNs) based on carrier sense methods for channel access suffer from reduced bandwidth utilization, increase energy consumptions and latency problems in networks with high traffic. In this work, a novel Evolutionary Slot Assignment (ESA) algorithm has been developed to in-crease the throughput of large wireless mesh networks with no centralized controller. In the presented scheme, the sensor nodes self-adapt to the traffic patterns of the network by selecting transmission slots us-ing evolutionary learning methods. Each sensor node evolves an independent transmission schedule. Unlike traditional evolutionary methods, fitness evaluation of every node impacts fitness of every other sensor node in the network. The ESA algorithm has been simulated using Network Simulator-2 and compared with the IEEE 802.15.4 CSMA-CA, a Static Slot Assignment (SSA) and a Random Slot Assignment schemes (RSA). Results show a remarkable improvement in the network throughput using the proposed ESA method as op-posed to other compared methods. Keywords: CSMA-CA, 802.15.4, Sensor Networks, Evolutionary Algorithms, Bandwidth Optimization

1. Introduction Wireless Sensor Networks (WSNs) consist of a group of sensors nodes that use wireless links to perform distrib-uted sensing tasks. Sensor nodes combine simple wire-less communication, minimal computational facilities, and sensing of the physical environment for an applica-tion-specific sensor network [1]. In monitoring applica-tions of Wireless Sensor Networks (WSNs), multiple nodes may desire to transmit at the same time, for exam-ple in response to an event detected by multiple sensor nodes. Same time transmissions by multiple nodes lead to RF collisions that will cause loss of packets. There are multiple methods for collision detection and avoidance. The simplest method developed in the 1970s is ALOHA, where, a node with a packet to transmit waits for a ran-dom amount of time and then sends the packet. Another variation of this method was the slotted ALOHA, where time was divided into slots and nodes wait for the start of a slot for transmission. In case of a collision, the node simply retransmits the packet. These methods failed with increasing network size and data rate [2]. FDMA (Fre-quency Division Multiple Access) and TDMA (Time

Division Multiple Access) are methods developed and used in 1980s and early 90s that divide the available fre-quency spectrum or time into slots and assign them to different nodes. These methods divide the available re-sources amongst sensor nodes and therefore are not very cost effective and cannot deal with rapidly changing network topology, such as in mobile nodes [3]. Tradi-tional WSNs are based on collision free carrier sense methods for channel access. The most commonly used scheme by the late 1990s is a Carrier Sense Multiple Access with Collision Avoidance (CSMA-CA) where the nodes sense the channel to be idle and wait for a random amount of time before transmission. This restricts the collision probability to a situation that two nodes may have a packet at the same time to transmit and the ran-dom wait time for them happens to be same.

IEEE 802.11 emerged as a popular choice for WSNs that use CSMA-CA with its four-way handshaking pro-tocol [4] to increase the reliability of data transfer. But this protocol and related hardware are too energy con-suming for low-data rate, low-power networks and en-ergy scavenging applications. CSMA-CA mechanism also affects the network latency [5] thus making it unfit

BANDWIDTH OPTIMIZATION IN 802.15.4 NETWORKS THROUGH EVOLUTIONARY SLOT ASSIGNMENT


519

for monitoring applications with large number of sensor nodes. The IEEE 802.15.4 protocol [6] has recently emerged as a new standard for low rate wireless personal area networks (LR-PANs) [7]. The popularity of the low-power, low-rate IEEE 802.15.4 protocol and avail-ability of low-cost, low-power RF chips has led us to use it for developing low cost Wireless Sensor Networks targeted towards monitoring applications, such as appli-cations of structural health monitoring. However, for synchronous sensor networks that sample data at the same time or respond to events, the packet collision probability is seen to tremendously increase with in-crease in network size [8]. Such collisions tremendously reduce bandwidth utilization, increase energy consump-tions and latency of IEEE 802.15.4 devices. The poor performance of IEEE 802.15.4 based CSMA/CA have also been analyzed using discrete time Markov chain models [9].

We performed preliminary analysis of a multi-hop network to illustrate quality loss of performance in a high-traffic 802.15.4 CSMA-CA network. Network Simulator–2 (NS–2) [10] with IEEE 802.15.4 medium access protocol [11] has been used. The network topol-ogy consisted of a beacon-less mesh network with one PAN Coordinator and 30 sensor nodes placed at random in a 50m-by-50m area (Figure 1). The receiving thresh-old of every sensor node was limited to 15m. The input data rate per node was varied and the total network throughput was measured as the total number of packets received at the receiver (sink) per second. Figure 2 shows the result of the simulation that clearly suggests low bandwidth utilization. A possible solution that can alleviate the problems faced by IEEE 802.15.4 CSMA- CA based sensor network is a reservation based mecha-nism where the contention times of each wireless node should be separated from each other, thus minimizing collisions and retransmissions. A number of mechanisms in this direction have been proposed and developed [12–17].

Our earlier work [18] shows that a reservation sched-uler could provide up to 5 times increase in data rate and 99% usage of effectively available bandwidth. The TDMA scheduler works very well in a single-hop star topology where the central node is aware of all the sen-sor nodes in the network. But in a typical wireless mesh network with large number of sources and less number of sinks, a centralized scheduling is undesirable. Also, in case of fluctuation in the topology, that is, if sensor nodes enter and leave the network continuously, a static scheduling would not work well. In this paper, we pro-pose a new algorithm, the Evolutionary Slot Assignment (ESA) algorithm that continuously adapts transmission schedule of every sensor node without relying on a cen-tral coordinator. The proposed Medium Access Control

(MAC) layer method is inspired by the evolutionary al-gorithms [19–27]. However, the proposed ESA algo-rithm is based on a novel model of specimen interactions where fitness evaluation of each individual affects fitness of other nodes in the network. Unlike other de-synchro-nized algorithms, ESA does not pose assumptions on topology and connectivity of the network. ESA algo-rithm does not need a common time reference or beacons and is well suited for mesh networks and networks with high node mobility and substantially increases bandwidth utilization of IEEE 802.15.4 networks.

The next section discusses the proposed method in de-tail. The simulation scheme has been outlined in Section 3 and Section 4 presents the results and discussion of the simulated model. Section 5 concludes the paper with remarks about the method.

2. Methods To increase the bandwidth utilization and decrease colli-sions in ad-hoc beaconless wireless sensor networks, a probabilistic learning-based scheduling algorithm based on Evolutionary Slot Assignment (ESA) is proposed. We

Figure 1. Tested mesh topology.

Figure 2. Throughput of a multi-hop mesh network with 31 nodes with increasing load.

V. KRISHNAMURTHY ET AL.


520

consider a beaconless IEEE 802.15.4 network with high data rates characteristic for continuously monitoring ap-plications as our main target. The nodes in the network are not synchronized in time. The ESA algorithm is im-plemented at the sensor node level and requires no cen-tral coordinator or cluster head; does not rely on beacons or common time reference and does not pose assump-tions on topology and connectivity of the network.

Each node in the network schedules slots for periodic transmissions within a time period called the network period, or Τ. The network period T is divided into virtual slots where the sensor node attempts to transmit data. These transmission slots and network period T should not be confused with the Guaranteed Time Slots and beacon interval in a beacon-enabled IEEE 802.15.4 net-work. In ESA, duration of a transmission slot is deter-mined by the maximum size of data packets with each slot consisting of 8 to 20 backoff periods corresponding 1) initialize probability vectors:

a. 1 2, ,......, NT T T TV V V V

0,1iTV

,

b. 1 2, ,......, NT T T TP P P P

[0,1]iTP

,

2) if packet to transmit at time t:

a. find i: iV =1 and i > t

b. packet_wait_timer = t – i; c. if packet_wait_timer == 0

i. transmit packet; α = rand(0, 0.2)

ii. if status == success

i iP P , if 1 ,1 1 Ti

T PPiii. elseif status == failure

i iP P

1

, if , if

0, 0i iP P

,1 1 Ti

T PPiv. elseif status == medium_busy

i iP P , if 0, 0i iP P

3) if iP < β, then

a. iV = 0

b. j = random number Є 1, 2, ……, N; jV = 1

4) adaptation:

a. 9

10T

ii T

Figure 3. ESA Algorithm.

to minimal and maximal possible packet lengths. Net-work period T defines the maximum number of trans-mission slots that can be scheduled and the length of the scheduling table. For the purpose of simplicity of analy-sis, the network period is kept constant for all the sensor nodes in the network and it is assumed that the network period starts on node power-up. Since sensor nodes start-up at random times, the beginning of a network period is not synchronized between sensors. However, since every sensor will evolve an independent schedule without knowing schedules of other nodes, the algorithm does not need a common time marker or a synchronizing beacon. All sensor nodes are assumed to have the same sampling rate and thus have the same average data rate. This represents heavy load conditions of a real-time mul-tipoint continuous monitoring application. The sensor nodes may send data using constant bit rate (CBR) traffic or an exponential (Poisson) traffic pattern. The number of slots (N) each node would need to transmit packets of size (χ bytes) at an average data rate of γ kbps is calcu-lated for the time Τ as:

*TN

(1)

The ESA algorithm for evolution of transmission schedules (Figure 3) is continuously executed on every node in the network. The algorithm is divided into the following phases:

1) Initialization Phase: In this phase, the internal vari-ables of the algorithm are initialized by each node. A binary slot vector VΤ is created of size equal to the total number of transmission slots (N) possible in Τ.

1 2, ,......, NT T T TV V V V , … (2) 0,1i

TV

A 1iTV indicates a slot in which the node will at-

tempt to transmit. Another vector PΤ contains the probability associated

with every transmission slot (or fitness of a slot):

1 2, ,......, NT T T TP P P P , (3) [0,1]i

TP

During initialization, all values of VΤ are initialized to 0 and of PΤ to 0.5. Due to periodic nature of the network period T both vector VΤ and PΤ are ring structures (Fig-ure 4), i.e. the algorithm traverses from the last element into the first element of the vectors. Ring representation allows to eliminate the need for common time reference since slot m of a wireless node 1 will always correspond to slot n of wireless node i as long as relative time drift between the nodes is close to zero (a safe assumption for most applications). The initial set of (N*r) slots is se-lected from the list of slots using a tournament selection algorithm [28]. Specifically, a random number of slots is drawn from the list of available slots and a single slot j with the highest probability jP

is selected from the set

and VΤ is updated with jV = 1. The tournament selection



521

is repeated till N*r slots are assigned. During the initial selection procedure, however, the probability of all the slots is equal and is hence not critical. The redundancy factor r is needed for any retransmissions due to colli-sions. An example of the vectors VΤ and PΤ updated with the initial selected slots is:

VΤ = 0, 0, 1, 0,……0, 1, 1, 0 ……0, 0], PΤ = 0.5, 0.5, 0.5, 0.5, ……0.5, 0.5, 0.5, 0.5 ……0.5, 0.5

2) Transmission Phase: Whenever a node has a packet to transmit, it traverses the vector VΤ for the next avail-

able slot i ( ) and sets packet_wait_timer to wait

until beginning of slot i to transmits the packet. After packet_wait_timer expires the node attempts an IEEE 802.15.4 compliant CSMA-CA transmission with ac-knowledgement and updates vector PΤ based on the out-come of the transmission as:

1iTV

a) Success: If the packet transmission in slot i is suc-cessful (acknowledgement received) the probability value is increased by the absolute value of a random

Gaussian integer α (mean 0, deviation 0.2):

iPi i

T TP P .

If the results is greater than 1, then . The increase

in slot’s probability increases the fitness measure of the slot in reward for a successful transmission. Maintaining

above the threshold β is required for continuing use

of slot i in transmission schedule.

1iP

iP

b) Failure: If the packet transmission in slot x fails (no acknowledgement),

i iP P … (4)

For a Gaussian α with zero mean, this operation equiprobably increases or decreases the fitness of slot i. However, if 0iP , or if 0iP 1i

TP ,

. The goal of this fitness (probability) adjustment 1iTP

Figure 4. Ring structure of PT and VT vectors permits asynchronous operation of wireless.

is to move the transmission probability of nodes com-peting for access in time slot i in the opposite directions. The optimal outcome for two competing nodes is when the probability of the slot in one node is increased and probability of a matching slot in another node is de-creased. Using random Gaussian variable as described will produce such an outcome in 50% of failures.

c) Medium Busy: If the medium is busy when the packet transmission is attempted in slot i, is decre-

mented by the absolute value of α:

iPi i

T TP P How-

ever, if 0iP , 0iP . The probability is de-

creased due to the fact that some other node is using the medium at this time. Repeated observation of busy me-dium will cause iP falling below the threshold β and

deselection of the slot i. 3) Selection Phase: The probability (fitness) of each

transmission slot marked for transmission in VΤ is ad-justed in transmission phase. Based on those adjustments, corresponding values of iP may fall below the thresh-

old β, indicating that transmissions repeatedly fail in slot i. The re-selection of slots is performed after completion of every network period (T) for those slots whose prob-ability falls below a certain threshold β. This is done using the tournament selection method. For a slot i with

iP < β, iV set to 0. A random number of slots are

drawn from the list of unused slots and the slot j with of the highest probability value among these slots is

selected. This procedure is repeated for all slots that need rescheduling (such slots are excluded from the pool from which new slots are drawn).

iP

4) Adaptation Phase: Being a part of a multi-hop mesh network, a node may receive packets that need to be forwarded. Therefore, additional slots for forwarding traffic need to be added to the transmission schedule every interval T. Since a node makes no assumptions of topology of the network and ESA algorithm does not use information from upper layers (routing for example), the number of forwarding slots δ is estimated for by moni-toring the total number of forwarding packets received by a node over each time period Τ. The value of δ is computed and updated every network period based on a floating average of δ for the last 10 network periods. Thus the total number of slots available for a node in the interval Τ is (N*r + δ). Selection of δ additional slots in the vector VΤ is performed by tournament selection. After adaptation phase the node transitions into transmission phase.

This evolutionary process is performed on a continu-ous basis and adapts the node to the network traffic con-ditions. It should also be noted that the ESA algorithm is not a classical evolutionary algorithm. Traditional evolu-tionary algorithms (genetic algorithms, evolution strate-gies, etc.) evolve a population of individuals but fitness



522

of any given individual does not impact fitness of other individuals in the population. In ESA fitness is inter-preted as transmission probability associated with a slot. A node evaluates fitness of a slot by attempting a trans-mission and thus actively impacting fitness of matching slots in schedules of all other nodes that may attempt a transmission at the same time. In this sense the ESA al-gorithm is better characterized as a competitive co-evo-lutionary algorithm.

To illustrate the gain in bandwidth utilization due to evolutionary component of the algorithm, we propose two simplified variants of the above Evolutionary Slot Assignment (ESA) and compare them with the regular CSMA-CA mechanism. First is a Static Slot Assignment (SSA) technique where both the evolutionary mechanism and randomness of slot reselection are eliminated. In SSA the slot vector VΤ is initialed to (N*r + δ) slots. The data packets are only transmitted in this pre-selected random slot schedule and PΤ vector is never updated. Second is Random Slot Assignment (RSA) algorithm in which the evolutionary component is eliminated but randomness of slot reselection is maintained through periodic re-initialization of the transmission schedule VΤ every period for (η*r + δ) slots. Comparison with SSA and RSA allows evaluating the performance gain attrib-uted to evolutionary learning. We also perform compari-son to standard CSMA-CA technique to illustrate per-formance gain of ESA. 3. Simulation Scheme Network Simulator–2 (NS–2) [10] with IEEE 802.15.4 medium access protocol [6] has been used to simulate ESA, SSA and RSA methods. The simulation scenario consists of a beaconless network with one PAN Coordi-nator and 30 nodes placed at random in a 50m-by-50m area (Figure1) with range of each node limited to 15m. The nodes utilize the IEEE 802.15.4 medium access and physical layers for transmissions and receptions. The nodes are not synchronized in time and start-up at ran-dom times. The ESA algorithm was implemented at the SSCS (Service Specific Convergence Sublayer) that acts as an interface between the MAC and the upper layers. The SSCS layer is an implementation specific module that provides access to the MAC primitives and allows for their modification for any specific application. A wireless channel with two-ray ground propagation model was used. Each node is chosen to have omni-directional antenna and the maximum number of packets allowed in the interface queue was set to 10. The packets were transmitted using AODV routing protocol [29].

One of the nodes (Node 0) is assigned as the PAN Coordinator. This node allows all other nodes to join the network through the regular association procedure of the IEEE 802.15.4 standard [6]. This node is also used as the

sink node. However, it is no different from any other node in functionalities and other nodes may be sinks too. The nodes formed a star network (single-hop, single re-ceiver) or a mesh network (multi-hop, multiple receivers) topology with up to five hops. The average data rate per node is varied from 0.2 kbps to 3.2 kbps, placing the upper bound on the total network throughput at 100 kbps (max possible throughput of the IEEE 802.15.4 network [18]). No assumptions was made on packet inter-arrival time except that every node attempts to send all the data in the network period of T=5 seconds. The length of sin-gle transmission slot of the SSA, RSA and ESA algo-rithms was set to be 20 backoff slot (6.4 milliseconds) allowing the maximal possible packet size of 127 bytes.

The following medium access methods were used for each topology:

a) CSMA-CA: A regular CSMA-CA channel access mechanism to analyze the network throughput under varying loads. Here, whenever a sensor node has a packet to transmit, it waits for a random amount of time and then senses if the medium is free and if so, transmits the packet. In case the medium is busy or the packet fails delivery (no acknowledgement received), the node waits for another random amount of time and then retries. The limit on the maximum transmission attempts for a single packet was set to 3.

b) Static Slot Assignment scheme with a fixed redun-dancy factor δ=0.05*r: The random schedule is selected at start-up with (N*r + δ) slots, as discussed in Section II. When a sensor node has a packet to transmit, it traverses the schedule to look for the next slot marked as available for transmission. The time until that slot is calculated and a wait timer is generated. The packet is transmitted in the slot using CSMA/CA. Any retransmissions also follow the same procedure. The mesh forwarding parameter δ remains constant and the slot schedule is not updated during operation.

0 500 1000 1500 2000 2500 30000

20

40

60

80

100

120

Star Network

Data Rate per Node (bps)

Net

wor

k T

hrou

ghpu

t (P

acke

ts/S

econ

d)

CSMA-CA

ESA

SSA

RSA

Theoretical Limit

Figure 5. Comparison of throughput of a Star Network with CBR traffic.



523

0 1000 2000 3000 4000 50000

200

400

600

800

1000

1200

1400

1600

1800

2000

2200

2400

2600

2800

3000

3200

3400

200 bps 400 bps 800 bps 1600 bps 2400 bps 3200 bps

Thr

ough

put (

Num

ber

of p

acke

ts/ 5

0 s

eco

nds)

Data rate per node (bps)

(a)

0 500 1000 1500 2000 2500 3000 35000

500

1000

1500

2000

2500

3000

3500

4000

Co

nve

rgen

ce T

ime

(se

c)

Data rate per node (bps) (b) (c)

Figure 6. (a) Comparison of throughput varying with simulation time of an ESA Star network with CBR traffic with differ-ent data rates (b) Convergence times of an ESA Star network with CBR traffic at different data rates (c) Histogram of packet transmission for 400bps data rate for a Star Network with CBR traffic.

c) Random Slot Assignment: A random schedule is generated by each sensor node every network period by random re-initialization of VΤ to mark (η*r + δ) slots as available for transmission. Packet transmissions follow the same procedure as in part (b) above. The vector PΤ is not updated with the status of the transmissions and pa-rameter δ=0.05*r remains constant.

d) Evolutionary Slot Assignment: The network is simulated with every node using ESA algorithm as out-

lined in Section II. Vector PΤ is updated after every transmission attempt and vector VΤ is updated by tour-nament selection every network period. Redundancy parameter δ is updated every network period.

Star and multi-hop mesh network topologies with con-stant-bit-rate (CBR) traffic from the sensor nodes and a mesh topology with Poisson traffic were analyzed. All simulations were run for 5000 seconds. The throughput was measured as the net received data in the last 1000



524

seconds of the simulation. The effect of increasing traffic on the throughput network was studied. The convergence time was measured as the time taken to settle to 90% of the maximum stable value. In order to avoid any local-ized inferences, the results are averaged from three ran-dom seeds of each simulation setup. 4. Results Three different simulation scenarios were created as dis-cussed in the previous section: a star topology with CBR traffic, mesh topology with CBR traffic and a mesh to-pology with Poisson traffic.

Figure 5 shows the throughput of the network at vary-ing input data rates for the different methods of medium access for a 31 node star network. It can be clearly seen that as the data rate per node increases, the throughput of CSMA-CA method falls below 20 kbps or ≈20% of use-ful bandwidth. The peak of SSA and RSA methods bring the throughput up to 50 kbps but for a large data rate of 3.2 kbps per node, the network throughput is seen to fall. On the other hand, the maximum network throughput achieved by ESA is about 60 kbps which is a 300% im-provement over a CSMA-CA network. From Figure 6(a) and 6(b), it can be seen that the convergence time of ESA increases exponentially with increase in the data rate per node.

Mesh networks do not follow the same behavior as that of the star networks since they have to adjust their schedules to cater to intermediate traffic, even when these nodes are absolutely unaware of the routes estab-lished. Figure 7(a) shows that for mesh networks, CSMA-CA outperforms SSA and RSA for lower data rates, however for higher data rates, these algorithms begin to show better performance than CSMA-CA. The ESA method out-performs all other methods by provid-ing up to a 200% improvement in throughput compared to any other tested method. ESA provides fair channel access to all nodes of the network as illustrated in Figure 6(c) and Figure 9(c).

5. Discussions The very first observation is that in both star and mesh network configurations ESA provides a consistent ad-vantage over IEEE 802.15.4 CSMA-CA protocol and simpler SSA and RSA methods. Results demonstrate that ESA creates a substantial (200%–300%) improvement in throughput over CSMA-CA networks while providing fair access to the medium. This increase in performance is created without any awareness of packet routing, without centralized scheduling and extra scheduling traf-fic, and without using a common time reference, beacons or knowing schedules of other nodes. Performance gain

200 400 800 1600 2400 32000

20

40

60

80

100

120

Mesh Network


Net

wor

k T

hrou

ghpu

t (P

acke

ts/S

econ

d)

CSMA-CA

ESA

SSA

RSA

Theoretical Limit

Figure 7. Comparison of throughput of a Mesh Network with CBR traffic

0 500 1000 1500 2000 2500 30000

20

40

60

80

100

120

Poisson Mesh Network


Net

wor

k T

hrou

ghpu

t (P

acke

ts/S

econ

d)

CSMA-CA

ESA

SSA

RSA

Theoretical Limit

Figure 8. Comparison of throughput of a Mesh Network with poisson traffic. of the ESA method becomes very clear for networks with high traffic, where probability of collisions grows signifi-cantly. As results in Figure 5, Figure 7 and Figure 8 show, performance of the simpler SSA and RSA meth-ods degrade significantly for higher bit rates since nei-ther transmission slots nor CSMA-CA procedure are not capable of effective collision resolution under heavy loads. ESA gains this improvement by adapting individ-ual transmission schedule of each node so that the total number of collisions is minimized. This adaptation is purely based on outcome of packet transmission and in-dependent of neighboring node visibility, thus eliminat-ing any need for knowledge of network topology (ex. hidden terminal problem).



525

The second observation is that the consistent gain in bandwidth utilization of ESA method is achieved through the adaptive, evolutionary nature of the algo-rithm. Random selection nature of SSA and RSA meth-ods itself shows improvement over CSMA-CA for higher data rates by increasing the overall randomness of the channel access. However, under heavy load condition ESA provides almost twice the bandwidth in comparison to SSA and RSA (Figure 5 and Figure 7). Another im-portant clue showing the advantage of evolutionary ad-aptation is presented in Figure 7, where throughput of SSA and RSA for data rates below 2.4kbps is worse than

pure CSMA-CA. We attribute this effect to the fact that a node in a pure IEEE 802.15.4 CSMA-CA network is not bound by a schedule and has more attempts at transmis-sion than a node using SSA or RSA. ESA will attempt virtually the same number of transmission attempts as SSA and RSA but adapting the schedule will allow ESA to avoid most collisions and thus provide reliable per-formance over a wide range of data rates.

Our third observation is that the speed of convergence for ESA (Figure 6(b) and Figure 9(b)) depends on a spe-cific network configuration and traffic. The convergence is fastest for networks with low traffic since there exists

0 1000 2000 3000 4000 50000

200

400

600

800

1000

1200

1400

1600

1800

2000

200 bps 400 bps 800 bps 1600 bps 2400 bps 3200 bps

Net

wo

rk T

hrou

ghpu

t (N

umbe

r of

pac

kets

/ 50

seco

nds

)

Data Rate per node (bps)

(a)

0 500 1000 1500 2000 2500 3000 3500

2000

2500

3000

3500

4000

4500

5000

5500

Co

nver

genc

e T

ime

(sec

)

Data rate per node (bps)

(b) (c)

Figure 9. (a) Throughput varying with simulation time of an ESA mesh network with poission traffic at different data rates (b) Convergence times of an ESA Star network with CBR traffic at different data rates (c) Histogram of packet transmission for 400bps data rate for a Mesh Network with CBR traffic.



526

a multitude of possible non-conflicting transmission schedules. As the network size grows, the number of possible solutions is decreased and the algorithm spends a longer time seeking a suitable set of schedules.

Finally, we would like to note that the ESA algorithm is computationally lightweight and can be easily imple-mented even on most energy-constrained platforms. For example, the number of C code lines in the ESA sched-uler is 652. On the other hand, the proposed algorithm can actually reduce power consumption of a node by reducing the number of retransmission attempts. Overall, one of the strong points of ESA is that the schedule will remain stable if all packets are delivered as needed and will self-adapt if the traffic pattern is changed. 6. Conclusions In this work, a novel Evolutionary Slot Assignment was developed and tested. In the ESA method, the sensor nodes, independent of each other, adapt internal sched-ules to a traffic pattern minimizing collisions and im-proving bandwidth utilization. ESA method makes no assumption of network topology, packet routing, traffic from neighboring nodes, does not require a centralized scheduler and does not create scheduling traffic. ESA also does not need to know schedules of neighboring nodes and does not require a common time marker or synchronizing beacons. Simulations were performed for two topologies, a star and a mesh network, and two dif-ferent traffic scenarios, constant-bit rate traffic and Pois-son traffic. Networks using CSMA-CA medium access were compared with the static slot assignment, random slot assignment and the evolutionary slot assignment algorithms. Simulation results show 200%–300% im-provement in throughput of proposed ESA algorithm in comparison to pure CSMA-CA. 7. References [1] H. Karl and A. Willig, “A short survey of wireless sensor

networks,” Telecommunication Networks Group, Tech-nische Universitat Berlin, Hasso-Plattner Institute, Pots-dam.

[2] A. El-Hoiydi, “Aloha with preamble sampling for spo-radic traffic in ad hoc wireless sensor networks,” in Pro-ceedings of IEEE International Conference Communica-tions, pp. 3418–3423, 2002.

[3] I. Demirkol, C. Ersoy, and F. Alagöz, “MAC protocols for wireless sensor networks: A survey,” IEEE Commu-nications Magazine, Vol. 44, No. 4, pp. 115–121, 2006.

[4] G. Bianchi, “Performance analysis of the IEEE 802.11 distributed coordination function,” IEEE Journal on Se-lected Areas in Communications, Vol. 18, No. 3, pp. 535–547, March 2000.

[5] J. Zheng and M. J. Lee, “A comprehensive performance

study of IEEE 802.15.4,” IEEE Press Book, 2004.

[6] IEEE P802.15.4/D18, Draft Standard: Low Rate Wireless Personal Area Networks, February 2003.

[7] G. Lu, B. Krishnamachari, and C. S. Raghavendra, “Per-formance evaluation of the IEEE 802.15.4 MAC for low-rate low power wireless networks,” in Proceedings of IEEE International Performance Computing and Com-munication Conference (IPCCC’04), pp. 701–706, Phoe-nix, AZ, April 2004.

[8] J. Misic, S. Shafi, and V. B. Misic, “Maintaining reliabil-ity through activity management in 802.15.4 sensor clus-ters,” IEEE Transactions on Vehicular Technology, Vol. 55, No. 3, pp. 779–788, 2006.

[9] J. Misic, S. Shafi, and V. B. Misic, “Performance of bea-con enabled IEEE 802.15.4 cluster with downlink and uplink traffic,” IEEE Transactions on Parallel and Dis-tributed Systems, Vol. 17, No. 4, pp. 361–376, 2006.

[10] K. Varadhan, “The Ns manual,” The VINT Project, Au-gust 2000.

[11] E. Sazonov, R. Jha, K. Janoyan, V. Krishnamurthy, M. Fuchs, and K. Cross, “Wireless intelligent sensor and ac-tuator network (WISAN): A scalable ultra-low-power platform for structural health monitoring,” SPIE Pro-ceedings 6177, March 2006.

[12] W. Ye, J. Heidemann, and D. Estrin, “An energy-efficient MAC protocol for wireless sensor networks,” IEEE In-focom’02, Vol. 3, pp. 1567–1576, June 2002.

[13] T. van Dam, and K. Langendoen, “An adaptive en-ergy-efficient MAC protocol for wireless sensor net-works,” ACM SenSys 2003, pp. 171–180, November 2003.

[14] C. S. Raghavendra and S. Singh, “PAMAS–power aware multi-access protocol with signaling for ad hoc net-works,” ACM Computer Communications Review, Vol. 28, No. 3, pp. 5–26, 1998.

[15] M. J. Miller and N. H. Vaidya, “On-demand TDMA scheduling for energy conservation in sensor networks,” Technical Report, University of Illinois at Urbana Champaign, 2004.

[16] M. L. Sichitiu, “Cross-layer scheduling for power effi-ciency in wireless sensor networks,” IEEE Infocom’04, March 2004.

[17] V. Rajendran, K. Obraczka, and J. J. Garcia-Luna-Aceves, “Energy-efficient, collision-free medium access control for wireless sensor networks,” ACM SenSys’03, pp. 181–192, November 2003.

[18] V. Krishnamurthy and E. Sazonov, “A reservation-based protocol for monitoring applications using IEEE 802.15.4 sensor networks,” in International Journal of Sensor Networks, Vol. 4, No. 3, pp. 155 –171, 2008.

[19] T. Bäck, “Evolutionary algorithms in theory and prac-tice,” Oxford University Press, New York, 1996.

[20] T. Bäck, D. Fogel, and Z. Michalewicz, “Handbook of evolutionary computation,” Oxford University Press, 1997.

[21] S. Baluja, “Population-based incremental learning: A



527

method for interacting genetic search based function op-timization and coemptive learning,” Technology Repub-lic, No. CMU-CS-94-163, Carnegie Mellon University, 1994.

[22] G. Harik, F. G. Lobo, and D. E. Goldberg, “The compact genetic algorithm,” in Proceedings of the International Conference on Evolutionary Computation (ICEC 98), pp. 523–528, 1998.

[23] H. Pang, K. Hu, and Z. Hong, “Adaptive PBIL algorithm and its application to solve scheduling problems,” IEEE International Symposium on Computer-Aided Control Systems Design, pp. 784–789, October 2006.

[24] Z. He, C. Wei, B. Jin, W. Pei, and L. Yang, “A new population-based incremental learning method for the t-raveling salesman problem,” in Proceedings of the 1999 Congress on Evolutionary Computation, Vol. 2, pp. –1156, 1999.

[25] S. Yang and X. Yao, “Dual population-based incremental

learning for problem optimization in dynamic environ-ments,” in M. Gen et al. (editors), Proceedings of the 7th Asia Pacific Symposium on Intelligent and Evolutionary Systems, pp. 49–56, 2003.

[26] S. Baluja and D. Simon, “Evolution-based methods for selecting point data for object localization: Applications to computer-assisted surgery,” International Journal of Applied Intelligence, Vol. 8, pp. 1–13, 1997.

[27] L. Chen and A. Petroianu, “Application of PBIL to the optimization of PSS tuning, power system technology,” in Proceedings of 1998 International Conference on Power System Technology, Vol. 2, pp. 834–838, 1998.

[28] B. L. Miller and D. E. Goldberg, “Genetic algorithms, tournament selection, and the effects of noise,” Complex Systems, pp. 193–212, June 1995.

[29] C. E. Perkins and E. M. Royer, “The ad hoc on-demand distance vector protocol,” Ad hoc Networking, Addi-son-Wesley, pp. 173–219, 2000.



A Scalable Architecture for Network Traffic Monitoring and Analysis Using Free Open Source Software

Olatunde ABIONA1, Temitope ALADESANMI2, Clement ONIME3, Adeniran OLUWARANTI4, Ayodeji OLUWATOPE5, Olakanmi ADEWARA6,

Tricha ANJALI7, Lawrence KEHINDE8 1Department of Computer Information Systems, Indiana University Northwest, Garry, USA

2,6 Information Technology and Communications Unit, Obafemi Awolowo University, Nigeria 3Information and Communication Technology Section, Abdus Salam International Centre for Theoretical Physics,

Trieste, Italy 4,5Department of Computer Science and Engineering, Obafemi Awolowo University, Ile-Ife, Nigeria 7Electrical and Computer Engineering Department, Illinois Institute of Technology, Chicago, USA

8Department of Engineering Technologies, Texas Southern University, Houston, Texas, USA Email: [email protected], 2taladesanmi,4aranti,5aoluwato,[email protected], [email protected],

[email protected], [email protected] Received February 5, 2009; revised April 8, 2009; accepted June 10, 2009

ABSTRACT The lack of current network dynamics studies that evaluate the effects of new application and protocol de-ployment or long-term studies that observe the effect of incremental changes on the Internet, and the change in the overall stability of the Internet under various conditions and threats has made network monitoring challenging. A good understanding of the nature and type of network traffic is the key to solving congestion problems. In this paper we describe the architecture and implementation of a scalable network traffic moni-toring and analysis system. The gigabit interface on the monitoring system was configured to capture net-work traffic and the Multi Router Traffic Grapher (MRTG) and Webalizer produces graphical and detailed traffic analysis. This system is in use at the Obafemi Awolowo University, Ile-Ife, Nigeria; we describe how this system can be replicated in another environment. Keywords: Scalable Network Monitoring, Traffic Analysis, Web Log Analysis, Open Source

1. Introduction The rapid growth of the Internet in size, complexity and traffic types has made network management a challeng-ing task. The ability of a monitoring system to provide accurate information about the nature and type of the network traffic can not be over emphasized. Information about who is generating the most traffic, what protocols are in use, where is the traffic originating from or where is the destination of the traffic can be very important to solving congestion problems. Many network administra-tors spend a lot of time, trying to know what is degrading the performance of their network.

A typical solution to congestion problem is to upgrade network infrastructure, i.e. replace servers with high end servers and increase the bandwidth. This solution is ex-pensive, short term and does not scale. As soon as the

upgrade is done the congestion problem will improve for a while and later gradually deteriorate as the users change their behavior in response to the upgrade. The alternative solution to this problem is to deploy a scal-able network traffic monitoring and analysis system, in order to understand the dynamics of the traffic and changes in the internet and overall stability of the net-work. In addition to knowing the health status of the network, monitoring of network activity also has the benefits of detecting denial of service (DoS) and band-width theft attacks. In order to conduct analysis of wide range of network behaviors, it is necessary to collect network traffic on a continuous basis rather than as a one time event which only captures transient behaviors that provides insight into network problems. Collecting long term network traffic data will provide valuable informa-tion for improving and understanding the actual network

A SCALABLE ARCHITECTURE FOR NETWORK TRAFFIC MONITORING AND ANALYSIS USING FREE OPEN SOURCE SOFTWARE


529

dynamics. The rest of the paper is organized as follows: in Sec-

tion 2, we review related work, followed by the system design in Section 3. In Section 4 we describe the imple-mentation and configuration of the system. In Section 5, we present the example application of the system and in Section 6 we conclude the paper. 2. Related Work The libpcap [1] tool has greatly simplified the task of acquiring network packets for measurement. The limita-tion of the tool is its inability to analyze the captured data, it will only capture the data and the programmer or net-work administrator is left to carry out analysis manually. This task can be time consuming and cumbersome and in most cases accurate information about the network is not obtained. Some researchers have developed modular software architectures for extensible system [2,3], how-ever only a few of these systems are optimized to handle large amount of data and continuous monitoring. Other researchers have developed systems for streaming data through protocol layers [4] and routing functions [5], but not much attention has been given to the analysis of large/huge or broad data collected over time.

Simple Network Management Protocol (SNMP) cov-ers a class of tools such as Multi Router Traffic Grapher (MRTG) and Cricket [6], which collect counter statistics from network infrastructures, and visualizes these statis-tics by means of graphs. The most common use for these tools is graphing the InOctet and OutOctet counters on router interfaces, which respectively provide counts of the number of bytes passing in and out of the interface. Many other tools also support SNMP monitoring [7]. This includes monitoring performed by Remote Moni-toring (RMON) agents, which can be useful in deter-mining the top hosts with regards to traffic, as well as the distribution of packet sizes on a network. SNMP can only be used to monitor devices that are SNMP managed, the reading and writing of the In Octet and OutOctet counters in a router could generate substantial traffic. Tcpdump [8] prints out the headers of packets on a net-work interface that match the boolean expression. It is a network sniffer with in-built filtering capabilities; it can only collect the data from the network, but does not ana-lyze collected data. The collected data can be analyzed offline with another utility namely, tcpshow and tcptrace. As useful and powerful as tcpdump is, it is only suitable for troubleshooting i.e., for tracking network and proto-col related connectivity problems.

MRTG is a versatile tool for graphing network data [9], this tool can run on a Web server. Every five minutes, it reads the inbound and outbound octet counter of the gateway router, and then logs the data to generate graphs

for web pages. These graphs can be viewed using a web browser. Although MRTG gives a graphical overview, it however does not give details about the host and protocol responsible for the traffic monitored. Windmill [10] is a modular system for monitoring network protocol events; it is useful for acquiring the data from the network, but it is however limited in its capability by not providing any facility to aid in the analysis of those events or non pro-tocol events acquired. WebTrafMon [11] uses a probe to extract data from network packets and composes log files. Analysis results are based on the collected log files. Fur-thermore the user is able to view the analysis result via a generic web browser. WebTrafMon can show traffic information according to the source and destination host through any web interface; it can also show the traffic status according to each protocol in use. Although Web-TrafMon has good capabilities, it can not monitor and analyze traffic in a switched network such as Fast Ethernet and Gigabit Ethernet.

The continuous Query Systems such as [12] and [13] share many of the concerns of other systems in acquiring and filtering continuous streams of data, this system however lack the ability to easily add new functions over the data, hence they are not extensible. The agile and scalable analysis of network events is a system based on modular analysis and continuous queries [14] allowing users to assemble modules into an efficient system for analyzing multiple types of streaming data. The system is optimized for analyzing and filtering large streams of data, and makes extensive use of polymorphic compo-nents that can perform common functions on new and unforeseen types of data without requiring any additional programming. This data analysis system provides a scalable, flexible system for composing ad-hoc analyses of high speed streaming data but does not provide infra-structures for data gathering and network monitoring.

Ntop [15] provides a display similar to the UNIX top command, but for network traffic, it can be used for traffic measurement and monitoring. Features such as the em-bedded HTTP server, support for various network media types, light CPU utilization, portability across various platforms, and storage of traffic information into an SQL database makes ntop versatile. However, ntop is limited by its high memory requirements when operating in a continuous monitoring environment. The extensive cache usage has the drawback that memory usage is increased [16]. This makes ntop a memory and computational in-tensive application. Like several of the other tools, ntop uses the same packet capture library to obtain the net-work data. Since ntop operates in continuous mode, it is designed to operate on networks with speeds of less than 100 Mbps [17]. A continuous network tracing infra-structure was described in [18], it is multi-user based and capable of collecting archives and analyzing network

O. ABIONA ET AL.


530

data captured. The system is limited by not providing a universal web interface to display the results of its monitoring. The system requires huge storage, 11TB of shared disk space for data repository since it is not web based and also there is high overhead since the storage system is accessed through an NFS server over an Ethernet link.

To understand network dynamics, and to analyze a wide range of network behaviors, network data needs to be collected over a long period of time, rather than as a one time event to capture transient behaviors that provide insight into immediate network problems. In this paper we describe the development and implementation of a scalable, passive and continuous network monitoring system capable of collecting, archives, and analyze net-work traffic. 3. System Design 3.1. The Obafemi Awolowo University Network The Obafemi Awolowo University Network (OAUnet), was established in the Obafemi Awolowo University, Ile-Ife, Nigeria and began operation in June, 1996. OAUnet was an initiative of the Academic Computer Networking Project of the International Centre for Theo-retical Physics (ICTP) (now Abdul Salam International Centre for Theoretical Physics in Trieste, Italy) for de-veloping countries. It is a campus wide intra academic network with a gateway to the INTERNET. In the last one decade of its existence, OAUnet has grown to be one of the largest intra academic networks among the Nige-rian universities. The development of the OAUnet from a network of three workstations with an e-mail gateway to the Internet now to a network of about one thousand nodes serving close to 30,000 staff and students popula-tion can be distinctly fused into four phases over a period of ten years. 3.1.1. Present Status (2005 Till Date) OAUnet currently connects to the Internet on 6Mbps downlink and 1.5Mbps uplink bandwidth. The Intranet is made up of four radio connections running at 3Mbps to the University collaborating research centres and 9 stu-dent information centres/cafes located in/around the stu-dent halls of residences. The main academic subnet comprising of two colleges and thirteen faculties linked together by over 2Gbps Fibre connections. The network serves a population of close to 26,500 students and 3,000 academic, administrative and technical staff. The net-work uses TCP/IP protocol and all the server namely web, mail, proxy, firewall, authentication, DNS and core routers runs on Linux operating system (a mixture of Redhat and Mandrivia). The network topology is hybrid, a sort of hirachical star, a logical layout of the network is

Figure 1. A logical layout of the OAUnet.

Figure 2. OAUnet internet link. shown in Figure 1. Recently, wifi/hotspot service have been deployed in the university to extend network access to places which hitherto has no netwotk presence such as the staff quarters. 3.2. The Network Monitoring System The scalable network monitoring and analysis tool com-



531

Figure 3. Scalable traffic monitoring setup.

Figure 4. Scalable traffic monitoring architecture.

prises of hardware and software. The hardware compo-nent includes a Pentium 4 central processing unit, a gi-gabit switch, and a gigabit network interface card. The software components are free open source programs

namely; Linux operating system, IP traffic monitoring program (IPTraf), MRTG, Webalizer and Perl script. 3.2.1. Hardware Setup The Obafemi Awolowo University Network Monitoring experiment was conducted with the aim of monitoring live network without adversely imparting on the per-formance and also to identify and monitor traffic patterns (both speed and volume) on the basis of host (IP address), Protocol and time of the day. For this experiment we used open source software for the reason of cost and ease of modification. This work builds on an earlier work in [19]. On a high-speed network, the number of packets transmitted per second can be astronomical. Analyzing such data requires a great deal of processing time. Thus the efficiency of a packet capturing code is essential [11]. As a matter of fact, ensuring that all packets are captured is extremely challenging [20]. The original design of the LAN to INTERNET is shown in Figure 2. There are two separate networks sharing the INTERNET link (cyber cafes and academic network). The academic network comprises proxy 1, web server, and firewall. While the Nine (9) cyber cafes use proxy 2 to access the INTER-NET. For the academic network, proxy 1 provides web/ftp services, firewall provides DNS/E-mail and other non-web services and the web server provides intranet facilities and also hosts the university web site. All were connected via fast Ethernet switch. A major constraint was that this design could not be changed or modified and it was not possible to install any applica-tion on the servers under the agreement reached with network/system administrators for conducting this ex-periment.

The proposal was to setup a single monitoring station that will monitor all the traffic. From Figure 2, the net-work has four (4) key servers transmitting and receiving signals at 100Mbps. In order to effectively capture all the packets from all the four (4) servers, the monitoring sta-tion must operate at 1gigabit speed. In Figure 3 the fast Ethernet switch in Figure 2, was replaced with a switch having two of its ports running at 1Gbps and also having port mirroring facility. The port mirroring facility allows concatenation of traffic from several ports to one of the gigabit ports where the monitoring station is attached. This implies that all traffic arriving at the ports for the four (4) OAUnet servers were also copied/mirrored to the gigabit port. 3.2.2. Software Setup Figure 4 shows the architecture of the scalable network traffic monitoring system. The monitoring system runs on the Linux operating system. Monitoring was done using the open source IPTraf software [21]. A wrapper script was used to start the IPTraf program with suitable

O. ABIONA ET AL.


532

command line arguments to sort the output of IPTraf based on IP address and protocol and it creates suitable inputs for MRTG [22] and Webalizer [23]. After starting up IPTraf and initializing some variables, the Perl wrap-per script is responsible for processing the summary in-formation from IPTraf into suitable log formats both for MRTG and Webalizer. The software aims at doing the processing very fast and as such it is not complicated. In order to maintain system stability and minimize memory leaks, the software restarts every 30 minutes. Other scripts used are run_mrtg, mrtg_reader, run_webalizer and webalizer_caller.

A copy of the scripts used may be downloaded from the URLs http://www.ictp.it/~abionao. The scalable traf-fic monitoring and analysis system also runs a cron job that copies proxy log from proxy servers on the network every six hours. The log files were analyzed using a Webalizer configuration file; the result of the proxy log analysis is available through a web interface. 4. Implementation and Configuration 4.1. Implementation A simple Perl script, run_mrtg was used to run MRTG. It basically checked if a MRTG configuration file and the right directories exist for the monitored host or protocol or create them if they do not exist. This script was run via a suitable cron job entry. A second script mrtg_ reader was used to further adapt and present the output of the IPTraf wrapper in a format suitable to MRTG. MRTG requires input to be in log format, Figures 5 and 6 shows the flow chart for the wrapper script and run_mrtg.

Another Perl script run_webalizer was used to check (or create) suitable configurations file for Webalizer and also check that the right output directories exist and are accessible. This was also called in a cron job entry. The number of hosts or protocols to monitor was specified through the use of a central text file that holds a list of IP addresses and protocol names. Figure 7 shows the flow chart for run_webalizer. The scalable network monitor-ing system is capable of monitoring the Intranet and the Internet link simultaneously; all that is required to achieve this is the provision of additional network inter-face card installed in the monitoring system.

4.2. Installation and Configuration A user is advised to choose names that are convenient for his network setup as it is the standard practice. In this write up the settings used are specific to our case study but can also be replicated. A copy of the Perl scripts may be downloaded from the URLs http://users.ictp.it/~abon-

Figure 5. Wrapper script.

Figure 6. Run_mrtg script.



533

Figure 7. Run_Webalizer script. ao/ and copied to the directory oaunetmon in /usr/local. Run the dos2unix command on each of the Perl script to convert them to UNIX program. Since the monitoring tool will process several packets, it is necessary to create a file containing the list of hosts and protocol on the network of interest. Using a text editor, type the IP ad-dresses and protocols of interest in the file OAUNET-MON in /etc (vi /etc/OAUNETMON), sample content of the file shown below.

Sample entry for OAUNETMON

Make a cron job entry in /etc/cron.d for the following files: Asoju_proxy, Café_proxy, oaunetmon_mrtg and oaunetmon_webalizer. This will enable the programs to run automatically when the system boots up or when

restarted. Below are samples entries in the files. Users are advised to use valid IP address and correct path to the log files.

Sample Cron job entry

Asoju_proxy

0 20,22,4,10,16 * * * root /etc/asoju_squid_logs

Café_proxy

0 20,22,4,10,16 * * * root /etc/cafe_squid_logs

oaunetmon_mrtg

*/5 * * * * root /usr/local/oaunetmon/mrtg_reader.pl /tmon/IPTRAF/mrtg/global/ /tmon/IPTRAF/mrtg/ */15 * * * * root /usr/local/oaunetmon/run_mrtg.pl /tmon/IPTRAF/mrtg /logf/www/html/oaunetmon/mrtg /logf/CONFIGS/oaunetmon/mrtg/

oaunetmon_webalizer

5 */4 * * * root /usr/local/oaunetmon/run_webalizer.pl /tmon/IPTRAF/Webalizer/ /logf/www/html/oaunetmon/webalizer /logf/CONFIGS/oaunetmon/webalizer/

To start the Perl wrapper script (iptraf.pl), the moni-toring interface should be placed in promiscuous mode. In this mode, the monitoring interface will see all packets flowing in the network. This will enable it monitor all packets and then starts the traffic monitoring program automatically. The file rc.local in /etc/rc.d/ will have the following content: /etc/rc.d/rc.local,

Sample entry for rc.local

#!/bin/sh /sbin/ifconfig eth0 promisc /usr/local/oaunetmon/iptraf.pl /tmon/IPTRAF//tmon/ iptraf -i qll &touch /var/lock/subsys/local 29.20.23.1

29.20.23.2 29.20.23.8 http domain ftp https smtp

The scalable monitoring system is also capable of ana-lyzing proxy server log files. Log files were copied from the two proxy servers every six hours, the access logs were analyzed using a Webalizer configuration file and the results of the analysis can be viewed on a web browser. The results of traffic analysis combined with the proxy log analysis are useful for optimizing band-width usage. Below are sample entries for the script used to copy the proxy logs (asoju_squid_logs and café_squid_logs); the file resides in /etc directory, and is called by a cron job (batch file). /etc/asoju_squid_logs

O. ABIONA ET AL.


534

and /etc/café_squid_logs. continues until 4.00pm when users leave offices for so-cial activities. The weekly graph correlates strongly the socio-economic influence as discussed above except for Saturday and Sunday, where there is a more pronounced late night activity on Saturday and a complete lack of afternoon and night time activity on Sundays.

Asoju_squid_logs Script

#!/bin/sh PATH = $ PATH: /usr/bin/ export PATH /usr/bin/rsync –a –e ssh root @192.168.0.5:/var/log/squid/access.log /logf/squid_log/ [ $? -gt 0] $$ exit 0 [-S /logf/squid_log/access.log ] | | exit 0 /usr/bin/webalizer -C /etc/squid_webalizer.conf > /dev/null 2 > &1

Figure 8 shows a sample global MRTG graph. The average daily bandwidth usage was 60.3kBps downlink green line and 28.8kBps uplink blue line.

The graph above shows Internet traffic for an aca-demic institution. Notice the relatively constant demand for network resources over time, as indicated by the hori-zontal pattern of the green, average-rate line. The green line also indicates that average utilization is about 60 KBps. This could be either a single user listening to a 60 KBps stream, or multiple users listening to slower streams that total 60 KBps.

café_squid_logs Script

#!/bin/sh PATH = $ PATH: /usr/bin/ export PATH /usr/bin/rsync –a –e ssh root @192.168.0.6:/var/log/squid/access.log /logf/cafe_log/ [ $? -gt 0] $$ exit 0 [-S /logf/cafe_log/access.log ] | | exit 0 /usr/bin/webalizer -C /etc/cafe_webalizer.conf > /dev/null 2 > &1

The graph’s peaks indicate some burstiness in the traf-fic that could periodically impact other applications' per-formance. However, given that this is a one week graph, bursting even once in an hour could cause the peak rate graph to go up. These peaks may just reflect heavy downloads or streaming by some users on the network.

5. Example Application of System 5.1 2. Sample Webalizer Graphs

The Webalizer graph shows the hourly average for the entire month of May 2006. The graph shows a correla-tion with the discussion from the MRTG graph with re-spect to the socio-economic activities, that is, a gradual increase in traffic starting at 8.00am with a dip at 12.00pm and a dip at 4.00pm and a gradual decline until 8.00pm when there is a new rise, with a slight decline after midnight.

5.1. External Link Monitoring Below are some output generated by the scalable net-work traffic monitoring and analysis system. Two types of outputs were generated, a graphical overview and a detailed statistics. MRTG generates the graphical over-view while Webalizer produced the detailed statistics and proxy log analysis. Unfortunately since this is an average over a month

the peaks and troughs are less pronounced. Generally this graph is consistent with the expected usage pattern and indicates a constant usage on the average over a 20 hour period. Figure 9 shows a section of the generated output for hourly usage for May 2006.

5.1.1. Sample MRTG Graphs The MRTG graph is influenced by the socio-economic activities on the university campus. Academic activities start at 6.00am with some lectures and probably students trying to finish their assignments via online research. There is a pronounced increase at 8.00am when most lecturers arrive in the offices along with non academic staff; by 10am the campus is fully active and usage re-mains constant until 12.00noon, with a dip, when people leave their offices on lunch break or to pickup kids from school. A steady rise in traffic by 2.00pm, when the break is over and users are back in their offices and then a steady drop when users finally leave at about 6.00pm. Some academic staff and students typically return 8pm and work till midnight, while some students continue to work until 2.00am. Network usage is at its lowest at 4.00am and starts its cycle from 6.00am once again.

Other techniques including the hardware based net-work monitoring station (NMS) by CISCO for the moni-toring and analysis of the INTERNET traffic exist; however this technique offers two interesting advan-tages:

Dual independent system (MRTG and Webalizer) makes correlation and deviation easier to detect and the short comings of either methods are overcome by the combi-nation.

Ability to focus or ignore noise traffic i.e. HTTP traf-fic could be monitored alone or and as a relationship to the global traffic, with/without HTTPS traffic. 5.1.3. Proxy Log Analysis Note: the pattern on Wednesday 28 April 2004 indi-

cates it is a holiday “workers remembrance and com-memoration day” since activity started at 10.00am and

The scalable network traffic monitoring and analysis system is also capable of proxy log analysis. This was



535

Figure 8. Global usage statistics.

O. ABIONA ET AL.


536

Figure 9. Global usage statistics.

achieved by copying log files from the proxy servers at regular time interval and then running a Webalizer con-figuration file to analyze the proxy logs collected. The log files were collected every four hours from the proxy server and the analysis carried out. When a new analysis is completed, the graph is only updated, this reduces the amount of storage required to keep historical data. Figure 10 shows the protocol usage statistics.

Suppose a client using a proxy makes requests r1, r2...rn to pages, if a page has F objects out of which C can be obtained from the cache and W from the origin server. Total request R will be:

1

n

ii

R r

Figure 10. Protocol usage statistics.

But not all requests will bring back data. Hence, all re-quests that will result in data transfer will be:

1 1

n n

i ii i

F W C

So we can compute the Document hit ratio (DHR) and Byte Hit Ratio (BHR) as;

1

1 1

Cache byte ,

Total byte

n

ii

n n

i ii i

CDHR BHR

W C

Cachebyte = the no of bytes transferred from the cache. Totalbyte = the total no of bytes transferred.

From the Squid Proxy Usage Statistics, Total hits = 59,035,756 (total requests) Total Files = 15,483,674 (internet content) Difference = 43,552,082 (cached content)



537

Figure 11. Hosts usage statistics.

O. ABIONA ET AL.


538

The difference is a measure of cached content. Hence the cache hit rate is about 73%. Other useful information about the network such as no of users, static content, internal server error, total kilobytes transferred, numbers of web servers, top host generating traffic, top destina-tion accessed, top users generating traffic etc can be ob-tained from the proxy log usage statistics. The cache hit rate calculated is an estimate; a more accurate value can be obtained by combining the data from the hit by re-sponse code and the monthly statistics. All the informa-tion provided by the monitoring system could be used by the system administrator for shaping the network traffic, improving network performance and predict future traf-fic trends. 5.2. Internal Network Monitoring 5.2.1. Sample MRTG Graphs We have described how the scalable network traffic monitoring and analysis system can be used to monitor the external link; it can also be used to monitor the in-ternal network. The type of output or environment in which the system is deployed will determine the setup and configuration used by the system. The system is flexible and extensible and can adapt to varying traffic load conditions. E.g. For a host on the internal network we are interested in traffic patterns, hosts and protocol related activity, but we are not interested in proxy log analysis, this implies that different configurations are used for different setup of the system. Figure 11 shows the MRTG host graph for the internal network. The flat line indicates power outages and periods the system was down. The graph shows high network traffic between 6.00pm and 9.00pm and low network traffic between 8.00am and 5.00pm. This is the pattern of traffic ob-served on Sundays. 5.2.2. Sample Webalizer Graphs The Webalizer gives a detailed traffic statistics of the network; from the Webalizer graph we can see the global usage statistics, hourly usage statistics, hosts usage sta-tistics and protocol usage statistics. Figure 12 shows the various historic Webalizer usage statistics.

The Protocol Usage Reports section includes reports that show bandwidth usage based on all the protocol groups generating traffic through the device. The Proto-col Trend Reports section includes reports that show trends in the amount of traffic generated using different protocol groups. Protocol trends help in identifying peak usage times for each protocol group, understanding user trends, and enforcing better policies to allow traffic from each protocol group. The system is capable of keeping historic data for several months; this makes it suitable for generating trend reports.

5.3. Replicating the System in Other Environ-ment

The system can be setup in two simple steps: Install any distribution of the Linux Operating system, this is freely available on the INTERNET and then follow Steps 1 and 2 to setup the hardware and software.

1) Hardware setups as in Subsection 3.2.1. 2) Software setups as in Subsection 3.2.2. The software used to setup the scalable network traffic

monitoring and analysis system and all configuration files are available for download on the author’s website, the free open source programs are available at the de-veloper’s website, while all the Perl scripts developed are available for download at http://users.ictp.it/~abionao/. We welcome Feedback and contributions from the users of the software. 6. Conclusions In this paper we have presented the development of a scalable network traffic monitoring and analysis system. The system is capable of monitoring and analyzing network

Figuer 12. Global usage statistics.



539

traffic for both the Intranet and the Internet traffic, the design employ free open source programs such as IPTraf, MRTG and Webalizer, and scripts written in Perl lan-guage. The system monitors network traffic passively, this implies that it is non intrusive and monitors network traffic continuously, using the lossy data storage tech-nique to store the data as web pages. Graphical over-views of the traffic monitored were presented using MRTG, while detailed analyses of the traffic and proxy log analysis were presented using Webalizer. A user can view the results of the monitoring system using a web browser. We have described the hardware and software required to setup the system and how it can be replicated on any network. Using the system we were able to monitor the internal and external network, the scalability feature of the system, makes it attractive to both re-searchers and network managers. The output graph gen-erated by the system provides details of the network dy-namics and insight into problems that could lead to con-gestion and poor network performance. Future work will focus on intrusion detection monitoring module and preemptive intrusion control. 7. Acknowledgements The authors will like to express our gratitude to the Of-fice of External Activities (OEA) of the Abdus Salam International Centre for Theoretical Physics, Trieste, Italy, for funding the research through the donation of all the equipment used to conduct the study. 8. References [1] V. Jacobson, C. Leres, and S. McCanne, “Libpcap,” 1994,

http://www-nrg.ee.lbl.gov/

[2] D. Paraas, “On the criteria to be used in decomposing systems modules,” Communications of the ACM, Vol. 14, No. 1, pp. 221–227, 1972.

[3] A. Reid, M. Flatt, L. Stroller, J. Lepreau, and E. Eide, “Knit: Component composition for systems software,” in Proceedings of the 4th Symposium on Operating Systems Design and Implementation, pp. 347–360, October 2000.

[4] N.C. Hutchinson and L. L. Peterson, “The X-Kernel: An architecture for implementing network protocols,” IEEE Transactions on Software Engineering, Vol. 17, No. 1, pp. 64–76, 1991.

[5] E. Kohler, R. Morris, B. Chert, J. Jannotti, and M. Frans Kaashoek, “The click modular router,” ACM Transac-tions on Computer Systems, Vol. 18, No. 3, pp. 263–197, August 2000.

[6] J. Allen, Cricket homepage, 2000, http://cricket.sourceforge.net.

[7] J. D. Case, M. Fedor, M. L. Schoffstall, and C. Davin, Simple Network Management Protocol (SNMP), May 1990, http://www.faqs.org/rfcs/rfc1157.html.

[ 8] V. Jacobson, C. Leres, and S. McCanne, “Tcpdump-the

protocol packet capture and dumper program,” http:// www. tcp dmp.org.

[9] T. Oetiker, “Monitoring your IT gear: The MRTG story,” IEEE IT Profesionals, Vol. 3, No. 6, pp. 44–48, December 2001.

[10] G. Robert Malan and Farnam Jahanian, “An extensible probe for network protocol performance measurement,” in Proceedings SIGCOMM’98, pp. 215–227, September 1998.

[11] J. Hong, S. Kwon, and J. Kim, “WebTrafMon: Web-based internetintranet network traffic monitoring and analysis system,” Elsevier Computer Communica-tions, Vol. 22, No. 14, pp. 1333–1342, September 1999.

[12] J. J. Chen, D. J. DeWitt, F. Tian, and Y. Wang, “A scal-able continuous query system for internet databases,” Proceedings of ACM SIGMOD’00, pp. 379–390, May 2000.

[13] S. Madden, M. Shah, J. M. HeUerstein, and V. Raman, “Continuously adaptive continuous queries over streams,” Proceedings of ACM SIGMOD 2002, pp. 49–60, June 2002.

[14] M. Fisk and G. Varghese, “Agile and scalable analysis of network events,” in Proceedings of 2nd ACM SIGCOMM Workshop on Internet Measurement IMW’02, pp. 285–290, November 2002.

[15] L. Deri and S. Suin, “Effective traffic measurement using ntop,” IEEE Communication Magazine, Vol. 38, No. 5, pp. 138–143, May 2000.

[16] L. Deri, R. Carbone and S. Suin, “Monitoring networks using ntop,” Proceedings of IEEE/IFIP International Sympo-sium on Integrated Network Management, pp. 199–212, May 2001.

[17] L. Deri and S. Suin, “Practical network security experi-ences with ntop,” Computer Networks, Vol. 34, pp. 873–880, 2000.

[18] A. Hussain, G. Bartlett, Y. Pryadkin, J. Heidemann, C. Papadopoulos and J. Bannister, “Experiences with a continous network tracing infrastructure,” in Proceedings of ACMSIGCOMM Workshop on Mining Network Data, pp. 185–190, August 2005.

[19] O. O. Abiona, C. E. Onime, A. I. Oluwaranti, E. R. Adagunodo, L. O. Kehinde, and S. M. Radicella, “De-velopment of a non intrusive network traffic monitoring and analysis system,” African Journal of Science and Technology (AJST) Science and Engineering series, Vol. 7, No. 2, pp. 54–69, December 2006.

[20] G. R. Wright and W. R. Stevens, “TCP/IP illustrated,” 2 Addison-Wesley, Reading, M. A., 1994.

[21] G. P. Java, IPTraf : http://iptraf.seul.org/ 2001.

[22] T. Oetiker and D. Rand, “MRTG: Multi router traffic grapher,” http://tobi.oetiker.ch/ 2008.

[23] B. L. Barrett, Webalizer home page, http://www.mrunix-. net /webalizer/ 2008.

[24] J. Vass, J. Harwell, H. Bharadvaj, and A. Joshi, “The world wide web: Everything you (n)ever wanted to know about its servers,” IEEE Potentials, pp. 33–34, Octo-ber/November 1998.



Positioning a Node of Wireless Sensor Networks in 3 Dimensional Space

Wenhui NIE, Shiguang JU, Anrong XUE, Feng LI School of Computer Science & Telecommunication Engineering, Jiangsu University, Zhenjiang, China

Email: whnie, jushig, xuear, fengli@ ujs.edu.cn Received May 8, 2009; revised July 1, 2009; accepted August 12, 2009

ABSTRACT To know the location of nodes is very important and valuable for wireless sensor networks (WSN), we pre-sent an improved positioning model (3D-PMWSN) to locate the nodes in WSN. In this model, grid in space is presented. When one tag is detected by a certain reader whose position is known, the tag’s position can be known through certain algorithm. The error estimation is given. Emulation shows that the positioning speed is relatively fast and positioning precision is relatively high. Keywords: Wireless Sensor Network, Cells, Reader, Tag, Positioning

1. Introduction In most applications of Ad Hoc networks, it is often use-less of not knowing the location of a node or sensor [1]. It is very important to locate a node in Ad Hoc networks using certain algorithm [2]. The node of Ad Hoc net-works is relatively simple which is powered by small batteries, so all the distributed algorithms, including lo-calization algorithm, must have the capabilities of energy saving in order to extend the lifecycle of batteries. As a result, special attentions of validity and efficiency must be paid for node positioning algorithm [3].

The main localization algorithms of Ad Hoc networks are: 1) Slobodan N. Simic and Shankar Sastry firstly proposed Bounding Box algorithm [4], using the method of grid in 2 dimensional space. Jongchul Song from University of Texas has successfully used this algorithm in tracking the location of materials on construction pro-jects [5]. 2) WEI Xing proposed a new tracking filter algorithm P-EKP [2], using particle filter to catch the emitter and improving the tracking precision, for the highly non-linear single observer passive location and tracking system. 3) NI Wei introduced a relative indoor location and tracking algorithm based on the received powers [6], which can provide a high positioning accu-racy. 4) ZHANG Ling-wen proposed an optimizing al-gorithm HOA [7] which combines the Taylor series ex-pansion method with steepest decent method [8]. 5) Other algorithms include Niculescu proposed DV—hop, Euclidean and Robust position etc. All used the distance between two neighbor nodes and the number of jump to

locate node. Nevertheless, these systems are mainly for 2D envi-

ronment. In this paper we addresses the issues of nodes localization in 3D space and proposed a 3D-based Posi-tioning Model and Algorithm in Ad Hoc（3D-PMAHN) and emulated it. 2. 3D-PMAHN Model 2.1. The 3D-Space Dividing Method in

3D-PMAHN Model

Firstly, we set up a cube Q space region, called operation region in space (see Figure 1), and then we divide each side of Q into n same parts. Accordingly, the cube Q is divided into n3 small cube called space cells. For each cell, it has an exclusive cell coordinate (x,y,z), (1<=x<=n, 1<=y<=n, 1<=z<=n). Here we define n3 be Q’s position-ing resolution. Obviously, when Q is fixed, the bigger n is, the bigger the positioning resolution rate will be [4].

We scatter randomly m readers whose positions are known in Q and assume that each reader is equipped with a same RF transceiver. Every reader has a sphere communication region with radius r. The needed posi-tioning object is a tag in Q. So far as the tag lays in any readers’ communication regions, it can be captured by these readers. In other words, the readers can read the tag in its communication region. Our purpose is to locate a tag in Q by readers whose positions are known.

We now introduce the symbols defined in the Model. Q-operation region

POSITIONING A NODE OF WIRELESS SENSOR NETWORKS IN 3 DIMENSIONAL SPACE


541

Q=[0,s]×[0,s]×[0,s], denotes a cube with each side length s.

S (small letter) denotes Q’s each side’s length. n denotes the number of every equally dividing

sides . N denotes the number of nodes whose positions are

known in Q. Each node has an exclusive cell coordinate in Q. In practice, the position can be gained by several ways, for example, we can equip the reader with GPS to get its position.

G denotes the area covered by reader’s RF trans-ceiver; we assume that it is a sphere with radius rand that any tag in G can be captured by this reader.

S (capital letter) denotes the biggest cube in G. The reason of choosing S as reader’s communication region, not G, is to simplify the calculation. 2.2. 3D-PMAHN Localization Algorithm For the model of Figure 1, in order to simplify calcula-tion, we assume that 1) the operation region is fixed, 2) the number of cells is variable, 3) all readers in Q have the same communication regions G, 4) parameter r is variable. The main approach for the localization algo-rithm can be generally described as follows: firstly, scat-tering some necessary readers with known positions in Q. The size of a cell can be gained through calculation. A reader’s communication region S can be showed by the number of cells contained in it. Once a tag lies in two readers’ communication regions, we can deduce that the tag’s position must be in the intersection of this two readers’ communication regions, and the size and posi-tion of this intersection can be gained by simple calcula-tion .Similarly, if one tag can be captured by more than two readers, the tag’s position can be gained by the in-tersection of these readers’ communication regions. 2.2.1. The Definition of Reader’s Communication

Region Firstly, let’s analyze the reader communication region’s features in 3D space (see Figure 2)

Assume that region is a sphere. In case of parameter n, r and s are fixed, letting the side be 2 plusing x for the max inner cube [4]. Obviously 3x2=r2, then

3

nr

s

(1)

where [x] is the integer part of x, the unit of ρ is the number of cell.

Accordingly, node S can communicate with every other node in the cube centered at S and containing

cells. 3(2 1) Assume that the number of readers with known posi-

tion is k denoted by R1, R2, and Rk. This readers’ com-

Figure 1. 3D-PMAHN model.

Figure 2. Reader’s communication region G in 3D-PMWSN. munication region are denoted by S1,S2,...Sk with cell coordinate(x1,y1,z1),(x2,y2,z2),.and (xk,yk,zk) (1<=xi<=n, 1<=yi<=n, 1<=zi<=n) (see Figure 3).

Ri,Rj,Rk are three readers scattered in Q and Ti, Tj, Tk

are three tags randomly scattered in Q.We assume that Ti

can be captured at same time by reader Ri, Rj and Rk. Now we give the formalized definition of the reader’s

communication region S: Let s=[a,b]×[c,d]×[e,f](1≤a<b≤n，1≤c<d≤n，1≤e<f≤n)

and s is a region of space with a certain number of cells. Every cell has an exclusive coordinate (x,y,z) (a≤x≤b, c≤y≤d, e≤z≤f).

It is easy to know that Ri’s communication region Si

can be computed as follows:

[ , ] [ , ] [ ,i i i i i i iS x x y y z z ] (2)

W. H. NIE ET AL.


542

2.2.2. The Intersection Algorithm of two Reader’s Communication Regions

Now we can calculate the intersection of two reader’s communication regions (see Figure 4),

Select two readers, Rm and Rn ( m,n∈1,2,...,k), in (R1,R2,...Rk) with cell coordinates (xm,ym,zm) and (xn, yn,zn), if their intersection T is not empty, it can be de-duced as follows:

[max( , ) , min( , ) )

[max( , ) , min( , ) )

[max( , ) ,min( , ) )

m n

m n m n

m n m n

m n m n

T Q S S

x x x x

y y y y

z z z z

(3)

where max(xm,xn) is the bigger one of xm and xn, min(xm,xn) is the smaller one of xm and xn, max(ym,yn), min(ym,yn),max(zm,zn) and min(zm,zn) have the similar meanings.

From Equation (3), we know that if one tag can be de-tected simultaneously by two readers, this tag must be in the intersection T of two readers’ communication re-gions. 2.2.3. The Intersection Computation Algorithm of

More than two reader’s Communication Region

We now consider the case that one tag can be captured simultaneously by more than two readers.

Assuming that tag Ti can be captured by readers R1, R2... Rk, at the same time, then it is obvious that this tag is in the communication regions’ intersection of above readers.

Figure 3. The relations between readers and tags.

Let

1

[ , )]

[ , )] [ , )]

k

ii n

T Q S

Q x x

y y z z

(4)

where x+ = max(x1, . . . , xm) , x_ = min(x1, . . . , xm) y+ = max(y1, . . . , ym) , y_ = min(y1, . . . , ym) z+ = max(z1, . . . , zm) , z_ = min(z1, . . . , zm) From the Equation (4), we can deduce the position

range through simple calculation in case of the k readers’ cell coordinates are known. With the appropriate pa-rameters setting, the positioning precision can be high enough as we wanted.

3. Error Estimations and Comparison with

Other Algorithms 3.1. Error Estimations in Positioning Because the scope of communication has been simplified, the positioning error is inevitable, and this greatly influ-ences the computing result. In order to simplify the cal-culation, we discuss this in 2 dimensional spaces.

Assume that a certain tag can be detected by two read-ers at the same time (see Figure 5). Defining the equa-tions of G1and G2 as

G1：x2+y2=32. G2 :(x-3)2+y2=32. Obviously, the tag must be in the intersection Ω of G1 and

G2

And

2

2

3 39-y2

x

3 3 3- 9-y2

Ω d yd

(5)

Figure 4. The intersection of two readers’ communication region (The cube T with dashed lines).



543

Figure 5. The real intersection of two readers’ communica-tion region.

According to the simplified method, tag must be lo-cated in the intersection of E. The error event M can be described as the point whose position is in Ω (but not in E). The error possibility is p (M).

( , ), ( , ) , ( , ) M x y x y x y E (6)

( )E

p M

(7)

According to above equation, we can obtain the value of p (M) is about 17.3 percent, that is to say, when the two readers’ position are like Figure 6, only after ex-cluding such errors can the algorithm work exactly.

In general, the estimated error is related to the posi-tions of readers. For the same readers, if their positions are different, such error is different accordingly. If one tag can be detected by m readers, the calculation of esti-mated error is much more complicated than two readers, but we are sure that the error is a function of the posi-tions m readers .We give the error definition as:

Definition 1: e(pr1,pr2,...,prm), where pri denotes the po-sition of reader ri. e(pr1,pr2,...,prm) is 3.2. Comparison with Other Algorithms We choose 3D RFID Positioning Algorithm [9], which is representative for 3d positioning, to be compared with 3D-PMWSN. The 3D RFID Positioning Algorithm is based on the location fingerprinting method. This method was first implemented in the RADAR system which is an RF-based indoor tracking system developed by Microsoft Research.

This positioning method achieves the best positioning results by two stages: the off-line training phase (off-line phase) and the on-line position determination phase (on-line phase). In the off-line phase, the received signal strength (RSS) from all the detectable transmitters and other relevant information including the physical coordinates

Figure 6. The simplified intersection of two readers’ com-munication region. of the receiver, which are measured by other methods, such as using total stations or tapes, in the local coordi-nates, are collected at predefined reference points and stored in a database. The set of the database is called fingerprint. During the on-line phase the mobile user samples the RSS patterns and searches for similar pat-terns in the off-line database so as to find its most possi-ble position.

Let’s compare the two positioning algorithms from four evaluation indicators.

1) Storage cost: As the nodes are relatively simple in Ad Hoc networks, any algorithm should be storage saved. Because there must be an off-line database, it is hard to setup such a database in the nodes of distributed system. But it is different for 3D-PMWSN; there is no off-line database, so the storage is saved.

2) Calculation complexity: for 3D-PMWSN, we can see from Equation (4) that the calculation in positioning is very simple. As to 3D RFID Positioning Algorithm, for the main calculations come from the search of off-line database, so the calculation is relatively simple. But this is based on the expensive cost of storage requirement, which is unsuitable for distributed Ad Hoc Networks.

3) Positioning accuracy: As to 3D-PMWSN, the posi-tioning precision can be set according to the size of posi-tioned object, which means the cell could be big or small. Nevertheless, the positioning resolution rate could not be changed for 3D RFID Positioning Algorithm because it is positioned by RSS.

4) Stability of the algorithm: In practice, the signal strength of RF transceiver is variable, which is to say the signal strength changes with the power supplied. For 3D-PMWSN, only using two states—tag detected or not, the influence of signal strength is little. The algorithm can work as long as a tag is captured by readers. But it is different for 3D RFID Positioning Algorithm. Because the positioning is mainly depended on signal strength, the signal attenuation, which may greatly increase the

W. H. NIE ET AL.


544

amount data of off-line database, may have great influ-ences for positioning precision.

From above discussion, we can conclude that the 3D-PMWSN is better than 3D RFID Positioning Algo-rithm in positioning; especially for distributed Ad Hoc systems (see Table 1). 4. Simulation and Analysis In order to evaluate the locating precision and rate of convergence, we do this simulation. Assume the number of readers is k, let the initial value of k be 2, and each reader’s cell coordinate are given as below. The simula-tion process is as following step,

Step 1: us Definition 1 and exclude the positions that could cause errors.

Step 2: by using Equation (3), we can easily know the intersection size by the number of cells contained in it.

Step 3: increase k by step length 1. Step 4: repeating from Step 1 to get a new intersection

number until to one we consider suitable, then stop the calculation.

We choose two kinds of solutions .The emulation re-sults are as bellows

Solution 1 Let s=10000, n=5000, r=100, using Equation (1), we

can obtain ρ=28. Let the reader’s number k be 10 and each cell’s coor-

dinate is as follows: R1(1305,1505,1805), R2(1310,1510,1810), R3(1315,1515,1815), R4(1320,1520,1820), R5(1325,1525,1825), R6(1330,1530,1830), R7(1335,1535,1835), R8(1340,1540,1840)， R9(1345,1525,1825), R10(1350,1530,1830) Firstly, we calculate the intersection of R1 and R2, and

the result is about 68 thousand cells. Then, increase the reader number step by step. For R1, R2 and R3, the result is 47 thousand cells. When k increases to 9, the result is about 200 cells, which is very small compared with that k is 2.In practice, how many readers be chose is mainly based on the size of located object and cells in G. Ac-cordingly we can choose appropriate parameters to get the result we need (see Table 2).

Solution 2 Let s=10000, n=2000, r=100.Then, we can deduce ρ=

12. Apart from the difference of cell numbers in G, other parameters are same as those in solution 1; the result is in Table 3. We can see that when k increases to 5, T has decreased to 27 cells.

For solution 1, we can conclude from Table 2 that the result T decreases as k increases, which means that the locating precision increases as k increase. Scattering 9 readers in G can get the precision degree to 100 grids.

Table 1. The comparison of 3D RFID positioning algorithm and 3D-PMWSN (√-denotes the better one of evaluation indicators between two algorithms).

evaluation in-dicators

3D RFID Position-ing Algorithm

3D-PMWSN

storage cost √ calculation com-plexity √

positioning ac-curacy

√

Stability of algo-rithm

√

Table 2. The simulation result of solution 1 (the unit of T is 10 thousand cells).

K 2 3 4 5 6 7 8 9

T 6.8 4.7 3.0 1.8 0.9 0.4 0.1 0.02

Table 3. The simulation result of solution 2 (the unit of T is 10 thousand cells).

K 2 3 4 5

T 0.69 0.27 0.073 0.0027

-

-

-

-

-

Figure 7. The result comparison of two solutions.

Obviously, the reader number does not need too much. So long as the positions are scattered suitably, the needed locating precision can be obtained. The algorithm is workable (see Figure 7).

For solution 2, the positioning speed is fast than solu-tion 1, using only 5 readers .The reason is that the size of cell in solution 2 is bigger than that of in solution 1.It is to say that the positioning speed becomes faster in case of the resolution rate decreases. The positioning speed is faster by the cost of decreasing the resolution rate. In practice, a balance should be kept between these two facts.



545

5. Conclusion and Future Work We gave a space cell concept in this paper and proposed a 3D-PMAHN model in Ad Hoc networks. An object’s position in 3 dimensional spaces can be computed by the given positioning algorithm. We also give the error esti-mation of the model. The simulation result shows that the 3D-PMAHN is simple and energy saving. Such characters meet the special requirements in Ad Hoc net-works. 6. References [1] F. B. Wang, L. Shi, and F. Y. Ren, “Self-localization

systems and algorithms for ad hoc networks,” Journal of Software [J], Vol. 16, No. 5, pp. 857–866, 2005.

[2] X. Wei, J. W. Wan, and K. Huangfu, “New technique in single observer passive tracing based on particle filter,” Journal on Communications [J], Vol. 26, No. 12, pp. 81– 85, 2005.

[3] C. G. Tan and W. Y. Luo, “A survey of cooperation of nodes in mobile ad hoc networks,” Computer Science [J], Vol. 34, No. 4, pp. 23–27, 2007.

[4] S. N. Simic and S. Sastry, “Distributed localization in wireless ad hoc networks,” Technical Report UCB/ERL M02/26, Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, January 2002.

[5] J. Song, “Tracking the location of materials on construc-tion projects [pHD],” The University of Texas at Austin, 2005.

[6] W. Ni and Z. X. Wang, “An adaptive multi-user location and tracking algorithm for indoor wireless network,” Journal on Communications [J], Vol. 26, No. 1, pp. 66– 73, 2005.

[7] L. W. Zhang and Z. H. Tan, “New TDOA algorithm based on Taylor series expansion in cellular networks,” Journal of Communications [J], Vol. 28, No. 6, pp. 7–11, 2007.

[8] L. W. Zhang and Z. H. Tan, “New 5TDOA algorithm based on Taylor series expansion in cellular networks,” Journal on Communications [J], Vol. 28, No. 6, pp. 8–12, 2007.

[9] K. F. Zhang, “Lecture notes in geoinformation and car-tography,” pp. 374–38, 2007.



A Trust Model Based on the Multinomial Subjective Logic for P2P Network

Junfeng TIAN1, Chao Li1, Xuemin HE2, Rui TIAN1 1College of Mathematics & Computer Science, Hebei University, Baoding, China

2Department of Modern Science & Technology, Agricultural University of Hebei, Baoding, China Email: [email protected]

Received March 9, 2009; revised May 17, 2009; accepted July 7, 2009 ABSTRACT In order to deal with the problems in P2P systems such as unreliability of the Service, security risk and at-tacks caused by malicious peers, a novel trust model MSL-TM based on the Multinomial Subjective Logic is proposed. The model uses multinomial ratings and Dirichlet distribution to compute the expectation of the subjective opinion and accordingly draws the peer’s reputation value and risk value, and finally gets the trust value. The decay of time, rating credibility and the risk value are introduced to reflect the recent behaviors of the peers and make the system more sensitive to malicious acts. Finally, the effectiveness and feasibility of the model is illustrated by the simulation experiment designed with peer-sim. Keywords: Trust, Multinomial Subjective Logic, Reputation, Risk, Dirichlet Distribution

1. Introduction P2P technology is a new distributed network model which doesn’t rely on the server. This model has been applied widely in areas such as peer-to-peer compute, information sharing, distributed search and so on. It real-ized the sharing of the network information and re-sources by the direct exchange between peers in the sys-tems. In this network, all the peers are equal, and truly achieve equality communications between the networks [1]. However, with the extensive and in-depth applica-tions of the existing P2P system, its defects are exposed gradually. The performance of P2P systems cannot achieve the best condition theoretically [2]. The main reasons are the unreliability of the service, security risk and attacks caused by malicious peers [3]. These prob-lems impose serious constraints on the cooperative rela-tions between the users in the P2P system. In addition, co-operation between users in P2P systems is limited, and the most fundamental reason is lack of trust between users and effective cooperation mechanisms. So it cannot motivate users to participate in the system cooperation more actively. The anonymity, high degree of openness as well as the peer type, purpose and other factors led to peers’ different action [4]. The loss of trust between us-ers leads to a severe damage to the performance of P2P network and hampers the further development of P2P network.

Therefore, in order to strengthen the cooperation among peers and improve the overall availability of P2P services, it is a great significance to constructing a reli-able trust management model for effectively resources selection and inspiring co-operation. 2. Related Works Nowadays the research of P2P trust is mainly focused on building reliable trust management model. Trust Man-agement (TM) is first proposed by Blaze M. in 1996 [5], and then it became a research focus of network security.

The PeerTrust [6,7] model proposed by L. Xiong combines both local and global reputation with confi-dence coefficient, and considers several factors influ-encing credibility quantification, the model can cope with virtual ratings well. However, the PeerTrust model does not offer measurements for factors of trust and methods for defining confidence coefficient. The P2P- oriented and reputation-based trust management model proposed in reference [8] introduces risk factor, and proposes to quantize risk with information entropy. This model is superior to some existing trust models in terms of both security and other aspects.

Jøsang makes research on trust management based on Subjective Logic [9], and proposes Evidence Space and Opinion Space that are used to describe and measure trust relationship. Also, he offers a series of Subjective

A TRUST MODEL BASED ON THE MULTINOMIAL SUBJECTIVE LOGIC FOR P2P NETWORK


547

Logic Operators [10] which are used for trust value de-ducing and integrating computing. In this binominal Subjective Logic [11], Bata distribution [12] that is used for describing binominal posterior probability is used as basis, and probability density is defined by positive and negative events, then the probability trust value of every event created among peers is computed. Later, Jøsang proposed multinomial Subjective Logic [13,14] which is based on Dirichlet multinomial probability distribution [15] and allows for ratings of different levels, this can be used for computing reputation, it provides more flexible platform for designing reputation systems. However, neither the influence of the time decay on trust value nor the trust integration of different weights is considered in the Subjective Logic model. It cannot protect the at-tacked target from excessive derogation or exaggerating brought by malicious peers. Besides, it does not consider reflecting the indeterminacy and risk brought by defec-tive interaction in terms of trust computing, and could not monitor probable attack and potential threaten from defective peers.

To deal with the problems mentioned above, we pro-pose a new P2P trust model based on multinomial Sub-jective Logic-MSL-TM (Multinomial Subjective Logic Based Trust Model). It adopts multinomial ratings, and uses Dirichlet distribution function to compute expected value of subjective opinion, with, which we can get the reputation value and risk value of peers, and get trust value of peers finally.

The main innovations of the paper are: 1) By making use of self interaction experience and

interaction experience of other entity in the system, en-tity evaluates the trust value of entities that would inter-act with it, and introduces time decay and rating credibil-ity into trust evaluation to make the trust value of peers reflect their recent action and eliminate excessive dero-gation or exaggerating brought by normal peers, then potential dangers can be prevented effectively, such as cooperative cheating and derogation.

2) Considering potential attack from variant types of defective peers, this paper not only computes reputation value when computing trust value of peers, but also ana-lyzes its historical action, and introduces potential inde-terminacy risk value as appendix of reputation value.

3) We can adjust the value of reputation and risk ap-propriately to make the trust value of peers more sensible to defective action, and achieve the goal of detecting defective action.

A detail description of our proposed trust model is pre-sented in Section 3, and the method for computing trust value is provided as well. In Section 4, we perform the simulation experiments, and the experiment results and analysis are reported. The last section ends the paper by presenting some concluding remarks.

3. MSL-TM trust model In this paper, we propose a trust model MSL-TM that is facing to P2P file-sharing primarily. The model can also be used to P2P data management, P2P collaborative computing, and e-business applications systems. 3.1. Related definition Definition 1, trust.

The reliability, credibility, and capacity to provide services of an entity reflected in the interaction.

Definition 2, ratings.

One peer gives another peer a quantitative value in accordance with their action when they interact with each other.

Definition 3, local trust.

The local trust peer X to Y, is based on the interaction history of peer x and y, and the historical ratings of the interaction of x, then get the expectations of future be-havior (trust level) of x to y.

Definition 4, global trust.

The global trust of peer Y is a credibility of y derived from the ratings of y’s neighbor on y.

Definition 5, reputation.

It can get the individual expectations of future behav-ior through observation or ratings information of a his-tory of individual acts. Reputation is composed of local trust and global trust. Calculation methods see Subsec-tion 3.5.

Definition 6, risk.

Risk is a concept of economics. In economics, risk re-fers to the uncertainty of loss; it is a negative deviation from the consequences of uncertainty to the expected target.

In this paper, it reflects the unreliability of the peer re-cently, which is the uncertainty of the interaction results and the probability of adverse consequences. The value of the risk Ri is composed of Local expectations of nega-tive ratings

LE

, global expectations of negative ratings

AE

and Risk components of the uncertainty in opinion

Xu . Calculation methods are introduced in Subsection 3.6.

Definition 7, trust value.

The quantized value of trust for one entity to another, it’s related to the reliability, integrity and performance of the peer. We use T to denote the trust value x to y. Re and Ri denote the reputation value and risk value of peer y respectively, α,β is their weight. Then the trust

J. F. TIAN ET AL.


548

value of peer y is:

T＝αRe－βRi (1)

Where 0≤α,β≤1.The value of α, βare determined by the degree of optimism of x to y.

The more optimistic to the y’s behavior and interac-tion results, the bigger the value of α / β is, so that it can weaken the influence of risk on the trust value. Oppo-sitely, The more pessimistic to y’s behavior and interac-tion results, the smaller of the value of α / β is, so that the trust value is more sensitive to the risk value. In order to calculate more precisely, in this paper, we set, β= NR(x3)/ NTotal , NR(x3) is the number of rating R(x3), NTotal is the number of total ratings.

3.2. Multinomial Ratings In binomial subjective logic, ratings are considered to be either true or false. This makes the ratings too one-sided and rigid. Now we introduce multinomial ratings.

( ( ) | 1... )iR R x i k

In this paper, we take trinomial ratings for example, mainly for P2P file-sharing applications. According to the degree that the consumer satisfies with the service quality, the ratings for the provider’s service are divided into three levels, then after consumers completing download, they can make the corresponding ratings. The three levels are:

( ( ) | 1...3)iR R x i

R(x1)=B(bad)：The document is false or malicious or non-responsive.

R(x2)=C(common)：The document is true but the quality is not good or download has delayed.

R(x3)=G(good)：The document is true and the quality is good, the speed of download is fast.

The ratings are divided by many parameters according to the real situation., this paper we refer to a ternary group (authenticity, download speed, quality). In practi-cal applications, the test parameters can be increased.

Authenticity: If the document downloaded is the one the user requested, it’s a true document, otherwise it’s a false document.

Download speed: is the time how long the user has waited. We define a parameter K=file size/transfer speed, the value of K is given according to the actual situation. k1, k2 is two middle values, when K<k1, the download speed is too slow or non-response; when K[k1，k2], the speed is not very good; When K>k2, the speed is fast. 3.3. Visualizing Opinion in the Space Let X=xi|i=1,…,k be a frame, then the composite

function ωX=( b

,u, a

) [13] is an opinion over X., where

Table 1 The classification of the quality. .

Quality Data audio

document Video or document

Good No data loss Smooth screen, good sound quality

Screen not smooth,Common Have a small

Poor quality or

amount of data loss and bit error

A serious data

sound is not clear

Screen can not be malicious files loss, or download

the file with virus displayed, poor sound quality or the file download with virus

Figure 1. Opinion pyramid with example trinomial opinion. b

is a vector of belief masses over the propositions of X,

u is the uncertainty mass, and a

is a vector of base rate values over the propositions X. These components satisfy:

[0b

of

k, 1]

, ( ) 0, ( ) 1x X

b b x

;

( ) 1x X

u b x

;

( ) 0, ( ) 1x X

a a x

The probability expectation value of the opinion is:

( ) ( )X i ib x a x u E

X ,

where .

x( )XE 0, ( ) 1Xx X

E

Trinomial opinions can be ualized as points inside a triUX

visangular pyramid as shown in Figure 1. The top vertex represents uncertainty, the three vertex of the bottom



549

ents of a frame. In case of a bi

i

then the multinomial Dirichlet density function over X can be expressed as:

bx1, bx2, and bx3 represent three Belief vectors. The pro-jector starting from the opinion point is parallel to the line that joins the uncertainty vertex and the base rate point on the bottom. The point at which the projector meets the bottom determines the expectation value of the opinion, i.e. it coincides with the point corresponding to expectation value EX.

We are interested in knowing the probability distribu-tion over the disjoint elem

nary frame, it is determined by the Beta distribution. In the general multinomial case it is determined by the Dirichlet distribution, which describes the probability distribution over a k-component random variable P(xi), i=1…k,

( )k

ip x1

0, ( ) 1i

p x

,

( ( ) 1)1

1

1

( (k

( ))

( | ) ( )( ( ))

i

i kxi

ik i

ii

xf p p x

x

(2)

because ( ) ( )i ix r x , then:

( ( (r x ) ( ))1( | , )

( ( ) ( ))1

( ( ) ( ) 1)( )

1

kCa xi iif p r a

kr x Ca xi ii

k r x Ca xi ip xii

(3)

where r x ，( ) 0i ( ) 0, ( ) 1ix X

a x a x

， 2C .

a priori stant, is observat evidence, in this paper, we use it for rati is base rate.

fu represen-ta

C is con ion rngs, a

Dirichlet distributions translate observation evidence directly into probability density nctions. The

tion of evidence, together with the base rate, can be used to denote opinions:

1

1

( )( ) i

X i k

r xb x

( ), 1,...,

( )

ij

X k

ij

C r xi k

Cu

C r x

(4)

The probability expectation values is expressed as:

1

( ) ( )( ( ) | , ) , 1,...,i i

i k

r x Ca xE p x r a i k

( )ijC r x

(5)

3.4. Dynamic Base Rate Agents will come and go during the lifetime of a market,

gn new members a In the simplest case, this

and it is important to be able to assiasonable base rate reputation.re

can be the same as the initial default reputation that was given to all agents during bootstrap. However, it is pos-sible to track the average reputation score of the whole community and this can be used to define the base rate for new agents, either directly or with a certain additional bias. Not only new agents, but also existing agents with a standing track record can get the dynamic base rate. Af-ter all, a dynamic community base rate reflects the whole community, and should therefore be applied to all the members of that community.

The global rating after the combination of the opinions is FR

(the computing method is given in Subsection

3.5.2), and FE

is the global expectation vector (the

com uting method is given in Subsection 3.5.2). This vector then needs to be normalized to a base rate vector, the base rate ime t +1 is then simply expressed as the global expectation vector at time t:

r

p

at t

F Fa E

.

3.5. Calculation of the Reputation Value Re The reputation value Re is composed of local trust L and global trust A, and can be calculated as follows:

Re (1 ) ,0 1L A k (6)

o another. It is similar to hu-s (her) trust to

Ry

where γ is the weight of the local trust, (1-γ) is the weight of the global trust.

3.5.1. Calculation of the Local Trust L The local trust is based on the history ratings to calculate he trust level of one peer tt

man society; an individual builds up hianother through local contacts. The local trust is not only relevant to the history ratings, in order to reflect the ob-jectivity and accuracy of the calculation; we introduce the following two factors:

1) Time decay: Agents will change their behavior over time; the research based on economic theory shows that: when computing the current reputation, reducing the weight of the history ratings can make the reputation converge at a steady state. The longer the time is, the smaller of the impact on the reputation by the ratings. The shorter the time interval from now, the better the effect of the ratings, so it is necessary to give the recent ratings a higher weight [16,17].

We denote the ratings in level xi as Ry(xi), equal to give y a rating of level xi, the value is 1.Ti is the time decay factor, Ti=e-(t-tRy(xi)), where t is Current time, t (xi) is the time when Ry(Xi) is given.

The cumulative ratings to y with time decay is Ry,t(Xi),

J. F. TIAN ET AL.


550

where is the kth ratings of y, t is the current

Including n times ratings. Then Ry,t can be expressed as: n

( )( )

,1

( ) ( ), 1,2,3kR xy i

t tk

y t i y ik

R x e R x i

(7)

( )ky iR x time,

)ix(kyRt is the time when the kth ratings is give

of the interact w he neighbor peer of i. In this pa

e malicious ra

trust value T and the global expectation EA(xi) of ra

1-k) is the weight of the expectation value.

To some extent, the rater’s trust

of ratings is small, the ra

nt rating levels as follows by putting the FFormula (5):

n.

2) Rating credibility DR Definition 8, Neighbor peer: Let i and j be two peers

p2p network respectively, if peer i hasith peer j, and then j is tper we call the peers who give other peers ratings rater,

and the peers which have been given ratings ratee. Definition 9, rating credibility: Reflects the degree of

credibility of the ratings is given, whose value can be used as a weight of ratings given by a peer.

It can prevent the derogation by malicious peers through using the rating credibility. It is very subjective that one peer gives ratings to another, so th

tings of a neighbor peer can bring a bad effect to the reputation of the rated peer. Therefore the accuracy is affected by the credibility of the ratings of the neighbor peer.

The rating credibility should not be given subjectively, in this paper, the rating credibility is defined as DR, The rater’s

tings level i are defined as a measure factor:

(1 ) ( ), 1, 2,3R A iD kT k E x i (8)

where k is the weight of the rating credibility (

value T determines the rating credibility of the rating Ry(Xi); In addition, if the expectation value of this kind

ting of this peer is unreliable. It can eliminate exces-sive derogation or exaggerating brought by malicious peers through introducing the rating credibility.

, , ,( ) ( ), 1,2,3Ry t D i R y t iR x D R x i (9)

We can obtain local expectations of differeormula (9) into the

, , ( ) ( )( ) , 1, 2,3

( )Ry t D i F i

L i k

R x Ca xE x i

C R x

(10)

, ,1 Ry t D jj

where i=1…k, and k=3 in this paper. By giving expectations of different r

weight value, the peer’s local trust value can be calcu-

3.5.2. Calculation of the Global TruPeer y’s global trust is related to the fo

1) The number of y’s neighbor peers. The more of the

the uncertain-

th

ven by them; on the contrary, if ne

the peers will be enhanced.

e-tw

simply add the observations from the tw

ating levels a

lated as follows:

( ) ( ), 1,2,31

kL x E x ii iLi

(11)

st L llowing factors:

number of neighbor peers, the smaller of ties relatively, then y’s global trust is more accurate; On

e contrary, if the number of neighbors peers has noth-ing to do with the global trust, a small number of mali-cious peers are easy to uplift each other’s reputation through collusiveness.

2) The rating credibility of y’s Neighbor peers. The higher of neighbor peer’s credibility rating, the more credible of the ratings gi

ighbor peer’s credibility rating is low the ratings can-not be trusted.

3) The ratings of y’s neighbor peers. If neighbor peers give a good rating, the global trust will be enhanced; otherwise, the risk level of

In many situations there will be multiple sources of evidence, and fusion can be used to combine evidence from different sources. A distinction can be made b

een two cases. The two peers observe the process during disjoint time

periods. In this case the observations are independent, and it is natural to

o peers, and the resulting fusion is called cumulative fusion.

Let the two observers’ respective opinions be ex-pressed as

A ( , , )A A AX X X Xb u a

and ( , , )B B B B

X X X Xb u a

over the same frame | 1,...,

iX x i k . Let A BX be

the opinion such that: When 0Xu A 0XuB :

i i

A B Bx X xA B

ix B

AX

A A B

b u b u

X X X X

A BA B X XX A B A B

X X X X

u u u

u uu

u u u u

(12)

When

bu

0BXu : 0A

Xu

(1 )

0

A B A B

i i ix x x

X

b b ,

A Bu

b

where

00

limAXBX

BX

A BuX X

u

u u

(13)

Then

u

A BX is called the cumulatively fused opinion of

AX and B

X , representing the combination of inde-

pe opinito te th

ndent ons of A and B. By using the symbol ‘⊕’ designa is belief operator, we define:

A B A BX X X (14)

The two peers observe the process during the same time period. In this and it is natural to

case the observations are dependent, take the average of the observations



551

by the two peers, and the resulting fusion is called aver-aging fusion.

Let the two observers’ respective opinions be ex-pressed as

( ,A A , )A AX X X Xb u

a and ( , , )B B B B

X X X X

same frame | 1,...,

b u a

over the

iX x i k . Let A B be the opinion su

X

ch that: When ： 0Au X 0Bu X

2

i

A B

i

B A

i

x XA Bb u

x X

A BX X

A BA B X XX A B

X X

b u

u u

u uu

u u

(15)

when ：

xb

0AXu 0A

Xu

(1 )

0

i i

A B A B

ix x x

A Bu

， X

b b b

where

00

limAXBX

BX

A BuX X

u

u u

(16)

Then

u

A BX is called the cumulatively fused opinion of

AX and B

X , representing the combination of inde-pendent ns of A and B. By using the symbol ‘opinio ’

designa is belief operator, we define: to te thA B A BX X X ( )

The global ratings

17

FR

can be computed by combin-

ing the two fusionexpectation is:

operator above, and then the global

1

( ) ( )( ) , 1, 2,3F i F i

A i k

R x Ca xE x i

C

(18)

( )F jjR x

The global trust can be calculated as

(19)

3.6. Calculation of the Risk Value Ri

e reputation order behav-

r of the peers, and unable to identify malicious peers.

follows:

1i A i

i

( ) ( ), 1, 2,3k

A x E x i

There are problems simply considering thvalue, it lacks sensitivity to perceive the disioIn this paper risk reflects the recent level of the peers’ reliability. The value of the risk Ri is composed of Local expectations of negative ratings LE

, global expectations

of negative ratings AE

and Risk components of the un-

certainty in opinion Xu . Then th isk can be expressed

as:

1 1 1( ) (1 ( ) (1 ( ))

e r

)L A F XRi E x E x a x u (20)

where λ is the weight of the local expectation, (1-λ) is the weight of global expectation, is 1(1 ( ))Fa x the level of contribution of the uncertainty in the opinion to the risk.

Fa

is the base rate.

Xu can be gotten from the Formula (4):

, ,1( )

k

y t D jC R x

R

X

j

u

When the negative ratings are multinomial, for exam-ple R(x1)，R(x2)…R(x3) are all negative ratings, we should in

C

troduce a parameter ρfor the weight of different level,ΡR(x1)> ΡR(x2)>…ΡR(xn), then:

( )1( ( ) (1 ) ( )

(1 ( )) )i

n

R x L i A iiRi E x E x

a x u

(21)

F i X

The introduction of risk has two funhand, it is more accurately to reflect the trust value com-bin

ts

y networks is a velopers. Several

ctions. On one

ing with the reputation. When there are more good interactions, the risk value will be small, and then the effect of the risk value on the trust value will be smaller. On the contrary, when there are more bad interactions, the risk value will be larger, and the trust value will be decreased. So considering the risk can be considered as an punishment to the malicious peers. On the other hand, because the risk comes from the interactive history of failure, risk value is determined by the degree of these failures, the larger the degree of loss is, the greater the risk is. The risk values can be used as a prediction of its future behavior. So, it can be used as an effective means to identify malicious peers.

4. Simulation and Analysis 4.1. Experimental Environmen Simulating Peer-to-Peer (P2P) overlaommon problem for researchers and dec

solutions exist to solve this problem. The PeerSim P2P simulator proposed by BISON [18] is one of the most known among researchers. All the simulations in this paper are based on PeerSim [19]. The philosophy of PeerSim is to use a modular approach, as the preferred way of coding with it is to re-use existing modules. These modules can be of different kinds, for example there are modules which can construct and initialize the underlying network, modules which can handle the dif-ferent protocols, modules to control and modify the net-work and so on. PeerSim offers a lot of these modules in its sources, which ease greatly the coding of new appli-cations. PeerSim 1.0 supports two simulation models: the cycle-based model and a more traditional event-based

J. F. TIAN ET AL.


552

r of protocols. The peer interface pro-vi

for execution at certain points during the si

vides a service to other protocols to access a se

[2

and then authenticity of the fil

files to 1000 peers ra

red and their ev

f peers offers vi d they offer authentic files at a pr

, they offer

vi

ploading, they may offer authen-tic

ysis

our Types of Peers as the Interaction Times Increases

of pee shown

alicious Peers on Successful Interaction Ratio

probab malicious

Param

model. Simulations in this paper use the former model. The main interfaces on which the PeerSim is based are listed as follows:

1) Peer: The P2P network is composed of peers. A peer is a containe

des access to the protocols it holds and a fixed ID of the peer.

2) Protocol: It defines the behaviors of peers in the network.

3) Control: Classes implementing this interface can be scheduled

mulation. These classes typically observe or modify the simulation.

4) Linkable: Typically implemented by protocols, this interface pro

t of neighbor peers. The instances of the same linkable protocol class over the peers define an overlay network.

A more detail introduction of PeerSim is shown in reference [19]. As a reference, we simulate the EigenRep

0] model simultaneously. Assume there is a file sharing system, users need to

download some files from it, e is the unique criterion for judging whether the inter-

action is successful or not. Here, we assume the file-sharing network is ideal, which is any user can find any files (It may be inauthentic) they want and all peers that are claimed as owner of them. Users take simple action, they choose the trust worthiest one among all the peers that are claimed as owners of needed files, and then the users interact with it (download).

Given a simulation network with 1000 peers, assume there are 10000 files. We allocate the

ndomly. Among these peers, the malicious ones take percentage from 0.1 to 0.5. Assume we can location all the files of the system in our simulation, and each file is owned by at least one good peer, every peer must ac-complish 100 times of interaction in the whole simula-tion. In every interaction, objects choose one file ran-domly from the files that they have never owned and downloaded. Successful interaction makes the users own these files, and failed interaction would not increase user’s files. In the whole simulation, every peer chooses one file that it does not own to download. If users own the files finally, then the download succeeds, otherwise, it fails. The ratio of successful times to failed times is called successful probability of interaction.

We design the several types of peers as following: 1) Good peers. Both the service they offealuation to other peers are all authentic. 2) Malicious peers. a) General malicious peers. This type ortual service only, anobability of 40% for every service request. b) Collusive malicious peers. This type of peers decry

good peers while exaggerating their cahoots

rtual upload service. c) Strategy malicious peers. This type of peers adopt

certain strategy when u files at different probability according to different

cases. In details, they offer authentic files at a low prob-ability when trust value is high, while offer authentic files at a high probability when trust value is low. In this way, they maintain their trust value at a credible thresh-old that the system defined, in case of being detected.

The initial trust values are defined as 0.5; parameters of the model are defined in Table 2.

4.2. Simulation results and anal 4.2.1. The Trust Value Variation of the F

Figure 2 illustrates the trust value trend of the four types rs as the increasing of interaction times. As

in the figure, the trust value of good peers increases gradually, while that of general malicious peers decrease rapidly. The trust value of strategy malicious peers un-dulate to some extent, that is because these peers are cunning, it is hard to identify them, and however, their trust value trend is decrease on the whole. We can see that the model illustrates the trust value of peers’ changes with interaction times, what is just as expected. 4.2.2. Influence of Percentage of General M

Figure 3 shows the variation of successful interaction ility of MSL-TM as variation of the ratio of

peers takes under the mode of no-reputation system and EigenRep as well as two parameters. Assume good peers offer authentic files at a probability of 0.97% in the

Table 2. Simulation parameters and their values.

eter α β γ k λ C

Value 0.7 1

or

0.3 or 0

0.7 0.6 0.7 2

Number of Interactions

0

0.2

0.4

0.6

0.8

1

20 40 60 80 100

Good peers

Strategy malicious peers

General malicious peersCollusive malicious peers

Figure 2. Variation of the four types of peers as the time of interaction increases.



553

Average S

uccessful Interactions(%)

0

0.6

0.7

0.8

0.9

1

10 20 30 40 50malicious nodes radio(%)

0.5

MSL-TM(0.7,0.3)

No-reputation system

MSL-TM(1,0)

EigenRep

n ratio.

Figure 3. Influence of percentage of general malicious peers to successful interactio

0

0.6

0.7

0.8

0.9

1


Average S

uccessful Interactions(%) 0.5

MSL-TM(0.7,0.3)


MSL-TM(1,0)

EigenRep

Figure 4. Influence of percentage of collusive malicious eers to successful interaction ratio.

p

0

0.6

0.7

0.8

0.9

1


Average S

uccessful Interactions(%) 0.5

MSL-TM(0.7,0.3)


MSL-TM(1,0)

EigenRep

malicious peers to successful interaction atio. simulation, while general malicious peers do at a prob-ability of 40% in order to hide their malicious action. As the figure illustrates, interaction probability is 97% when there is not malicious peers. However, the no-reputation system, without any precaution and defense, it’s the suc-cessful interaction probability of decrease rapidly as ma-licious peers increase. EigenRep is short of punish strat-egy for the malicious peers that offer authentic service at certain probability, so the successful interaction prob-

ability decreases obviously; MSL-TM (0.7,0.3)shows strong superiority as introduces risk factor(β=0.3). The reason of it is that this model quantifies risk with expec-tation and indeterminacy, and then we are surer about actions of peers. 4.2.3. Influence of Percentage of Collusive Malicious

Peers on Successful Interaction Ratio Collusive malicious peers decry all good peers that have interacted with it and exaggerate their cahoots, they try to destroy the validity of network by decreasing the trust value of authentic peers and increasing that of their ca-hoots. This is a serious cooperative cheat actually. From

agnify on successful

uccessful Interaction Ratio trategy malicious peers are cunning, they have a latent

probab other-

In this logic as

Figure 5. Influence of percentage of strategy r

the result and comparation as illustrated in Figure 4, we can see the influence of decry and minteraction probability is not obvious, the reason is thatno-reputation system and EigenRep is lack of punish-ment strategy. However, as virtual services that mali-cious peers offered increase, the interaction probability of system decrease obviously. Our model introduces rat-ing credibility and risk factor, though the successful identification ratio of collusive may decrease in the be-ginning, it becomes stable as restrain to malicious peers. This model can reach a successful rate of 80% under the condition that fifty percent of peers are collusive mali-cious peers, it can depress influence of decry and mag-nify effectively. 4.2.4. Influence of Percentage of Strategy Malicious

Peers on SSperiod. Assume this kind of peers provide true files at a

ility of 30% when trust value is above 0.6, wise, at a probability of 0.6, we call the peers with trust value below 0.5 as incredible peers. As shown in Figure 5, strategy malicious peers hide it by providing true files in the beginning, so there is very little difference from the successful interaction ratio of these mechanisms. As the number of interaction increases, malicious action of some peers begins expose. But as EigenRep does not take any action, it cannot identify their dynamic action. Besides, it has not any punishment mechanism to cheat-ing, so successful interaction ratio decreases largely as the percentage of malicious peers increases. The suc-cessful probability of MSL-TM (0.7, 0.3) which consid-ers risk decrease less than that of MSL-TM (1,0) which takes reputation into account only. This told us the im-portance of computing risk value, and it is accurate to quantify risk with expectation and indeterminacy. The experiment result proves that MSL-TM is robust to risk in condition that percentage of malicious peers variant. 5. Conclusions

paper, we take multinomial subjective

J. F. TIAN ET AL.


554

fScience and Technology in Hebeit No. 072135192.

earch on dynamic trustmodel for large scale distributed environment,”

8, No. 6, pp. 1510–152June 2007.

Supporting rerust for peer-to-peer electronic comEE Transactions on Knowledge and

Data Engineering, Vol. 16, No. 7, pp. 843–

reputation-based trust

c under uncertainty,”

n Proceedings of the 15th Bled Electronic

ng, “Cumulative and averaging unfusion of

Austria, April 2007.

, 2004.

,

basis, adopt multinomial ratings, and compute expected value of opinions with Dirichlet distribution function, with which we can get the reputation value and risk value of peers, and get trust level of peers finally. We quantify risk with expectation and indeterminacy to hold actions of peers more accurate, which improves success-ful interaction probability. The experiment results show that MSL-TM is robust to resist risks in condition that percentage of malicious peers’ changes, this is prior to existed models in many indexes. We will further improve subjective logic in our future research, such as doing research of dynamic of prior const C to make it more reality, and decreasing complexity of our model, so that it can serve for P2P much better. 6. Acknowledgements This work is supported by the National Natural Science Foundation of China under Grant No. 60873203, the Natural Science Foundation of Hebei Province under Grant No. F2008000646, and the Guidance Program o

in t

the Department of Province under Gran 7. References [1] Ramaswamy and L. L1, “Freeriding: A new chal-

lenge to peer-to-peer file sharing systems,” in 36th Annual Hawaii International Conference on System Sciences (HICSS236), 2003.

[2] C. Lin and X. H. Peng, “Research on trustworthy networks,” Chinese Journal of Computers, Vol. 28, No. 5, pp. 751–758, May 2005.

[3] D. S. Peng, C. Lin, and W. D. Liu, “A distributed trust mechanism directly evaluating reputation of nodes,” Journal of Software, Vol. 19, No. 4, pp. 946– 955, April 2008.

[4] X. Y. Li and X. L. Gui, “Res [16]

Journal of Software, Vol. 1 1, [

[5] M. Blaze, J. Feigenbaum, and J. Lacy, “Decentral-ized trust management,” in J. Dale, G Dinolt, edi-tors, Proceedings of the 17th Symposium on Secu-rity and Privacy, Oakland, IEEE Computer Society Press, CA, pp. 64–173, 1996.

[6] L. Xiong and L. Liu, “PeerTrust: pu--tation-based t

munities,” IE857,

Budapest, pp. 123–134, May 2003. 2004.

[7] L. Xiong and L. Liu, “A

model for peer-to-peer ecommerce communities,” Proceedings of IEEE International Conference on Electronic Commerce, New York, pp. 228–229, 2003.

[8] C. Q. Tian, S. H. Zou, W. D. Wang, and S. D. Cheng, “Trust model based on reputation for peer-to-peer networks,” Journal on Communica-tions, Vol. 29, No. 4, pp. 63–70, 2008.

[9] A. Jøsang, “A logic for uncertain probabilities,” International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, Vol. 9, No. 3, pp. 279– 311, June 2001.

[10] A. Jøsang, “The consensus operator for combining beliefs,” Artificial Intelligence Journal, Vol. 142, No. 12, pp. 157–170, October 2002.

[11] A. Jøsang, “Probabilistic logihe Proceedings of Computing: The Australian

Theory Symposium (CATS2007), CRPIT, Vol. 65, Ballarat, Australia, January 2007.

[12] A. Jøsang and R. Ismail, “The beta reputation sys-tem,” iCommerce Conference, Bled, Slovenia, June 2002.

[13] A. Jøsang, “Conditional reasoning with subjective logic,” Journal of Multiple-Valued Logic and Soft Computing, Vol. 14, No. 2–3, pp. 155–185, 2008.

[14] A. Jøsabeliefs,” in the Proceedings of the International Conference on Information Processing and Man-agement of Uncertainty (IPMU2008), Malaga, June 2008.

[15] A. Jøsang and H. J. Dirichlet, “Reputation sys-tems,” In the Proceedings of the International Con-ference on Availability, Reliability and Security (ARES), Vienna,

B. A. Huberman and F. Wu, “The dynamics of reputations,” Journal of Statistical Mechanics: The-ory and Experiment, Vol. 4, pp. 1−17

17] W. Yuan, J. S. Li, and P. L. Hong, “Distributed peer-to-peer trust model and computer simulation,” Journal of System Simulation, Vol. 18, No. 4, pp. 938−942, 2006.

[18] BISON, http://www.cs.unibo.it/bison/.

[19] PeerSim, http://peersim.sourceforge.net/.

[20] S. D. Kamvar and M. T. Schlosser, “EigenRep: Reputation management in P2P networks,” in S. Lawrence, editor, Proceedings of the 12th Interna-tional World Wide Web Conference, ACM Press



A Rank-One Fitting Method with Descent Direction for Solving Symmetric Nonlinear Equations*

Gonglin YUAN, Zhongxing WANG, Zengxin WEI College of Mathematics and Information Science, Guangxi University, Nanning, Guangxi, China

Email: [email protected]. Received January 14, 2009; revised March 4, 2009; accepted May 31, 2009

ABSTRACT In this paper, a rank-one updated method for solving symmetric nonlinear equations is proposed. This method possesses some features: 1) The updated matrix is positive definite whatever line search technique is used; 2) The search direction is descent for the norm function; 3) The global convergence of the given method is established under reasonable conditions. Numerical results show that the presented method is in-teresting. Keywords: Rank-One Update, Global Convergence, Nonlinear Equations, Descent Direction

1. Introduction Consider the following system of nonlinear equations:

( ) 0, ,nF x x R (1)

where F:Rn→Rn is continuously differentiable and the Jacobian F(x) of F(x) is symmetric for all x∈Rn. Let θ(x) be the norm function defined by

21( ) ( )

2x F x

then the nonlinear Equation (1) is equivalent to the fol-lowing global optimization problem

min ( ), nx x R (2)

The following iterative method is used for solving (1)

kkkk dxx 1 (3)

where xk is the current iterative point, dk is a search di-rection, and ak is a positive step-size.

It is well known that there are many methods [1–9] for the unconstrained optimization problems

min ( )( )nx Rf x UP

,

where the BFGS method is one of the most effective quasi-Newton methods [10–17]. These years, lots of modified BFGS methods (see [18–23]) have been pro-posed for UP. Different from their techniques, Xu [24] presented a rank-one fitting algorithm for UP and the

numerical examples are very interesting. Motivated by their idea, we give a new rank-one fitting algorithm for (1) which possesses the global convergence, the method can ensure that the updated matrices are positive definite without carrying out any line search, the search direction is descent for the normal function, and the numerical results is more competitive than those of the BFGS method for the test problem.

For nonlinear equations, the global convergence is due to Griewank [25] for Broyden’s rank one method. Fan [1], Yuan [26], Yuan, Lu andWei [27], and Zhang [28] presented the trust region algorithms for nonlinear equa-tions. Zhu [29] gave a family of nonmonotone back-tracking inexact quasi-Newton algorithms for solving smooth nonlinear equations. In particular, a Gauss- Newton-based BFGS method is proposed by Li and Fu-kushima [30] for solving symmetric nonlinear equations, and the modified methods [31,32] are studied.

The line search rules play an important role for solving the optimization problems. In the following, we briefly review some line search technique to obtain the stepsize ak.

Brown and Saad [33] proposed the following line search method:

( ) ( ) ( )Tk k k k k k kx d x x d

k

(4)

where

( ) ( ) ( )T Tk k kx d F x F x d

(0,1), , (0,1)kik r r

,

, *National Natural Science Foundation of China (10761001) and the Scientific Research Foundation of Guangxi University (Grant No. X081082). ik is the smallest nonnegative integer i such that (4). Zhu

G. L. YUAN ET AL.


556

k

[29] gave the nonmonotone line search technique:

( )( ) ( ) ( )Tk k k l k k kx d x x d

,m

( ) 0 ( )( ) max ( ), (0) 0

( ) min ( 1), , 1

l k j m k k jx x

m k m k M k

and M is a nonnegative integer. Yuan and Lu [32] pre-sented a new backtracking inexact technique to obtain the stepsize ak:

2 2( ) ( ) ( )T

k k k k k k kF x d F x F x d (5)

where δ∈(0,1) is a constant, and dk is a solution of the system of linear Equation (9). Li and Fukashima [11] give a line search technique to determine a positive step-size ak satisfying

2 2

2 2

1 2

( ) ( )

( ) ( )

k k k k

k k k k k k

F x d F x

F x d F x

2 (6)

where δ1 andδ2 are positive constants, and εk is a positive sequence such that

0k

k

(7)

The Formula (7) means that F(xk) is approximately norm descent when k is sufficiently large. Gu, Li, Qi, and Zhou [14] presented a descent line search technique as follows

2 2

2

1 2

( ) (

( )

k k k k

k k k k

F x d F x2

)

F x d

(8)

whereδ1 andδ2 are positive constants. In this paper, we also use the Formula (8) as line search to find the step-size ak:

The search direction dk: play a main role in line search methods for solving optimization problems too, and dk: is a solution of the system of linear equation

( ) 0k k kB d F x (9)

where Bk is often generated by BFGS update formula

1

T Tk k k k k k

k k T Tk k k k k

y y B s s BB B

y s s B s (10)

where

1k ky g g k k and 1k ks x x Is there another

way to determine the update formula? Accordingly the search direction dk is determined by the way. In this pa-per, the updated matrix Bk is generated by the following rank-one updated formula

1T T

k k kB B v v k (11)

1 1

Tk k k k

k k Tk k k

H v v HH H

v B v

(12)

where, as k = 0, B0 is the given symmetric positive defi-nite matrix,

1kB H k and 0 0( ),k k kv F x

is a positive constant. Then we use the following formula to get the search direction,

1( )k kB d q 0 (13)

11

1

( )( ) k k k k

kk

( )F x F F xq

(14)

Bk follows (11), ak-1 is the steplength used at the pre-vious iteration, and the Equation (14) is inspired by [34].

Throughout the paper, we use these notations: . is the

Euclidean norm, and F(xk) and F(xk+1) are replaced by Fk and Fk-1, respectively.

This paper is organized as follows. In the next section, the algorithm is stated. The global convergence conver-gence is established in Section 3. The numerical results are reported in Section 4. 2. Algorithm In this section, we state our new algorithm based on Formulas (3), (8), (11), (12) and (13) for solving (1).

Rank-One Updated Algorithm (ROUA). Step 0: Choose an initial point x0∈Rn constants

,0,1,,0),1,0( 1210 r

symmetric positive definite matrices B0 and B0-1=H0 . Let:

k = 0; Step 1: If ( ) 0kF x , stop. Otherwise, solving lin-

ear Equation (13) to get dk; Step 2: Find ak is the largest number of 1,r,r2,…

such that (8); Step 3: Let the next iterative point be xk+1= xk+akdk; Step 4: Update Bk+1 and Hk+1 by the Formulas (11) and

(12) respectively; Step 5: Set k: = k + 1. Go to Step 1. In this paper, we also give the normal BFGS method

for solving (1), the algorithm which has the same condi-tions to ROUA is stated as follows.

BFGS Algorithm(BFGSA). In ROUA, the Step 4 is replaced by: Update Bk+1 by

the Formula (10). Remark 1. a) By the Step 0 of ROUA, there should

exist constants λ1≥λ0＞0 such that

2 2

1 0

2 2

0 1

,

1 1,

Tk

T nk

d d B d d

d d H d d d R

(15)

A RANK-ONE FITTING METHOD WITH DESCENT DIRECTION FOR SOLVING SYMMECTRIC NONLINEAR EQUATIONS


557

b) By the Step 4 of ROUA, it is easy to deduce that the updated matrices are symmetric 3. Convergence Analysis This section will establish the global convergence for ROUA. Let Ω be the level set defined by

0 | ( ) ( ) x F x F x (16)

In order to establish the global convergence of ROUA, the following assumptions are needed [30,34,35].

Assumption A 1) F is continuously differentiable on

an open convex set Ω1 containing Ω. 2) The Jacobian of F is symmetric, bounded and uniformly nonsingular on Ω1, i.e., there exist constants M≥m＞0 such that

1( ) ,F x M x (17)

and

1( ) , , nF x d m d x d R (18)

Remark Assumption A 2) implies that

1( ) , , nM d F x d m d x d R (19)

19 ) ( ) , ,M x y F x F y m x y x y (20)

In particular, for all x∈Ω1, we have

* *( ) ( ) ( ) *M x x F x F x F x m x x (21)

where x* stand for the unique solution of (1) in Ω1.

Lemma 3.1 Let Assumption A hold. Consider ROUA. Then for any d∈Rn, then there exist constants m0 such that

2

0 ,T nkd B d m d d R (22)

i.e., the matrix Bk is positive for all k.

Proof. By ROUA, we know that the initial matrix B0 is symmetric positive, and then we have (15). Using (11), for k≥1, we have

1

2

1

2

1 0

T T T Tk k k k

T Tk k

T Tk

d B d d B d d v v d

d B d d v

d B d d B d d

0

(23)

Let m0=λ0. Then we get (22). The proof is complete. Since Bk is positive definite, then dk which is deter-

mined by (13) has the unique solution. The following lemma can found in [34], here we also give the process of this proof.

Lemma 3.2 Let Assumption A hold. If xk is not a sta-

tionary point of (2), then there exists a constant a'＞0 depending on k such that when ak-1∈(0, a'), the unique solution d(ak-1) of (13) such that

1( ) ( ) 0Tk kx d (24)

Moreover, inequality 2 2

1 1

2

1 1 1 2 1

( ( )) ( )

( ) ( )

k k k k

k k k k

F x d F x

d F

2x

)kx

k

k

(25)

Proof. By (14), we can deduce that

1

10

( ) ( ) (limk

k kq F x F

(26)

From (13), we get

1

1

10

11

0

1

( ) ( )

( ). ( ) ( )

( ). ( ) ( ) ( )

lim

limk

k

Tk k

Tk k k k

Tk k k k

x d

F x F x B q

F x F x B F x F x

(27)

Since xk is not a stationary point of (2), we have F(xk)F(xk)≠0. By F(xk) is symmetric and Bk is posi-tive. We obtain (24).

1

1

2 2

1 1

0 1

10

1

( ( )) ( )

2 ( ) ( )

2 ( ). ( ) ( ) ( ) 0

lim

limk

k

k k k k

k

Tk k

Tk k k k k

F x d F x

x d

F x F x B F x F x

However, the right hand side of (25) is O(ak-1). Thus, inequality (25) holds for all ak-1＞0 sufficiently small. The proof is complete.

The above lemma shows that line search technique (8) is reasonable, and the given algorithm is well defined. Lemma 3.2 also shows that the sequence θ(xk) is strictly decreasing. By Lemma 3.2, it is not difficult to get the following lemma.

Lemma 3.3 Let xk be generated by ROUA. Con-

sider the line search (8). Then xk∈Ω moreover, ( ) kF x converges.

Lemma 3.4 Let Assumption A hold and

, 1 , , k k k kd x F

be generated by ROUA. Then we have 2

0k k

k

F

(28)

and

G. L. YUAN ET AL.


558

2

0k k

k

d

(29)

Proof. By the line search (8), we get 2 2

1 1 1 2 1

2 2

1

( ) ( )k k k k

k k

d F

F F

x (30)

Summing these inequalities (30) for k from 0 to ∞ we obtain (28) and (29). Then we complete the proof of this Lemma.

Lemma 3.5 Let Assumption A hold. Consider ROUA.

Then kB converges, for all k and any d∈Rn then

there exist constants m0 and M0 such that 2

0 ,T nkd B d M d d R (31)

and

2 2

0 0

1 1,T n

kd d H d d d RM m

(32)

which mean that the updated matrices are all positive by ROUA. Proof. By the updated Formula (11), we have

2

1

220

22

0 00

Tk k k k k

k k k

k

i ii

B B v v B v

B F

B F

k

(33)

By (28), we know that 2

0

k

i ii

F

is convergent. Then we can deduce that kB is con-

vergent. So there exists a constant M0 such that

0kB M for all k (34)

Accordingly, we get (28). By (32), (31), and the Re-mark 1(b), we can deduce that the updated matrices are all symmetric and positive. Consider 1

k kH B we ob-

tain (32) immediately. So, we complete the lemma. By (32), (31), and (34), we have

1 00

1( ) , ( )k k k k k k k kq B d M d d q

m 1 (35)

Now we establish the global convergence theorem of ROUA.

Theorem 3.1 Let Assumption A hold and

, 1 , , k k k kd x F be generated by ROUA. Then the se-

quence xk converges to the unique solution x* of (1) in

sense of

0lim kk

F

(36)

Proof. By Lemma 3.3, we know that k F con-

verges. By Lemma 3.4, we get

0lim k kk

F

(37)

then, we have

0lim kk

F

(38)

or

0lim kk

(39)

Therefore, we only discuss the case of (38). In this case, for all k sufficiently large and

' kk r

by (8), we obtain 2 2

2

1 2

( ' ( )

( )

k k k k

k k k k

F x d F x

F x d

(40)

By Lemma 3.3, we know that xk∈Ω is bounded, considering (35), it is easy to deduce that qk(ak-1) and dk are bounded. Let xk and dk(a) converge to x* and dx*, respectively. Then we have

*1( ) (lim k k

k

q

)x (41)

Let both sides of (40) be divided by ak' and take limits as k→∞ we obtain

* *( ) 0Tx d (42)

By (31) and (13), we have

1

2

0 1

0 (

( )

T Tk k k k k k

Tk k k

d B d q d

m d q d

)

k

(43)

As k→∞ taking limits in both of (43) yields 2* *

0( )Tkx d m d

This together with (42) implies d*=0. From (35), we have

1( )lim k kk

q

0

which together with (41), we obtain *( ) 0x (44)

By * *( ) ( ) ( )*x F x F x and using *( )F x is

nonsingular, we have . This implies (36). The

proof is complete.

*( )F x 0



559

Table 1. Test results for ROUA.

x0 (5,5,…,5) (20,20,…,20) (–20,…, –20) (–60, –60,…, –60) (–100,…, –100)

Dim NI/NG/GF NI/NG/GF NI/NG/GF NI/NG/GF NI/NG/GF

n=10 40/121/8.132565e-007 43/130/8.142272e-007 43/130/8.143256e-007 45/136/9.711997e-007 47/142/6.423242e-007

n=40 43/130/7.823362e-007 46/139/8.163385e-007 46/139/8.163465e-007 49/148/6.389598e-007 50/151/6.806303e-007

n=100 44/133/8.517388e-007 47/142/8.916340e-007 47/142/8.916354e-007 50/151/7.002112e-007 51/154/7.468255e-007

n=500 46/139/8.076481e-007 49/148/8.467259e-007 49/148/8.467260e-007 52/157/6.664491e-007 53/160/7.124612e-007

n=1000 47/142/7.340784e-007 50/151/7.698173e-007 50/151/7.698173e-007 52/157/9.480059e-007 54/163/6.502176e-007

x0 (5,0, 5,0,…) (20, 0, 20, 0…) (–20, 0, –20, 0…) (–60, 0, –60, 0…) (–100,0,–100,0…)


n=10 39/118/6.440181e-007 42/127/6.456963e-007 42/127/6.458627e-007 44/133/7.690322e-007 45/136/8.088567e-007

n=40 41/124/9.581725e-007 44/133/9.998342e-007 44/133/9.998539e-007 47/142/7.823915e-007 48/145/8.333429e-007

n=100 43/130/6.657874e-007 46/139/6.969606e-007 46/139/6.969629e-007 48/145/8.555192e-007 49/148/9.121694e-007

n=500 44/133/9.861003e-007 48/145/6.615057e-007 48/145/6.615058e-007 50/151/8.129150e-007 51/154/8.675076e-007

n=1000 45/136/8.961735e-007 48/145/9.396191e-007 48/145/9.396192e-007 51/154/7.392479e-007 52/157/7.893927e-007

x0 (5, –5, 5, –5…) (20, –20, 20, –20…) (–20, 20, –20, 20…) (–20, 20, –20, 20…) (–100, 100…)


n=10 30/91/8.710675e-007 33/100/7.545800e-007 33/100/7.545800e-007 35/106/8.150523e-007 36/109/8.146625e-007

n=40 31/94/8.893379e-007 34/103/8.687998e-007 34/103/8.687998e-007 37/112/6.403068e-007 38/115/6.691432e-007

n=100 31/94/8.918405e-007 34/103/8.713106e-007 34/103/8.713106e-007 37/112/6.423164e-007 38/115/6.713147e-007

n=500 31/94/8.923155e-007 34/103/8.717867e-007 34/103/8.717867e-007 37/112/6.426974e-007 38/115/6.717265e-007

n=1000 31/94/8.923306e-007 34/103/8.718018e-007 34/103/8.718018e-007 37/112/6.427095e-007 38/115/6.717395e-007

Table 2. Test results for BFGSA.

x0 (5,5,…,5) (20,20,…,20) (–20,…, –20) (–60, –60,…, –60) (–100,…, –100)


n=10 26/62/4.019133e-007 28/67/7.629739e-007 26/62/7.836022e-007 28/67/7.942352e-007 29/69/8.843658e-007

n=40 53/141/8.955174e-007 56/151/9.298740e-007 54/145/7.883733e-007 57/152/9.506096e-007 61/162/6.146640e-007

n=100 89/247/6.293858e-007 93/258/6.009680e-007 95/263/4.620386e-007 95/263/4.877714e-007 103/283/6.719347e-007

n=500 121/347/9.502010e-007 129/371/9.550139e-007 129/371/9.550162e-007 136/391/9.229412e-007 140/402/8.368401e-007

n=1000 122/350/9.130277e-007 131/376/8.492495e-007 131/376/8.492495e-007 137/393/9.697413e-007 141/404/9.845929e-007

x0 (5,0, 5,0,…) (20, 0, 20, 0…) (–20, 0, –20, 0…) (–60, 0, –60, 0…) (–100,0,–100,0…)


n=10 29/70/5.384995e-007 30/72/8.024920e-007 30/72/8.022076e-007 31/74/7.737379e-007 32/76/6.247863e-007

n=40 72/198/5.245237e-007 74/203/5.317215e-007 74/203/5.325755e-007 75/205/6.538916e-007 75/204/9.700355e-007

n=100 110/313/8.802791e-007 118/336/9.964184e-007 118/336/9.966396e-007 125/357/9.676773e-007 128/366/8.655033e-007

n=500 116/332/9.424860e-007 126/360/9.585718e-007 126/360/9.586065e-007 133/380/9.648650e-007 136/389/9.324697e-007

n=1000 113/325/8.970304e-007 122/351/8.659330e-007 122/351/8.659334e-007 129/371/8.270087e-007 132/380/8.530508e-007

x0 (5, –5, 5, –5…) (20, –20, 20, –20…) (–20, 20, –20, 20…) (–20, 20, –20, 20…) (–100, 100…)


n=10 29/71/5.110057e-007 28/69/4.091687e-007 28/69/4.091687e-007 29/70/7.916413e-007 28/68/7.221453e-007

n=40 68/183/9.825927e-007 69/188/5.723010e-007 69/188/5.722966e-007 69/185/9.294491e-007 69/189/8.093485e-007

n=100 87/239/7.675976e-007 92/254/9.416435e-007 92/254/9.413503e-007 92/255/9.920299e-007 98/269/9.349510e-007

n=500 98/281/9.381734e-007 106/304/9.843192e-007 106/304/9.843192e-007 113/324/9.911432e-007 116/333/9.971433e-007

n=1000 98/281/9.925145e-007 107/307/9.345099e-007 107/307/9.345099e-007 113/325/9.830913e-007 117/336/8.588496e-007

G. L. YUAN ET AL.


560

4. Numerical Results In this section, we report results of some preliminary numerical experiments with ROUA. Problem. The dis-cretized two-point boundary value problem is similar to the problem in [36]

2

1( ) ( )

( 1)F x Ax T x

n

where A is the n×n tridiagonal matrix given by

8 1

1 8 1

1 8 1

1

1 8

A

and ))(,),(),(()( 21 xTxTxTxT n

with In the experiments, .,,2,1,1sin)( nixxT ii the parameters in ROUA were chosen as 0.1r ，

. The program was coded in MATLAB

Subsection 6.5.1. We stopped the iteration when the con-dition

40 1 2 10

60( ) 1F x was satisfied.

The columns of the tables have the following mean-ing:

Dim: the dimension of the problem. NI: the total number of iterations. NG: the number of the function evaluations. GF: the function norm evaluations. In the next table, the numerical results are to test

ROUA. In the Table 2, the numerical results are to test

BFGSA. From these two tables, we can see that the numerical

results of the two methods are all interesting. The nu-merical results of the proposed method perform better, and more stationary than the method BFGSA. Moreover, for the method ROUA, the initial points and the dimen-sion do not influence the number of iterations very much. However, for the BFGSA, the number of the iteration will increase quickly with the dimension becoming larger. One thing we like to point out is that δ0 should be chosen in such a way that it is not too large. Overall, from the numerical results, we can see that the ROUA is one of the robust methods for symmetric nonlinear equations. 5. Acknowledgements We are very grateful to anonymous referees and the edi-tors for their valuable suggestions and comments, which

improve our paper greatly. 6. References [1] R. Fletcher, Practical meethods of optimization, 2nd Edi-

tion, John Wiley & Sons, Chichester, 1987.

[2] A. Griewank and L. Toint, “Local convergence analysis for partitioned quasi-Newton updates,” Numerical Mathe- matics, No. 39, pp. 429–448, 1982.

[3] G. L. Yuan and X. W. Lu, “A new line search method with trust region for unconstrained optimization,” Com-munications on Applied Nonlinear Analysis, Vol. 15, No. 1, pp. 35–49, 2008.

[4] G. L. Yuan and X. W. Lu, “A modified PRP conjugate gradient method,” Annals of Operations Research, No. 166, pp. 73–90, 2009.

[5] G. L. Yuan, X. W. Lu, and Z. X. Wei, “New two-point step size gradient methods for solving unconstrained op-timization problems,” Natural Science Journal of Xiang-tan University, Vol. 1, No. 29, pp. 13–15, 2007.

[6] G. L. Yuan, X. W. Lu, and Z. X. Wei, “A conjugate gra-dient method with descent direction for unconstrained optimization,” Journal of Computational and Applied Mathematics, No. 233, pp. 519–530, 2009.

[7] G. L. Yuan and Z. X. Wei, “New line search methods for unconstrained optimization,” Journal of the Korean Sta-tistical Society, No. 38, pp. 29–39, 2009.

[8] G. L. Yuan and Z. X. Wei, “A rank-one fitting method for unconstrained optimization problems,” Mathematica Applicata, Vol. 1, No. 22, pp. 118–122, 2009.

[9] G. L. Yuan and Z. X. Wei, “A nonmonotone line search method for regression analysis,” Journal of Service Sci-ence and Management, Vol. 1, No. 2, pp. 36–42, 2009.

[10] R. Byrd and J. Nocedal, “A tool for the analysis of quasi-Newton methods with application to unconstrained minimization,” SIAM Journal on Numerical Analysis, No. 26, pp. 727–739, 1989.

[11] R. Byrd, J. Nocedal, and Y. Yuan, “Global convergence of a class of quasi-Newton methods on convex prob-lems,” SIAM Journal on Numerical Analysis, No. 24, pp. 1171–1189, 1987.

[12] Y. Dai, “Convergence properties of the BFGS algo-rithm,” SIAM Journal on Optimization, No. 13, pp. 693– 701, 2003.

[13] J. E. Dennis and J. J. More, “A characterization of super-linear convergence and its application to quasi-Newtion methods,” Mathematics of Computation, No. 28, pp. 549–560, 1974.

[14] J. E. Dennis and R. B. Schnabel, “Numerical methods for unconstrained optimization and nonlinear equations,” Pretice-Hall, Inc., Englewood Cliffs, NJ, 1983.

[15] M. J. D. Powell, “A new algorithm for unconstrained optimation,” in Nonlinear Programming, J. B. Rosen, O. L. Mangasarian and K. Ritter, eds. Academic Press, New York, 1970.

[16] Y. Yuan and W. Sun, Theory and Methods of Optimiza-



561

tion, Science Press of China, 1999. [27] G. L. Yuan, X. W. Lu, and Z. X. Wei, “BFGS trust-re-gion method for symmetric nonlinear equations,” Journal of Computational and Applied Mathematics, No. 230, pp. 44–58, 2009.

[17] G. L. Yuan and Z. X. Wei, “The superlinear convergence analysis of a nonmonotone BFGS algorithm on convex,” Objective Functions, Acta Mathematica Sinica, English Series, Vol. 24, No. 1, pp. 35–42, 2008. [28] J. Zhang and Y. Wang, “A new trust region method for

nonlinear equations,” Mathematical Methods of Opera-tions Research, No. 58, pp. 283–298, 2003.

[18] D. Li and M. Fukushima, “A modified BFGS method and its global convergence in nonconvex minimization,” Jour-nal of Computational and Applied Mathematics, No. 129, pp. 15–35, 2001.

[29] D. Zhu, “Nonmonotone backtracking inexact quasi-New-ton algorithms for solving smooth nonlinear equations,” Applied Mathematics and Computation, No. 161, pp. 875– 895, 2005.

[19] D. Li and M. Fukushima, “On the global convergence of the BFGS methods for on convex unconstrained optimi-zation problems,” SIAM Journal on Optimization, No. 11, pp. 1054–1064, 2001.

[30] D. Li and M. Fukushima, “A global and superlinear con-vergent Gauss-Newton-based BFGS method for symmet-ric nonlinear equations,” SIAM Journal on Numerical Analysis, No. 37, pp. 152–172, 1999.

[20] Z. Wei, G. Li, and L. Qi, “New quasi-Newton methods for unconstrained optimization problems,” Applied Ma- thematics and Computation, No. 175, pp. 1156–1188, 2006.

[31] G. Yuan and X. Li, “An approximate Gauss-Newton- based BFGS method with descent directions for solving symmetric nonlinear equations,” OR Transactions, Vol. 8, No. 4, pp. 10–26, 2004.

[21] Z. Wei, G. Yu, G. Yuan, and Z. Lian, “The superlinear convergence of a modified BFGS-type method for un-constrained optimization,” Computational Optimization and Applications, No. 29, pp. 315–332, 2004.

[32] G. L. Yuan and X. W. Lu, “A new backtracking inexact BFGS method for symmetric nonlinear equations,” Com-puter and Mathematics with Application, No. 55, pp. 116–129, 2008.

[22] G. L. Yuan and Z. X. Wei, “Convergence analysis of a modified BFGS method on convex minimizations,” Com-putational Optimization and Applications, doi: 10.1007/ s10 589–008–9219–0.

[33] P. N. Brown and Y. Saad, “Convergence theory of nonlinear Newton-Kryloy algorithms,” SIAM Journal on Optimization, No. 4, pp. 297–330, 1994. [23] J. Z. Zhang, N. Y. Deng, and L. H. Chen, “New quasi-

Newton equation and related methods for unconstrained optimization,” Journal of Optimization Theory and Ap-plications, No. 102, pp. 147–167, 1999.

[34] G. Gu, D. Li, L. Qi, and S. Zhou, “Descent directions of quasi-Newton methods for symmetric nonlinear equa-tions,” SIAM Journal on Numerical Analysis, Vol. 5, No. 40, pp. 1763–1774, 2002. [24] Y. Xu and C. Liu, “A rank-one fitting algorithm for un-

constrained optimization problems,” Applied Mathemat-ics and Letters, No. 17, pp. 1061–1067, 2004.

[35] G. Yuan, “Modified nonlinear conjugate gradient meth-ods with sufficient descent property for large-scale opti-mization problems,” Optimization Letters, No. 3, pp. 11– 21, 2009.

[25] A. Griewank, “The ‘global’ convergence of Broyden-like methods with a suitable line search,” Journal of the Aus-tralian Mathematical Society, Series B., No. 28, pp. 75– 92, 1986.

[36] J. J. More, B. S. Garow, and K. E. Hillstrome, “Testing unconstrained optimization software,” ACM Transactions on Mathematical Software, No. 7, pp. 17–41, 1981. [26] Y. Yuan, “Trust region algorithm for nonlinear equations,

information,” No. 1, pp. 7–21, 1998.

http://www.springer.com/math/journal/186

http://www.springer.com/math/journal/186



A Novel Packet Switch Node Architecture for Contention Resolution in Synchronous Optical Packet

Switched Networks

Virendra Singh SHEKHAWAT*, Dinesh Kumar TYAGI*, V.K. CHAUBEY** Birla Institute of Technology & Science, Pilani, Rajasthan, India

Email: vsshekhawat, dk_tyagi, [email protected] Received March 15, 2009; revised April 30, 2009; accepted June 2, 2009

ABSTRACT Packet contention is a key issue in optical packet switch (OPS) networks and finds a viable solution by in-cluding optical buffering techniques incorporating fiber delay lines (FDLs) in the switch architecture. The present paper proposes a novel switch architecture for packet contention resolution in synchronous OPS net-work employing the packet circulation in FDLs in a synchronized manner. A mathematical model for the proposed switch architecture is developed employing packet queuing control to estimate the blocking prob-ability for the incoming traffic. The switch performance is analyzed with a suitable contention resolution al-gorithm through the computer simulation. The simulation results substantiate the proposed model for the switch architecture. Keywords: Fiber Delay Lines, Packet Circulation, Optical Packet Switch Networks, Packet Delay Probabil-

ity, Contention Resolution.

1. Introduction The growth of optical transmission technology in recent years is significant by achieving a Tbps class of trans-mission speed. However, a rapid increase in the band-width requirement for optical network to support high data rate puts the switching speed limit for the supporting electronic technology. Thus, we need a photonic network which can incorporate functions such as the multiplexing, de-multiplexing, switching, and routing in the optical domain substituting the electronic control circuitry. A number of research groups have reported various optical sub-wavelength switching approaches [1–3] and OPS approach [4,5] attracts attention as it is capable of dy-namically allocating network resources with fine granu-larity and excellent scalability. In general, packet switch optical networks can be divided into two categories: synchronous networks with fixed length packets and asynchronous networks with fixed size or variable-size packets. In an OPS network, contention occurs at a switching node whenever two or more packets try to leave the switch fabric on the same output port, on the same wavelength, at the same time. This is a problem

that commonly arises in packet switches, and is known as external blocking. Techniques used to address this problem include optical buffering, wavelength conver-sion and deflection routing.

Usually in a contention resolution method, wavelength conversion is a superior option as it does not introduce delay in the data path and also avoid packet re-sequenc-ing. On the other hand number of converters and place-ment of these converters in the network is NP complete problem [6,7]. The deflection routing exploits the space dimension to resolve contention. It introduces delays in the data path and requires packet re-sequencing as pack-ets may arrive out of order. This makes deflection rout-ing not a good choice as contention resolution [8].

Optical buffering is fundamental to many optical packet switch implementation which have been proposed widely to overcome contention problem [9,10]. One of the contending packets is routed through the switch fab-ric, while the rest are sent to FDLs. Optical buffers may be placed at the input, output, or both, in a switch. How-ever, a number of optical buffer arrangements have been proposed in the literature, such as single or multi-stage FDLs, feed-forward or feed-backward connections. [11–14] *CSIS Group, **EEE Group

A NOVEL PACKET SWITCH NODE ARCHITECTURE FOR CONTENTION RESOLUTION IN SYNCHRONOUS OPTICAL PACKET SWITCHED NETWORKS


563

These three contention resolution schemes have been used in pure form, or they combined to implement more sophisticated strategies. Each of these arrangements is used to implement a variety of packet switch architec-tures [15–18]. These schemes, along with the various combinations, make possible a wide spectrum of conten-tion resolution methods that cover various tradeoffs of performance versus cost and complexity.

It is observed that the existing buffering implementa-tions require a huge amount of FDLs as well as complex switch architecture that increases the overall cost of the switch. In particular, switch architecture designed in [13] uses a single level of FDLs i.e. all buffers are in shared mode. This model performance degrades with the rise in arriving packet rate or requires more delay lines to re-duce blocking probability. The switch hardware cost can be better managed with a proper node architecture design with a better contention resolution scheme involving flexible delay lines. This design can be further modified to allow packet circulation in buffers or FDLs. This pa-per proposed a switch architecture that utilizes the delay lines in efficient way involving packet circulation to re-solve the contention. The proposed node architecture is presented along with a contention resolution algorithm in Section 2.

2. Node Architecture and Contention

Resolution Algorithm Proposed node architecture consists of k fixed length FDLs labeled as f1, f2….fk each having delay length simi-lar to packet transmission time to keep synchronization. Conflicted packets are circulated in the FDLs causing delay in multiples of T and makes nT time unit delay after n loop circulation. In case of contention for a par-ticular output port let say m, one packet is transmitted to the desired output port and remaining packets are di-verted to the fixed sized free FDLs as per the proposed algorithm.

Additional output ports are connected to the switch through FDL lines as shown in Figure 1. The appropriate FDL lines are chosen through the feedback control mechanism to resolve the packet contention. If two

packets (packet-1 and packet-2) are competing for same output port m at time t0, the packet-1 can be send to the desired output port and the packet-2 is send to one of the free FDL at time t1 as decided by the FDL control system. The packet-2 is emergence from the delay line at time t2; simultaneously packet-1 is transmitted successfully at time t2 from output port m. Now this port is free to send packet-2 at time t3. Let us say at time t3 a new packet (packet-3) competes for port m then packet-2 and packet-3 are in contention with delay time T and 0 re-spectively. Now packet-3 is send to one of the free FDL and packet-2 will be directed towards the desired output port. Here scheduling algorithm based on delay time is developed to divert the packets in appropriate direction. The packets can be delayed through FDL control up to desired amount of time by circulation. Maximum possi-ble delay time is decided by the signal to noise ratio at the output port which depends upon fiber characteristics. Eventually this constraint limits the number of maximum possible circulation for a given packet with in the FDL structure. The overall performance of the system is im-proved as circulation frequency is increased that com-pensate the additional cost involved for the switch. The description of the proposed algorithm is illustrated through a flow chart as shown in Figure 2.

A control structure, maintained inside the control block of the switch keeps track of the status (0 for free and 1 for used) of the each delay line, contended output port (output port number from which packet belongs) and delay time (how long packet is delayed). A mathe-matical model has been developed to visualize the inter-nal behavior of the proposed switch architecture. The model involves auxiliary FDLs to resolve the packet contention and it is based on Erlang B data traffic model having data arrival rate and packet transmission time as λ and 1/μ respectively. Here each fiber consisting of W channels.

Consider an output fiber say F, of the switch, we as-sume S sources sending optical packet traffic destined for fiber F. Let the on and off periods of each source be ex-ponentially distributed with common means 1/σ and 1/τ, respectively. The mean offered load (ρ) to the system is given in [10]:

Figure 1. Proposed Packet Switch Node Architecture.

V. S. SHEKHAWAT ET AL.


564

Figure 2. Flow Chart for the Contention Resolution Algorithm.

A NOVEL PACKET SWITCH NODE ARCHITECTURE FOR CONTENTION RESOLUTION IN SYNCHRONOUS OPTICAL PACKET SWITCHED NETWORKS


565

S

W

(1)

A switch with M inputs and N outputs can be modeled as an M/M/N/N queue. The probability that the packet has to wait for service (i.e. packet blocking probability) is given by the following Erlang C formula

1

0

!( 0)

! !

w

i ww

i

WW W

P dW

i W W

(2)

In the proposed switch architecture, if the destined output port for the packet is not free, it can be sent to one of the free FDL for a fixed amount of delay (i.e. equal to the packet transmission time). Assuming that total delay lines are D, becomes L = W + D outputs. So now this can be modeled as M/M/L/L queue. The new packet de-lay probability can be given by following equation:

1

0

!( 0)

! !

L

i LL

i

LL L

P dL

i L L

(3)

Now the probability of number of packets delayed by more than t seconds can be calculated by multiplying the probability of number of packets delayed by time t=0 seconds with the negative exponential of tμ(L-ρ).

( ) ( 0) t LP d t P d e (4)

By using Equation (4) we can calculate the packet blocking probability for a given amount of delay time t seconds. For example for t=1ms gives us the probability of num-ber of packets that will be delayed by more than 1ms.

Here: MaxD = Maximum delay for the packet St(FDLk) = Status of FDLk (0 for free and 1 for busy) D(Pi) = Delay for ith packet Drop_pkt = Drop packet count

3. Performance Evaluation and Discussion The proposed switch architecture has been characterized for its packet delay performance and blocking probability under the influence of varying numbers of fiber delay lines. The performance of the proposed switch architec-ture handling eight channels has been evaluated and compared with conventional switch architecture.

The packet delay for the switch comprising of different number of delay lines in the switch is computed using Equation 4 and the corresponding results for delay lines having 1 to 4 are presented in Figure 3. It is observed that packet delay probability decreases with the inclusion

Figure 3. Packet delay probability for FDLs as a function of delay time with ρ=5 Erlang. of more FDL lines. It is also inferred that the probability deteriorates for a given FDL structure by increasing packet delay time i.e. number of circulation allowed in FDLs. It depicts that the effect of increasing the number of FDLs on packet delay probability is more than in-creasing 10 times delay.

The influence of traffic load on the delay probability of the proposed switch architecture has been computed using the developed mathematical model and the results are depicted in Figure 4 for the case of delay time to packet hold time ratio as 1. It is observed that the delay probability is not a linear function of traffic intensity and increases rapidly with the rise in traffic at the lower traf-fic range however this dynamic gradient reduces at the higher traffic range in case of a normal switch without FDLs. In case of proposed switch architecture with a single FDL, the qualitative behavior of the delay prob-ability is found to be similar but with a quantitative dif-ference with a lower numerical value. It is further ob-served that inclusion of more FDLs in the switch makes the delay probability still lower values but not with a significant difference. It is interesting to note that though the numerical delay for a given load is lower for the FDLs in the switch, yet the gradient of the delay is slightly higher. This behavior reveals that in case of FDLs the traffic fluctuation may cause more variation in the delay probability as compared to a switch without FDLs. The packet delay analysis for different time delay to packet hold time ratio is obvious to appreciate physi-cal operation of the switch. Figure 5 shows the variation of packet delay probability for the case of delay time to packet hold time ratio as 10. As we increase the delay time the delay probability decreases significantly due to more circulation in the delay lines finding a better prob-ability of the packet to be processed. The curves in Figure 5 are qualitatively similar to the curves in the Figure 4 but



566

Figure 4. Packet delay probability as a function of the traf-fic load (ρ) for fixed size FDLs.

Figure 5. Packet delay probability as a function of the traf-fic load (ρ) for fixed size FDLs.

Figure 6. Packet delay probability as a function of the delay time for FDLs.

with a more significant numerical difference. The effect of delay time on packet delay probability with different number of FDLs has also been investigated and the re-sults are shown in Figure 6. The packet delay probability decreases with the rise in the time delay in the switch and this tendency is maintained by further inclusion of FDLs in the switch. It may however be noted that slope of the curve is higher in case of the switch having more FDLs showing a better reduction in the packet delay probabil-ity with a slight increment in delay time. 4. Simulation Study The proposed switch architecture has been simulated having 5 output ports with the varying number of FDLs from 1 to 4. The request set of varying size from one to ten units is randomly generated and is repeated 10 times for each random value. The number of circulation for each packet is varied from 0 to 3 for the simulation study and the results are presented in Figure 7. It is inferred that the throughput is almost equal for lower traffic (i.e. 10 to 30 requests) for different number of delay lines in the switch. This may be attributed to easy availability of the output ports for the lower incoming packet rate. However at moderate traffic rate (i.e. 40 to 70 requests) the availability of free output ports decreases due to packet contentions and thereby reduces the throughput for a less number of FDLs in the switch. The simulation curves also support the expected results showing a re-markable dropout in the throughput for the case of one FDL as compared with the four FDLs. It is also observed from the curves that the throughput difference decreases for higher traffic rate (i.e. 80 to 100 requests) and shows nearly the same throughput irrespective of the number of the FDLs showing the limiting packet handling capacity of the FDLs. It is observed that the FDLs are beneficial up-to certain traffic rate after that switch performance is controlled by input-output ports involved.

The switch behavior for the incoming traffic involving a three delay lines with different number of packet cir-culations in the FDLs has also been investigated. The simulation results have been presented in Figure 8 for the case of 0, 1, 2 and 3 circulation loops for the packets. It is obvious to note that no packet circulation loop shows the minimum throughput as compared to other cases. These curves also show that the throughput is improved significantly in case of circulating packets even for the single circulating loop for a moderate traffic rate. How-ever these improvements are insignificant at lower and higher data rates due to constraints of the number of in-put-output ports.

The presented analysis confirms that switch size i.e. number of input-output ports, limits the throughput and makes it nearly constant after a given traffic rate. However

A NOVEL PACKET SWITCH NODE ARCHITECTURE FOR CONTENTION RESOLUTION IN SYNCHRONOUSOPTICAL PACKETSWITCHED NETWORKS


567

Figure 7. Throughput as a function of no. of requests for FDLs.

Figure 8. Throughput as a function of no. of requests for packet circulation(s).

this traffic limit is increased by using additional delay lines in the switch. This switch handling capacity may further improved by circulating the packets in delay lines providing the higher probability for finding the desired port at the output. Number of circulation is a notion of packet delay time and can be limited by the maximum time de-lay allowed in the switch. 5. Conclusions A fiber delay line as a solution of packet contention in OPS network has been discussed to establish novel switch architecture with an appropriate mathematical model to simulate the traffic behavior. Model gives a more generic solution for the packet conflict problem at the switching node using FDLs. Packet circulation in delay lines is the key feature of the proposed method that improves connection probability up to a significant traf-fic rate. It also gives better utilization of FDLs through

packet circulation. Both theoretical and simulated results confirm each other. An important observation has been made out that the maximum number of FDLs and the circulations have a significant impact on throughput for a given switch. Performance analysis of the proposed switch architecture for asynchronous packet switching can be done as future work. Further this model can be analyzed with priority based packet scheduling algorithm to control the traffic. 6. References [1] S. Yao, B. Mukherjee, S. J. B. Yoo, and S. Dixit, “A uni-

fied study of contention resolution schemes in optical packet switched networks,” Journal of Lightwave Tech-nology, Vol. 21, No. 3, pp. 672–683, 2003.

[2] I. Chlamtac, V. Elek, A. Fumagalli, and C. Szabo, “Scal-able WDM access network architecture based on photonic slot routing,” IEEE/ACM Transactions on Net-working, Vol. 7, No. 1, pp. 1–9, February 1999.

[3] S. P. Singh, A. Mukherjee, and V. K. Chaubey, “Wave-length conversion algorithm in an intelligent optical net-work from a multilayered approach,” Journal of Optical Networking (OSA), No. 3, pp. 354–362, 2004.

[4] S. Rangarajan, Z. Hu, and L. Rau, “All optical contention resolution with wavelength conversion for asynchronous variable-length 40Gb/s optical packets,” IEEE Photonics Technology Letters, Vol. 16, No. 2, February 2004.

[5] S. Sen and V. K. Chaubey , “A novel electronic device for high speed WDM optical network operations capable of intelligent routing based on simulated electrical net-work approach,” Optics Communications, No. 248, pp. 131–146, 2005.

[6] S. L. Danielsen, P. B. Hansen, and K. E. Stubkjaer, “Wavelength conversion in optical packet switching,” IEEE/OSA Journal of Lightwave Technology, Vol. 16, No. 12, pp. 2095–2108, December 1998.

[7] B. Ramamurthy and B. Mukherjee, “Wavelength conver-sion in WDM networking”, IEEE Journal on Selected Areas in Communications, Vol. 16, pp. 1061–1073, Sep-tember 1998.

[8] R. Tucker and W. Zhong, “Photonic packet switching,” IEICE Transactions on Communications, E–82–B(2), pp. 254–264, February 1999.

[9] D. K. Hunter, M. C. Chia, and I. Andonovic, “Buffering in optical packet switches,” Journal of Lightwave Tech-nology, Vol. 16, No. 12, pp. 2081–2094, December 1998.

[10] F. Callegati, “Optical buffers for variable length packets,” IEEE Communications Letters, Vol. 4, No. 9, pp. 292– 294, September 2004.

[11] L. Tancevski, S. Yegnanarayanan, G. Castanon, L. Tamil, F. Masetti, and T. McDermott, “Optical routing of asyn-chronous, variable length packets,” IEEE Journal on Se-lected Areas in Communications, Vol. 18, pp. 2084–2093, October 2000.



568

[12] D. Fiems, K. Laevens, and H. Bruneel, “Performance analysis of an all-optical packet buffer,” IEEE Confer-ence on Optical Design and Modeling, pp. 221–226, February 2005.

[13] H. Mellah and F. M. Abbou “Contention resolution in slotted WDM optical packet switching networks using cascaded auxiliary switch,” IEE Proceedings of Commu-nications, Vol. 153, No. 2, pp. 205–206, April 2006.

[14] K. Laevens and H. Bruneel, “Analysis of single wave-length optical buffer,” in Proceedings IEEE INFOCOM 200, Vol. 3, San Fransisco, CA, USA, pp. 2262–2267 , March 2003.

[15] D. Hunter and I. Andonovic, “Approaches to optical Internet packet switching,” IEEE Communications

Magazine, pp. 116–122, September 2000.

[16] T. Zhang, K. Lu, and J. P. Jue, “An analytical model for shared fiber-delay line buffers in asynchronous optical packet and burst switches,” Proceedings of ICC’05, pp. 1636–1640, May 2005.

[17] A. Rostami and S. S. Chakraborty, “On performance of optical buffers with fixed length packets,” Proceedings of 2nd IFIP WOCN Conference, Dubai, pp. 1570–1572, March 2005.

[18] A. Mukherjee, S. P. Singh, and V. K. Chaubey, “Wave-length conversion algorithm in an intelligent WDM net-work,” Optics Communications, No. 230, pp. 59–65, 2004.



A Network Intrusion Detection Model Based on Immune Multi-Agent

Nian LIU1,2, Sunjun LIU3, Rui LI3, Yong LIU4 1College of Computer Science, Sichuan University, Chengdu, Sichuan, China

2School of Electrical Engineering and Information, Sichuan University, Chengdu, Sichuan, China 3The software engineering college of Chengdu University of Information Technology, Chengdu, Sichuan, China

4Chengdu Institute of Computer Applications, Chinese Academy of Sciences, Chengdu Sichuan, China Email: [email protected]

Received March 18, 2009; revised May 5, 2009; accepted June 18, 2009 ABSTRACT A new network intrusion detection model based on immune multi-agent theory is established and the concept of multi-agents is advanced to realize the logical structure and running mechanism of immune multi-agent as well as multi-level and distributed detection mechanism against network intrusion, using the adaptability, diversity and memory properties of artificial immune algorithm and combing the robustness and distributed character of multi-agents system structure. The experiment results conclude that this system is working pretty well in network security detection. Keywords: Artificial Immune, Intrusion Detection, Multi-Agent System,

1. Introduction Along with the rapid development of network technology and fast upgrade of network attack technologies network security has become the focus of this age. However, cur-rent intrusion detection technologies [1,2], like statistical analysis, characteristics analysis and expert system etc, can not meet well all the needs. Firstly, the lack of adaptability makes it difficult to detect unknown attacks; Secondly, the lack of robustness leaves each part isolated without communication. Therefore, the building of a detection system with adaptability and robustness is in pressing needs.

Biological Immune System [3] identifies and elimi-nates foreign bodies intruding into the organism by immune cells. From the aspect of information process-ing, BIS is a distributed autonomic system and its self- learning, adaptability and robustness serve as important inspirations for the solving of network security prob-lems. In 1974, Jerne [3] brought forth the first mathe-matical model in immune system, later in 1994, Forrest [4] put forward the concept of computer immune sys-tem for the detection of network intrusion, and up to now the artificial immune system based on biological immune theory has been applied extensively in network security.

Agent [5] refers to such entity as possessing perceiv-

ing, analyzing and reasoning mechanism. Multi-agent [5,6] system, with fairly strong distributed character, robustness and coordination, realizes problem solving under complicated environment by harmonizing the interaction and cooperation among various separate Agents.

In this article, a new network intrusion detection model based on immune multi-agent (NIDIMA) is estab-lished and the concept of immune multi-agent is ad-vanced in the building of logical structure and running mechanism of immune multi-agent to realize multi-level and distributed detection mechanism against network intrusion. This model provides a new way in network safeguard and proves to be an effective solution to net-work security detection throughout the experiment. 2. System Principles of the Detection Model

Based on Immune Multi-Agent Apart from inheriting the original characteristics of Agent, immune agent also has the characters of evolution, diversity, tolerance and detection [7,8] properties etc.

1) Evolution: Following the evolution law, IA acti-vates antibodies, which can effectively recognize anti-gens, into higher form, while eliminates the inefficient one. In this way, the self-learning ability of detection is realized.

N. LIU ET AL.


570

2) Diversity: The matching of antibodies and antigens adopts fuzzy matching [9], with just the need to meet the preset value. The incomplete matching, which realizes the diversity of recognition, enables immune antibodies to recognize various kinds of antigens and in this way, it can produce antibodies covering the whole form space in theory.

3) Tolerability: Immune tolerability refers to the non-response status of immune cells towards the peculi-arities revealed by certain kinds of antigens [10]. The tolerability of IA is of great significance in the maintain-ing and balancing of the model.

4) Detection property: The immune system transmits the produced immune cells within each organ in vivo to increase immunologic competence. Network security model based on this mechanism is of great detection ability.

Combined with multi-agent and AIS technology, the detection model constitutes a multi-direction and multi-level intelligent network security model, with its mapping re-lationship with BIS model as shown in Table 1, and its system structure diagram as shown in Figure 1.

The model adopts large-scale and distributed system structure, and a series of network situation awareness agents and a network security situation evaluation agent is deployed in target networks and its host computer firstly.

Security situation evaluation agent gathers information about the security situation of subordinate subnets and host computers from each security situation awareness agent to evaluate about the whole integrated risks to the whole network, and the information includes the type, quantity, strength and harmfulness of the attacks.

The network security situation awareness agent shown in Figure 1 is itself a sub-network security situation awareness system, defined by recursion, and it mainly monitors on the sub-network security situation within its control, and specifically speaking, real-time monitor on the type, strength and harmfulness of attacks suffered by sub-network. Because there might as well be subnets under subnets, sub-network security situation awareness agents may be composed of sub-network security situa-tion awareness agents at lower level. Eventually, security situation awareness agents, that monitor the specific host computer, are made up of intrusion detection and secu-rity situation evaluation of the host computer.

In this architecture, IA distributed at host computer node starts to recognize the intruded events at first, and in case unknown attacks are discovered through learning and memory, information will then be sent to corre-sponding SMC, while vaccines that has identified new attacks will be distributed to each node within the same network segment to improve the intrusion defense capa-bility of each node. SMC makes analysis on the intrusion information sent by each IA in the network segment and SMC of suffered network segment will send vaccines to

Table 1. The relationship between the BIS and the NIDIMA.

Biological immune systemNetwork intrusion detection system

Organism Network

Organ Network segment

Cell Host computer

Vaccine distribution The transmission of intrusion

information

Antigen The binary character string feature-

extracted from IP packets

B Cell, Antibody Antibodies represented in

binary character string

Cell clone Duplication of antibody

Figure 1. Architecture of NIDIMA.

Figure 2. Architecture of immune agent.

A NETWORK INTRUSION DETECTION MODEL BASED ON IMMUNE MULTI-AGENT


571

other non-intruded network segments for early warnings for the realization of whole detection. 3. System Mechanism of the Network

Intrusion Model Based on Immune Multi-Agent

In Figure 2, the logical structure of IA is shown, which includes self antigens, immature immune antibodies, mature immune antibodies and memory immune anti-bodies etc. The working procedures involve two inter-playing and concurrent circulations, which are the circu-lation of immune antibodies’ detection of external anti-gens and the circulation of immune antibodies’ evolution. 3.1. The Definition of Immune Elements Definition 1: Antigens are binary character strings in the length of l that are feature-extracted from network IP data packets. Let U=0, 1l (l>0)，antigen assembly Ag U, it mainly includes IP address, port and protocol type etc.

, |

( )

Ag a b a D b Ψ a

l a APCs b

(1)

Definition 2: The antigen assembly Ag is classified into two substes of self assembly Self (normal network behavior) and non-self assembly Nonself (abnormal net-work behavior):

Self NonSelf = Ag, Self Nonself =

Antibodies and antigens have binary character strings of the same features and length, the definition of anti-body assembly B U：

,,,| agcountagesAbAbB (2)

Antibody Ab is a quadruple, among which s means binary string in the length of l, age is the age of antibody, count is the quantity of antigens matched with antibodies, ag is the antigens that are detected by the antibodies. Antibodies are classified into three categories: mature antibodies, memory antibodies and immature antibodies. Mature antibodies, tolerable to self bodies, refer to the antibodies that are not activated by antigens, mature an-tibodies assembly TAb U; memory antibodies evolve from mature antibodies that are not activated, memory antibodies assembly MAb U; immature antibodies are antibodies that have not undergone self-tolerance, im-mature antibodies assembly IAb D.

Definition 3: Affinity serves as the main evidence to judge the matching state between antibodiesand antigens, and the calculation formula is as Formula (3), equaling to 1 means matching, or else non-matching. In the formula, x Ag, y B, xi, yi are the i-th characters of x, y respec-

tively, l is the length of character string, is the thresh-old value of affinity matching.

1 ( ( ) / )( , )

0

h_dis

match

, f x, y lf x y

, otherwise

(3)

2_

1

( , ) ( )l

h dis i ii

f x y x y

3.2. The Changing Process of Immature Antibodies

Let I be the number of immature antibodies included in IAb at certain time, the dynamic changing formula of im-mature antibodies assembly is:

( ) ( ) ( mature deadnew

mature dead

I II t t I t I t t

x x

t

(4)

The Formula (4) indicates that the changing process of IAb assembly is divided into 2, that is, inflow and outflow. Inflow is the process of newly produced immature anti-bodies’ joining in IAb assembly: I = Inew×t，Inew means the production rate of immature antibodies per unit of time. Outflow is the process of removing immature anti-bodies from IAb assembly, and there are two directions of outflow: the quantity of immature antibodies decreased out of successful tolerance and evolution into mature antibodies is presented by

mature

mature

It

x

, and dead

dead

It

x

presents the quantity of immature antibodies decreased out of tolerance failure.

In order to avoid matching between antibodies and self bodies, newly produced immature antibodies can only match with antigens after passing self tolerance. The tolerance process is shown as in Formula (5), and 1 means passing self tolerance, 0 means failure of self tol-erance, x∈IAb.

0 ( , ) 1( )

1match

tolerate

y Self f x yf x

otherwise

(5)

3.3. The Changing Process of Mature Antibodies Let T represent the number of mature antibodies in-cluded in TAb at certain time, and the dynamic changing formula of mature antibodies assembly is:

( ) ( ) ( mature deadnew

mature dead

I I)I t t I t I t t t

x x

(6)

the Formula (6) indicates that the changing of TAb assem-bly is divided into two processes: inflow and outflow. The inflow is the process of antibodies’ joining the TAb,

N. LIU ET AL.


572

and there are two ways: The number increased out of immature antibodies’ successful tolerance and evolution is represented by

tolerate

tolerate

Tt

x

; clone

clone

Tt

x

means the number increase out of clonal selection of memory antibodies. Outflow is the process of removing mature antibodies, it also has two directions: the number of memory antibodies that have been evolved from acti-vation is presented as

active

active

Tt

x

, while dead

dead

Tt

x

represents the number that die from failed activation. The mature antibodies assembly Tactive that have been

activated and evolved into memory antibodies is shown in Formula (8), and the mature antibodies assembly Tdead that have failed in activation is shown in Formula (9), among which is activation threshold, and is the life cycle of mature antibodies.

: | . . active AbT x x T x count x age (7)

: | . . dead AbT x x T x count x age (8)

3.4. The Changing Process of Memory Antibodies Let M as the memory antibodies quantity contained in MAb at certain time, and the dynamic changing formula of memory antibodies assembly is:

( ) ( ) active bacterin

active bacterin

M MM t t M t t

x x

t (9)

Because memory antibodies have infinite life cycle, the change of Mb assembly only has the process of inflow, without the outflow process of dead memory cells. The inflow of memory antibodies is conversed by activated mature antibodies Tactive,

active active

active active

M Tt t

x x

3.5. Immune Surveillance During the detection process of IA on network behaviors, it mainly adopts mature cells and memory cells to detect antigens, and it is capable to detect non-self antigens efficiently and rapidly, what follow are the detailed steps:

1) Antigen presentation: The feature information of IP packet is extracted from actual network data flow to con-stitute a binary string in the length of l, which is then put in the antigen assembly Ag as antigen regularly.

2) Using memory antibodies MAb to detect antigens:

N

TAb to detect antigens: The no

de: After detection, the le

. Simulation Experiment and Result Analyse

.1. Experiment Environment and Parameter

he experiment environment is classified into two net-

d of source, des-tin

.2. Experiment Results and Performance

ctivation threshold value βand life cycleλbear large

on-self bodies that match with memory antibodies are removed, and memory antibodies that have detected self bodies in are also removed.

3) Using mature antibodiesn-self antibodies that match with TAb are removed, and

the TAb that has detected enough antigens in the life cycle is then activated and evolved into memory antibodies MAb; The MAb that has not been activated or detected self elements in life cycle will die.

4) Self body assembly upgraft antigens will join in self body assembly, maintain

dynamic self body upgrading, undergo self tolerance with immature antibodies and maintain dynamic evolu-tion of antibodies. 4 4

Settings

Twork segments of A, B, and composed of 20 host com-puters of the same configuration. Ai, Bi (1≤i≤20) means the i-th host computer in A, B segments respectively. The experiment applies part of the data from KDDCUP99 [11] in MIT LINCOLN lab as training and test data. The training set is normal network data without any attack, and the test set includes normal data and at-tack data. The attack data is classified into 4 categories: DoS, Probe, U2R, R2L. The test data I includes guess_ passwd, buffer_overflow and other attacks, 7 in total, and the test set Ⅱ includes the attacks in test set I and other 10 attacks or more, e.g. land, spy, perl.

Antigen data architecture is composeation IP address, port number, protocol type, IP tag

field, IP packet length and TCP/UDP/ICMP domain etc, l = 172, affinity matching threshold valueθ＝0.7. 4

Analysis

Ainfluence on TP and FP, the test performance of AD-NIIMA, therefore, optimization test should be carried out.The experiment results are shown in Figure 3, when ac-tivation threshold value β is relatively small, and the mature antibodies are activated without thorough learn-ing, FP is relatively bigger; when the life cycle λ is relatively small, the mature antibodies will die without activation and the memory antibodies will be scarce, which in total may cause TP lower. Along with the in-crease of β and λ, TP increases and FP decreases. However, when βandλare too big, TP decreases in-stead and FP increases. After through analysis, it is found

A NETWORK INTRUSION DETECTION MODEL BASED ON IMMUNE MULTI-AGENT


573

Table 2. Experiment data list.

Training set Test set I Test set Ⅱ Attack

Type Attack Times Attack Type Attack Times Attack Type Attack Times Attack Type

Normal 18300 0 17500 0 14400 0

DoS 4580 1 5140 2 5580 5

Probe 1250 1 2700 2 4240 4

U2R 560 1 650 2 420 4

R2L 290 1 300 1 360 3

0

0.2

0.4

0.6

0.8

1

0 20 40 60 8Time

TP

0

β=30

β=20

β=10

0

0.2

0.4

0.6

0.8

1

0 20 40 60 8TimeFP

0

β=30

β=20

β=10

0

0.2

0.4

0.6

0.8

1

0 20 40 60 8Time

TP

0

λ=80

λ=50

λ=20

0

0.2

0.4

0.6

0.8

1

0 20 40 60 8Time

FP

0

λ=80

λ=50

λ=20

Figure 3. Effect of the activation threshold β and the lifecycle λ.

4. Conclusions out that whenβ＝20, λ＝50, FP<6%, TP>93%，the effect is better.

In Table 3 and Table 4, the comparison test results of this model and the detection tools matched with the pat-terns based on, BRO [12], are listed, and from the tables we can infer that the detection model is of higher TP and lower FP, and the types of recognized unknown attacks are more than that of BRO, which demonstrates that this model is of high self-learning and adaptability.

The active defense model for network intrusion based on artificial immune multi-agent put forth in this article has following advantages compared with other network in-trusion defense techniques: 1) Self-learning; The mem-ory mechanism and antibodies generation mechanism can not only detect well-known attacks, but also bear recognition ability towards unknown attacks. 2) Multi-level; The model has introduced in the concept of vaccine, which can strengthen network nodes and con-nection between each network segment. 3) Robustness; The model applies distributed system structure so that a single node under attack does not impact the detection abilities of other nodes. In a word, the experiment results reveal that the model differs greatly from the isolated and passive defense situation of traditional network security model, and it is a better solution to network security de-tection.

In Table 4, TP and FP curves of host computer A2 with self evolution and detection, and the host computer A3, using vaccines emitted from A1 for detection, are list re-spectively, and from the table we can conclude that in the later phase of experiment, the TP and FP curves of A2

and A3 are almost coincident. But in the early phase of the actual experiment, A2, that relies on itself for detec-tion, has low detection rate due to a lack of antibodies, and the attacks have made severe damage to the during this exact phase, which is unacceptable for key network node.

N. LIU ET AL.


574

Table 3. Detection results of ADNIIMA model experiment.

Test Set I Test Set Ⅱ Attack Type

Recognition Type TP (%) FP (%) Recognition Type TP (%) FP (%)

Normal 0 98.6 0 0 97.2 0

DoS 2 97.2 2.8 5 97.1 2.9

Probe 2 96.5 3.5 4 94.3 5.7

U2R 2 94.5 5.5 4 95.4 4.6

R2L 1 95.2 4.8 3 94.3 5.7

Table 4. Detection results of BRO experiment.

Test Set I Test Set Ⅱ Attack Type

Recognition Type TP (%) FP (%) Recognition Type TP (%) FP (%)

Normal 0 97.5 2.5 0 97.2 2.8

DoS 1 73.3 26.7 2 53.6 46.4

Probe 1 72.2 27.8 2 52.1 47.9

U2R 1 68.5 31.5 1 45.6 54.4

R2L 1 94.5 5.5 2 63.5 36.5

0

0.2

0.4

0.6

0.8

1

0 20 40 60Time

T

80

P

solitude evolutionvaccine reception

0

0.08

0.16

0.24

0.32

0.4

0 20 40 60 8Time

F

0

P

solitude evolutionvaccine reception

Figure 4. Comparison of detecting effect between solitude evolution and vaccine reception.

5. References [1] Y. Bai, and H. Kobayashi, “Intrusion detection systems:

Technology and development,” IEEE Advanced Informa-tion Networking and Applications, pp. 710–715, 2003.

[2] A. Pilz and J. Swoboda, “Network management informa-tion models,” Aeu-International Journal of Electronics and Communications, Vol. 58, pp. 165–171, 2004.

[3] Y. L. Dong, J. Qian, M. L. Shi, “A cooperative intrusion detection system based on autonomous agents,” IEEE CCECE 2003, Vol. 2, pp. 861– 863, 2003.

[4] P. D’haeseleer and S. Forrest, “An immunological ap-proach to change detection: Algorithm, analysis and im-plication,” in Proceedings of IEEE Symposium on Re-search in Security and Privacy, Oakland, pp. 110–119, 1996.

[5] J. Kim and P. Bentley, “The artificial immune model for network intrusion detection,” 7th European Congress on Intelligent Techniques and Soft Computing, 1999.

[6] P. K. Harmer and G. B.Lamont, “An agent based archi-

tecture for a computer virus immune system,” Proceed-ings of the Genetic and Evolutionary Computation Con-ference, Orlando, Florida, USA, 1999.

[7] F. Esponda, S. Forrest, and P. Helman, “A formal frame work for positive and negative detection schemes,” IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics, Vol. 34, No. 1, pp. 357–373, 2004.

[8] I. M. Hegazy, H. M. Faheem, T. Al-Arif, and T. Ahmed, “Evaluating how well agent-based IDS perform,” Poten-tials, Digital Object Identifier, IEEE, Vol. 24, 27–30, 2005.

[9] P. Ballet and V. Rodin, “Immune mechanisms to regulate multi-agents systems,” GECCO 2000, Las Vegas, Nevada, USA, July 2000.

[10] Z. Z. Shi, “Intelligent agent and its Application [M],” Science Press, Beijing, 2000.

[11] A Hofmeyr and S. Forrest, “Architecture for an artificial immune system,” Evolutionary Computation, Vol. 7, No. 1, 2000.

[12] N. K. Jerne, “Towards a network theory of the immune system,” Annual Immunology, Vol. 125, 1974.



575

A Low Power and High Speed Viterbi Decoder Based on Deep Pipelined, Clock Blocking and Hazards Filtering

C. ARUN1, V. RAJAMANI2 1Department of Information Technology, Sri Venkateswara College of Engineering,

Chennai, Tamilnadu, INDIA 2Department of Electronics and Communication Engineering, PSNA College of Engineering and Technology,

Dindigul, Tamilnadu, INDIA Email: [email protected], [email protected]

Received May 18, 2009; revised July 14, 2009; accepted August 23, 2009 ABSTRACT A high speed and low power Viterbi decoder architecture design based on deep pipelined, clock gating and toggle filtering has been presented in this paper. The Add-Compare-Select (ACS) and Trace Back (TB) units and its sub circuits of the decoder have been operated in deep pipelined manner to achieve high transmission rate. The Power dissipation analysis is also investigated and compared with the existing results. The tech-niques that have been employed in our low-power design are clock-gating and toggle filtering. The synthe-sized circuits are placed and routed in the standard cell design environment and implemented on a Xilinx XC2VP2fg256-6 FPGA device. Power estimation obtained through gate level simulations indicated that the proposed design reduces the power dissipation of an original Viterbi decoder design by 68.82% and a speed of 145 MHz is achieved. Keywords: Viterbi Decoder, Convolutional Codes, High-Speed, Low Power Consumption, Parallel

Processing, Deep Pipelining.

1. Introduction Overcoming the variable deterioration in reliability of a communication channel in real time is a critical issue for many communication systems. Therefore, from the channel coding point of view techniques, this demands both high speed and low power decoding. Convolution code and Viterbi algorithm are known to provide a strong forward error correction scheme, which have been widely utilized in noisy digital communication applica-tions like satellite and wireless communication. For the decoder part, maximum likelihood method has been used in Viterbi decoding [1].The types of convolution codes qualified in the 3GPP WCDMA mobile communication system are (561,753) and (557, 663,711). It requires high decoding speed and low power consumption because of the large constraint value [2].

A conventional Viterbi decoder contains three major units: 1) a Branch Metric Unit (BMU) which calculates the branch metrics; 2) an Add-Compare-Select Unit (ACSU) which recursively accumulates the branch met-rics as the path metrics (PM), compares the incoming path metrics, and makes a decision to select the most

likely state transitions for each state of the trellis and generates the corresponding decision bits; and 3) a sur-vivor memory unit (SMU), which stores the decision bits and helps to generate the decoded output. Among these three units, the ACSU and SMU consume most of the power of the decoder [3].

To meet the high throughput requirement of the mod-ern communication systems, the fully parallel and pipe-lined architecture is commonly used for implementing the Viterbi decoder [4]. Many ACSs run at a high clock frequency and hence they consume a lot of power. Si-multaneously, the SMU also consumes more power be-cause of the large number of memory accesses. In some cases, it is more than half of the total power consumption of the decoder [5]. There are two known methods for the implementation of the SMU, namely the Register Ex-change method (RE) and the Trace Back (TB) method [6]. In general, RE has the advantage of high speed, low latency, and simple control but it consumes more power than the Traceback (TB) mechanism since it needs to move the data among the memories in every cycle. Therefore, the TB mechanism is commonly used for the implementation of the SMU.

C. ARUN ET AL.


576

Numerous methods have been studied reduce the power consumption of the Viterbi decoder by exploiting different aspects of the system characteristics. The slid-ing block VD architecture was designed to achieve a speed of 1Gbps [7]. The decoding in the sliding block decoder was performed simultaneously in forward and backward direction. However, only a 4-state Viterbi de-coder was implemented, while for practical 32- or 64-state Viterbi decoder, the complexity of the decoder is extremely high due to the fully parallel ACS units and large number of skew buffer registers required. Inside the Viterbi decoder, the feedback loop in the Add-Com-pare-Select (ACS) unit imposes the bottleneck on the decoding speed and shorten the critical path in the ACS unit has been widely studied [7]. A retiming scheme for the most significant bit (MSB) first ACS unit was ana-lyzed in detail to achieve minimal length of the critical path [8]. A bidirectional Viterbi decoder that can meet the requirements of high-speed and low power consump-tion has been discussed by Song Li [2]. It can decode in both positive direction and reverse direction simultane-ously, so that the decoding delay can be reduced to half of the unilateralism decoder. However, since the power consumption will greatly increase, it is obviously not worth implementing a bi-directional decoding through doubling the area and storage space.

The adaptive decoder discards some states (in the trel-lis) with high path metrics dynamically during the de-coding process [9]. Seki et al. suggested the use of a scarce state transition (SST) scheme for the multimedia mobile communication [10]. The scheme employs a sim-ple pre-decoder followed by a pre-encoder to minimize signal transitions at the input of a conventional Viterbi decoder, which leads to lower dynamic power dissipation. Kong and Willson studied various issues in designing a low-power Viterbi decoder for the IS-95 CDMA system. Their decoder employs various low-power design schemes such as state partition, gated control, and gray coding [11]. A k-pointer algorithm was studied for the efficient implementation of the TB-based SMU design [6]. In this implementation, several memory READ operations were required in order to decode one bit. Thus, the power consumption due to the memory access operation was significant.

Limited search algorithms were used to reduce the av-erage number of ACS computation and the path storage required by Viterbi Algorithm (VA) [12]. The T-algo-rithm requires comparison operations for finding the best path metric in each decoding stage. This limits the use of the T-algorithm for high throughput applications. For low throughput applications, the comparisons can be done in multiple cycles. Many T-algorithms were de-signed for the low throughput applications [13,14]. The SPEC-T algorithm was implemented to solve the prob-lem by relieving the requirement of finishing the com-

parison in one cycle to v cycles, where v is the latency of the comparison operation. The current best path metric was estimated with errors and then corrected for every v cycles. However, it still has large power and area over-head to search for the best path metric [15].

During the trace-back operation, the decision bits of all the states were read out from the memory at the same time, but actually only one bit is required. Thus, the power overhead was large. The power overhead of the trace-back operation was reduced by dividing the origi-nal memory, which had a word length equal to 2K-1, into two equal and smaller memories [16]. The power effi-ciency of the T -algorithm had been demonstrated in by assuming 2K-1 banks of 1-bit memories were used [5]. However, the hardware implementation was not de-scribed. Also, the area for this memory configuration was large and the power overhead of the peripheral can be high which makes the cost of adoption of this memory configuration very high.

In order to improve the speed and reduce the power consumption we proposed new architecture incorporate the ACS and TB and their associated circuits for decod-ing process have been operated in parallel and deep pipe-lined manner for higher date rate. For a low-power de-sign we proposed a clock gating and toggle filtering for survivor path and trace back units of Viterbi decoder. The design of a decoder at behavior level has been de-scribed in a high-level hardware description language. The behavioral design is synthesized to generate a gate level design. For testing the behavioral design MODEL SIM has been used. We compared the gates utilization, speed and power dissipation of the different implementa-tion and suggested a low power and high speed Viterbi decoder design

The paper is organized as follows Section 2 presents the algorithm and operation of Viterbi decoder. The ar-chitecture of the proposed Viterbi decoder is described in Section 3. Section 4 describes the operation of deep pipe-lined mechanisms. Section 5 comprises of the proposed low power techniques. FPGA implementation and per-formance are given in Section 6. 2. Viterbi Decoder Algorithm In this section, both the algorithm and operation behind the proposed eight state Viterbi decoder depicted in Fig-ure 1 are presented. Consider a communication system where the convolutional encoder adds redundancy to the input signal S, and the encoded output x symbols are transmitted over a noisy channel.

The input of the convolutional decoder that is the input for the Viterbi decoder r is the encoded symbols con-taminated by noise. Then the decoder tries to extract the original information from the received sequence and generates an estimate y. The algorithm that maximizes

A LOW POWER AND HIGH SPEED VITERBI DECODER BASED ON DEEP PIPELINED, CLOCK BLOCKING AND HAZARDS FILTERING


577

Figure 1. A 8–state Trellis diagram for K=4.

the conditional probability P (r|y) is called the maximum likelihood algorithm.

The maximum-likelihood algorithm finds the most likely code sequence for the received channel output sequence. Therefore, if the encoder output sequence is denoted by xm, and the channel output sequence is denoted by r, the probability of receiving r when the channel input is xm is

' '' 0

( | ) Pr( | )m nn

r r x r x

mn

mn

(1)

The most likely path through the trellis for the channel output r is the one that maximizes the function. Thus, the function shown in Equation (1) is usually called the met-ric and it is used in comparison between the code se-quence and the received sequence [6]. Notice that the decoding metric in Equation (1) requires product imple-mentation and therefore the metric ln [Pr (r|xm)] is more frequently applied than the metric Pr (r|xm) in the de-coder. Moreover, finding the trellis path with the largest log-likelihood function corresponds to the maximum likelihood decoding.

' '' 1

ln[P ( | )] ln[ P ( | )]m nn

r r x r r x

(2)

where the components of the summation are accumulated on the individual branches, and therefore they are de-noted by branch metrics. In this work, the hard-decision decoding method is used, and is determined using the Hamming distance measure. Thus, the most optimal path through the trellis is the path with the minimum Ham-ming distance. The Hamming distance between the trellis code word and the received sequence c

y

each of them

being of length n is:

1

, , 1n

i ii

d c y d c y i n

(3)

3. Proposed VLSI Architecture There are two types of architectures for the implementa-tion of Viterbi decoder: register exchange method and memory trace back method. The register-exchange algo-rithm offers minimal latency as required. Unfortunately, the capacitive load of clock and trellis-structured inter-connect network dissipate considerable power, and their routing resources occupy significant area [12]. Register traceback method was proposed in order to reduce the interconnect wires and power from register exchange method. But this architecture still consumes lots of logic materials, and the critical paths are too long to improve the speed [1]. In this paper we have presented two schemes to reduce power dissipation in the traceback approach. The proposed architecture of Viterbi decoder is presented in Figure 2 for reducing power. 3.1. Branch Metric (BM) Module The BM module generates branch metric for ACS mod-ule in terms of the received channel symbols; the archi-tecture of BM is very straightforward. The only point to be mentioned here is that, in order for the constrain length to be reconfigured, a set of RAMs are employed to provide branch information for branch metric compu-tation, in terms of the constraint length selection. Obvi-ously, a pipeline scheme can be easily applied to achieve enough high speed for branch metric calculations, be-cause of no feedback loop existing in the BM module.

C. ARUN ET AL.


578

3.2. ACS Module For each state in the trellis diagram of Viterbi decoding, current path metrics are obtained from current branch metric and path metrics of the previous states, which lead to current state, by executing addition, comparison and selection operations. We modified the regular ACS structure to deep pipelined structure in order to speed up this module. 3.3. Traceback Module The TB module is a bank of registers, which record the survivor path of each state selected by the ACS module. A register is assigned to each state, and the length of a register is equal to the frame length (which is 24 for our decoder). The corrected output sequence is produced by tracing the decision vectors. The traceback module is used to decide the final output.

4. Proposed Pipelined and Deep Pipelined Structure

The speed of the decoder can be improved by applying pipelining approach to ACS and TBU shown in Figure 3. Pipelining is performed between BMU, ACS and TBU where multiple instructions are overlapped in execution. Pipelining does not decrease the time for individual in-struction execution. Instead, it increases instruction throughput and is determined by how often it exits the pipeline.

To obtain high speed implementation, the maximum inherent parallelism needs to be extracted. Generally, a high throughput rate is achieved if the circuit has a very short critical path. The critical path of a synchronous circuit is that path between two buffers that has the larg-est delay and hence determines the maximum achievable clock frequency of the circuit. The pipeline register block consists of a set of positive edge triggered D-type flip

Figure 2. Architecture of Viterbi decoder.

Figure 3. Parallel and pipelined execution of Viterbi decoder.



579

Figure 4(a). Deep pipelined execution of ACS (Add-Com-pare-Select) unit.

Figure 4(b). Deep pipelined execution of Trace Back (TB) unit.

(a) Shift update

(b) Selective update Figure 5. Shift update versus selective update.

flops. The total number of flip flops required is equal to the number of states multiplied by the number of state metrics bits, i.e. 8*6=48.

The throughput is still increased when deep pipelining method is used. In this method, the pipelining process is applied within ACS and TB units up to the trace back depth. Compared to others methods, deep pipelining method shows a tremendous reduction in memory usage and the throughput of the chip also increases considerably. In non-pipelined system the silicon area required is high and that results in high power consumption. This is con-siderably reduced in deep pipelining method whose structure is shown in Figure 4. 5. Proposed Low Power Design The two basic approaches used to record survivor paths are register-exchange and traceback. Both methods cause substantial switching activity and hence are inefficient in power dissipation. In this section, we have presented two schemes to reduce power dissipation in the traceback approach. 5.1. Survivor Path Storage Block and Clock Gating In the traceback approach, one flip-flop is necessary to record the survivor branch for each state per stage. The shift update scheme forms a shift register for each state by collecting the flip-flops in the horizontal direction as shown in Figure 5(a). The survivor branch information is filled into the least significant bits of the registers. We have proposed the formation of registers vertically as shown in Figure 5(b). The survivor branch information is filled into registers from left to right as time progresses. This scheme is called selective update.

The key difference between the two schemes is that the content of a register in the selective update method does not change once it is updated. Hence, the register incurs less switching activity, thus reducing power dissi-pation. Moreover, the fact makes it possible to apply a clock-gating scheme to further reduce power dissipation. The clock of each register is enabled only when the reg-ister updates its survivor path information. Figure 6 shows the proposed survivor path storage block for the selective update method with employment of clock- gat-ing. In the figure, register Ri holds the survivor path in-formation of all the states at stage i, where i is 1 to 24 for our decoders. The five-bit counter keeps track of the current stage, which is equal to the number of code symbols received so far for the frame. When the ith code symbol is received, the clock of register Ri is enabled, and the survivor path information of all the states at stage i is recorded in the register. Note that all the other regis-ters hold their state since the clock of the other registers

C. ARUN ET AL.


580

Figure 6. Proposed survivor path storage block for the selective update with clock gating.

is disabled. Therefore the proposed survivor path storage reduces switching activity to a minimal level to save power. Finally, it is possible to replace the five-bit counter and the de-multiplexer with a ring-counter.

5.2. Toggle Filtering of the Output Generator

Block The output generator block in the traceback approach traces back the survivor path after all the symbols have been received and generates the decoded output sequence. The block is a combinational circuit, which can be active only during one clock cycle.

A block diagram of the output generator block is shown in Figure 7. Ignoring the AND gates with an en-able input for the time being, the block receives inputs from the survivor storage block containing the survivor path information. The block traces the survivor path at the end of the frame and generates the decoded output sequence. The decoded output sequence is loaded into a register at the first clock.

Since the registers update the survivor path informa-tion progressively throughout the entire frame, the output generator receives spurious inputs, which cause unnec-essary switching activity to dissipate power. The pro-posed design blocks spurious inputs applied to the block. The array of AND gates and the enable signal shown in Figure 7 are introduced for this purpose. The enable sig-nal is activated during the one clock period at the end of the frame as shown in Figure 7.

6. Implementation and Performance Results We measured the speed, power and area of the three dif-ferent implementations of Viterbi decoders: proposed deep pipelined design, the traceback approach with the

clock gating and toggle filtering scheme. We implemented the three different Viterbi decoders

in the standard cell environment. It is first, we described the three Viterbi decoders at the register transfer level in VHDL. Then synthesized and placed and roughed using Xilinx project Navigator tool. The processing technol-ogy used in our experiments was CMOS six metal layer 0.25 μm with the supply voltage of 1.8V.

Power dissipation was estimated for the synthesized gate-level circuits using Xilinx project Navigator tools. The process for power estimation is calculated by using the maximum numbers of random patterns were simu-lated at the clock frequency of 10 MHz for each Viterbi decoder, and the switching activity of each node was recorded. Then, the power dissipation of the circuit was estimated using the formula, P=αCV2f, where α is the switching activity, C is the parasitic capacitance (0.032pF), V is the supply voltage (1.8V), and f is the clock fre-quency. The clock frequency was set to 10 MHz [6]. The static power dissipation of cells was not considered due to the limitation of the library cells used in our experi-ments.

Experimental results for three Viterbi decoders are shown in Table 1. Among the three designs, the conven-tional trace back approach has the largest area and the proposed low-power design the least area. The area of the proposed deep pipelined with low power design is 69% less than that of the conventional traceback ap-proach and 36.5% less than that of shift and selective update low power approach. We observed that the large area for the conventional traceback is due to the com-plexity of the survivor path storage block. Table 1 Gates count, power dissipation and speed comparison of pro-posed three Viterbi decoders (Xilinx Vertex 2p)

The power dissipation of the three decoders is small due to the small size of the decoders and is in the range of 0.5 mW to 2 mW. The conventional traceback ap-proach dissipates the largest amount of power, while the

Figure 7. Proposed toggle filtering of the trace back block diagram.



581

Table 1. Area, power dissipation and speed comparison of proposed methods.

Type No of gates

utilized Power dissipation

in mw Speed of operation

in MHz

Conventional Viterbi decoder

20168 20.91 mw 93.102 MHz

Proposed Shift update with clock gating and Toggle filtering

9950 10.31 mw 27.174 MHz

Proposed Selective update with clock gating and Toggle filtering

9297 9.63 mw 33.20 MHz

Proposed Deep pipelining with clock gat-ing , Toggle filtering and selective update

techniques 6289 6.52 mw 145.12 MHz

Table 2. Device utilization summary of three different decoders.

Type Conventional

Viterbi decoder

Proposed Shift update with clock

gating and Toggle filtering

Proposed Selective update with

clock gating and Toggle filtering

Proposed Deep pipelining with clock gating ,Toggle

filtering and selective update

No. of slices 1174 519 494 486

No. of Slice FFs 961 400 360 243

No. of 4 input LUTs 1592 933 882 436

No. of Bonded IOBs 9 9 9 9

No. of GCLKs 1 1 1 1

Figure 8. Power dissipation chart of four different imple-mentation of Viterbi decoder. proposed low-power design dissipates the least power. The proposed method reduces the power dissipation by 68.82% compared with the conventional traceback ap-proach by 36.7% compared with the shift and selective update approach. We observed that the deep pipelined selective update scheme with clock gating and toggle filtering is the most effective method in power saving. Details on the experiments, including the area of each block and the power dissipation for three different de-

coders, are shown in Table 1&2. Figure 8 shows the power dissipation for the four different implementation of Viterbi decoder. The first bar in the Figure 8 is the power dissipation of original Viterbi decoder without low power techniques. When both the clock gating and toggle filtering is applied to s selective update deep pipelined Viterbi decoder in traceback approach, it saves about 68.82% of power compared with conventional Viterbi decoder. 7. Conclusions Viterbi decoders employed in digital wireless communi-cations are complex and dissipate high power. In this paper, we have investigate power dissipations of three different implementations of Viterbi decoders: the shift update, selective with clock gating and toggle filtering and deep pipelined with all low power techniques Viterbi decoder Scheme. We have proposed a low-power Viterbi decoder design based on the traceback approach. The schemes employed for our low-power design are clock- gating of the survivor path storage block and toggle fil-tering of output generation block. We have implemented the three designs in the standard cell design environment and measured the performance in terms of area and power dissipation. Among the three implementations, it is observed that the proposed low-power design takes the

C. ARUN ET AL.


582

smallest area and dissipates the least power. The pro-posed design reduces the power dissipation of the regis-ter-exchange approach by 68.82%. Finally, it is difficult to make a head-to-head comparison of power efficiency between the proposed method and other existing methods due to different environments (such as hard decision versus soft decision) and constraints imposed. Some of our techniques can also be applied to other low-power designs to save power. 8. References [1] H. Yang and X. Lang, “Design and implementation of

high speed and area efficient Viterbi decoder,” IEEE, Proceedings of the 8th International Conference on Solid State and Integrating Circuit Technology, ICSICT’06, pp. 2108–2110, 2006.

[2] S. Li and Q. M. Yi, “The design of high speed and low power consumption bidirectional Viterbi decoder,” IEEE, Proceedings of the 5th International Conference on Ma-chine Learning and Cybernetics, Dalian, pp. 13–16, Au-gust 2006.

[3] J. Jin and C. Y. Tsui, “Low-power limited-search parallel state Viterbi decoder implementation based on scarce state transition,” IEEE Transaction on Very Large Scale Integration (VLSI) System, Vol. 15, No. 10, October 2007.

[4] C. C. Lin, Y. H. Shih, H. C. Chang, and C. Y. Lee, “De-sign of a power-reduction Viterbi decoder for WLAN ap-plications,” IEEE Transactions on Circuits System I, Reg-ular Papers, Vol. 52, No. 6, pp. 1148–1156, June 2005.

[5] R. Henning and C. Chakrabarti, “An approach for adap-tively approximating the Viterbi algorithm to reduce power consumption while decoding convolutional codes,” IEEE Transactions on Signal Processing, Vol. 52, No. 5, pp. 1443–1451, May 2004.

[6] G. Feygin and P. Gulak, “Architectural tradeoffs for sur-vivor sequence memory management in Viterbi decod-ers,” IEEE Transactions on Communications, Vol. 41, No. 3, pp. 425–429, March 1993.

[7] P. J. Black and T. H. Y. Meng, “A 1-Gb/s, four-state, sliding block Viterbi decoder,” IEEE Journal of Solid-

State Circuits, Vol. 32, No. 6, pp. 797–805, June 1997.

[8] K. K. Parhi, “An improved pipelined MSB-first add com-pare select unit structure for Viterbi decoders,” IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, Vol. 51, No. 3, pp. 504–511, March 2004.

[9] M. H. Chan, W. T. Lee, M. C. Lin, and L. G. Chen, “IC design of an adaptive Viterbi decoder,” IEEE Transac-tions on Consumer Electronics, Vol. 42, pp. 52–61, Feb-ruary 1996.

[10] K. Seki, S. Kubota, M. Mizoguchi, and S. Kato, “Very low power consumption Viterbi decoder LSIC employing the SST (Scarce State Transition) scheme for multimedia mobile communications,” Electronics-Letters, IEE, Vol. 30, No. 8, pp. 637–639, April 1994.

[11] Kang and A. N. Willson, “Low-power Viterbi decoder for CDMA mobile terminals,” Conference-Paper, Journal- Article, IEEE Journal of Solid-State Circuits, Vol. 33, No. 3, pp. 473–82, March 1998.

[12] R. Henning and C. Chakrabarti, “An approach for adap-tively approximating the Viterbi algorithm to reduce power consumption while decoding convolutional codes,” IEEE Transactions on Signal Processing, Vol. 52, No. 5, pp. 1443–1451, May 2004.

[13] R. Tessier, S. Swaminathan, R. Ramaswamy, D. Goeckel, and W. Burleson, “A reconfigurable, power-efficient adaptive Viterbi decoder,” IEEE Transactions on Very Large Scale Integration (VLSI) System, Vol. 13, No. 4, pp. 484–488, April 2005.

[14] M. Guo, M. O. Ahmad, M. N. S. Swamy, and C. Wang, “FPGA design and implementation of a low-power sys-tolic array-based adaptive Viterbi decoder,” IEEE Trans-actions on Circuits Systems I, Regular Papers, Vol. 52, No. 2, pp. 350–365, February 2005.

[15] F. Sun and T. Zhang, “Parallel high-throughput limited search trellis decoder VLSI design,” IEEE Transactions on Very Large Scale Integration (VLSI) System, Vol. 13, No. 9, pp. 1013–1022, September 2005.

[16] Y. N. Chang, H. Suzuki, and K. K. Parhi, “A 2-Mb/s 256-state 10-mW rate-1/3 Viterbi decoder,” IEEE Journal on Solid-State Circuits, Vol. 35, No. 6, pp. 826–834, June 2000.

The 6th International Conference on Wireless Communications, Networking and Mobile Computing

September 23–25, 2010, Chengdu, China http://www.wicom-meeting.org/2010

Call For Papers

WiCOM serves as a forum for wireless communications researchers, industry professionals, and academics interested in the latest development and design of wireless systems. In 2010, WiCOM will be held in Chengdu, China. You are invited to submit papers in all areas of wireless communications, networking, mobile computing and applications.

Network Technologies

Ad hoc and Mesh Networks

Sensor Networks

RFID, Bluetooth and 802.1x Technologies

Network Protocol and Congestion Control

QoS and Traffic Analysis

Network Security

Multimedia in Wireless Networks

Services and Application

Applications and Value-Added Services

Location based Services

Authentication, Authorization and Billing

Data Management

Mobile Computing Systems

Wireless Communications

B3G and 4G Technologies

MIMO and OFDM

UWB

Cognitive Radio

Coding, Detection and Modulation

Signal Processing

Channel Model and Characterization

Antenna and Circuit

IMPORTANT DATES

Paper due: March 10, 2010

Acceptance Notification: May 10, 2010

Camera-ready due: May 31, 2010

int. j. communications, network and system sciences vol.2 no.6-02-06-20090922104705.pdf ·...

Documents