stationary symbol sequences from variable-length word sequences

11
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-23, NO. 2, MARCH 1977 Stationary Symbol Sequences from Variable-Length Word Sequences ‘243 GIANFRANCO L. CARIOLARO, MEMBER, IEEE. AND GIANFRANCO L. PIEROBON Abstract-Variable-length word sequences (VLWS) and symbol sequences obtained from VLWS are discussed. A VLWS is regarded as a vector-valued stochastic process in which the sequence of vector dimensions is itself a stochastic process. For a given VLWS, the symbol sequence is not uniquely defined because it depends on how the choice of the time origin is made. In particular, a deter- ministic choice leads to a nonstationary symbol sequence even if the originating VLWS is stationary. A suitable random choice must be made in order to get a stationary symbol sequence from a sta- tionary VLWS. Attention is given to the spectral analysis of both the VLWS and the corresponding stationary symbol sequence. An application to a VLWS consisting of mutually independent words is given. I. INTRODUCTION I N MOST DIGITAL transmission and recording systems, data are encoded before modulation into a sequence with special properties. The choice of encoding is dominated by requirements of spectral shaping, self- timing, intersymbol interference, and error monitoring. In recent years, several binary and multilevel codes have been proposed. Most of these are fixed-length codes, which include symbol-by-symbol codes as a particular case. However, variable-length codes, which combine the ad- vantages of short and long word lengths, have also received considerable attention [l], [2]. As examples, we mention the HDBn and CHDBn codes [3], the BnSZ codes [4], and the VL43 code [5]. The spectral analysis and, more gen- erally, the statistical characterization of the symbol se- quences resulting from the use of these codes play a fun- damental role in the study of digital transmission sys- tems. To illustrate our discussion, we consider the block di- agram of Fig. 1, where for convenience the encoding op- eration is split into three parts: framing, word encoding, and deframing. The digital source emits symbols at a fixed rate producing a stationary sequence of symbols; symbols are then grouped (framed) into blocks of fixed or variable length to form a sequence of words (source words), which is then encoded into another sequence of words (code- words). Thereafter, the digits of each codeword are sent sequentially, and a new sequence of symbols is formed; this operation may be considered the inverse of framing (de- framing). Finally, each symbol of the deframed sequence Manuscript received June 9, 1975; revised June 10,1976. This work was supported by the Italian Research Council (CNR) under Contract CT 75.01017.07. The authors are with the Institute of Electrical Engineering of Padova University, 35100 Padova, Italy. G. L. Cariolaro is also a Consultant of Digital Transmission and Wire Communication Division of Telettra, Vimercate, Italy. is transmitted by a pulse-amplitude modulation (PAM) signal. From a statistical viewpoint, the following stationary stochastic processes are implied in our discussion: the source-symbol and the coded-symbol sequences, which are to be regarded as discrete-parameter discrete-valued stochastic processes; the source word and the codeword sequences, which are discrete-parameter vector processes; and, the PAM signal, which is a continuous-parameter stochastic process. In some cases, as, for example, in error probability evaluation in the presence of intersymbol interference [6], the complete statistical description of these stochastic processes must be investigated. However, for many pur- poses, an analysis of correlations and spectral densities is adequate. This is particularly true for the design of base- band data-transmission systems. Cariolaro and Tronca [i’ ] have performed a systematic spectral analysis of the aforementioned processes in the case of fixed-length en- coding. Our aim in this paper is the extension to variable- length encodings. The extension is by no means straight- forward because it involves vector processes whose di- mensions are themselves stochastic processes. Another nontrivial problem is the deframing, i.e., the passage from a stationary variable-length word sequence to a stationary symbol sequence. This paper is organized as follows. Section II deals with the characterization of stationary variable-length word sequences (VLWS), particular attention being devoted to their spectral analysis. Section III deals with the symbol sequences obtained by deframing VLWS, the main prob- lem being the choice of the time origin for the symbol se- quence. It is noted that a deterministic choice leads to a nonstationary sequence so that an appropriate random choice has to be made to get a stationary sequence. In Section IV, the spectral distribution of random origin symbol sequences (ROSS) is related to the spectral dis- tribution of the originating VLWS. In the well-behaved case, the ROSS spectrum consists of an absolutely con- tinuous component and of lines regularly spaced by (AT)-l, where l/T is the ROSS symbol rate and A the greatest common divisor of the word lengths in the VLWS. In Section V, the theory is illustrated by an application to a VLWS with mutually independent words. II. VARIABLE-LENGTH WORD SEQUENCES We introduce the following notation. Let 8 be a finite set of words having lengths in a set of positive integers & = (Xl,&, * * - ,XN),wherel~X1<X~<...<XN.Let~~=

Upload: independent

Post on 21-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-23, NO. 2, MARCH 1977

Stationary Symbol Sequences from Variable-Length Wo rd Sequences

‘243

GIANFRANCO L. CARIOLARO, MEMBER, IEEE. AND GIANFRANCO L. PIEROBON

Abstract-Variable-length word sequences (VLWS) and symbol sequences obtained from VLWS are discussed. A VLWS is regarded as a vector-valued stochastic process in which the sequence of vector dimensions is itself a stochastic process. For a given VLWS, the symbol sequence is not uniquely defined because it depends on how the choice of the time origin is made. In particular, a deter- ministic choice leads to a nonstationary symbol sequence even if the originating VLWS is stationary. A suitable random choice must be made in order to get a stationary symbol sequence from a sta- tionary VLWS. Attention is given to the spectral analysis of both the VLWS and the corresponding stationary symbol sequence. An application to a VLWS consisting of mutually independent words is given.

I. INTRODUCTION

I N MOST DIGITAL transmission and recording systems, data are encoded before modulation into a

sequence with special properties. The choice of encoding is dominated by requirements of spectral shaping, self- timing, intersymbol interference, and error monitoring. In recent years, several binary and multilevel codes have been proposed. Most of these are fixed-length codes, which include symbol-by-symbol codes as a particular case. However, variable-length codes, which combine the ad- vantages of short and long word lengths, have also received considerable attention [l], [2]. As examples, we mention the HDBn and CHDBn codes [3], the BnSZ codes [4], and the VL43 code [5]. The spectral analysis and, more gen- erally, the statistical characterization of the symbol se- quences resulting from the use of these codes play a fun- damental role in the study of digital transmission sys- tems.

To illustrate our discussion, we consider the block di- agram of Fig. 1, where for convenience the encoding op- eration is split into three parts: framing, word encoding, and deframing. The digital source emits symbols at a fixed rate producing a stationary sequence of symbols; symbols are then grouped (framed) into blocks of fixed or variable length to form a sequence of words (source words), which is then encoded into another sequence of words (code- words). Thereafter, the digits of each codeword are sent sequentially, and a new sequence of symbols is formed; this operation may be considered the inverse of framing (de- framing). Finally, each symbol of the deframed sequence

Manuscript received June 9, 1975; revised June 10,1976. This work was supported by the Italian Research Council (CNR) under Contract CT 75.01017.07.

The authors are with the Institute of Electrical Engineering of Padova University, 35100 Padova, Italy. G. L. Cariolaro is also a Consultant of Digital Transmission and Wire Communication Division of Telettra, Vimercate, Italy.

is transmitted by a pulse-amplitude modulation (PAM) signal.

From a statistical viewpoint, the following stationary stochastic processes are implied in our discussion: the source-symbol and the coded-symbol sequences, which are to be regarded as discrete-parameter discrete-valued stochastic processes; the source word and the codeword sequences, which are discrete-parameter vector processes; and, the PAM signal, which is a continuous-parameter stochastic process.

In some cases, as, for example, in error probability evaluation in the presence of intersymbol interference [6], the complete statistical description of these stochastic processes must be investigated. However, for many pur- poses, an analysis of correlations and spectral densities is adequate. This is particularly true for the design of base- band data-transmission systems. Cariolaro and Tronca [i’] have performed a systematic spectral analysis of the aforementioned processes in the case of fixed-length en- coding. Our aim in this paper is the extension to variable- length encodings. The extension is by no means straight- forward because it involves vector processes whose di- mensions are themselves stochastic processes. Another nontrivial problem is the deframing, i.e., the passage from a stationary variable-length word sequence to a stationary symbol sequence.

This paper is organized as follows. Section II deals with the characterization of stationary variable-length word sequences (VLWS), particular attention being devoted to their spectral analysis. Section III deals with the symbol sequences obtained by deframing VLWS, the main prob- lem being the choice of the time origin for the symbol se- quence. It is noted that a deterministic choice leads to a nonstationary sequence so that an appropriate random choice has to be made to get a stationary sequence. In Section IV, the spectral distribution of random origin symbol sequences (ROSS) is related to the spectral dis- tribution of the originating VLWS. In the well-behaved case, the ROSS spectrum consists of an absolutely con- tinuous component and of lines regularly spaced by (AT)-l, where l/T is the ROSS symbol rate and A the greatest common divisor of the word lengths in the VLWS. In Section V, the theory is illustrated by an application to a VLWS with mutually independent words.

II. VARIABLE-LENGTH WORD SEQUENCES

We introduce the following notation. Let 8 be a finite set of words having lengths in a set of positive integers & = (Xl,&, * * - ,XN),wherel~X1<X~<...<XN.Let~~=

244 IEEE TRANSACTIONS ON INFORMATION THEORY, MARCH 1977

symbol sequence vwable-length word sequences symbol sequence PAMstgnal r , /

FRAMING -ENCODING -DEFRAMlNG

lAmI _ I4 I L------J

Fig. 1. Model for variable-length encoding and modulation. The “framing” produces a variable-length word sequence (VLWS) from the source symbol sequence. The “deframing” converts the encoded VLWS into a new symbol se- quence.

variable length word sequence

h=3

distance I” words

between B., and B2: 2 -(-I)=3

(a)

symbol sequence

distance ,n dlglts

between B_, and B2: 2_,,=8

cd, Fig. 2. Illustration of distance in words (a) and distance in digits (b) in a given VLWS.

u31x,rs2x, * * * ,@Nxh} be the set of those words that have An infinite sequence of variable-length words (B,,m E length X; these words are regarded as 1 X X vectors, namely 9), where 9 is the set of integers and B, E 53, may be re- aA = kc,& - - * ,&J,l I r I Nx, and are arranged to form garded as a discrete-time vector-valued stochastic process. the Ni X X matrix1 The novelty of this process lies in the particular structure

Bh = I IP%l I> r= I,... ,NA; p = 1, . . . ,X; h E &. (1) of the range 33 as a consequence of which some operations, such as unconditional expectation, become meaningless

Throughout this paper, the integers r,s,X, and p will have (see below). the ranges r = 1,2,. . . ,Ni and s = 1,2, - - * ,N, where X E The {Bn) process is characterized by the following LandpE&. (mass) probability distributions

For illustrating our notation, we consider the following example in which the alphabet is ternary and where + (-)

rni E 9, 1 _< ri I N,,

stands for + l(- 1), respectively, pi E &, K = 1,2, - . . . (2)

N = 3, & = (2,4,6) N2 = 3, ;12 1 ;,+,;I’ t922 = I- +I, N* = 2, /3:: = [0+‘- +], 824 = to- + -1, N6 = 4, 816 = [00+ + - +], #f326 = [00+ + + -1,

p36 = [o@- - +], 846 = [OO- - - +].

In this case, the matrices (1) are the following:

84 = o+ - + [ 1 B6 = o- + -

bo+ + - +’ oo+ + + - oo- - + - po- - - +,

(1)

l Matrices will be set in boldface to distinguish them from scalars. Let We assume that (B,) is strict-sense stationary, i.e., that A be an arbitrary matrix. Then A’ and A denote the transpose and the complex conjugate transpose of A, A dia

B onal matrix with diagonal ele- distributions (2) are invariant upon an arbitrary rni

ments dl, * -. ,d, is written as A = diag di, . . . ,d,]. translation.

CARIOLARO AND PIEROBON: STATIONARY SYMBOL SEQUENCE 245

Let L, denote the length of B,. Then the {B,] process The following probability distributions concern the implies the definition of another process (L,,m E J), the length process. Length absolute probabilities, sequence of the word lengths. The statistical description of {Lm) follows from that of the (Bd} process through the identity between events

px p Pr (L, = X) = CPA(r), x E &. (8)

b&n = A) = I&n E BAI= c;I i&n = Brx), x E &. (3) Cumulative length probabilities,

A. VL WS Distribution for Deframing

It is necessary to make a distinction between the time units marked by the words and the time units marked by the digits of the words in the given VLWS (Fig. 2). We say that the ordered word-pair (B,, B,} is at h words’ distance whenever n - m = h, and that it is at k digits’ distance whenever rmn = k, where

I n-l

c Lt, n > m, A t=m

Tmn = (4) 0, n=m,

y(k) A Pr ’ 6 (~,,+h = k) I , k 3 0, (9) h=O

A 0, k <O.

Remark 1: Whenever cumulative lengths T,, are con- cerned, a fundamental role is played by the greatest common divisor (g.c.d.), say A, in the set of significant lengths, i.e., the subset &’ C & composed of the lengths having positive probabilities. The quantities PAW(k) and y(k) vanish for any k that is not an integer multiple of A.

I- T,,, n < m. B. Correlation Matrices We consider the following probability distributions.

Word absolute probabilities, Because of the variability of the word lengths in a

VLWS, some attention must be paid in applying such a PA(r) & Pr (B, = prxj. (5) usual operation as expectation. For instance, the uncon-

Joint probabilities between words at h words’ distance, ditional expectation E( B,) is meaningless because the dimension of B, changes in the domain of integration. On

&,x,“(h) A Pr Pm = PA n &+h = ,&,I. (6) the other hand, the conditional expectation E{ B, 1 L, =

Joint probabilities between words at k digits’distance, A) is a well-defined 1 X X vector whenever Pr (L, = Xl > o

. Pi/(k) A Pr B, = /3,.x

1

We consider the problem in general. Let (9,3,P) be the basic probability space, and let A(w),@ E Q, be a matrix function whose dimension (m,n) may change in Q; i.e.,

n L

-‘u (Bn = A, f’ ~mn = k)]]. (7) (m,n) is itself an w-function. Let & be an I;2 subset wherein

n=--m ( m,n) takes on the fixed value (X,p). Then, we define

The last definition requires some comments. The event within square brackets means that a word B,, of the VLWS

E(A;Q) A JG A dP, (10)

should be equal to ps, (which implies that the B, - length is /J) and, jointly, that B, should be at k digits’ distance where the result of the integration is a X X p matrix.

from B,. Because the assigned distance in digits k does For instance, by setting A P B, and & s (L, = X), we

not determine uniquely a corresponding distance in words get n - m, the index n of B, can have many values in general. For instance, reconsidering the previous example in which rni A E(B,; L, = Xl = C h&(r) = PA (11) & = {2,4,6), we find the word at k = 6 digits’ distance from B, may be B,+l (if L, = 6), Bm+2 (if L, = 2 and Lmfl where PA is the word absolute probability vector. The = 4, or L, = 4 and L,+l = 2), and Bm+3 (if L, = L,+l = quantity rnx is a vector of dimension X and represents the L m+2 = 2). On the other hand, there is no B, word at k = mean value of the words of length X in (B,), weighted by 7 digits’ distance from &, whereas B,+z or Bm+3 or Bm+4 the corresponding probability pi P Pr (L, = X). may be at k = 8 digits’ distance, etc. Even for a general k, By means of (10) we now define the following correla- the permitted choices are finitely many, but it becomes tion2 matrices of the (B,J process, which are obtainable difficult to identify them explicitly. Hence, we find it from the previous probability distributions. Correlation convenient to use an infinite union in (7), thus including both permitted and unpermitted choices. Of course, the latter have zero probability. 2 In spectral analysis, in place of correlations, covariances are usually

Probabilities (5) to (7) are conveniently arranged in considered,, which is equivalent to considering in place of the given process

matrix form. Let PA A [Pi(l), +. . ,Pi(NJ] and let &AM(h) the deviation of the process from its expected value. As noted by Doob [8, p. 951, this is unnatural mathematically and has nothing to do with

and Pip(k) be the Nh by N, matrices, whose r,s entries are the essential properties of interest of stationary processes. We add that

given by (6) and (7), respectively. this is particularly true in dealing with VLWS because of the ambiguity in the definition of the expected value.

246 IF,EE TRANSACTIONS ON INFORMATION THEORY, MARCH 1977

matrices between words at h words’ distance,

BXqh) 4 E{ B;Bm+h;Lm = x n Lm+h = ~1. (12)

These matrices are obtainable from probabilities (6). Correlation matrices between words at k digits’ distance. The word, supposed of length II, that is at k digits’ distance from a given word B,, supposed of length X, may be de- fined as follows. Let gxp(k) be the set

= p n TV, = k) II

. (13)

If Pr (gxI*(k)) > 0, for a given w E gxp(k), one, and only one, integer t exists such that T,~ (w) = k. Then, we define the w-function

&n(h)(~) A B,(o), for T~~(W) = k, w E SAP(k), (14)

which represents the word of length or. that is at k digits’ distance from the word B, of length h. Finally, the cor- relation matrix between words at k digits’ distance is de- fined as

I&(k) 4 E{ BI, B,ck,;9x~(k)). (15)

Theorem 1: The correlation matrices RAP(k) are given by

W‘(k) 4 &W(k)/3 PJ W-3

where PAN(k) is the matrix of the joint probabilities (7).

Proof: From definition (14), the following identity holds

n=v_d(Bn = A, n 7-,, = k) II .

/lLl IW(hA) P RAN. i18)

If this happens, the limit value (18) is responsible for the jump part, whereas the deviation of RYhA) from R”p( ~0 A) yields the continuous part. These considerations are stated in the following theorem.

Hence, considering (3), we get the following partition of event (13)

Theorem 2: If the matrix function RAp(hA) converges as h - ~0 to a matrix RAp( 0~ A), and if the matrix function R”p(hA) - Rxw(wA) is absolutely summable, then F@ consists of a jump part FIX” with jumps of A-lR(mA) at frequencies

gxp(k) = U U $,x,“(k). r s

Thereafter, use of (10) with the definitions & p SAp(k) and A P Bk Bm(h) yields

RAP(k) = cc ,&AP'sp Pr&%(k)). r s

Finally, (16) follows from (7).

Remark 2: The correlation matrices RAP(k) vanish for any integer k # Ah,h E 9, where A is the g.c.d. in the &’ set (see Remark 1).

Since the correlations between words at h words’ dis- tance are not involved in deframing considerations, they will not be considered further. We proceed, therefore, with correlations only between words at k digits’ distance.

The k-functions RAw(k),k E 9, will be regarded as the cross-correlation matrix functions of the given stationary’ VLWS ( Bm). Here the term “correlation matrix function” (CMF) must be justified because ordinarily it is used for stationary fixed-length vector-valued processes,3 for which it has a precise meaning [9, ch. 11. Now, in spite of the fact that the I-W(k) have been defined in a very atypical manner (see (15)), they turn out to be ordinary CMF’s in the sense of FLWS. However, the proof of this result is lengthy [17] and, therefore, omitted.

C. Spectral Distributions and Densities

It is well-known that a cross-correlation matrix function B&p(k) uniquely determines, if suitably normalized, a cross spectral distribution matrix function FAN, - l/2 < u d l/2, where u has the meaning of a normalized frequency. In particular, if RAp( s ) is absolutely summable, then FAp( . ) is absolutely continuous, and its derivative #A,( . ) is given by

$‘AP(u) = ,T< Rb(k)e-2riku (17) m

and is called the cross spectral density matrix function. In general, a distribution function consists of an abso-

lutely continuous part, a jump part, and a singular part, by Lebesgue’s decomposition [lo]. The last part occurs only in pathological situations, whereas the jump part is related to the asymptotic behavior of the correlation function. As a consequence of Remark 2, we find that BAv(k), does not converge in general to a limit for A > 1, whereas it may be reasonable to expect the convergence of the limit

UITl P mA-l, -iA<m<iA, (19)

and of an absolutely continuous part @ with a density given by

@‘(u) = ,T$ [R”fl(hA) - RAfi(aA)]e-2rih*u. (20) co

Proof: Let us consider the following decomposition of the given CMF. RAp = R,xI” + RF, where (see Remark

3 Hereafter, “fixed-length vector-valued process” will be abbreviated by FLWS, and also by A-FLWS to indicate the length of the words.

CARIOLARO AND PIEROBON: STATIONARY SYMBOL SEQUENCE 247

2)

R:‘(k) P 4k)RA@(mA) 9 (214

R?(k) p R+(k) - R:&(k) = t(k)[RXfi(k) - RAp(m

(21b)

with t(k) e 1, for k = hA,h E .7, and E(k) k 0, other- wise.

Now, by virtue of bhe identity

E(k) = A-’ C e2?rikm/A -$i<V&i

,

we get

R?(k) = A-lRAp(aA) C e2?rikmlA -~A<ll&i

Then, R?(k) determines, through (17), a matrix function w of jump type with jumps at urn p m/A of amount A-lRXp(mA) [8, p. 7481. On the other hand, R?(k), is ab- solutely summable and, hence, series (20) converges to the absolutely continuous matrix function &. Finally, @ ’ and w are distribution matrix functions by virtue of the un- iqueness of the FA,, decomposition.

D. Specialization to FL WS

Now we specialize the discussion to a A-FLWS in which the length set & is composed of only one element A. In this case, the distances in digits (4) take on values that are multiples of A; i.e.,

TO,,, = mh, m E 9. (22)

All the matrices introduced for the general case, with a A-FLWS turn out to be of dimension A X A. Considering (22), we get the following relation between correlations (12) and (13),

IX”(h) P E( B:, Bm+h) = R”*(hA). (23)

Hence, the spectral analysis is easily performed on the basis of the CMF @* in place of R**. In particular, if we apply Theorem 2 to a FLWS, we get the result obtained through heuristic considerations by Konovalov and Tar- asenko [II] and by Cariolaro and Tronca [12].

III. SYMBOLSEQUENCESFROMA VLWS

A VLWS {B, 1 may be converted into a symbol sequence (deframing) by taking sequentially the digits of each word B, in accordance with the order in the VLWS. This sit- uation occurs when the VLWS digits are stored in order and then read at a fixed symbol rate. The choice of time origin for the symbol sequence is not unique, and different choices lead to different characteristics of the resulting symbol sequence. We consider two cases: in the first, the origin is chosen deterministically; in the second, it is chosen randomly. We call the resulting sequences determinis- tic-origin symbol sequence (DOSS) and random-origin symbol sequence (ROSS), respectively.

A. The DOSS Model

In the DOSS, the choice of the origin is made so that its zeroth symbol coincides with the first digit of the zeroth word, i.e., bo A BA (Fig. 3(a)). Then we get bl P Bi, if Lo I 2; bl e B:, if Lo = 1; b2 P Bi, if Lo 13; b2 c By, if Lo = 1 and L1 1 2; etc. In general, using distance in digits (4), the DOSS {b,} is defined, for any discrete time t E J, as fol- lows:

bt d B& t d 7om+p - 1; p = I,... ,L,, m f 9. (24) -

Indeed, by virtue of (4), the sequence of discrete times defined in (24) covers the integer set 9.

The statistical description of the DOSS can be derived from that for the originating VLWS. A considerable sim- plification is achieved if we regard the digits of VLWS words, namely #?h,l 5 p 5 X, 1 I r 5 NA, X E &, as dis- tinct objects (even if some values may be coincident). Then we get by inspection,

Pr (bo = pf’hla] = 0, p 2 2(X > 2)

= Pr Wo = ,&A), p = 1; (25a)

Pr {bl = /3flx 1 a)) = 0, p 2 3(h 2 3)

= Pr PO = ,&A], p = 2(X r 2) = Pr {Lo = 1 n B1 = prx), p=l (25b)

etc., where B indicates the deterministic choice of the time origin.

In general, from (25), the DOSS turns out to be non- stationary even if the originating VLWS is stationary. For instance, if X 1 2 and p = 2, we find Pr (bo = flFh} = 0, whereas Pr (bl = /3:,) = Pr (Bo = &.A] may not vanish.

B. The ROSS Model.

The loss of stationarity in the passage from the VLWS to the DOSS is due to the deterministic choice mode for the time origin. Indeed, the DOSS model can be applied when the stored VLWS digits are read starting from a predetermined position which is a typically nonstationary situation. In order to regain stationarity, we need to ran- domize the choice of the time origin. This can be accom- plished by adding a random phase in the symbol-sequence definition (ROSS model). In other words, the ROSS is a new sequence (ct,t E J), where

ct 4 bt+w (26)

with bt defined by (24) and v an appropriate random variable (Fig. 3(b)).

We first consider a ROSS obtained from a A-FLWS. In this case, we have to asstime v as independent of the FLWS process and uniformly distributed over the set (O,l, . . . ,A - 1). This corresponds, indeed, to the fact that each digit of a A-word is equally likely to fall in the time origin.

Now, the problem is to link the a posteriori probabilities Pr ( - 1 .B) after the random choice of the time origin to the a priori probabilities Pr ( . ). This can be done by the fol-

248 IEEE TRANSACTIONS ON INFORMATION THEORY, MARCH 1977

(a)

DOSS

v= 2

v =3

(b)

Fig. 3. Illustration of DOSS (a) and ROSS (b) obtained from a VLWS realization in which the choice of the time origin falls in an LO = 4 length word.

lowing rule

Pr(u=un&l%]=$Pr{;lll, 0 d u d A - 1, (27)

where A is a general event and % indicates the random choice of the time origin. On the basis of this assumption, one can verify that the ROSS {ct) defined through (24) and (26) turns out to be stationary4 provided that the origi- nating FLWS is stationary.

We now consider the general case in which the ROSS is obtained from a VLWS. In this case the probability that a word is selected by the random origin procedures is not independent of the word, but it is proportional to the length of the word, longer words being more likely to be selected than shorter words. Moreover, the phase distri- bution is to be specified in connection with the length of the selected word. Both these problems are of no concern in the fixed-length case jn which the chance that a A length word is encountered has probability one.

In order to focus on the problem, we reconsider the particular case 6: = (2,4,6) on an intuitive basis. The choice of the ROSS origin falls at random in one digit of the zeroth word Bo. In the present example, BO has three possible lengths: LO = 2, LO = 4, and LO = 6, and, therefore, there are 12 choices for the pairs (Lo,v),Lo E &,v = 0.1,. . * ,X - 1 (Fig. 4(a)). Assume for the moment that the lengths are equally likely in the originating VLWS, i.e. PA d Pr (Lo = X) = l/3. Then, each qf the 12 choices (Lo,v) is equally likely tobeselected,i.e.,Pr (LO = 2 (7 v = OIJ?) =-es= Pr (LO = 6 n v = 5 I Y-Z} = l/12. This means that the probability that

Lo =2

Lo -4

Lo =6

I

I , I I I I I I I

I I I I / I I I I I I I I I I

v=o v =I v=2 v-3 v=4 v=5

1 Pr (a+) (a)

3112

2/12

v=o v=1 v=2 V’3 v-4 v= 5

(b)

Fig. 4. Illustration of the random choice of the time origin, when the VLWS has lengths X = 2,4, and 6. a) shows that the random phase v is allowed to have LO = X values. b) shows a typical v distribution.

the ROSS origin falls in a X-length word is exactly Pr (LO = hlY?j = (1/12)h. We find, consequently, that these a posteriori length probabilities are unequal, even if the a priori length probabilities Pr (LO = Xl are equal. Moreover, the probability distribution of the random phase v, Pr (~1 ?i!), is not uniform, as shown in Fig. 4(b). However, each of the conditional distributions Py (vlL0 = X;Yi?],X E &, turns out to be uniform, specifically Pr (VI LO = A;%!] = l/&v = O,l, * * * ,x - 1.

4 This result can be obtained as an extension of Hurd’s result [13] to the discrete case or as a particularization of the general result stated below for VLWS.

If we now consider general a priori length probabilities pA,x E &, we find that th e a posteriori length probability should be proportional to both PA and X itself, i.e., Pr (Lo 5 This procedure has some similarity with the problem of “random

starting” in the semi-Markov chain theory [14]. = XI.%!] = PAPA, where, for reasons of normalization, it

CARIOLARO AND PIEROBON: STATIONARY SYMBOL SEQUENCE 249

turns out that

These intuitive considerations are based on the length process (L,}, implied by the given VLWS. However, in order to relate the a posteriori to the a p&ii probabilities of general events concerning the VLWS, we find it more convenient to proceed axiomatically. To this end, we make the following assumption.

Assumption 1: Let (Q,39] be the basic probability space of the VLWS (B, ). Then the ROSS (ct ] is defined in a new probability space whose relevant probabilities Pr (. 1%) are related to the original probabilities Pr ( . ] by the fol- lowing rule,

Pr{Y=unLc=Xns1219?] =rPr(Lc=Xn&), O<ulX-1, = 0, otherwise, (29)

where X E &,A E 3, and where P is given by (28). Remark 3: From Assumption 1, we get the following

relations,

Pr (Lo = xl%!) = l?h Pr (Lo = X), A E L, (30)

Pr(Y=u]J?]=P C ph, 0 5 u d iiN - 1, (31) A>0

Pr{~=u]&=X;%]=~, OIUGX-1,

X E =G (32)

which are in agreement with the above intuitive consid- erations. Also, Assumption 1 agrees with (27). However, the ultimate interpretation of Assumption 1 lies in the fact that, when the u-distribution is chosen in accordance with (29), the ROSS turns to be stationary (see Theorem 3). In order to derive the ROSS distributions, the following lemma plays a fundamental role.

Lemma 1: Let &,k E 9, be the events defined by

fik d ‘u” u ((k 5 T-m0 < k + A) n L-, = A], (33) m=--m A

where T-~O is the distance in digits defined in (4) and the finite 1 set consists of positive integers. Then, for any k E 9, ok coincides with the basic space 52 of the VLWS.

Proof: See Appendix I. The following theorem links the ROSS to the VLWS

distributions. It also assures that the ROSS is station- ary.

Theorem 3: Let (B,, m E ,7] be a strict-sense stationary VLWS. Then, the ROSS (ct,t E J] obtained by deframing ( Bm] is strict-sense stationary, and its K-order distribution is given by

n rmlmi = ti - tl + pl -pi) J , (34a)

where ti E J,pi E &,l I ri 5 NM,1 I pi I P~,K = l,2, . ..*

For K = 1 and K = 2, (34a) becomes

Pr kt = PSI%1 = rpk), (34b)

1 5 p 5 x, 1 I q d /.L, (34d)

where pi(r) and P>!(k) are the VLWS distributions de- fined by (5) and (7), respectively.

Proof: We limit ourselves to proving (33b) and (33~); thereafter, the extension to the general order K is imme- diate. From definition (24) and (26), it follows that ct = b t+u = BP, if and only if t + v = ~0~ + p - 1. Therefore, ct = PFA whenever B, = &.A and t + u = rcm+~-l; i.e.,

h = PfA= ,ga {Bm = &A fI Tom = t -P + 1 - 4,

with probability one. Now, in using (29), we have to specify the word length occurring in the random choice. This is made by partitioning the above event as follows:

Then, considering that the events in the last unions are mutually exclusive and using (29) with A = (B, = prx n ~c,=t-p+l+u],weget

Pr(ct=PS,I% l=r 5 c l~Pr{Bm=p,, m=--m 1E-C u=O

nTom=t--p+i+UnLo=l]

=r +E C PrtB,=fi,+n(t-p+l m=--m 1EL

~70m<t-p+i+z)nLo=z)

where, in the last equality, use has been made of the sta- tionarity of (B,). Thereafter, recalling that the .?vents are mutually exclusive, we get

Pr h = Pflx1-W = r Pr PO = AX n Qt--p+ll

which yields (34b) after use of Lemma 1. For the proof of (34c), we begin by noting that ct = BP,

andct+k=Bzifandonlyift+u=rcm+p-lands,,=

250 IEEE TRANSACTIONS ON INFORMATION THEORY, MARCH 1977

k + p - q. Hence,

h = P% n Ct+k =

m=--m n=--m

P&l = ,V ,~m (Brn = &A n Tom

n & = A, n 7mn = k + p - ql

Proof: The expectation in (35) yields

@tCt+k 1 %I

=t-p+l+u

with probability one. Thereafter, we partition the event as follows to make the random choice parameters evi- dent.

.(B,=B~x(?Tom=t-P+i+Un &

= A, n Tmn =k+p-qnLo=lnY=u],

where the events in the unions are mutually exclusive. Using Assumption 1 and reasoning as before, we get

Pr kt = 0% n Ct+k = P&I Al

n(t-p+15Tom<t-p+l+l)nB,

= 64, n Tmn

=k+p-qnLo=Z]

= A6,, n 70,n-m=k+p-qn(t-p tl~T-mo<t-Ptlt~)nLL-m=~)

=rPr(Bo=&xU ‘u” (Bj=&,nToj j=-a

= k + p - 4) n f&+11.

Hence, (34~) follows after use of Lemma 1.

IV. SPECTRALANALYSISOF ROSS

In this section, we relate the correlation and spectral distribution of the ROSS to those of the originating VLWS.

A. ROSS Correlation Function

Theorem 4: The correlation function of the ROSS is given by

R,(k) A @CtCt+k 1%)

= r c Ii 2 R;;(k + p - q), (35) A,wp=l q=l

where R$ are the entries of the VLWS correlation ma- trices defined by (15).

= r f: k f R;;(k t p - q), p p=lq=l

where use has been made of (34) and (16).

B. ROSS Spectral Distribution Function

Theorem 5: The ROSS spectral distribution function F,( i) is given by

dF,(u) = I’ C VA(u) dFX@(u)V,(u), A,P

1~1 d ;, (36)

where PM(U) are the spectral distribution matrix functions of the VLWS defined in Section II, and where VA(U) are the row vectors

v,(u) A [z, * ” ,z’], 2 + exp (2&),

and where v,(u) is the conjugate transpose of V,(u).

Proof: The following spectral representation of R, ( . ) holds (Herglotz’ theorem [8, p. 4741):

R,(k) = S 112 --1,2 e2?riku dF, (u),

Use of (17) in (34) yields

k E 9.

S l/2 e2?riku dFc(u)

-l/2

= r E $ qr$ 1:; e Bai(p--q)ue2?riku &‘;;(u)

S l/2 = r’

-l/2 e2aiku C Vi(u) d FAp(u)V,(u).

A*w

Then (36) follows from the uniqueness of F,(u).

Spectral Decomposition: The decomposition of the spectral distribution matrix functions Fp( . ) of the VLWS (see Section II) leads to a similar decomposition of the spectral distribution function F,(u) of the ROSS. Specif- ically, we have the following theorem.

Theorem 6: Under the assumptions of Theorem 2, the spectral distribution function F, of the ROSS consists of a jump part F,(u)1 with jumps at u, A A-lrn of amounts

(37)

CARIOLARO AND PIEROBON: STATIONARY SYMBOL SEQUENCE 251

and an absolutely continuous part F, (u)s with a spectral density given by

Fcb)2 = r C VACU>WLCU) V,b>, A,cC

IUI <; (38)

where ptp(u) is defined by (20).

C. Transmission of a ROSS by a PAM Signal

In engineering applications, the ROSS obtained by de- framing a VLWS is generally transmitted by a PAM signal. As stated in the introduction, spectral analysis plays a fundamental role in this case. We now present in an in- formal way some results that follow from the previous theory.

In general, when each symbol of the stationary sequence is transmitted by a standard pulse shape g(t) at T spaced intervals, the spectral density function of the resulting PAM digital signal s(t) is given by [15]

W,(f) = T-11G(f)12~c(fT), (-=J <f<+-a), (39)

where G(f) is the Fourier transform of g(t), and PC(u) is the spectral density function of the symbol sequence.‘j

This result holds in particular when the symbol se- quence is obtained by the ROSS approach. Then, under the assumptions of Theorems 2 and 6, W,(f) may be de- composed into two parts: W ,(f) = W, (f)l t W , (f)2, namely,

and

w,(f)2 = WWf)12&(fT)2,

where AF, is given by (37) and PC (u)s by (38). Then, if the g.c.d. of the lengths in the VLWS is, e.g., A

= 3, the spectrum of the PAM signal presents lines regu- larly spaced by (3T)-l, where T is the symbol period.

V. APPLICATIONTOMUTUALLYINDEPENDENT WORD SEQUENCE

In this section, we consider as an application of the previous theory, a stationary VLWS consisting of mutually independent words. Our interest is confined mainly to the spectral analysis of the ROSS.

Evaluation of Probabilities: By virtue of independence, the VLWS (B,] is characterized by the word absolute probabilities PA(r) (see (5)), which are assumed given. The corresponding length process {Lm] turns out to be com- posed of mutually independent lengths, and it is specified by the length absolute probabilities pb which are marginal with respect to the PA(r) (see (8)).

The joint probabilities between words at k digits’dis- tance (7), which are needed in the ROSS spectral analysis, are straightforward to evaluate. Their matrices are given

by

PAW(O) = 6X@ diag [PA(~), - - - ,PxWx)],

PAW(k) = y(k - X)P& k > 0, (40)

PAP(k) = y(-k - p)PxPy, k <0,

where Px,X E L, are the vectors of the PA(r) (see (5)) and y( . ) gives the cumulatiue length probabilities (9). The leading results concerning the last probabilities are given by the following theorem.

Theorem 7: Let {Lm] be stationary with mutually inde- pendent lengths. Then, the sequence y(hh), h = 0.1, . . . , converges to y(aA) = AI’, where l? is given by (28). More- over, the sequence x(h) 4 y(hA) - y(wA), h = O,l, . - s, is absolutely summable with z-transform

X(z) P 2 x(h)z-h = y(aA)h(z)[g(z)]-‘, (41) h=O

h(z) and g(z) being, respectively,

h(z) A C px[h’z- (A’-1) + (A’ _ 1)z-(A’-2)

A

+ * * * t 22-l t 11, A’ A X11-l (42)

g(z) k c px[z-O’-l) + z-(A’--2) + . . . + z-1 + 1) A

for N > 2 (for N = 1, i.e., in the fixed-length case, X(z) A 0).

Proof: See Appendix II.

Correlation Matrices: Use of (40) and of Theorem 1 yields the correlation matrices between words at k digits’ distance, defined by (15). We get

RAW(O) = Sx,P’, diag [PA(~), - * * ,PAWA)]P~

RAM(k) = y(k - &z&m,, k > 0, (43)

= y(-k - X)&m,, k < 0,

where mA = PA/3h (see (11)). Spectral Distribution and Density of the VL WS: Now,

from the second equality of (43) and from Theorem 7, it follows that a) the matrix functions RAfi(hA), converge to the values RAfi(aA) = Ar&m,& E L, and b) the se- quences RAfi(hA) - RAp(aA), are absolutely summable. Hence, the assumptions of Theorem 2 are satisfied. Then, we find that w( . ) has jumps at the points urn g mh-l(-(1/2)A < m d (1/2)A) of equal amount rm;m,, and e(u) has a derivative

pig(u) = ,;$ [R+(hA) - RAp(mA)]z-h” m

= Rx”(O) t mim,[.z-uX(z*) + zAX(z-*) (44)

A’-1 tAr(l- c Zh* -"+q, z p exp (2riu)

h=O h=O

6 In (39) PC(u) is to be regarded as a periodic function with a unitary where RAp(0) is given by (43), X( . ) given by (41) and X’ A period. XA-l,/J A PA-I.

252 IEEE TRANSACTIONS ON INFORMATION THEORY, MARCH 1977

Fig. 5.

I I I I I I I I I I I 5 p4 p6

I [0.3 0.2 0.11 [O.l 0.11 [O.l 0.04 0.04 0.021

II [0.2 0.1 0.13 [O.l 0.11 [O.l 0.1 0.1 0.1 ]

III [0.2 0.2 0.21 [O.l 0.11 [0.05 0.05 0.05 0.051

Ip [0.2 0.2 0.21 [o.o 0.01 [O.l 0.1 0.1 0.1 ]

0.2 I- ~t$=2.5.10-~

t-

AF,=2.5d .e-,.ti A

I

u- Curves of spectral density versus normalized frequency with a ROSS obtained by an “independent” VLWS for various

values of word probabilities. For cases I and II, the amounts of the jumps are also shown.

Spectral Distribution and Density of the ROSS: By using Theorem 6, we find that the spectral distribution of the ROSS F, ( - ) has jumps at the previous points u,,, of amount

@m = IrA(urn)12 (45)

and a spectral density function given by

3’&.& = I’ C VA(u)Rxx(0)~A(u) + hI’)A(u)12 i A

+ Re [A(u)B(u) I , (46)

where

A(u) s C Vx<u,m’, A

(464

A’-1

B(u) 9 C V,(u)m’,[z”X(z-“) - hr C +*I, (46b) A h=O

with z P exp(2aiu)J P AA-l. Numerical example: The ROSS spectrum has been

evaluated by means of (45) and (46) for the example given at the beginning of Section II. In Fig. 5, the curves of the resulting spectral densities are plotted for different word probability distributions. The jump part vanishes with distributions III and IV, whereas with I it presents jumps at u = l/2 and u = 0 of amounts 1.4 X lOA and 2.5 X 10v3, respectively, and with II it presents a jump at u = l/2 of amount 2.5 X 10p3.

VI. CONCLUSIONS

A systematic theory has been developed for both the stationary variable-length word sequences (VLWS) and the symbol sequences that are obtained from the VLWS

by a deframing operation. If a suitable randomization is adopted in the choice of the symbol sequence origin, this process also turns out to be stationary and its relevant probabilities are easily linked to those of the originating VLWS. Particular attention has been devoted to the spectral analysis, and the spectral representation of the stationary symbol sequence has been related to that of the originating VLWS. Finally, the theory has been applied to the case of VLWS consisting of mutually independent words. A closed-form result for the spectrum of the symbol sequence was obtained. The theory can be applied also to VLWS obtained by encoding zero-memory sources with finite-state sequential machines, thus including VLWS’s encountered in digital. baseband transmission systems (HDBn, VL43 codes, etc.). This application, however, re- quires some additional theory, mainly on variable length sequential machines [ 181.

ACKNOWLEDGMENT

The authors are grateful to Prof. Yu. V. Rozanov of the Steklov-Institute for Mathematics of Moscow for a stim- ulating discussion during the paper’s revision. Also, thanks are due to Dr. M. Strintzis of the University of Pittsburgh for his helpful criticism.

APPENDIX I

Proof of Lemma 1

For k = 0, (33) reduces to no = UxEr {O I 700 < X n LO = X), where by definition ~00 = 0 and where the first condition is un- necessary. Then, QO = UX (LO = X) = Q.

Now we limit ourselves to proving the lemma fork > 0, since for k < 0 the same considerations hold. The lengths being posi- tive, the condition k I T+O < k + X implies that 15 m I k; then

CARIOLARO AND PIEROBON: STATIONARY SYMBOL SEQUENCE 253

(33) becomes

i&z = 6 u ((k I 7--mo < k + X) n L-, = X), k > 0. m=l iE1I

Moreover, since 7~~0 = 7-(m-i)a + L-, and lJhE~ (L-, = A) = fl, we get

fik = II; 7- mo 2 k n T-(~--~)o < k n m=l

Ag L-, = x 11 = $J, in-ma 1 k n T-(,,-~)a <kJ = R,

where the last step follows from the fact that roe < r-i0 < . . . < r-,0, with at least r-k0 2 k and 700 = 0 < k.

APPENDIX II Proof of Theorem 7

From definition (9) and from [Lm] independence, it follows that the cumulative length probabilit ies satisfy the following differ- ence equation.

y(k) = I$ po(k - x), k > 0, (47)

where y(0) = 1 and y(k) = 0, for k < 0. In the, next, we assume A = 1 (the extension to A > 1 is only a matter of scaling). Then from a theorem proved in [16], it follows straightforwardly that the general solution of (47) converges ask -+ + a. On the other hand, such a solution has the form [lo, p. lo]

y(k) = E 2 Cmnkn-l&, k > 0, tn=l n=l

where si, . . . ,sM are the roots (of multiplicity ~1, . . - ,VM) of the characteristic polynomial of (47), i.e., f(s) * sa - Zh pA.snmX, (Y A A,%,. The y(k) convergence assures that f(s) has no roots outside the unit circle and the only root on the circle boundary is one, which is simple, so that the absolute summability of the sequence x(k) 4 y(k) - y(a) also follows.

By using the initial condit ions y(0) = 1 and y(k) = 0, for k < 0, from the aforement ioned theorem, it then follows that y(a) =r

Next, to get the z-transform of the sequence x(k),k = O,l, . . s , we note that x(k) also satisfies (47), with x (0) = 1 - r and x(k)

= -r, fork < 0. Hence, we obtain (after some algebra)

X(z) = 1 - r + x pxz-x 5 x(k - X)Z-(~-~) A k=l

X-l =i+x(2)cpx2-X-r~px c 2-j. A A j=O

Now, by solving with respect to X(z), one recognizes that (41) also holds.

PI Dl

131

[41

[51

P31

[71

[lOI [Ill

[121

P31

[I41

[I51

1161

[I71

[W

REFERENCES R. G. Gallager, Information Theory and Reliable Communication. New York: Wiley, 1968, pp. 43-63. H. Kobayashi, “A survey of coding schemes for transmission or re- cording of digital data,” IEEE Trans. Communication Technology, vol. COM-19, pp. 1087-1100, Dec. 1971. A. Croisier, “Introduction to pseudoternary transmission codes,” IBM J. Res. Develop., vol. 14, pp. 354-367, July 1970. V. I. Johannes, A. G. Kaim, and T. Walzman, “Bipolar pulse transmission with zero extraction,” IEEE Trans. Communication Technology, vol. COM-17, pp. 303-310, April 1969. P. A. Franaszek, “Sequence-state coding for digital transmission,” Bell Syst. Techn. J., vol. 47, pp. 143-157, Jan. 1968. G. L. Cariolaro and S. G. Pupolin, “Momonts of correlated digital signals for error probability evalyation,” IEEE Trans. Inform. Theory, vol. IT-21, pp. 558-568, Sept. 1975. G. L. Cariolaro and G. P. Tronca, “Spectra of block coded signals,” IEEE Trans. Communications, vol. COM-22, pp. 1555-1564, Oct. 1974. J. L. Doob, Stochastic Processes. New York: Wiley, 1953. I. I. Gikhman and A. V. Skorokhod, Introduction to the Theory of Random Processes. Philadelphia: W . B Saunders, 1969, pp. 32- 33. E. J. Hannan, Multiple Time Series. New York: Wiley, 1970, pp. 34-44. G. V. Konovalov and Ye. M. Tarasenko, “Matrix methods for as- sessing the statistical characteristics of pulse-code signals,” Radio Engineering, vol. 25, pp. 80-86, Feb. 1970. G. L. Cariolaro and G. P. Tronca, “Correlation and spectral prop- erties of multilevel (M,N) coded digital signals with applications to pseudoternary (4,3) codes,” Alta Frequenza, vol. XLIII, pp. 1-14, Jan. 1974. H. L. Hurd, “Stationarizing properties of random shifts,” SIAM J. Appl. Math., vol. 26, pp. 203-212, Jan. 1974. R. A. Howard, Dynamic Probabilistic Systems. New York: Wiley, 1971, vol. II, pp. 663-678. L. E. Franks, Signal Theory. Englewood Cliffs, NJ: Prentice-Hall, 1969, ch. 8. S. Karlin, A First Course in Stochastic Processes. New York: Ac- ademic Press, 1966, pp. 61-62. G. L. Cariolaro and G. L. Pierobon, “Spectral analysis of vector- valued stochastic processes with random dimension,” Communi- cation Engineering Staff of Padova University, Padova, Italy, In- ternal Note, no. 6, April 1976. --, “Spectra calculation of some variable-length coded digital messages, ” to be published.