shannon's theory

8/7/2019 Shannon's Theory

1/15

Shannons Theory

Claude Shannon, one of the greatest scientists of the 20th century was a key

figure in the development of information science. He is the creator of modern information

theory, and an early and important contributor to the theory of computing.

1. SECRECY SYSTEMS

As a first step in the mathematical analysis of cryptography, it is necessary toidealize the situation suitably, and to define in a mathematically acceptable way what we

shall mean by a secrecy system. A schematic diagram of a general secrecy system is

shown in Fig. 1. At the transmitting end there are two information sourcesa messagesource and a key source. The key source produces a particular key from among those

which are possible in the system. This key is transmitted by some means, supposedly not

interceptible, for example by messenger, to the receiving end. The message sourceproduces a message (the clear) which is enciphered and the resulting cryptogram sent

to the receiving end by a possibly interceptible means, for example radio. At thereceiving end the cryptogram and key are combined in the decipherer to recover the

message.

Fig. 1. Schematic of a general secrecy system

Evidently the encipherer performs a functional operation. IfM is the message, Kthe key, and E the enciphered message, or cryptogram, we have

E = f(M,K)

1


2/15

that is E is function ofM and K. It is preferable to think of this, however, not as afunction of two variables but as a (one parameter) family of operations or

transformations, and to write itE = TiM.

The transformation Ti applied to message M produces cryptogram E. The index i

corresponds to the particular key being used.We will assume, in general, that there are only a finite number of possible keys, and thateach has an associated probability pi. Thus the key source is represented by a statisticalprocess or device which chooses one from the set of transformations T1 ,T2, , Tm withthe respective probabilities p1, p2, ,pm. Similarly we will generally assume a finitenumber of possible messages M1, M2, ,Mn with associate a priori probabilitiesq1, q2, ,qn.The possible messages, for example, might be the possible sequences of English letters

all of length N, and the associated probabilities are then the relative frequencies ofoccurrence of these sequences in normal English text.

At the receiving end it must be possible to recoverM, knowing E and K. Thus the

transformations Ti in the family must have unique inverses Ti-1

such that TiTi-1

=I theidentity transformation. Thus:

M = Ti-1E.

At any rate this inverse must exist uniquely for every E which can be obtainedfrom an M with key i. Hence we arrive at the definition: A secrecy system is a family ofuniquely reversible transformations Ti of a set of possible messages into a set ofcryptograms, the transformation Ti having an associated probability pi. Conversely anyset of entities of this type will be called a secrecy system. The set of possible messages

will be called, for convenience, the message space and the set of possible cryptograms

the cryptogram space.

Two secrecy systems will be the same if they consist of the same set oftransformations Ti, with the same messages and cryptogram space (range and domain)and the same probabilities for the keys.

A secrecy system can be visualized mechanically as a machine with one or morecontrols on it. A sequence of letters, the message, is fed into the input of the machine and

a second series emerges at the output. The particular setting of the controls corresponds

to the particular key being used. Some statistical method must be prescribed for choosingthe key from all the possible ones.

REPRESENTATION OF SYSTEMS

A secrecy system as defined above can be represented in various ways. Onewhich is convenient for illustrative purposes is a line diagram. The possible messages arerepresented by points at the left and the possible cryptograms by points at the right. If a

certain key, say key 1, transforms message M2 into cryptogram E4 then M2 and E4 areconnected by a line labeled 1, etc. From each possible message there must be exactly oneline emerging for each different key. If the same is true for each cryptogram, we will say

that the system is closed.

2


3/15

A more common way of describing a system is by stating the operation one

performs on the message for an arbitrary key to obtain the cryptogram. Similarly, one

defines implicitly the probabilities for various keys by describing how a key is chosen orwhat we know of the enemys habits of key choice. The probabilities for messages are

implicitly determined by stating oura priori knowledge of the enemys language habits,

the tactical situation (which will influence the probable content of the message) and anyspecial information we may have regarding the cryptogram.

CLOSED SYSTEM NOT CLOSEDFig. 2. Line drawings for simple systems

EXAMPLES OF SECRECY SYSTEMS

Simple Substitution Cipher

In this cipher each letter of the message is replaced by a fixed substitute, usually

also a letter. Thus the message,M = m1m2m3m4

where m1, m2, are the successive letters becomes:E = e1e2e3e4 = f(m1)f(m2)f(m3)f(m4)

where the function f(m) is a function with an inverse. The key is a permutation of thealphabet (when the substitutes are letters) e.g. X G U A C D TB F H R S L M Q V YZ W I E J O K N P. The first letterX is the substitute forA, G is the substitute forB,

etc.

Transposition (Fixed Periodd)

The message is divided into groups of length d and a permutation applied to thefirst group, the same permutation to the second group, etc. The permutation is the key and

can be represented by a permutation of the first d integers. Thus ford = 5, we mighthave 2 3 1 5 4 as the permutation. This means that:

3


4/15

m1 m2 m3 m4 m5 m6 m7 m8 m9 m10 becomes

m2 m3 m1 m5 m4 m7 m8 m6 m10 m9 .Sequential application of two or more transpositions will be called compound

transposition. If the periods are d1, d2, ,dn it is clear that the result is a transposition

of period d, where d is the least common multiple ofd1, d2, ,dn.

Vigenere, and Variations

In the Vigenere cipher the key consists of a series ofd letters. These are writtenrepeatedly below the message and the two added modulo 26 (considering the alphabetnumbered from A = 0 to Z = 25. Thus

ei = mi + ki (mod26)where ki is of period d in the index i. For example, with the key G A H, we obtain

message N O W I S T H Erepeated key G A H G A H G A

cryptogram T O D O S A N E

The Vigenere of period 1 is called the Caesar cipher. It is a simple substitution in whicheach letter ofM is advanced a fixed amount in the alphabet. This amount is the key,which may be any number from 0 to 25. The so-called Beaufort and Variant Beaufortare similar to the Vigen_ere, and encipher by the equations

ei = kimi (mod26)ei = mi ki (mod26)

respectively. The Beaufort of period one is called the reversed Caesar cipher. Theapplication of two or more Vigenere in sequence will be called the compound Vigenere.

It has the equation

ei = mi + ki + li + + si (mod26)

where ki, li, ,si in general have different periods. The period of their sum,ki + li + + si

as in compound transposition, is the least common multiple of the individual periods.

Digram, Trigram, andN-gram substitution

Rather than substitute for letters one can substitute for digrams, trigrams, etc.General digram substitution requires a key consisting of a permutation of the 262

digrams. It can be represented by a table in which the row corresponds to the first letter of

the digram and the column to the second letter, entries in the table being the substitutions(usually also digrams).

VALUATIONS OFSECRECYSYSTEM

There are a number of different criteria that should be applied in estimating the value of aproposed secrecy system. The most important of these are:

4


5/15

Amount of Secrecy

There are some systems that are perfectthe enemy is no better off after

intercepting any amount of material than before. Other systems, although giving himsome information, do not yield a unique solution to intercepted cryptograms. Among

the uniquely solvable systems, there are wide variations in the amount of labor required

to effect this solution and in the amount of material that must be intercepted to make thesolution unique.

Size of Key

The key must be transmitted by non-interceptible means from transmitting to

receiving points. Sometimes it must be memorized. It is therefore desirable to have the

key as small as possible.

Complexity of Enciphering and Deciphering Operations

Enciphering and deciphering should, of course, be as simple as possible. If theyare done manually, complexity leads to loss of time, errors, etc. If done mechanically,

complexity leads to large expensive machines.

Propagation of Errors

In certain types of ciphers an error of one letter in enciphering or transmissionleads to a large number of errors in the deciphered text. The error are spread out by the

deciphering operation, causing the loss of much information and frequent need for

repetition of the cryptogram. It is naturally desirable to minimize this error expansion.

Expansion of Message

In some types of secrecy systems the size of the message is increased by the

enciphering process. This undesirable effect may be seen in systems where one attempts

to swamp out message statistics by the addition of many nulls, or where multiplesubstitutes are used. It also occurs in many concealment types of systems (which are

not usually secrecy systems in the sense of our definition).

PERFECT SECRECY

Let us suppose the possible messages are finite in numberM1, ,Mn and havea priori probabilities P(M1), ,P(Mn), and that these are enciphered into the possiblecryptograms E1, ,Em by

E = TiM.The cryptanalyst intercepts a particularE and can then calculate, in principle at

least, the a posteriori probabilities for the various messages, PE(M). It is natural to defineperfect secrecy by the condition that, for all E the aposteriori probabilities are equal tothe a priori probabilities independently of the values of these. In this case, intercepting

the message has given the cryptanalyst no information.Any action of his which depends

5


6/15

on the information contained in the cryptogram cannot be altered, for all of his

probabilities as to what the cryptogram contains remain unchanged. On the other hand, if

the condition is notsatisfied there will exist situations in which the enemy has certain a

priori probabilities, and certain key and message choices may occur for which the

enemys probabilities do change. This in turn may affect his actions and thus perfect

secrecy has not been obtained. Hence the definition given is necessarily required by ourintuitive ideas of what perfect secrecy should mean.

A necessary and sufficient condition for perfect secrecy can be found as follows:We have

by Bayes theorem

)(

)()()(

EP

MPMPMP E

E =

in which:

P(M) = a priori probability of message M.PM(E) = conditional probability of cryptogram E if message M is chosen i.e. the sum

of the probabilities of all keys which produce cryptogram E from message M.

P(E) = probability of obtaining cryptogram E from any cause.PE(M) = a posteriori probability of messageM if cryptogram E is intercepted.

For perfect secrecy PE(M) must equal P(M) for all E and allM. Hence either P(M) = 0,a solution that must be excluded since we demand the equality independent of the valuesofP(M), or

PM(E) = P(E)

for every M and E. Conversely ifPM(E) = P(E) thenPE(M) = P(M)and we have perfect secrecy. Thus we have the result:

Theorem . A necessary and sufficient condition for perfect secrecy is thatPM(E) = P(E)

for allM andE. That is, PM(E) must be independent ofM.

Stated another way, the total probability of all keys that transform Mi into a givencryptogram E is equal to that of all keys transforming Mj into the same E, for all Mi;Mjand E.

Now there must be as many Es as there are Ms since, for a fixed i, Ti gives aone-to-one correspondence between all theMs and some of the Es.For perfect secrecyPM(E) = P(E) 6= 0 for any of these Es and any M.Hence there is at least one keytransforming any M into any of these Es.But all the keys from a fixed M to different Es

must be different, andtherefore the number of different keys is at least as great as thenumber of Ms. It is possible to obtain perfect secrecy with only this number of keys, as

6


7/15

Fig. 3. Perfect system

one shows by the following example: Let the Mi be numbered 1 to n and the Ei the same,and using n keys let

TiMj = Es

where s = i + j (Mod n). In this case we see that )(1

)( EPn

MPE ==

and we have perfect secrecy. An example is shown in Fig. 3 with s = i+j - 1 (Mod 5).Perfect systems in which the number of cryptograms, the number of messages, and the

number of keys are all equal are characterized by the properties that (1) each M isconnected to each E by exactly one line, (2) all keys are equally likely. Thus the matrix

representation of the system is a Latin square.In MTC it was shown that information may be conveniently measured by

means of entropy. If we have a set of possibilities with probabilities p1, p2, ,pn, theentropy H is given by:

= .log ii ppH

In a secrecy system there are two statistical choices involved, that of the message and of

the key. We may measure the amount of information produced when a message is chosen

by H(M):

= ),(log)()( MPMPMH

the summation being over all possible messages. Similarly, there is an uncertaintyassociated with the choice of key given by:

= ),(log)()( KPKPKH

In perfect systems of the type described above, the amount of information in the

message is at most log n (occurring when all messages are equiprobable). Thisinformation can be concealed completely only if the key uncertainty is at least log n. Thisis the first example of a general principle which will appear frequently: that there is a

7


8/15

limit to what we can obtain with a given uncertainty in keythe amount of uncertainty

we can introduce into the solution cannot be greater than the key uncertainty.

The situation is somewhat more complicated if the number of messages is infinite.Suppose, for example, that they are generated as infinite sequences of letters by a suitable

Markoff process. It is clear that no finite key will give perfect secrecy. We suppose, then,

that the key source generates key in the same manner, that is, as an infinite sequence ofsymbols. Suppose further that only a certain length of key LKis needed to encipher anddecipher a length LM of message. Let the logarithm of the number of letters in themessage alphabet be RM and that for the key alphabet be RK. Then, from the finite case, itis evident that perfect secrecy requires

RMLM RKLK.

This type of perfect secrecy is realized by the Vernam system.

These results have been deduced on the basis of unknown or arbitrary a priori

probabilities of the messages. The key required for perfect secrecy depends then on the

total number of possible messages.One would expect that, if the message space has fixed known statistics, so that it has a

definite mean rate R of generating information, in the sense of MTC, then the amount of

key needed could be reduced on the average in just this ratioMR

R, and this is indeed

true. In fact the message can be passed through a transducer which eliminates the

redundancy and reduces the expected length in just this ratio, and then a Vernam system

may be applied to the result. Evidently the amount of key used per letter of message is

statistically reduced by a factorMR

Rand in this case the key source and information

source are just matcheda bit of key completely conceals a bit of message information.

It is easily shown also, by the methods used in MTC, that this is the best that can be done.Perfect secrecy systems have a place in the practical picturethey may be used

either where the greatest importance is attached to complete secrecy e.g.,

correspondence between the highest levels of command, or in cases where the number ofpossible messages is small. Thus, to take an extreme example, if only two messages

yes or no were anticipated, a perfect system would be in order, with perhaps the

transformation table:

The disadvantage of perfect systems for large correspondence systems is, ofcourse, the equivalent amount of key that must be sent. In succeeding sections we

consider what can be achieved with smaller key size, in particular with finite keys.

2. ENTROPY

8


9/15

The Shannon entropy orinformation entropy is a measure of the uncertainty

associated with a random variable. It quantifies the information contained in a message,usually in bits or bits/symbol. It is the minimum message length necessary to

communicate information.

This also represents an absolute limit on the best possible lossless compression ofany communication: treating a message as a series of symbols, the shortest possible

representation to transmit the message is the Shannon entropy in bits/symbol multiplied

by the number of symbols in the original message.

Definition: The information entropy of a discrete random variableX, that can take on

possible values {x1...xn} is

where

I(X) is the information content orself-informationofX, which is itself a randomvariable; and

p(xi) = Pr(X=xi) is the probability mass functionofX; and

0log0 is taken to be 0.

Characterization

Information entropy is characterisedby these desiderata:

Define and .

Continuity

The measure should be continuous i.e., changing the value of one of the

probabilities by a very small amount should only change the entropy by a small amount.

Symmetry

The measure should be unchanged if the outcomes xi are re-ordered.

etc.Maximum

The measure should be maximal if all the outcomes are equally likely (uncertaintyis highest when all possible events are equiprobable).

9
http://encyclopedia.thefreedictionary.com/Random+variablehttp://encyclopedia.thefreedictionary.com/Random+variablehttp://encyclopedia.thefreedictionary.com/Data+compressionhttp://encyclopedia.thefreedictionary.com/Discrete+probability+distributionhttp://encyclopedia.thefreedictionary.com/Discrete+probability+distributionhttp://encyclopedia.thefreedictionary.com/Self-informationhttp://encyclopedia.thefreedictionary.com/Probability+mass+functionhttp://encyclopedia.thefreedictionary.com/Characterization+(mathematics)http://encyclopedia.thefreedictionary.com/Continuous+functionhttp://encyclopedia.thefreedictionary.com/Random+variablehttp://encyclopedia.thefreedictionary.com/Data+compressionhttp://encyclopedia.thefreedictionary.com/Discrete+probability+distributionhttp://encyclopedia.thefreedictionary.com/Self-informationhttp://encyclopedia.thefreedictionary.com/Probability+mass+functionhttp://encyclopedia.thefreedictionary.com/Characterization+(mathematics)http://encyclopedia.thefreedictionary.com/Continuous+function


10/15

For equiprobable events the entropy should increase with the number of

outcomes.

Additivity

The amount of entropy should be independent of how the process is regarded asbeing divided into parts.

This last functional relationship characterizes the entropy of a system with sub-

systems. It demands that the entropy of a system can be calculated from the entropy of itssub-systems if we know how the sub-systems interact with each other.

Given an ensemble of n uniformly distributed elements that are divided into k

boxes (sub-systems) with b1, b2, , bk elements, the entropy of the whole ensemble

should be equal to the sum of the entropy of the system of boxes and the individualentropies of the boxes, each weighted with the probability of being in that particular box.

Forpositive integersbi where b1 + + bk = n,

Choosing k= n, b1 = = bn = 1 this implies that the entropy of a certain outcome

is zero:

It can be shown that any definition of entropy satisfying these assumptions has the form

where Kis a constant corresponding to a choice of measurement units.

Information entropy explained

For a random variable with outcomes , the Shannon

information entropy, a measure of uncertainty (see further below) and denoted by

, is defined as

(1)

where is the probability mass functionof outcome , and is the base of the

logarithm used. Common values of are 2, , and 10. The unit of the information entropy

isbitfor , nat for , dit(or digit) for .

10
http://encyclopedia.thefreedictionary.com/Natural+numberhttp://encyclopedia.thefreedictionary.com/Probability+mass+functionhttp://encyclopedia.thefreedictionary.com/BIThttp://encyclopedia.thefreedictionary.com/BIThttp://encyclopedia.thefreedictionary.com/Ban+(information)http://encyclopedia.thefreedictionary.com/Natural+numberhttp://encyclopedia.thefreedictionary.com/Probability+mass+functionhttp://encyclopedia.thefreedictionary.com/BIThttp://encyclopedia.thefreedictionary.com/Ban+(information)


11/15

To understand the meaning of Eq.(1), let's first consider a set of possible outcomes

(events) , with equal probability . An example

would be a fairdie with values, from to . The uncertainty for such set of outcomesis defined by

(2)

The logarithm is used so to provide the additivity characteristic for independent

uncertainty. For example, consider appending to each value of the first die the value of a

second die, which has possible outcomes . There are thus

possible outcomes . The uncertainty for such

set of outcomes is then

(3)

Thus the uncertainty of playing with two dice is obtained by adding the uncertainty of the

second die to the uncertainty of the first die .

Now return to the case of playing with one die only (the first one); since the probability

of each event is 1 / n, we can write

In the case of a non-uniform probability mass function (or distribution in the case ofcontinuous random variable), we let

(4)

which is also called asurprisal; the lower the probability , i.e. , thehigher the uncertainty or the surprise, i.e. , for the outcome

The average uncertainty , with being the average operator, is obtained by

(5)

and is used as the definition of the information entropy in Eq.(1). The above also

explained why information entropy and information uncertainty can be usedinterchangeably.

11
http://encyclopedia.thefreedictionary.com/Diehttp://encyclopedia.thefreedictionary.com/Diehttp://encyclopedia.thefreedictionary.com/Self-informationhttp://encyclopedia.thefreedictionary.com/Self-informationhttp://encyclopedia.thefreedictionary.com/Diehttp://encyclopedia.thefreedictionary.com/Diehttp://encyclopedia.thefreedictionary.com/Self-information


12/15

Example

As an example, consider a fair coin. The probability of a head or a tail is 0.5. SoI(head) = I(tail) = -log(0.5) = 1. H = 1 * 0.5 + 1 * 0.5 = 1. So the messages each contain

one bit and the average information per message is one bit. This is what we would expect,

since each coin toss generates a single bit of information.Now consider a biased coin, p(head) = 2/3, p(tail) = 1/3. We have I(head) =

-log(2.3) = 0.58. I(tail) = -log(1/3) = 1.58. Note: To find the log (base 2) of a number if

you have a standard calculator, find log base 10 and then divide this by log 2 (base 10).The entropy for this system is then: H = 0.58 * 2/3 + 1.58 *1/3 = 0.92. This is telling us

that each message (head or tail) is carrying only .92 bits of information. The reason is that

the bias means we could have expected to see more heads than tails, so when this

happens we are not seeing anything unexpected. Perfect information only happens whenwe are told something we couldn't have made any useful attempt to predict.

The entropy of a system is important because it tells us how much we can hope to

compress streams of messages in the system. In principle, we could hope to get the datato fit into a system with entropy 1, by finding a perfect compression technique. In

practice, we will usually not achieve better than about 99% efficiency.Shannon calculated that English text has an entropy of about 2.3 bits per

character. Modern analysis has suggested that actually it is closer to 1.1-1.6 bits per

character, depending on the kind of text.

Further properties

The Shannon entropy satisfies the following properties:

Adding or removing an event with probability zero does not contribute to the

entropy:

.

It can be confirmed using the Jensen inequality that

This maximal entropy of log2(n) is effectively attained by a source alphabet

having a uniform probability distribution: uncertainty is maximal when all possible

events are equiprobable.

Theorem: SupposeXis a random variable having probability distribution p1, p2, pn,wherepi > 0, 1 i n. Then H(X) log2n, with equality if and only if pi = 1/n, 1 i n.

12
http://encyclopedia.thefreedictionary.com/Jensen's+inequalityhttp://encyclopedia.thefreedictionary.com/Jensen's+inequality


13/15

PROOF :

Applying Jensens Inequality, we have the following:

n

pp

pp

ppxH

i

n

i

i

n

i i

i

n

iii

2

1

2

1

2

12

log

)1

(log1

log

log)(

=

=

=

==

=

Further, equality occurs if and only ifpi = 1/n, 1 i n.

3. Product Cryptosystems

Another innovation introduced by Shannon in his 1949 paper was the idea of combining

cryptosystems by forming their product. This idea has been of fundamental importancein the design of present-day cryptosystems such as the Data Encryption Standard, which

we study in the next chapter.

For simplicity, we will confine our attention in this section to cryptosystems in which:PC= cryptosystems of this type are called endomorphic. Suppose

),,( 1111 DKPPS = and ),,(2 222 DKPPS = are two endomorphic cryptosystems

which have the same plaintext (and ciphertext) spaces. Then the product ofS1 and S2,

denoted by S1 S2, is defined to be the cryptosystem).,,,,( 21 DKKPP

A key of the product cryptosystem has the form K= (K1, K2), where 11 KK and

22 KK .The encryption and decryption rules of the product cryptosystem are defined as

follows: For each K= (K1, K2), we have an encryption rule eK defined by the formula

)),(()( 1221 ),( xeexe kkkk =

and a decryption rule defined by the formula

)).(()(221 1),(yddyd kkkk =

13


14/15

That is, we first encrypt x with1k

e , and then re-encrypt the resulting ciphertext with

2ke . Decrypting is similar, but it must be done in the reverse order:

.

))(())))((((

)))((())((

1

1

1),(),(),(

1

221

2212121

X

XedXeedd

xeedxed

kk

kkkk

kkkkkkkk

=

==

=

Recall also that cryptosystems have probability distributions associated with theirkeyspaces. Thus we need to define the probability distribution for the keyspace Kof the

product cryptosystem. We do this in a very natural way:

).()(),( 2121 21 kpkpkkp kkk =

In other words, choose K1 using the distribution 1kp , and then independently choose K2

using the distribution2k

p .

Figure 4. Multiplicative Cipher

Suppose we define the Multiplicative Cipher as in Figure 4

Suppose M is the Multiplicative Cipher (with keys chosen equiprobably) and S is the

Shift Cipher (with keys chosen equiprobably). Then it is very easy to see that M S is

nothing more than the Affine Cipher (again, with keys chosen equiprobably). It is

slightly more difficult to show that S M is also the Affine Cipher with equiprobablekeys.

Lets prove these assertions. A key in the Shift Cipher is an element 26Zk , and the

corresponding encryption rule is eK(x) = x + Kmod 26. A key in the Multiplicative

Cipher is an element 26Za ,such that gcd(a, 26) = 1; the corresponding encryption rule

is ea(x) = ax mod 26. Hence, a key in the product cipherM S has the form (a, K), where26mod)(),( kaxxe ka +=

14


15/15

But this is precisely the definition of a key in the Affine Cipher. Further, the

probability of a key in the Affine Cipher is 1/312 = 1/12 1/26, which is the product of

the probabilities of the keys a and K, respectively. Thus M S is the Affine Cipher.Now lets considerS M. A key in this cipher has the form (K, a), where

26mod)()(),( akaxkxaxe ka +=+=

Thus the key (K, a) of the product cipherS M is identical to the key (a, aK) of theAffine Cipher. It remains to show that each key of the Affine Cipher arises with the

same probability 1/312 in the product cipherS M. Observe that aK= K1 if and only ifK= a-1K1 (recall that gcd(a, 26) = 1, so a has a multiplicative inverse). In other words, the

key (a, K1) of the Affine Cipher is equivalent to the key (a-1K1, a) of the product cipher

S M. We thus have a bijection between the two key spaces. Since each key isequiprobable, we conclude that S M is indeed the Affine Cipher.

We have shown that M S = S M. Thus we would say that the two

cryptosystems commute. But not all pairs of cryptosystems commute; it is easy to findcounterexamples. On the other hand, the product operation is always associative: (S1

S2) S3 = S1 (S2 S3).

If we take the product of an (endomorphic) cryptosystem S with itself, we obtainthe cryptosystem S S, which we denote by S2. If we take the n-fold product, the

resulting cryptosystem is denoted by Sn. We call Sn an iteratedcryptosystem.

A cryptosystem S is defined to be idempotentifS2 = S. Many of the

cryptosystems we studied in Chapter 1 are idempotent. For example, the Shift,

Substitution, Affine, Hill, Vigenere and Permutation Ciphers are all idempotent. Of

course, if a cryptosystem S is idempotent, then there is no point in using the product

system S2, as it requires an extra key but provides no more security.If a cryptosystem is not idempotent, then there is a potential increase in security

by iterating several times. This idea is used in the Data Encryption Standard, which

consists of 16 iterations. But, of course, this approach requires a non-idempotent

cryptosystem to start with. One way in which simple non-idempotent cryptosystems cansometimes be constructed is to take the product of two different (simple) cryptosystems.

BIBLIOGRAPHY:

C. E. Shannon : Communication Theory of Secrecy Systems,

Douglas Stinson : Theory and Practice,http://encyclopedia.thefreedictionary.com

15

shannon's theory

Documents