thermodynamics and complexity of cellular automata

4
VOLUME 78, NUMBER 3 PHYSICAL REVIEW LETTERS 20 JANUARY 1997 Thermodynamics and Complexity of Cellular Automata Remo Badii 1 and Antonio Politi 2 1 Paul Scherrer Institut, Villigen, Switzerland 2 Istituto Nazionale di Ottica, I-50125 Firenze, Italy and INFN, Sezione di Firenze, Firenze, Italy (Received 9 September 1996) The complexity exhibited by cellular automata is studied using both topological (graph-theoretical) and metric (thermodynamic) techniques. A novel topological classification, based on a hierarchy of languages, is introduced. In particular, it is shown that the elementary rule 22 is able to produce, upon iteration, a deep nesting of grammatical rules and that this asymptotically yields a phase transition when the thermodynamic formalism is applied to the limit spatial configuration. [S0031-9007(96)02188-6] PACS numbers: 05.45.+b, 02.50.Le, 05.70.–a, 87.10.+e Deterministic cellular automata (CAs) exhibit a remark- able ability to produce intricate, although not necessarily random patterns, even when acting upon a nearly uniform initial condition [1]. The apparent contrast between this behavior and the elementary form of the dynamical rules constitutes a challenge to the definition and characteriza- tion of complexity [2]. The structure of CAs makes them a fertile ground for testing conjectures and new methods of analysis both in a physical and in a computational con- text. On the one hand, in fact, they mimic more realistic models, such as partial differential equations and coupled- map lattices, while yielding quicker simulations; on the other hand, the discreteness of space, time, and field vari- ables assimilates them to Turing machines [3], so that ap- plication of computational tools is straightforward. Some of the schemes proposed to classify CAs can be viewed as complexity measurements. In particular, dy- namical behavior lying “between” ordered (periodic) and chaotic evolution is sometimes regarded as “complex.” In spite of all efforts made to formalize this conjecture [4], however, the very existence of complex cellular automata is still in doubt. Rigorous methods, in fact, are seldom us- able in practice because of uncomputability problems [5]. Others, based on mean-field approximations [6], demon- strate the utility of a physical approach. Notwithstand- ing their different origin and motivation, these procedures present several interconnections to such an extent that, even without seeking exact results, much progress can be made by carefully combining them in the analysis. In the present paper, we present a novel graph- theoretical technique for the characterization of generic one-dimensional symbol sequences which stresses the hierarchical nature of the underlying language. We then apply it to the limit set of the “elementary” CA 22, a one-dimensional, nearest-neighbor rule over two symbols which has gained the fame of being indeed complex, especially because of the difficulty of estimating its metric entropy [7]. For this reason, we also evaluate its “ther- modynamic properties” through the entropy spectrum of the limit set using the nearest-neighbor method [8]. Our main findings are an unexpected grammatical structure of the spatial configuration even at finite times, given a random initial condition, and strong numerical evidence for the existence of a phase transition in the asymptotic spatial configuration, which appears to exhibit an infinite nesting of grammatical rules. We study symbolic sequences of the type S s 1 s 2 ,..., s n , with s i [ h0, 1, . . . , b 2 1j and n ¿ 1, such as those produced by a dynamical system endowed with a b-element phase-space partition [9] or by a one-dimensional CA. The latter consists of a dynamical rule which updates synchronously the variables s i st d, for all i [ Z, where t is the discrete time [1]. Spatial configurations S at fixed time t, possibly with t ¿ 1, will be studied through the properties of the associated language L , the set of all finite subwords of S. In order for a statistical analysis and, in particular, for the thermodynamic formalism [10] to be applicable to the spatial patterns, these must be stationary (as it is always true with rule 22 whenever the initial condition is also stationary). In Ref. [11], two indicators have been proposed to characterize the topological complexity of L . The for- mer, C s1d , was identified with the topological entropy K 0 , which represents the exponential growth rate of the num- ber N snd of words of length n in L [12]; the latter, C s2d , was defined as the topological entropy of the set F of all irreducible forbidden words (IFW) of S: These are words which do not belong to L , although all their proper sub- words do. Hence, denoting with N f snd the number of IFWs of length n, C s2d lim n!` fln N f sndgyn . (1) Since N f snd is, at most, equal to bN sn 2 1d, C s2d # C s1d and the passage from L to F in the description of S represents, de facto, a compression of information. At this stage, the maximum degree of complexity is attained when no compression takes place, i.e., when the equality holds. On the contrary, C s2d 0 implies simplicity, since a compact description of the topology 444 0031-9007y 97y 78(3) y444(4)$10.00 © 1997 The American Physical Society

Upload: antonio

Post on 30-Mar-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Thermodynamics and Complexity of Cellular Automata

VOLUME 78, NUMBER 3 P H Y S I C A L R E V I E W L E T T E R S 20 JANUARY 1997

tical)hy of, uponwhen

8-6]

444

Thermodynamics and Complexity of Cellular Automata

Remo Badii1 and Antonio Politi21Paul Scherrer Institut, Villigen, Switzerland

2Istituto Nazionale di Ottica, I-50125 Firenze, Italyand INFN, Sezione di Firenze, Firenze, Italy

(Received 9 September 1996)

The complexity exhibited by cellular automata is studied using both topological (graph-theoreand metric (thermodynamic) techniques. A novel topological classification, based on a hierarclanguages, is introduced. In particular, it is shown that the elementary rule 22 is able to produceiteration, a deep nesting of grammatical rules and that this asymptotically yields a phase transitionthe thermodynamic formalism is applied to the limit spatial configuration. [S0031-9007(96)0218

PACS numbers: 05.45.+b, 02.50.Le, 05.70.–a, 87.10.+e

kriorhileizmootilee

vaa

dn

]au[5ode

thn

pr

eao

erie

u

reaeticte

edaal

ed

eeys

o

to

m-

isen

y

Deterministic cellular automata (CAs) exhibit a remarable ability to produce intricate, although not necessarandom patterns, even when acting upon a nearly unifinitial condition [1]. The apparent contrast between tbehavior and the elementary form of the dynamical ruconstitutes a challenge to the definition and charactertion of complexity [2]. The structure of CAs makes thea fertile ground for testing conjectures and new methof analysis both in a physical and in a computational ctext. On the one hand, in fact, they mimic more realismodels, such as partial differential equations and coupmap lattices, while yielding quicker simulations; on thother hand, the discreteness of space, time, and fieldables assimilates them to Turing machines [3], so thatplication of computational tools is straightforward.

Some of the schemes proposed to classify CAs canviewed as complexity measurements. In particular,namical behavior lying “between” ordered (periodic) achaotic evolution is sometimes regarded as “complex.”spite of all efforts made to formalize this conjecture [4however, the very existence of complex cellular automis still in doubt. Rigorous methods, in fact, are seldomable in practice because of uncomputability problemsOthers, based on mean-field approximations [6], demstrate the utility of a physical approach. Notwithstaning their different origin and motivation, these procedurpresent several interconnections to such an extenteven without seeking exact results, much progress camade by carefully combining them in the analysis.

In the present paper, we present a novel gratheoretical technique for the characterization of geneone-dimensional symbol sequences which stresseshierarchical nature of the underlying language. We thapply it to the limit set of the “elementary” CA 22,one-dimensional, nearest-neighbor rule over two symbwhich has gained the fame of being indeed complespecially because of the difficulty of estimating its metentropy [7]. For this reason, we also evaluate its “thmodynamic properties” through the entropy spectrumthe limit set using the nearest-neighbor method [8]. O

0031-9007y97y78(3)y444(4)$10.00

-lymssa-

dsn-cd-

ri-p-

bey-dIn,tas-].n--sat,be

h-icthen

lsx,cr-of

r

main findings are an unexpected grammatical structuof the spatial configuration even at finite times, givenrandom initial condition, and strong numerical evidencfor the existence of a phase transition in the asymptospatial configuration, which appears to exhibit an infininesting of grammatical rules.

We study symbolic sequences of the typeS ­s1s2, . . . , sn, with si [ h0, 1, . . . , b 2 1j and n ¿ 1,such as those produced by a dynamical system endowwith a b-element phase-space partition [9] or byone-dimensional CA. The latter consists of a dynamicrule which updates synchronously the variablessistd,for all i [ Z, where t is the discrete time [1]. SpatialconfigurationsS at fixed time t, possibly with t ¿ 1,will be studied through the properties of the associatlanguageL , the set of all finite subwords ofS. Inorder for a statistical analysis and, in particular, for ththermodynamic formalism [10] to be applicable to thspatial patterns, these must be stationary (as it is alwatrue with rule 22 whenever the initial condition is alsstationary).

In Ref. [11], two indicators have been proposedcharacterize the topological complexity ofL . The for-mer,Cs1d, was identified with the topological entropyK0,which represents the exponential growth rate of the nuber Nsnd of words of lengthn in L [12]; the latter,Cs2d,was defined as the topological entropy of the setF of allirreducible forbidden words (IFW) ofS: These are wordswhich do not belong toL , although all their proper sub-words do. Hence, denoting withNfsnd the number ofIFWs of lengthn,

Cs2d ­ limn!`

fln Nfsndgyn . (1)

SinceNfsnd is, at most, equal tobNsn 2 1d, Cs2d # Cs1d

and the passage fromL to F in the description ofS represents,de facto, a compression of information.At this stage, the maximum degree of complexityattained when no compression takes place, i.e., whthe equality holds. On the contrary,Cs2d ­ 0 impliessimplicity, since a compact description of the topolog

© 1997 The American Physical Society

Page 2: Thermodynamics and Complexity of Cellular Automata

VOLUME 78, NUMBER 3 P H Y S I C A L R E V I E W L E T T E R S 20 JANUARY 1997

e

o

s

hr

a

tiv

ic

r;

e0fins

aai

th

,

yllhle

d,by

s a

en

s,

ygeel

et

ia-

ce

mithe-al-l-in.A

g

ce

fisdsu-”

e-tce

of L is achieved throughF . In particular, all systemswith a finite F (called subshifts of finite type) belongto this class, random signals corresponding to the extresituationF ­ [. In general,Cs2d can be seen to measurthe difficulty of approximating the languageL throughsubshifts of finite type with increasing memory (lengththe IFWs).

Strictly positiveCs2d can be found already in the clasof regular languages (akin to Markov processes inphysical language). An example is given by the sL 0 of words that do not contain any expression of ttype101s1 1 00dp101, where the asterisk means arbitraconcatenations of the words 1 and 00. Indeed, nonethese forbidden sequences contains any other one, becof the delimiter word 101, so that they are true IFWbelonging to a setF 0. After recoveringL 0 from F 0,one findsCs1d ø 0.6374 andCs2d ø 0.4812, in agreementwith the supposed inequality between the two exponen

We now extend this scheme by supplying a recursprocedure which makes the approach truly hierarchicThe clue is to realize that the IFW languageF can beanalyzed in the same way as the original languageL .More precisely, we propose to determine the set [whwe indicate withF sF d] of all irreducible prohibitionsfound in F and to iterate the procedure onF sF d andon its possible descendants. Special care is requihowever. In fact,F is neither factorial nor transitivei.e., concatenations of words inF do not belong toFby definition, and, ifu, w [ F , then uyw ” F , ;y.LanguagesL which originate from dynamical systemsinstead, enjoy these two properties. The languageF 0

above, for example, has a factorial, transitive compon(given by all possible concatenations of 1 and 0and “one-way” parts, corresponding to the prefix/suf101. In general, therefore, knowledge of all prohibitioin F [i.e., of F sF d] does not permit unambiguoureconstruction ofF itself. Notwithstanding this andthe presumable lack of a general theory that coverslanguages, the hierarchyL ! F ! F sF d ! . . . can bedescended whenever the main source of diversity in eof its members consists of a finite number of factoritransitive components. When this holds, the complexof F originates from its own IFWs and a new indexCs3d

can be properly defined as the topological entropy oflanguageF sF d.

This is, in particular, possible for regular languagethe words of which can be generated by following apaths on a finite graphG [13]. There is, in fact, a well-defined procedure to construct a graphGf , the “dual” ofG, which reproduces the languageF : As a consequenceF is also regular [2]. In general,Gf contains transientparts (arising from the lack of factoriality and transitivitof F ) as well as disjoint ergodic components (usuaassociated with different nodes of the original grapThe exponentCs2d is just the largest of the topologicaentropies of such components. Since each of th

me

f

aeteyofuse

s

s.e

al.

h

ed,

,

nt)

xs

all

chl,ty

e

s,ll

y).

m

is a finite graph itself, the procedure can be iteratethus obtaining a cascade of languages characterizeddecreasing topological entropiesCsdd sd ­ 1, 2, . . .d. Weconjecture that, for each regular language, there existfinite dt such thatCsdt11d ­ 0; i.e., dt is the number ofnested hierarchical levels in the languageL and may beseen as the “topological depth” of the language.

The elementary CA rule 22, defined bysisk 1 1d ­1 if si21skd 1 siskd 1 si11skd ­ 1 and sisk 1 1d ­ 0otherwise (k being the discrete time), illustrates the abovideas. Let us apply it on a random initial conditio(which corresponds to a regular language withF ­ [).It is known that any elementary cellular automaton yieldafter k steps, a spatial configurationSk in the class ofthe regular languages if the initial conditionS0 alsobelonged to it [1]. The number of IFWs, however, maincrease in time. Indeed, the analysis of the languaL1 obtained after one iteration of rule 22 reveals thexistence of four hierarchical levels with topologicaexponentsCs1d ø 0.6508, Cs2d ø 0.5297, Cs3d ø 0.4991,andCs4d ø 0.1604.

No rigorous results are instead known for the limit sVs`d (the intersection of the setsVskd of all spatial con-figurations surviving afterk steps, fork ­ 0, . . . , `) ofrule 22. Since, however, the size of the graph assocted with Sk is a rapidly increasing function ofk, one ex-pects that both the topological depthdt ­ dtskd and theexponentsCsddskd also increase withk. Rather than at-tempting a troublesome investigation of the convergenproperties of the sequencehCsddskdj for increasingk, wehave preferred to attack the more meaningful probleof constructing a hierarchical description of the limset which, in the present context, is tantamount to tidentification of all IFWs for increasing length. Unfortunately, one is immediately faced with a fundamentlimitation: At variance with the case of dynamical systems with known generating partition [14], there is no agorithm which may determine all forbidden sequencesthe limit setVs`d of a generic automaton in a finite timeOne could imagine to proceed as follows: Given a Crule, all the preimages of a test sequenceS of length nare computed fork backward iterates (their length beinn 1 2rk in elementary automata with ranger). Then,S is forbidden if an empty set is found for a finitek.This procedure, however, is not guaranteed to halt sinthere is no bound, in principle, to the smallestk whichis necessary to reach in order to assess the legality oS.In practice, however, the detection of forbidden wordsnot too hard since knowledge of the previously identifieones may be used to speed up the computation. Ually, a few iterates are sufficient to identify the “shortprohibitions.

A more fundamental difficulty is represented by squences that are only “asymptotically” forbidden. LePsS, n; kd denote the occurrence probability of a sequenS of length n in the kth image of a uniformly random

445

Page 3: Thermodynamics and Complexity of Cellular Automata

VOLUME 78, NUMBER 3 P H Y S I C A L R E V I E W L E T T E R S 20 JANUARY 1997

ed

fie.

1hlm

c-

in

d

e

d

alln

oreing

n

e

ntthe

f

analer-

d-ti-

d

theity

initial configuration. Then,

PsS, n; kd ­ NpsS, n; kdb2sn12rkd, (2)

where NpsS, n; kd is the number ofkth order preimagesof S and the factorb2sn12rkd represents the Lebesgumeasure of each preimage. Then,S must be considereasymptotically forbidden ifPsS, n; kd ! 0 for k ! `.

The existence of such sequences is easily verifor rule 22: S0 ­ 10101, e.g., has only one preimagof order k, consisting of5 1 2k alternating 0s and 1sAccordingly, the probability of observing 10101 afterkiterations of the rule isPs10101, 5; kd ­ 22522k , whichvanishes exponentially for largek. Further asymptoticallyforbidden words are 010110001 and 100110011.

The preimages method enabled us to determineIFWs up to length 25 in a spatial configurationSk

generated by rule 22 ink ­ 10 000 iterations from arandom initial condition. The results, illustrated in Fig.yield Cs2d ø 0.545. Before comparing this value witthe topological entropyK0, let us shift to the generaframework offered by the thermodynamic formaliswhich provides a more complete characterizationsymbolic signals. From the probabilityPss, nd of eachsubsequence of lengthn, one defines the local metrientropyksSd ­ 2 ln PsS, ndyn [8] and considers the number dNnsk0d of such sequences withk [ sk0, k0 1 dk0d.SettingNnskd , engskd for n ! `, one may interpretkas the thermodynamic energyE of a chain of n spins,and gskd, the “entropy spectrum,” as the correspondthermodynamic entropySsEd [2,10]. The functiongskdhas been estimated for the asymptotic configurationS

from the dimension spectrumfsad [8] in the spaceof symbolic sequencesS [the interval x [ f0, 1g, withxsSd ­

Pi si22i] using the relation k ­ a ln 2 be-

tweenk and the local dimensiona. We have analyzedM ­ 106 symbols of S using the fixed-mass metho[15]. The local entropies were computed in2 3 104

randomly chosen neighborhoods, each containing a m

FIG. 1. Natural logarithm of the numberNfsnd of IFWsof length n for the asymptotic spatial configuration of thelementary CA rule 22 vsn. The fitted slope isCs2d ­ 0.545.

446

ed

all

,

of

g

ass

p ­ mynp. Histograms of the entropies were constructefor np [ f105, 9.8 3 105g andm [ h20, 40, 60, 80j. Thecurve in Fig. 2 refers tonp ø 8 3 105, m ­ 40, andrepresents the typical shape of the curves obtained forpairs sm, npd. The curves are most reliable in the regioaround the valuek ­ K1 ø 0.51 6 0.01 of the metricentropy, which has been estimated independently. Maccurate results can be obtained with a finite-size scalanalysis. The topological entropyK0, in particular, hasbeen determined either as

Ksad0 snd ­ ln

NsndNsn 2 1d

, or asKsbd0 snd ­ ln mmaxsnd ,

(3)wheremmaxsnd is the largest eigenvalue of the transitiomatrix for the graphG that reproduces all prohibitionsup to lengthn. Assuming an exponential convergencof the form K

sid0 snd , K0 1 aie2hin, we have estimated

the topological entropy to beK0 ­ 0.55 6 0.01 andh ø 0.08 in both cases (see Fig. 3). The agreemebetween the two approaches strengthens the validity ofnumerical findings [16]. It is important to notice thatCs1d

is very close toCs2d, which hints at the maximal degree oorder-two complexity for the asymptotic sequenceS .

Moreover, the approximately linear behavior ofgskdvs k for 0.25 , k , 0.5 is suggestive of a (first-order)phase transition. This graph is, indeed, analogous toentropy-energy diagram in which subsystems with locenergies in a finite range coexist at the same tempature T ­ sdSydEd21. Analogous curves are obtainefor well-known dynamical systems [8]. The slow convergence of the thermodynamic functions (finite-size esmates show thath is indeed small for other generalizeentropiesKq [8] as well) confirms this conjecture. Withinthe error bounds, we may presumably conclude thatpresence of the phase transition is related to the validof the exact equalityCs1d ­ Cs2d, which could not be di-rectly assessed.

FIG. 2. Entropy spectrumgskd vs k for the elementary CArule 22.

Page 4: Thermodynamics and Complexity of Cellular Automata

VOLUME 78, NUMBER 3 P H Y S I C A L R E V I E W L E T T E R S 20 JANUARY 1997

arthgt

r

es

t,ithr

oen

le

n

lallye

al-m-ap-

ofhefig-allyhe

di-to

-

,

,

.

-er

there

t

liti,

FIG. 3. Finite-size estimatesK0snd of the topological entropyK0 for the asympototic spatial configuration of the elementCA rule 22. Stars: logarithm of the largest eigenvalues ofdirected graph constructed from all forbidden words of len, # n. Circles: logarithm of the ratioNsndyNsn 2 1d betweenthe numbers of legal words of lengthsn andn 2 1.

The reliability of thegskd spectrum of Fig. 2 is furtheconfirmed by the agreement of the boundarieskmin andkmax of its support with various approximate estimatAn upper bound to the minimum local entropykmin isgiven by the decay rateks0d of the probabilityPs000 . . .dof a sequence composed ofn zeros. This is the mosfrequently observed one (the spatio-temporal patternfact, presents a large number of triangles filled w0s) and, hence, it asymptotically dominates all otheThe abundance of such triangles and the existencesimple recognition algorithm for them permit achievemof reliable results even for large values ofn. Carefulsimulations indicate thatks0d ­ 0.267 is indeed very closeto kmin. An analytic lower bound to the topologicaentropy can be obtained by realizing that rule 22, whapplied four times on an arbitrary concatenationS̃ ofthe two wordsw0 ­ 0000 andw1 ­ 0001, simulates theeffect of rule 90 on the configurationS 0 obtained fromS̃

by substituting eachw0 with 0 and eachw1 with 1 [3].Since rule 90 is “linear,” it prohibits no sequence, athe topological entropy of its limit set is ln2. Hence, allsubsequences of̃S in rule 22 contribute to the topologicaentropy with sln 2dy4 ø 0.173. The value representslower bound toK0 only if all such subsequences are reacontained in the invariant set of rule 22, as it is confirmby numerical simulations.

yeh

.

in

s.f at

n

d

d

In this work, we have introduced a method whichlows one to identify a hierarchy of nested levels of gramatical structure in a generic symbolic sequence. Itsplication to the language generated by a single iterationrule 22 reveals four different levels. Investigation of tlimit set of the same rule indicates that the spatial conuration is a good candidate for a second-order maximcomplex language. The implications of this finding in tthermodynamic setting have been demonstrated with arect estimate of the entropy spectrum which appearsexhibit a phase transition.

[1] S. Wolfram, Theory and Applications of Cellular Automata(World Scientific, Singapore, 1986).

[2] R. Badii and A. Politi,Complexity: Hierarchical Struc-tures and Scaling in Physics(Cambridge University PressCambridge, England, 1997).

[3] S. Wolfram, Commun. Math. Phys.96, 15 (1984).[4] C. G. Langton, Physica (Amsterdam)22D, 120 (1986).[5] M. Hurley, Ergod. Th. Dynam. Syst.10, 131 (1990).[6] H. A. Gutowitz, Physica (Amsterdam)45D, 136 (1990).[7] G. Fahner and P. Grassberger, Complex Syst.1, 1093

(1987).[8] P. Grassberger, R. Badii, and A. Politi, J. Stat. Phys.51,

135 (1988).[9] V. M. Alekseev and M. V. Yakobson, Phys. Rep.75, 287

(1981).[10] Ya. G. Sinai, Russ. Math. Surv.27, 21 (1972); D. Ruelle

Thermodynamic Formalism(Addison-Wesley, ReadingMA, 1978).

[11] G. D’Alessandro and A. Politi, Phys. Rev. Lett.64, 1609(1990).

[12] R. L. Adler, A. G. Konheim, and M. H. McAndrew, TransAm. Math. Soc.114, 309 (1965).

[13] J. E. Hopcroft and J. D. Ullman,Introduction to AutomataTheory, Languages, and Computation(Addison-Wesley,Reading, MA, 1979).

[14] A. Politi, in From Statistical Physics to Statistical Inference and Back,edited by J.-P. Nadal and P. Grassberg(Kluwer, Dordrecht, 1994), p. 293.

[15] R. Badii and A. Politi, J. Stat. Phys.40, 725 (1985);R. Badii and G. Broggi, Phys. Lett. A131, 339 (1988).

[16] At variance with generic dynamical systems, wheregraph provides better approximations [17], the mostringent upper bound toK0 is here provided by the direcmethod.

[17] G. D’Alessandro, P. Grassberger, S. Isola, and A. PoJ. Phys. A23, 5285 (1990).

447