thermodynamics and complexity of cellular automata
TRANSCRIPT
VOLUME 78, NUMBER 3 P H Y S I C A L R E V I E W L E T T E R S 20 JANUARY 1997
tical)hy of, uponwhen
8-6]
444
Thermodynamics and Complexity of Cellular Automata
Remo Badii1 and Antonio Politi21Paul Scherrer Institut, Villigen, Switzerland
2Istituto Nazionale di Ottica, I-50125 Firenze, Italyand INFN, Sezione di Firenze, Firenze, Italy
(Received 9 September 1996)
The complexity exhibited by cellular automata is studied using both topological (graph-theoreand metric (thermodynamic) techniques. A novel topological classification, based on a hierarclanguages, is introduced. In particular, it is shown that the elementary rule 22 is able to produceiteration, a deep nesting of grammatical rules and that this asymptotically yields a phase transitionthe thermodynamic formalism is applied to the limit spatial configuration. [S0031-9007(96)0218
PACS numbers: 05.45.+b, 02.50.Le, 05.70.–a, 87.10.+e
kriorhileizmootilee
vaa
dn
]au[5ode
thn
pr
eao
erie
u
reaeticte
edaal
ed
eeys
o
to
m-
isen
y
Deterministic cellular automata (CAs) exhibit a remarable ability to produce intricate, although not necessarandom patterns, even when acting upon a nearly unifinitial condition [1]. The apparent contrast between tbehavior and the elementary form of the dynamical ruconstitutes a challenge to the definition and charactertion of complexity [2]. The structure of CAs makes thea fertile ground for testing conjectures and new methof analysis both in a physical and in a computational ctext. On the one hand, in fact, they mimic more realismodels, such as partial differential equations and coupmap lattices, while yielding quicker simulations; on thother hand, the discreteness of space, time, and fieldables assimilates them to Turing machines [3], so thatplication of computational tools is straightforward.
Some of the schemes proposed to classify CAs canviewed as complexity measurements. In particular,namical behavior lying “between” ordered (periodic) achaotic evolution is sometimes regarded as “complex.”spite of all efforts made to formalize this conjecture [4however, the very existence of complex cellular automis still in doubt. Rigorous methods, in fact, are seldomable in practice because of uncomputability problemsOthers, based on mean-field approximations [6], demstrate the utility of a physical approach. Notwithstaning their different origin and motivation, these procedurpresent several interconnections to such an extenteven without seeking exact results, much progress camade by carefully combining them in the analysis.
In the present paper, we present a novel gratheoretical technique for the characterization of geneone-dimensional symbol sequences which stresseshierarchical nature of the underlying language. We thapply it to the limit set of the “elementary” CA 22,one-dimensional, nearest-neighbor rule over two symbwhich has gained the fame of being indeed complespecially because of the difficulty of estimating its metentropy [7]. For this reason, we also evaluate its “thmodynamic properties” through the entropy spectrumthe limit set using the nearest-neighbor method [8]. O
0031-9007y97y78(3)y444(4)$10.00
-lymssa-
dsn-cd-
ri-p-
bey-dIn,tas-].n--sat,be
h-icthen
lsx,cr-of
r
main findings are an unexpected grammatical structuof the spatial configuration even at finite times, givenrandom initial condition, and strong numerical evidencfor the existence of a phase transition in the asymptospatial configuration, which appears to exhibit an infininesting of grammatical rules.
We study symbolic sequences of the typeS s1s2, . . . , sn, with si [ h0, 1, . . . , b 2 1j and n ¿ 1,such as those produced by a dynamical system endowwith a b-element phase-space partition [9] or byone-dimensional CA. The latter consists of a dynamicrule which updates synchronously the variablessistd,for all i [ Z, where t is the discrete time [1]. SpatialconfigurationsS at fixed time t, possibly with t ¿ 1,will be studied through the properties of the associatlanguageL , the set of all finite subwords ofS. Inorder for a statistical analysis and, in particular, for ththermodynamic formalism [10] to be applicable to thspatial patterns, these must be stationary (as it is alwatrue with rule 22 whenever the initial condition is alsstationary).
In Ref. [11], two indicators have been proposedcharacterize the topological complexity ofL . The for-mer,Cs1d, was identified with the topological entropyK0,which represents the exponential growth rate of the nuber Nsnd of words of lengthn in L [12]; the latter,Cs2d,was defined as the topological entropy of the setF of allirreducible forbidden words (IFW) ofS: These are wordswhich do not belong toL , although all their proper sub-words do. Hence, denoting withNfsnd the number ofIFWs of lengthn,
Cs2d limn!`
fln Nfsndgyn . (1)
SinceNfsnd is, at most, equal tobNsn 2 1d, Cs2d # Cs1d
and the passage fromL to F in the description ofS represents,de facto, a compression of information.At this stage, the maximum degree of complexityattained when no compression takes place, i.e., whthe equality holds. On the contrary,Cs2d 0 impliessimplicity, since a compact description of the topolog
© 1997 The American Physical Society
VOLUME 78, NUMBER 3 P H Y S I C A L R E V I E W L E T T E R S 20 JANUARY 1997
e
o
s
hr
a
tiv
ic
r;
e0fins
aai
th
,
yllhle
d,by
s a
en
s,
ygeel
et
ia-
ce
mithe-al-l-in.A
g
ce
fisdsu-”
e-tce
of L is achieved throughF . In particular, all systemswith a finite F (called subshifts of finite type) belongto this class, random signals corresponding to the extresituationF [. In general,Cs2d can be seen to measurthe difficulty of approximating the languageL throughsubshifts of finite type with increasing memory (lengththe IFWs).
Strictly positiveCs2d can be found already in the clasof regular languages (akin to Markov processes inphysical language). An example is given by the sL 0 of words that do not contain any expression of ttype101s1 1 00dp101, where the asterisk means arbitraconcatenations of the words 1 and 00. Indeed, nonethese forbidden sequences contains any other one, becof the delimiter word 101, so that they are true IFWbelonging to a setF 0. After recoveringL 0 from F 0,one findsCs1d ø 0.6374 andCs2d ø 0.4812, in agreementwith the supposed inequality between the two exponen
We now extend this scheme by supplying a recursprocedure which makes the approach truly hierarchicThe clue is to realize that the IFW languageF can beanalyzed in the same way as the original languageL .More precisely, we propose to determine the set [whwe indicate withF sF d] of all irreducible prohibitionsfound in F and to iterate the procedure onF sF d andon its possible descendants. Special care is requihowever. In fact,F is neither factorial nor transitivei.e., concatenations of words inF do not belong toFby definition, and, ifu, w [ F , then uyw ” F , ;y.LanguagesL which originate from dynamical systemsinstead, enjoy these two properties. The languageF 0
above, for example, has a factorial, transitive compon(given by all possible concatenations of 1 and 0and “one-way” parts, corresponding to the prefix/suf101. In general, therefore, knowledge of all prohibitioin F [i.e., of F sF d] does not permit unambiguoureconstruction ofF itself. Notwithstanding this andthe presumable lack of a general theory that coverslanguages, the hierarchyL ! F ! F sF d ! . . . can bedescended whenever the main source of diversity in eof its members consists of a finite number of factoritransitive components. When this holds, the complexof F originates from its own IFWs and a new indexCs3d
can be properly defined as the topological entropy oflanguageF sF d.
This is, in particular, possible for regular languagethe words of which can be generated by following apaths on a finite graphG [13]. There is, in fact, a well-defined procedure to construct a graphGf , the “dual” ofG, which reproduces the languageF : As a consequenceF is also regular [2]. In general,Gf contains transientparts (arising from the lack of factoriality and transitivitof F ) as well as disjoint ergodic components (usuaassociated with different nodes of the original grapThe exponentCs2d is just the largest of the topologicaentropies of such components. Since each of th
me
f
aeteyofuse
s
s.e
al.
h
ed,
,
nt)
xs
all
chl,ty
e
s,ll
y).
m
is a finite graph itself, the procedure can be iteratethus obtaining a cascade of languages characterizeddecreasing topological entropiesCsdd sd 1, 2, . . .d. Weconjecture that, for each regular language, there existfinite dt such thatCsdt11d 0; i.e., dt is the number ofnested hierarchical levels in the languageL and may beseen as the “topological depth” of the language.
The elementary CA rule 22, defined bysisk 1 1d 1 if si21skd 1 siskd 1 si11skd 1 and sisk 1 1d 0otherwise (k being the discrete time), illustrates the abovideas. Let us apply it on a random initial conditio(which corresponds to a regular language withF [).It is known that any elementary cellular automaton yieldafter k steps, a spatial configurationSk in the class ofthe regular languages if the initial conditionS0 alsobelonged to it [1]. The number of IFWs, however, maincrease in time. Indeed, the analysis of the languaL1 obtained after one iteration of rule 22 reveals thexistence of four hierarchical levels with topologicaexponentsCs1d ø 0.6508, Cs2d ø 0.5297, Cs3d ø 0.4991,andCs4d ø 0.1604.
No rigorous results are instead known for the limit sVs`d (the intersection of the setsVskd of all spatial con-figurations surviving afterk steps, fork 0, . . . , `) ofrule 22. Since, however, the size of the graph assocted with Sk is a rapidly increasing function ofk, one ex-pects that both the topological depthdt dtskd and theexponentsCsddskd also increase withk. Rather than at-tempting a troublesome investigation of the convergenproperties of the sequencehCsddskdj for increasingk, wehave preferred to attack the more meaningful probleof constructing a hierarchical description of the limset which, in the present context, is tantamount to tidentification of all IFWs for increasing length. Unfortunately, one is immediately faced with a fundamentlimitation: At variance with the case of dynamical systems with known generating partition [14], there is no agorithm which may determine all forbidden sequencesthe limit setVs`d of a generic automaton in a finite timeOne could imagine to proceed as follows: Given a Crule, all the preimages of a test sequenceS of length nare computed fork backward iterates (their length beinn 1 2rk in elementary automata with ranger). Then,S is forbidden if an empty set is found for a finitek.This procedure, however, is not guaranteed to halt sinthere is no bound, in principle, to the smallestk whichis necessary to reach in order to assess the legality oS.In practice, however, the detection of forbidden wordsnot too hard since knowledge of the previously identifieones may be used to speed up the computation. Ually, a few iterates are sufficient to identify the “shortprohibitions.
A more fundamental difficulty is represented by squences that are only “asymptotically” forbidden. LePsS, n; kd denote the occurrence probability of a sequenS of length n in the kth image of a uniformly random
445
VOLUME 78, NUMBER 3 P H Y S I C A L R E V I E W L E T T E R S 20 JANUARY 1997
ed
fie.
1hlm
c-
in
d
e
d
alln
oreing
n
e
ntthe
f
analer-
d-ti-
d
theity
initial configuration. Then,
PsS, n; kd NpsS, n; kdb2sn12rkd, (2)
where NpsS, n; kd is the number ofkth order preimagesof S and the factorb2sn12rkd represents the Lebesgumeasure of each preimage. Then,S must be considereasymptotically forbidden ifPsS, n; kd ! 0 for k ! `.
The existence of such sequences is easily verifor rule 22: S0 10101, e.g., has only one preimagof order k, consisting of5 1 2k alternating 0s and 1sAccordingly, the probability of observing 10101 afterkiterations of the rule isPs10101, 5; kd 22522k , whichvanishes exponentially for largek. Further asymptoticallyforbidden words are 010110001 and 100110011.
The preimages method enabled us to determineIFWs up to length 25 in a spatial configurationSk
generated by rule 22 ink 10 000 iterations from arandom initial condition. The results, illustrated in Fig.yield Cs2d ø 0.545. Before comparing this value witthe topological entropyK0, let us shift to the generaframework offered by the thermodynamic formaliswhich provides a more complete characterizationsymbolic signals. From the probabilityPss, nd of eachsubsequence of lengthn, one defines the local metrientropyksSd 2 ln PsS, ndyn [8] and considers the number dNnsk0d of such sequences withk [ sk0, k0 1 dk0d.SettingNnskd , engskd for n ! `, one may interpretkas the thermodynamic energyE of a chain of n spins,and gskd, the “entropy spectrum,” as the correspondthermodynamic entropySsEd [2,10]. The functiongskdhas been estimated for the asymptotic configurationS
from the dimension spectrumfsad [8] in the spaceof symbolic sequencesS [the interval x [ f0, 1g, withxsSd
Pi si22i] using the relation k a ln 2 be-
tweenk and the local dimensiona. We have analyzedM 106 symbols of S using the fixed-mass metho[15]. The local entropies were computed in2 3 104
randomly chosen neighborhoods, each containing a m
FIG. 1. Natural logarithm of the numberNfsnd of IFWsof length n for the asymptotic spatial configuration of thelementary CA rule 22 vsn. The fitted slope isCs2d 0.545.
446
ed
all
,
of
g
ass
p mynp. Histograms of the entropies were constructefor np [ f105, 9.8 3 105g andm [ h20, 40, 60, 80j. Thecurve in Fig. 2 refers tonp ø 8 3 105, m 40, andrepresents the typical shape of the curves obtained forpairs sm, npd. The curves are most reliable in the regioaround the valuek K1 ø 0.51 6 0.01 of the metricentropy, which has been estimated independently. Maccurate results can be obtained with a finite-size scalanalysis. The topological entropyK0, in particular, hasbeen determined either as
Ksad0 snd ln
NsndNsn 2 1d
, or asKsbd0 snd ln mmaxsnd ,
(3)wheremmaxsnd is the largest eigenvalue of the transitiomatrix for the graphG that reproduces all prohibitionsup to lengthn. Assuming an exponential convergencof the form K
sid0 snd , K0 1 aie2hin, we have estimated
the topological entropy to beK0 0.55 6 0.01 andh ø 0.08 in both cases (see Fig. 3). The agreemebetween the two approaches strengthens the validity ofnumerical findings [16]. It is important to notice thatCs1d
is very close toCs2d, which hints at the maximal degree oorder-two complexity for the asymptotic sequenceS .
Moreover, the approximately linear behavior ofgskdvs k for 0.25 , k , 0.5 is suggestive of a (first-order)phase transition. This graph is, indeed, analogous toentropy-energy diagram in which subsystems with locenergies in a finite range coexist at the same tempature T sdSydEd21. Analogous curves are obtainefor well-known dynamical systems [8]. The slow convergence of the thermodynamic functions (finite-size esmates show thath is indeed small for other generalizeentropiesKq [8] as well) confirms this conjecture. Withinthe error bounds, we may presumably conclude thatpresence of the phase transition is related to the validof the exact equalityCs1d Cs2d, which could not be di-rectly assessed.
FIG. 2. Entropy spectrumgskd vs k for the elementary CArule 22.
VOLUME 78, NUMBER 3 P H Y S I C A L R E V I E W L E T T E R S 20 JANUARY 1997
arthgt
r
es
t,ithr
oen
le
n
lallye
al-m-ap-
ofhefig-allyhe
di-to
-
,
,
.
-er
there
t
liti,
FIG. 3. Finite-size estimatesK0snd of the topological entropyK0 for the asympototic spatial configuration of the elementCA rule 22. Stars: logarithm of the largest eigenvalues ofdirected graph constructed from all forbidden words of len, # n. Circles: logarithm of the ratioNsndyNsn 2 1d betweenthe numbers of legal words of lengthsn andn 2 1.
The reliability of thegskd spectrum of Fig. 2 is furtheconfirmed by the agreement of the boundarieskmin andkmax of its support with various approximate estimatAn upper bound to the minimum local entropykmin isgiven by the decay rateks0d of the probabilityPs000 . . .dof a sequence composed ofn zeros. This is the mosfrequently observed one (the spatio-temporal patternfact, presents a large number of triangles filled w0s) and, hence, it asymptotically dominates all otheThe abundance of such triangles and the existencesimple recognition algorithm for them permit achievemof reliable results even for large values ofn. Carefulsimulations indicate thatks0d 0.267 is indeed very closeto kmin. An analytic lower bound to the topologicaentropy can be obtained by realizing that rule 22, whapplied four times on an arbitrary concatenationS̃ ofthe two wordsw0 0000 andw1 0001, simulates theeffect of rule 90 on the configurationS 0 obtained fromS̃
by substituting eachw0 with 0 and eachw1 with 1 [3].Since rule 90 is “linear,” it prohibits no sequence, athe topological entropy of its limit set is ln2. Hence, allsubsequences of̃S in rule 22 contribute to the topologicaentropy with sln 2dy4 ø 0.173. The value representslower bound toK0 only if all such subsequences are reacontained in the invariant set of rule 22, as it is confirmby numerical simulations.
yeh
.
in
s.f at
n
d
d
In this work, we have introduced a method whichlows one to identify a hierarchy of nested levels of gramatical structure in a generic symbolic sequence. Itsplication to the language generated by a single iterationrule 22 reveals four different levels. Investigation of tlimit set of the same rule indicates that the spatial conuration is a good candidate for a second-order maximcomplex language. The implications of this finding in tthermodynamic setting have been demonstrated with arect estimate of the entropy spectrum which appearsexhibit a phase transition.
[1] S. Wolfram, Theory and Applications of Cellular Automata(World Scientific, Singapore, 1986).
[2] R. Badii and A. Politi,Complexity: Hierarchical Struc-tures and Scaling in Physics(Cambridge University PressCambridge, England, 1997).
[3] S. Wolfram, Commun. Math. Phys.96, 15 (1984).[4] C. G. Langton, Physica (Amsterdam)22D, 120 (1986).[5] M. Hurley, Ergod. Th. Dynam. Syst.10, 131 (1990).[6] H. A. Gutowitz, Physica (Amsterdam)45D, 136 (1990).[7] G. Fahner and P. Grassberger, Complex Syst.1, 1093
(1987).[8] P. Grassberger, R. Badii, and A. Politi, J. Stat. Phys.51,
135 (1988).[9] V. M. Alekseev and M. V. Yakobson, Phys. Rep.75, 287
(1981).[10] Ya. G. Sinai, Russ. Math. Surv.27, 21 (1972); D. Ruelle
Thermodynamic Formalism(Addison-Wesley, ReadingMA, 1978).
[11] G. D’Alessandro and A. Politi, Phys. Rev. Lett.64, 1609(1990).
[12] R. L. Adler, A. G. Konheim, and M. H. McAndrew, TransAm. Math. Soc.114, 309 (1965).
[13] J. E. Hopcroft and J. D. Ullman,Introduction to AutomataTheory, Languages, and Computation(Addison-Wesley,Reading, MA, 1979).
[14] A. Politi, in From Statistical Physics to Statistical Inference and Back,edited by J.-P. Nadal and P. Grassberg(Kluwer, Dordrecht, 1994), p. 293.
[15] R. Badii and A. Politi, J. Stat. Phys.40, 725 (1985);R. Badii and G. Broggi, Phys. Lett. A131, 339 (1988).
[16] At variance with generic dynamical systems, wheregraph provides better approximations [17], the mostringent upper bound toK0 is here provided by the direcmethod.
[17] G. D’Alessandro, P. Grassberger, S. Isola, and A. PoJ. Phys. A23, 5285 (1990).
447