optimization of flowsheet drawing layout using a genetic algorithm

21
Pergamon Computers chem. Engng Vol. 22, No. 1-2, pp. 47-67, 1998 © 1997 Elsevier Science Ltd Printed in Great Britain. All rights reserved PII: S0098-1354(96)00351-1 0098-1354/97$17.00+0.00 Optimization of flowsheet drawing layout using a genetic algorithm A. A. Brice, W. R. Johns* Quanti Sci Ltd., Chiltern House, 45 Station Road, Henley-on-Thames, Oxfordshire RG9 IAT, UK Abstract An optimal flowsheet drawing layout is one that presents a simple intuitive correspondence between the drawing and the underlying process it describes. Criteria that can be applied to make this correspondence have, however, never been clearly defined. Drawing algorithms from other disciplines are reviewed and found to have criteria quite distinct from those required for chemical process flowsheets. In most cases the criteria are not explicitly stated and heuristic solution methods are employed. The paper introduces a range of possible flowsheet drawing criteria which can be weighted to emphasize desirable drawing features. A genetic algorithm for laying out drawings is described which can easily accommodate such a wide range of optimization criteria. Results presented show that the algorithm gives "optimal" or near-optimal layouts for small to medium sized drawings (less than 20 units) which compares favourably with other genetic algorithm applications. For the first time we are able to quantify criteria for a "good" flowsheet drawing and hence clearly distinguish between the criterion and the optimization routine. One benefit of such mathematically computable criteria is that it would now be possible to develop special-purpose optimization methods applicable to larger or more complex drawings. 1. Introduction The stimulus for the work reported here was to provide layout drawings for processes synthesized using the PIP II [Kirkwood et al. (1988), Korovessi (1995)] process synthesis system. PIP II generates a process in a hierarchical way and automates the synthesis method- ology presented by Douglas (1987). It starts with a simple input/output description which is broken down into subsystems (e.g. reaction and separation), which are themselves further broken down until a complete feasible process is achieved. At each stage, the current state of elaboration is generated as a text file using a formal process description language. It is the objective of the program described here to generate flowsheet drawings for any level of elaboration during the synthesis procedure. It should be emphasized that the object is not to produce P & IDs. PIP II presents the flowsheets as simple blocks ("reaction system", "separation system", "solids separation system" etc.). These blocks (or subsystems) can be further expanded as the synthesis proceeds. Ultimately the subsystems will be expanded until they are made up of physically realisable unit operations (reactors, heat exchangers, liquid extractors * Corresponding author. C.ACE22-I/2.C etc.). At that stage, it may be desirable to produce a full P & ID. The drawing criteria and algorithms presented would be adaptable to such drawings. It is likely, however, that human intervention would be required to produce a fully professional P & ID (at least to lay out the labels). It is not the intention here to produce a drawing package to compete with the many excellent interactive drawing packages available (although the drawings produced could act as a starting point for the latter). The intention is that each drawing should be produced fully automatically in, preferably, a few seconds and be good enough to enable the operator to concentrate on the synthesis process and not be dis- tracted by the production of individual drawings. Thus, at will, it must be possible to produce drawings of any selected subsystem. For any system drawn it must be possible to select any combination of subsystems to be expanded or not, as desired, and the expanded sub- systems themselves must be expandable to any selected degree of elaboration. In synthesizing a complete process, it must be easily possible to produce dozens of alternative drawings. There must be an intuitive linking between these various levels of elaboration. For this purpose the edge connexions of most of the streams are predefined (to give consistency between levels of expansion) and the drawing package described has the option of holding together sub-units within any subsystem so that, for example, all units going to make up the "separation system" are close to one another on the final drawing. 47

Upload: aa-brice

Post on 02-Jul-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Pergamon Computers chem. Engng Vol. 22, No. 1-2, pp. 47-67, 1998

© 1997 Elsevier Science Ltd Printed in Great Britain. All rights reserved

PII: S0098-1354(96)00351-1 0098-1354/97 $17.00+0.00

Optimization of flowsheet drawing layout using a genetic algorithm

A. A. Brice, W. R. Johns*

Quanti Sci Ltd., Chiltern House, 45 Station Road, Henley-on-Thames, Oxfordshire RG9 IAT, UK

Abstract

An optimal flowsheet drawing layout is one that presents a simple intuitive correspondence between the drawing and the underlying process it describes. Criteria that can be applied to make this correspondence have, however, never been clearly defined. Drawing algorithms from other disciplines are reviewed and found to have criteria quite distinct from those required for chemical process flowsheets. In most cases the criteria are not explicitly stated and heuristic solution methods are employed. The paper introduces a range of possible flowsheet drawing criteria which can be weighted to emphasize desirable drawing features. A genetic algorithm for laying out drawings is described which can easily accommodate such a wide range of optimization criteria. Results presented show that the algorithm gives "optimal" or near-optimal layouts for small to medium sized drawings (less than 20 units) which compares favourably with other genetic algorithm applications. For the first time we are able to quantify criteria for a "good" flowsheet drawing and hence clearly distinguish between the criterion and the optimization routine. One benefit of such mathematically computable criteria is that it would now be possible to develop special-purpose optimization methods applicable to larger or more complex drawings.

1. Introduction

The stimulus for the work reported here was to provide layout drawings for processes synthesized using the PIP II [Kirkwood et al. (1988), Korovessi (1995)] process synthesis system. PIP II generates a process in a hierarchical way and automates the synthesis method- ology presented by Douglas (1987). It starts with a simple input/output description which is broken down into subsystems (e.g. reaction and separation), which are themselves further broken down until a complete feasible process is achieved. At each stage, the current state of elaboration is generated as a text file using a formal process description language. It is the objective of the program described here to generate flowsheet drawings for any level of elaboration during the synthesis procedure.

It should be emphasized that the object is not to produce P & IDs. PIP II presents the flowsheets as simple blocks ("reaction system", "separation system", "solids separation system" etc.). These blocks (or subsystems) can be further expanded as the synthesis proceeds. Ultimately the subsystems will be expanded until they are made up of physically realisable unit operations (reactors, heat exchangers, liquid extractors

* Corresponding author.

C.ACE 22-I/2.C

etc.). At that stage, it may be desirable to produce a full P & ID. The drawing criteria and algorithms presented would be adaptable to such drawings. It is likely, however, that human intervention would be required to produce a fully professional P & ID (at least to lay out the labels). It is not the intention here to produce a drawing package to compete with the many excellent interactive drawing packages available (although the drawings produced could act as a starting point for the latter). The intention is that each drawing should be produced fully automatically in, preferably, a few seconds and be good enough to enable the operator to concentrate on the synthesis process and not be dis- tracted by the production of individual drawings. Thus, at will, it must be possible to produce drawings of any selected subsystem. For any system drawn it must be possible to select any combination of subsystems to be expanded or not, as desired, and the expanded sub- systems themselves must be expandable to any selected degree of elaboration. In synthesizing a complete process, it must be easily possible to produce dozens of alternative drawings.

There must be an intuitive linking between these various levels of elaboration. For this purpose the edge connexions of most of the streams are predefined (to give consistency between levels of expansion) and the drawing package described has the option of holding together sub-units within any subsystem so that, for example, all units going to make up the "separation system" are close to one another on the final drawing.

47

48 A. A. BRICE and W. R. JOHNS

The drawing package specification was for units to be represented by rectangles of equal sizes and streams to be represented by straight line sections drawn parallel to the x or y axes. For all streams, the source and destination units are specified and, for the majority of streams both the source and destination edges are specified. For example, a stream may originate from the right-hand-side of a unit and terminate at the left-hand side of a unit. We will discuss extensions to cover representations of units by different shapes and sizes but extensions were excluded from the current study.

The algorithm presented here clearly differentiates between the criterion for optimization and the optimiza- tion algorithm itself. Thus, if unattractive drawings are produced, the reasons can be attributed directly to the criterion. Furthermore, alternative drawing criteria can be explored (to investigate human perceptions of drawing "quality") without simultaneously paying atten- tion to the definition and fine-tuning of the optimization algorithm. This differentiation also enables us to speak of "optimization" of the drawing layout in that we have a criterion susceptible to optimization.

In Section 2, we present a review of some of the drawing algorithms developed in other disciplines. In Section 3 we describe how a drawing can be coded compactly. In Section 4, the genetic optimization algorithm used is described. The results obtained are presented in Section 5 and Section 6 briefly outlines implementation features of the program. In Section 7, we conclude that the genetic algorithm is flexible and sufficiently fast to optimize moderate sized drawings. It has also been adequate to investigate the relevance of eight different criteria of drawing optimality and show that some of those more complex to compute are not relevant. Without these complex criteria, it becomes practicable to employ mathematical optimization meth- ods which have the potential to optimize much larger drawings.

It is concluded that:

• "Quality" of chemical process drawings can be defined by a small number of relatively simple criteria that can be accumulated into a single figure of merit that is susceptible to optimization.

• Complete drawings can be compactly coded in a way that retains feasibility as any one ordinate (x or y) is changed over its complete possible range.

• The layout optimization should place units and determine stream paths at the same time. It is, however, beneficial to defer the optimization of the separation of streams following close parallel paths to a second step.

• A genetic algorithm has proved to be a simple and effective method for optimizing small drawings and for exploring alternative optimization criteria. It is recommended for scoping studies such as that described in this paper. It is, however, unsuitable for large drawings and it is recommended that future work concentrates on mathematical optimization methods (e.g. implicit enumeration).

2. Literature review

There have been a great many published methods for automating the layout of various types of drawings and there are a number of interactive drawing packages.

VLSI layout is a very active area of research of obvious importance to the computer industry. A typical algorithm is described by Tani et al. (1991). A similar problem is tackled by Nummenmaa (1992). This whole area is not relevant to chemical process flowsheets. It is concerned with fitting required connexions into the minimum area without consideration of whether the pattern developed is readily intelligible. Indeed the compression associated with minimization of area makes the logic flow less intelligible to the human observer.

Superficially the work addressed by Tamassia et al.

(1991) appears more relevant. It represents an area of application restricted to non-directed planar graphs (i.e. with no line crossings) with maximum vertex degree -<4. The vertex degree limitation allows a maximum of one line in each of the four directions (horizontally left, vertically down, horizontally right and vertically upwards). The drawings produced have nothing in common with process flow diagrams and the algorithms are not readily adaptable to situations in which lines may cross and the direction of flow is important.

A more relevant class of algorithms is described by Gassner et al. (1993). These authors explicitly state the aesthetic principles on which their drawing algorithm is based. They do not, however, quantify the principles. Instead, the algorithm attempts to meet the principles by a series of steps with heuristically determined criteria. As a consequence, if for example a user would like to reduce the number of line crossings at the expense of increasing the line length, it is impossible to commu- nicate this information to the algorithm. There are two further difficulties with the algorithm. The first difficulty is that the main emphasis is on finding the ranldngs of the nodes. (Nodes would correspond to process units and similarly displayed items on a process flow diagram.) Thus counting the rankings from 1 to n, where 1 is the start rank and n the end-rank, the nodes are sequenced, one group at rank 1, the next a rank 2 and so on. The nodes with equal rank are then aligned at right-angles to the general flow of information (material etc.). No attempt is made to align nodes in the direction of information flow, a criterion to which most process engineers assign higher priority. The second difficulty is that, whilst an attempt is made to minimize sharp comers, no attempt is made to draw straight lines. Indeed, line drawing (edge placing) is tackled as a separate task after the node coordinates have been fixed. Visual inspection of the examples illustrated in the paper shows that it would not be possible to place the edges (representing streams etc.) in a way that would be acceptable for a process flowsheet without moving the nodes. A number of the ideas in the overall algorithm could, however, be adapted to a mathematical optimiza- tion of the flowsheet drawing problem.

An area of less concern to the current study is the one

Optimization of flowsheet drawing layoug using a genetic algorithm

of placing dimensions on engineering drawings. This problem starts with an engineering drawing (e.g. of a mechanieai part) and optimizes the placing of lines on which the dimensions (e.g. lengths in ram) are to be placed. The algorithms seem not to be relevant in that the initial problem is much more closely constrained than the flowsheet drawing problem. It may be noted, however, that the general label-placing problem is very difficult, see, for example, Freeman and Alan (1987). We have not attempted to optimize label layout.

The most directly relevant research appears to be that on the automated drawing of data flow diagrams

F q

(DFDs). Protsko et al. (1989), and Protsko et al. (1991) describe MONDRIAN, a system for drawing DFDs. As with Process Flowsheet Diagrams, these authors note that producing acceptable drawings manually can take hours. There is a strong incentive to produce reasonable drawings automatically. MONDRIAN first places the nodes of a DFD and then joins the nodes with directed lines made up from straight sections. The system is highly heuristic with no clear definition of what constitutes a "good" diagram. Visual inspection of the published diagrams indicates that the criteria may be different than for process flow diagrams. As there is no clear definition of what the algorithm sets out to achieve, it is not obvious how it should be adapted to meet other criteria. Indeed, it is not clear that the algorithm meets any consistent criteria for "goodness" of diagram. MONDRIAN does, however, tackle an essentially similar problem to the problem which we set out to tackle, namely the generation of a drawing made up of rectangular nodes joined by straight line sections. Furthermore, their approach of placing the nodes and edges on a grid does appear promising for process flowsheet applications. The use of a relatively coarse grid avoids small offsets in lines which are essentially a continuation of one line through a series of nodes. It also reduces a mixed-integer optimization problem to a relatively more tractable all-integer problem.

It is seen that none of the published algorithms address the problem that is defined in this paper. The prevalence of heuristic procedures also wraps the objective into the algorithm itself. Thus the resulting drawings meets their own authors' ideas of a good drawing but may not meet the ideas of the users of the packages. Another widespread characteristic is a separa- tion of the node-placement and edge-tracing algorithms. The preliminary node placing is typically done to minimize line crossings when the edges are subsequently placed. It is, however, clear by referring to Fig. 1, that line crossings can only be determined by actually tracing the lines. We have, therefore, decided to combine the line-placing and node-placing algorithms so that both are optimized simultaneously. The placement algorithm then considers the possibility of circuitous routes for the edges when placing the nodes whilst including any penalty for unnecessarily long lines. Whilst our pro- posed algorithm considers the general path taken by the lines (edges), it is believed unnecessary to determine the fine placement in order to place the nodes correctly.

49

Thus, if there were two lines from node A to node D, we would not place them separately. Instead, we separate the line placement algorithm into two stages. The first allows lines to fall on top of each other over a relatively coarse grid whereas the second generates a finer grid to separate the otherwise coincident lines. This two-stage approach is illustrated in Fig. 2.

It should be emphasized that our intention is to develop an algorithm that can be extended to cover more complex icons than simply boxes. It is not, however, the intention to develop an automated drawing package fully competitive to commercial computer-aided drawing packages such as PROCEDE 2.1 (1995). Such packages allow the user to develop a flowsheet manually and move the units and streams to meet their requirements to produce good quality P & IDs. Our program tackles a different problem, namely to draw a flowsheet for a previously unseen process that has been automatically generated as a text file. The user has no previous idea of how many units there are, how important each unit is or how many connecting streams there are. The presence or absence of recycle streams may also only be vaguely known a priori. The objective of our program is to produce an acceptable drawing that will give the user some understanding of the process. It is likely that, with knowledge that cannot be adequately coded in a formal text file, the user will be able subsequently to use a package such as PROCEDE to produce an even better drawing. In a similar vein, it may be noted that commercial process simulation packages (e.g. those

Fig. 2. Line separation in two stages.

Fig. 1. Line crossings can only be determined by placing streams.

50 A. A. BRICE and W. R. JOHNS

from Aspen Technology Inc., Simulation Sciences Inc. and Hyprotech Ltd) offer drawing tools that rapidly enable users to develop and explore their ideas. They also offer limited optimization facilities to "tidy up" drawings that may have become less clear as extra units are added. These do not, however, address the problem of generating drawings from scratch without manual intervention. This latter requirement is likely to become more widespread as Process Synthesis technology becomes more widely accepted.

3. Drawing coding and quality criteria

Our remit was to produce drawings of a quality acceptable to users. It is, then, not desirable to impose our criteria for drawing quality and wrap them up in a heuristic algorithm that is not easily changed. We, therefore, decided to separate clearly the criteria for a good drawing from the algorithm designed to meet these criteria. In this way users can explore the effect of different combinations and weightings of criteria on the drawing produced. They can then determine the criteria that they consider important in generating "good" drawings without having the criteria imposed by our- selves. There is a further benefit in that the same package may be used for different types of drawing which may require different layout criteria; it would simply require a change of criterion.

Before describing the algorithm in detail, the possible layout quality criteria will be discussed together with means of quantifying them.

3.1. Drawing quality criteria

The criteria considered were:

• Minimize number of kinks (i.e. right angle turns) in the lines representing the process streams.

• Minimize excess line length. • Minimize number of recycles (i.e. streams flowing

right to left). • Favour drawing feed-forward streams above recycle

streams (i.e. show recycle streams towards the bottom of the canvas, below feed forward streams).

• Show main raw material to product flow from left to fight.

• Minimize occurrences of multiple closely-spaced parallel streams.

• Minimize abutting streams. • Favour placing units with common parents (e.g. the

units making up a "parent" reactor system) close together.

These criteria do not, in general, subsume one another. Thus for example, there are situations in which minimiz- ing the number of kinks also minimizes excess line length, but in other situations it does not. Similarly minimization of total line length or of excess line length gives rise to distinct drawing characteristics; the first favouring the units pulled close together, the latter favouring simple alignments. Examples can be devised

that show that no one criterion can be met by combinations of the others.

The rationale behind these criteria and the methods of giving them quantitative expression are discussed in turn. At this stage, we are not prejudging that the arguments presented are sound, merely stating a case for each one of the criteria considered. It is only by inspecting the consequences of applying these criteria that their effectiveness can be determined.

3.1.1. Kinks The rationale is that the most intelligible flowsheet is

one in which each stream is shown as a single straight line linking its source and destination units. Every right- angled turn makes it less easy to follow the stream from source to destination; we therefore penalize the drawing for every right-angle turn in a stream. A value for this criterion is easily obtained by adding up the total number of kinks in all streams on a process drawing.

3.1.2. Excess line length The rationale is similar to that for kinks. The

"simplest" drawing is considered to be that with streams directly linking sources and destinations. Where kinks are essential, they should generate the smallest possible deviation from the direct line. The shortest length of line connecting a unit at (x~, y~) to a unit at (x2, Y2) is ([x2-x~l+ly~-yl[). The "excess line length" is then (L - Ix 2 -x t l - ly2 - y~l), where L is the actual line length. The Excess Line Length criterion is then obtained by adding the excess length of each of the streams drawn on the flowsheet.

3.1.3. Line crossing Line crossings are considered to confuse the eye in

tracing a stream from source to destination. The line crossing criterion is calculated by finding the total number of streams shown as crossing on the whole diagram. Account has to be taken of streams that do not cross directly, i.e. streams that join and run parallel for a while and then diverge again. It depends on the direction of divergence whether the lines cross or not.

3.1.4. Recycles The rationale here is that the easiest drawing to follow

is one in which the raw materials enter at the left of the drawing and all streams flow from left to right with the product stream leaving on the right. There is then a simple left to right flow for the whole drawing. The "recycle" criterion is calculated by counting the number of line sections directed from right to left, i.e. moving counter to the preferred simple flow pattern. One "recycle count" is added per stream directed from right to left independent of how many sections are directed right to left or how long the fight to left sections are. Note also that, even if the source unit is to the left of the destination unit, if any section of the joining stream is directed right to left, a "recycle" is scored because the preferred left to right flow is interrupted.

3.1.5. Recycles below feed-forward It is often considered that a flowsheet is easier to

follow if the main material flows are directed from left to right across the upper or mid parts of the canvas with any recycles clearly separated and flowing fight to left across

1

Optimization of flowsheet drawing

(a) (b)

Fig. 3. Penalty for high recycles.

the lower parts of the canvas. There are a variety of ways in which this type of drawing can be favoured. We have chosen to penalize drawings including recycles which have a source or destination leg below the recycle section. The extent of the penalty is calculated by adding the number of kinks of the type illustrated in Fig. 3; namely a right to left line section immediately preceded by a section directed vertically upwards or followed by a section directed vertically downwards.

3.1.6. Show main raw material to product flow from left to right

This criterion was not evaluated because the process flowsheet file did not identify feed or product streams, or streams containing significant quantities of feed or product materials. Had the information been given, it would have been easy to increase the recycle penalty for such streams and/or to extend them to the left-hand-side and right-hand-side of the canvas as appropriate. The information is available within synthesis packages (such as PIP II) and could be added at a later release.

3.1.7. Minimize number of closely-spaced lines Figure 4 shows three streams which are closely spaced

on part of the drawing canvas. For clarity, the y- dimension is exaggerated. For simplicity, the canvas is divided into two grids, a coarse grid and a fine grid. Lines which are separate by one or more coarse grid increments are not considered to be closely spaced (compare Fig. 2). Lines which are separated by less than one coarse grid increment are considered to be closely spaced. Thus, in Fig. 4, lines A and C are considered to be closely spaced (as well as lines A and B and lines B and C). Closely spaced lines are then penalized. The

A

' 8

I l I I i I I I I I I

I 2 3 4 5 O 7 8 9 10

A

I i

I I 12

x seRlo

layoug using a genetic algorithm 51

motivation for such a penalty is that such closely spaced fines confuse the eye and that it may not be immediately obvious that the source of stream A does not continue to destination B or C. Several close-spacing penalties have been explored, two of which are described here.

The first and simplest is that a crossed-line penalty is incurred whenever one line meets another at right- angles, whether or not the lines cross. According to this criterion, Fig. 4 shows three penalty points, one for line B meeting line A, and two points for line C, one because it meets line A and another because it meets line B. The second is that a penalty is incurred for every unit of distance over which the two lines are closely spaced. On this criterion, the close spacing of lines A and B incurs a penalty of 9 points (x= 10, where A diverges, minus x= l where B joins). To discourage multiple close- spaced lines, a higher penalty is incurred where three lines are closely spaced. Thus there is an additional triple-line penalty of 4 points corresponding to the length over which line C is close to lines A and B. It will be noted that the double and triple line scores can be separately weighted and that, whatever the weighting, the total drawing score is independent of the sequence in which the lines are traced across the canvas. Similarly higher weightings can be applied to quadruple, or more, close-spaced lines.

3.1.8. Minimize abutting streams The situation of concern is illustrated in Fig. 5. Lines

(streams) A and B approach a common point at which they turn through right angles. There is perceived to be the possibility that the eye may be deceived and, for example, mistake the destination of stream B as the continuation of stream A. The penalty for abutments is simply obtained by adding the total number of similar abutments on the whole drawing.

3.1.9. Favour units with common parents close together

The situation is illustrated in Fig. 6 which shows an early stage of synthesis (reaction step followed by separation step) elaborated into a reaction step with 3 reactors having a distributed feed and a separation stage with two separators giving a product, by-product and recycle. Using the criteria introduced previously, the separator $2 would be vertically below the reactor R1 (to minimize kinks). There would also be no benefit in making a greater separation between R3 and S 1 than between R2 and R3. It may then be difficult to distinguish those units that make up the reaction system from those that make up the separation system. To

i

Fig. 4. Close line spacing. Fig. 5. Abutting kinks.

52 A. A. BRICE and W. R. JOHNS

(a)

m

Fig. 6. Units with common parents.

achieve the desired grouping of units, the total length of lines joining units with common parents is minimized. Thus, in Fig. 6(b), the units FD, R1, R2 and R3 have the common "parent" reaction system, and the units S 1 and $2 have the common parent separation system. The "common parent" score is then obtained by taking the total of the lengths of the lines FD to Ri, R2 and R3, R1 to R2, and R2 to R3 (reaction system) and the length to S I to $2 (separation system). In minimizing this common parent score, the units making up the relevant groups are drawn together thus giving a flowsheet with reaction, separation etc. clearly grouped in different sections of the drawing canvas.

The drawing is then considered to be optimized when

~, win i

is minimized where n~ is the "score" for each of the criteria and w~ is the relative weight of each of the criteria. The weights w~ are adjusted until the drawings meet the users requirements.

3.2. Drawing canvas

Drawings are placed on a grid which accommodates both the units and the streams. The initial requirement was for units to be represented as squares. (We have generally referred to the squares as rectangles because the x and y scales could be individually adjusted). The grid is organized on three levels:

1. Unit level. The units can be placed on a grid of m by n. The units are sized such that the length of each side is less than one unit.

2. Stream level. The spacing of the grids is halved, giving a total grid of (2m) by (2n). The original unit level grid, forms the odd numbered x and y ordinates. Thus, the units can be placed on the odd grid locations (1 to 2 m - 1 ) , (1 to 2 n - 1 ) whilst the streams can be placed on the odd and even grids. Apart from direct connexion to the units, the even grid numbers are preferred because these are guaran- teed not to intercept the units without any check for clashes being necessary. (If, at some later date, more

complex unit shapes that cross several grid positions are introduced, clash checks for even grid numbers can be added easily).

3. Fine separation. Where several streams follow similar paths, they are not constrained to follow the stream grids as described at (2). Such situations may arise as illustrated in Fig. 4. Any number of finer grids are allowed at a spacing that is determined to ensure that no lines are co-incident. After the initial crude placing of the streams on the stream grid, the "fine grid" is introduced to give the lines maximum separation whilst still meeting the units as required. This fine grid is arranged such that no new kinks are thereby introduced. Thus, a sufficient number will be intro- duced to maintain straight lines in cases such as that illustrated in Fig. 7(a), rather than minimizing the number of grids that would still avoid line crossing as in Fig. 7(b).

The user has two parameters with which to define the canvas, the Fill Ratio (f) and the Aspect Ratio (A). '3e' is defined by

f=u/ (mn) , (1)

where u is the number of process units to be displayed. f i s thus the ratio between the number of units displayed and the maximum number that could be displayed. Clearlyf-<l. Valuesfnear to 1.0 will give very crowded drawings with streams following tortuous paths to link the closely spaced units. Values off,~ 1 will give spidery drawings with small widely-spaced units linked by long streams.

The Aspect Ratio is defined by the ratio width/height. Note that stream lines can be put on the zero ordinates (0) and the upper limit ordinates (2m or 2n). To avoid having lines at the edge of the display window or sheet of paper on which the drawing is to be printed, a "border" is placed around the drawing. Thus white space is allowed between - 1 and 0, for both x and y directions and between 2m and (2m+ 1) and 2n and (2n+ 1). The total drawing width is thus (2m+2) and the height (2n+2), thus:

A = (m + 1 )/(n + 1), (2)

The aspect ratio is automatically set to correspond to

(a) No kinks

(b) Additional kinks

Fig. 7. Fine grid organization to avoid adding kinks.

Optimization of flowsheet drawing layoug using a genetic algorithm

the shape of the window displayed (default 4•3, which the user can adjust with standard windowing tools). The user specifiesfdirectly. Having specified A and f, (1) and (2) are solved for the value "u" derived from the text file description of the process flowsheet. Thus:

m=[ ~¢'(A- 1)2+4Aulf+(A - 1)]/2,

n=[ "~/(A- l)2+4Aulf - ( A - 1)]/(2A).

The resulting values of m and n are rounded to the next integer above.

3.3. Coding the drawings

In contrast to previous drawing algorithms, the stream paths and unit locations are optimized simultaneously. The drawings must therefore be encoded so that both the unit locations and the stream paths are compactly recorded to enable complete drawing specifications to be manipulated and stored.

The coding is in two parts:

1. Source and destination of each stream. The stream source unit and destination unit are recorded together with the sides to which the unit is connected. For example, a stream could be defined as starting from unit A where it is directed vertically upwards (top side connexion) and finishing at unit B where it is directed horizontally to the right (left-hand-side connexion). In fact the algorithm used works equally well if only the orientation of the stream connexions is specified (namely horizontal or vertical). Such restrictions are realistic, e.g. feed connexion to a distillation column should be horizontal (either to lhs and rhs). The current program, however, both allows and enforces specific directions (e.g. to left-hand side). This stream connexion information is stored only once and defines the sequence of processing operations.

2. The location of each unit and the kink points of the joining streams. The unit locations are defined by their (x, y) coordinates. The streams are defined by alternate x and y coordinates of the kink points. Thus, a stream starting horizontally from a unit at (x~, Y0 to be joined horizontally to a unit at (x2, Y2), may be coded (x3, Y3, x4). The stream kink coordinates are then:

(x3, Yl), (x3, Y3), (x4, Y3) and (x4, Y2),

as illustrated in Fig. 8.

The benefits of this coding are both that a complete drawing can be coded compactly and that, in course of optimization, any one parameter or combination of parameters can be altered whilst leaving the drawing fully connected. For example, unit 1, at (x~, Y0 can be placed anywhere on the drawing canvas when the horizontal line from the unit will form the first section of the stream path until it intercepts ordinate x3. The vertical line from (x3, Y0 is then joined to point (x3, Y3) after which the line continues as illustrated. Similarly any of the parameters x3, Y3, x~ can be altered with only the lengths of the lines connected to the corresponding ordinate changing.

Y3"

X 3 X 4

53

Fig. 8. Stream path coding.

A further benefit of the coding is that scrolling the drawing becomes a simple operation. After some steps of optimization, the most favourable relative positions of a large proportion of the units and streams will have been obtained, If it is then desired to raise or lower the drawing slightly, or to move it to the right or left, a large number of individual adjustments to the x or y ordinates will be required. Each of the individual movements may destroy a favourable alignment thus requiring a large number of inferior drawings to be traversed before an equal or more favourable drawing is obtained. Scrolling permits, for example, all the x values to be incremented with the effect of simply moving the drawing to the right. If a unit (or kink coordinate) then falls off the right-hand-side of the canvas, it is simply reintroduced on the left hand side. The drawing remains fully connected with the change that some left-to-right streams become right-to-left streams and vice versa. The net effect can be the beneficial elimination of a recycle stream.

The list ofx and y ordinates that define the kink-points has to be terminated so that the final connexion to the destination unit can be made. This termination is signalled by a recognizable ordinate value. Inserting or removing this terminator can make a large change in the path taken by one line and requires unambiguous rules for completing the line so that every drawing coding uniquely defines a drawing. We consider below these "implied stream paths". Finally we consider other steps that are taken to ensure that all coded drawings are feasible (e.g. stream lines do not pass through units) and not obviously suboptimal (e.g. streams do not cross themselves).

3.3,1. Implied paths We describe here only directed connexions between

units. Non-directed connexions and connexions from an ordinate to a unit are determined by an obvious sub-set of the rules given here. All that we require is an unambiguous set of rules for defining these implied paths; they do not need to be optimal, the explicit path alternative can be used to generate the optimal paths. Nevertheless, a series of rules are followed that will often given an optimal path, these are:

54 A. A. BRICE and W, R. JOHNS

1. Lines should have as few kinks as possible. 2. Lines should be as short as possible. 3. Left-to-fight kink sections should be as near the top

of the canvas as possible and right-to-left sections as near the bottom as possible.

The rules are applied in the order given. Sometimes rule (i) is adequate completely to define the line, sometimes all three rules are required. We consider first the case when the source and destination lines are parallel or directly aligned (i.e. both horizontal or both vertical).

Figure 9(a) illustrates directly aligned connexions. A single line with no kinks rule (i) directly connects source and sink. This direct connexion is made for all directions where alignment is found, namely left-to-right (illus- trated), fight-to-left, vertically upwards and vertically downwards.

Figure 9(b) illustrates direct but unaligned con- nexions. We apply rule (i) which determines that a minimum of 2 kinks is required which, in this case, also automatically gives the shortest possible lines thus satisfying rule (ii). To determine the position of the kink, we apply rule (iii). Thus, in all cases illustrated, the first kink is made as near to the source unit as possible to ensure that the (longer) section directed left to right is as near the top as possible and conversely the (longer) fight to left section is as near to the bottom as possible. For the other 4 cases (not illustrated) the first kink is made as far from the source unit as possible again to take feed- forward sections to the top and recycle sections to the bottom.

Figure 9(c) illustrates an opposed unaligned con- nexion. A minimum of two kinks is required (rule i) and, to minimize line length a kink must be applied as close

to the destination unit as possible (rule ii). It is not necessary to apply rule (iii) to define the line unambigu- ously. All eight opposed unaligned cases can be determined in an exactly similar manner without apply- ing rule (iii).

Figure 9(d) illustrates an opposed aligned connexion. Rule (i) gives a minimum of 4 kinks. Rule (ii) requires that the last kink is as close to the destination unit as possible and that the long horizontal section is as close to the two units as possible. Rule (iii) determines that the long horizontal left-to-right section is above the units (zero high-recycle counts) as opposed to below (one high-recycle count). Rule (iii) also favours the first kink being as close to the source unit as possible to maximize the length of feed-forward (left-to-right) sections near the top of the canvas. (This latter application of rule (iii) is not actually reflected in an improvement in the drawing quality score as described previously). In all there are 8 possible opposed aligned connexion cases. Application of the line section rules gives, in all cases, 4 kinks with the first and last being as close to the source and destination units as possible. There is then always one "long section" which (as illustrated in the figure) is above the units when left-to-right. In the other cases, it is below when fight-to-left, to the left when upwards and to the right when downwards.

There are two remaining classes to consider, out- wardly directed unaligned and outwardly directed aligned connexions. Figure 9(e) illustrates an outwardly directed unaligned connexion. In all similar cases, there are 4 kinks with the middle section between the two units. The only choice is whether the middle section is as close to the first or second unit as possible. This choice is made by application of rule (iii). In the case shown,

(b)

(d)

I

Fig. 9. Implied stream paths, direct or parallel connexions.

Optimization of flowsheet drawing layoug using a genetic algorithm

the left-to-right section is as near the upper unit as possible. Figure 9(f) illustrates an outwardly directed aligned connexion. Apart from the direction of the first connexion, Figure 9(t") corresponds to Fig. 9(d) and the same rules apply. Indeed, it is apparent that each of the possible outwardly directed aligned connexion cases has a directly corresponding opposed aligned connexion case.

We have given a detailed description of application of the stream tracing rules to the case of parallel or coincident connexions. A much briefer description is given for the case of connexions at right angles. There are essentially 3 cases to consider as illustrated in Fig. 10(a) to Fig. 10(c). Figure 10(a) illustrates the simplest case which only requires application of rule (i). There are 8 similar simple cases. Figure 10(b) illustrates the most complex case when all three rules must be applied; rule (iii) determines that the first horizontal section is as long as possible and the last horizontal section is as short as possible. There are, again, 8 similar cases where one of the line sections falls between the units. Figure 10(c) illustrates the case when none of the line sections falls between the units. The lines sections are positioned simply by applying rules (i) and (ii). No special consideration needs to be applied to directly aligned units, they are covered either by cases (b) or (c).

3.3.2. Creating feasible drawings It will be noted that the drawing coding as described

is capable of defining infeasible or obviously sub- optimal drawings. In optimizing functions in which small adjustments to the parameters can generate such obviously sub-optimal outputs, two approaches are possible. Either the infeasible and sub-optimal cases can be given large scores so that they are eliminated by the optimization algorithm or the nearest feasible not trivially sub-optimal case can be generated.

For the current problem some infeasible drawings

(a)

(b)

I I I

Fig. 10. Implied stream paths, right-angle connexions.

55

(with for example a stream intercepting a unit other than the source or destination) can, in other respects, be close to optimal. It is then impossible to devise a scoring scheme that fairly measures the quality of a near-optimal infeasible drawing as compared a clearly very non- optimal but feasible drawing. It was, therefore, decided to take the approach of generating feasible drawings for all coded descriptions. In the majority of cases, feasi- bility is only achieved at the expense of additional line length and kinks so that the optimization tends to drive the drawing description to one that does not need such feasibility steps to be applied.

Having taken the decision to render all drawing codings as feasible drawings, the further decision was taken to remove obviously non-optimal features. The rationale for this decision was that, in the same way that infeasible solutions can be generated by small changes to good solutions during iterative optimization, so can obviously highly non-optimal solutions. Eliminating such solutions can give a smoother objective function enabling an easier progress to optimality.

It is easy to constrain optimization routines not to generate solutions with units falling on top of each other. The routine to generate feasible drawings from all drawing codes does not, therefore, need to deal with repositioning units, only with rerouting streams.

Interceptions that arise remote from the direct con- nexions from source or to destination units are trivial to eliminate. It is only necessary to constrain the streams to follow even numbered ordinates. Our grid has been set up so that all units only fall on odd numbered ordinates. (If, at a later stage, we introduce larger icons that may extend over several x or y ordinates, this strategy will need to be revised).

We are then only concerned with cases when a line directly connecting from the source or to the destination unit crosses another unit. Two classes of cases are considered, either the intercepted unit is a third unit (i.e. not source or destination) or it is the source or destination unit. We defer consideration of the latter cases until we consider obviously sub-optimal draw- ings.

The principles applied in adjusting drawings showing third-unit infeasibility are:

1. Apply the minimum stream section deviation possi- ble, except when more than 6 sections (5 kinks) would then result.

2. Apply the deviation in the direction of the next parallel section so that no extra stream length will result. Where the next section is coincident with the deviated stream (e.g. source and destination directly aligned) apply the deviation in the direction that will give feed-forward sections nearer the top of the canvas and recycle sections nearer the bottom.

3. Apply the deviation as near to the respective s o m e / destination unit as possible. (The heuristic here is that it is easier to follow a line that goes straight past the interfering unit rather than one that deviates imme- diately in its vicinity).

56 A.A. BRICE and W. R. JOHNS

(a)

i

t

e I

/ ,#

p °

Fig. 13. Eliminating crossing lines.

IB

(b)

Fig. 11. Avoiding third-unit interception.

The "no more than 5 kink" constraint applied above responds to the observation that no "optimal" drawing shows more than 5 kinks in one line so there is no benefit in setting aside the extra space to record more than 5 kink locations.

Application of these principles is illustrated in Fig. 11; Fig. l l (a) shows the case when no more than 5 kinks would result, Fig. l l (b) when more than 5 kinks would result. In Fig. 1 l(a) two additional kinks are introduced, all the other kinks are unchanged. In Fig. 1 l(b) one kink is adjusted.

A second cause of infeasibility is disconnected line sections. Such a disconnected section is illustrated in Fig. 12 The ordinate x 3 is not placed to intercept the line section joining the destination unit. The procedure adopted is, starting at the destination unit, eliminate infeasible ordinates until a feasible stream path is revealed or until an implied path is applicable.

3.3.3. Removing obvious suboptimality Obvious suboptimalities occur when a stream crosses

or becomes coincident with itself. Such suboptimal features are illustrated in Fig. 13 and Fig. 14. As shown in Fig. 13, there can never be benefit in including the

extra line sections joining A and B; they can only introduce extra kinks and extra line length, as well as introducing a line crossing. The simplified direct A to B connexion is then substituted. Similarly, in Fig. 14, the direct A to C connexion is always superior to the stream paths in which sections become co-incident.

In eliminating obvious suboptimalities, the principle adopted is that, where possible, no new line sections are introduced.

The case when a line intercepts the source or destination unit is both infeasible and obviously sub- optimal. Such a situation is illustrated in Fig. 15. These infeasibilities are eliminated following the same princi- ples.

4. T h e o p t i m i z a t i o n algorithm

The coding described in Section 3 allows a compact description of a whole drawing. Thus 1 byte is allowed per parameter (x or y ordinate). With a maximum of 6 line sections per stream, a typical 40 unit drawing will take less than 400 bytes to code. It is thus possible to

a ) o . - " . . . . °~,

c i c II , - ° " r

A B" A

(b) J , A

I1 , '* .°°oe

Af 1 ~C C

Fig. 14. Eliminating co-incident lines.

e

x 3

Fig. 12. Disconnected line sections. Fig. 15. Line intercepting source unit.

Optimization of flowsheet drawing layoug using a genetic algorithm

hold over 1000 drawings in less than half a megabyte of RAM. This coding consequently makes it practicable to store large numbers of flowsheets in the course of drawing optimization.

9 criteria of optimality are presented in Section 2 but these have not been previously assessed for the aesthetic quality of the drawings they produce. It is difficult to devise a single mathematical optimization procedure that can treat any combination of the optimality criteria. As a first step in drawing optimization it is, however, necessary to evaluate all the criteria before an effective subset can be determined and an efficient optimization algorithm geared to the particular subset of criteria developed. It is well appreciated that the weakest step in the optimization is the selection of a mathematically computable criterion that defines "aesthetic quality". This weakness in the definition of the criterion for optimality is, however, in no way restricted to the drawing layout problem but is common throughout optimization. Process Synthesis itself suffers from the same problem. How do you balance running cost, capital cost, start-up and shut-down cost, controllability, safety, environmental impact and resilience to data and other uncertainties? The weaknesses in these areas have not hindered development of some effective optimization algorithms (both mathematical and heuristic) which are beginning to produce significantly better process designs. We have adopted a similar philosophy in that we will attempt to optimize drawing layout according to the criteria so far proposed. It is only through experience that such criteria can be selected and refined.

A genetic algorithm (Goldberg, 1989) was chosen for two reasons. First, it allows a wide range of drawing quality criteria to be explored with only minor changes to the program. Secondly, in retaining a "population" of drawings it allows the search to explore many alternative local optima, not just a region local to one solution. In this sense, it provides a "global" optimization tool. Genetic algorithms have the further benefit that they are simple to code and quick to develop. Run times are related to the number of options in the search space (i.e. the number of binary digits needed to code a solution, see Simpson and Goldberg, 1994). As discussed in Section 2, however, the search space can be considerably reduced by allowing lines representing streams to fall on top of one another in a first stage of optimization whilst providing a small separation between the lines in a second optimization step. It should be reiterated that this optimization is still significantly more sophisticated than previously published algorithms which place the units first, then position the streams.

We now describe the two stages of optimization in turn.

4.1. The genetic algorithm

The genetic algorithm is initialized by generating "p" drawings randomly, moving any coincident units to ensure feasibility. It then undertakes "g" iterations (or generations). At each iteration it attempts to generate a better population of solutions. It does so by generating a

57

number of new drawings to give an expanded popula- tion. The objective function or "score" (Y. wl n~) is computed for each drawing in the population. This larger population is then culled to leave a new population consisting of the "p" best drawings (i.e. the drawings with the lowest scores) from the expanded population. As an enhancement to this basic procedure, it has been found advantageous to reject randomly a proportion of old drawings having scores (objective function values) equal to those of new drawings. This rejection mecha- nism avoids an accumulation of identical drawings within the population. It has the undesirable side effect of rejecting some non-identical drawings which happen to have equal scores but the net benefit from rejecting identical drawings outweighs this disadvantage.

The new drawings are produced by two mechanisms, "crossover" and "mutation". "Mutations" are equally divided between two submechanisms, "line simplifica- tion" and "scrolling". All three mechanisms require random selection of drawings from the base population. The selection algorithm will be described, after which each mechanism will be described in turn.

4. I.I. Drawing selection The drawings are selected from the population of p

solutions using a uniform probability distribution. Thus each drawing is equally likely to be selected. The selection is then checked against the following crite- rion:

r=[(B - I ) /B](N- No)I(N m - No).

where B is the selection bias, N is the drawing score, No is the score of the best drawing in the population and Nm is the score of the worst drawing in the population. " f ' is compared to a random number in the range 0 to 1.0. If the random number is less than "r" the drawing is rejected and another drawing selected from the popula- tion.

It is thus seen that, if B= l, no drawing is rejected and there is no selection bias in favour of better (or fitter) drawings. Similarly, no drawings for which N=N o, is ever rejected. The probability of rejecting the least fit drawing (N=Nm) is simply ( B - 1)lB. Thus, if the bias, B, is 2.0, 50% of the least fit drawings will he rejected, giving a 2:1 bias in favour of the fittest. Similarly, if B is 4.0, 75% of the least fit drawings will be rejected, giving a bias of 4:1. Drawings of intermediate fitness are accepted with intermediate bias. This mechanism main- tains a uniform "selective pressure" in favour of better drawings that does not become less discriminating as the optimum approaches and the range of drawing scores in the retained population becomes less. For the special case No=N m, we set r=0.

4.1.2. Crossover Two drawings are selected at random. One is called

the "father" drawing, the other the "mother". A random line is drawn across the father drawing either vertically or horizontally. It is checked that there are units both sides of the line otherwise another line is drawn. We then generate two "offspring" drawings. The first offspring

58 A. A. BRICE and W. R. JOHNS

has all the father units above, or to the right of the line on the father drawing. The second offspring has all the units below, or to the left of the line on the father drawing. The units are positioned as on the father drawing. The remaining units on each drawing are taken from the mother drawing and placed in the positions taken on the mother drawing. If a mother unit falls in a position already take by a unit on the offspring drawing, it is moved to the mother-drawing position of the unit on which it falls. (This move is repeated, if necessary. Note the repetition is guaranteed to terminate.) The paths taken by the lines representing streams are defined by the list of kink co-ordinates on the parent drawing from which the stream source unit is taken. It will be noted that each crossover operation produces two new draw- ings. To avoid accumulation of identical drawings, any mother drawing that would be the same as the father drawing is rejected and a new mother selected.

4.1.3. Line simplification A drawing is selected at random. Within that drawing,

a stream is selected at random. If the stream has more than one kink and, by moving either its source or destination unit, could be made straight or made to have only one kink, the selection is confirmed. Otherwise the next line (from a list of all the lines) is selected. If it is found that no line on the drawing can be further simplified, a new drawing is selected. Having selected a line, either its source or its destination unit are selected with 50/50 probability. That unit is then moved (chang- ing only one coordinate if possible, otherwise placing it as close as possible to the other unit). The position is selected to eliminate all kinks or, where the source and destination are at right angles, to reduce the number of kinks in the line to one. If this movement would take the unit outside the drawing canvas, both units are moved to bring the unit just inside the canvas. If the moved unit falls on another, it is moved to the nearest free location. This mutation tends to improve the drawing score because it often increases the number of simple direct connexions.

4.1.4. Scrolling A drawing is selected at random. A line is drawn

across that drawing at random, either horizontally or vertically. The drawing is then scrolled to take the line either to the top or to the right-hand-side of the canvas. This mutation has a more global effect than either crossover or line simplification. It can, on some applications, eliminate recycle lines or take recycles from the top to the bottom of the canvas. As explained in drawing coding (Section 3), scrolling is simple to achieve, gives drawings that visually appear "mutated" (i.e. most units and streams retain their relative posi- tions) and feasibility is retained without taking any line off one side of the canvas and bringing it in to the other.

4.1.5. Population size, and number of iterations Simpson and Goidberg (1994) give an indication of

the population size that may be necessary to achieve a reasonable guarantee of that optimality can be achieved. Omitting a small constant their formula can be written:

p=cb, (3)

where b is the number of binary digits needed to code one complete solution and c is a constant which they put in the range 4 to 40.

With the coarse drawing coding that we employ, it takes on average 5 binary digits to code the coordinates of each unit and 3.75 binary digits to code the kinks in all the streams originating at each unit. Thus it takes about 8.75 binary digits per unit to code each process flowsheet. On this basis it would be expected that the appropriate population size would be given by substitut- ing b=8.75 u into (3), thus:

p=ku, (4)

where there are "u" units and k is the range 35 to 350. There is rarely any absolute criterion for terminating a

genetic algorithm optimization. Thus it is typically found that, after a number of iterations, the algorithm stagnates, after which no further improvement in the best member of the population is achieved. Stagnation eventually sets in despite the "mutations" designed to refresh the "genetic diversity" of the pool of solutions. The best solution at the point of stagnation may, or may not, be a global optimum. The chances of achieving an optimum depend on the size of the initial population, the larger the population, the greater the assurance of optimality. Solutions prior to the stagnation point are clearly non-optimal. It is the objective of genetic algorithm design to determine population size just large enough to give a good assurance of optimality and to determine a minimum number of iterations which will give a reasonable assurance that the stagnation point has been passed. Simpson and Goidberg (1994) found that the number of generations to stagnation increases with population size. Correlating the limited data in their Table 3, we get

g/p= 1.5p °'2s,

where g is the number of generations to stagnation. With a method of selecting solutions that maintains "selective pressure" (the tournament method) they achieve faster convergence whereby the number of generations to stagnation can be reduced by a factor of roughly 3. Our procedure also maintains selective pressure, so that we would expect the number of generations to stagnation to be reduced to:

g/p=O.5p °'25. (5)

These correlations are used as a basis for our investigation into the genetic algorithm parameters most appropriate to our drawing layout optimization.

It may also be noted that, at each generation, Simpson and Goldberg (1994) developed a number of new solutions equal to the population size, 2% of their new solutions were generated by mutation, with 98% by crossover.

4.2. Close line separation

The close line separation routine is applied to the "optimal" drawing generated by the genetic algorithm. It

Optimization of flowsheet drawing layoug using a genetic algorithm

starts initially at the connexions directly to and from the units. It divides these connexions into 6 groups. For connexions to a top edge, these groups consist of:

1. Streams which turn to the left, 2. Process input streams, 3. Direct, straight-line connexions to an adjacent unit, 4. Direct, straight-line connexions from an adjacent

unit, 5. Process output streams, 6. Streams which turn to the right,

where the groups are numbered from the left-hand to the right-hand ends of the edge. The grouping is illustrated in Fig. 16. For other edges the allocation of streams to groups 2 and 5 is made to ensure that process inputs are nearer to the top and the left-hand-side of edges and outputs nearer to the bottom and right-hand-side of each edge. Similarly, the allocation of streams to groups 3 and 4 is made such that left to right streams are higher than right to left streams and such that upward directed streams are to the left of downward directed streams. The sequence of streams within groups (2) to (5) is determined on an arbitrary basis, corresponding simply to any identifying numbers on the input file.

The sequence of streams in groups (1) and (6) is selected to minimize crossovers. Thus, the streams are sequenced by line section length, with the shortest nearest the outside edge and the longest nearest the middle. For line sections of equal length, the algorithm is applied recursively; see Fig. 17. This algorithm ensures that no lines from a common edge will cross unless they already cross on the coarse grid of the genetic algorithm. If necessary, blank spaces are introduced between the groups to ensure that lines from directly opposed unit faces do not coincide and to ensure that crossovers are minimized or eliminated.

Once the sequence of lines is established, a separation & between the lines is calculated. 8 is computed to ensure that for the edge with most streams connected, the streams are uniformly spaced along the edge.

Whilst the above algorithm establishes the sequence

7 - -

I (a) Distinct kink point

: I I

I (b) Reclusive application

for common kink point

59

Fig. 17. Sequence to minimize crossovers.

of lines from any given edge, it does not establish their position relative to lines from other units and edges. For even-numbered ordinates (i.e. for line sections not directly connected to or from an edge), all line sections are taken together as if connected to or from a pseudo unit at the bottom or left-hand-side of the canvas. The recursive algorithm is then applied to this pseudo-unit to establish the relative positions of stream sections originating from different units or edges (see Fig. 18). Finally, where possible without introducing additional line crossings or coincidences of lines, line sections are made to share the same offset. This extension of the algorithm is illustrated in Fig. 19.

Note that we describe here only the case in which the sequence of connexions to each edge of a unit is not predefined. This situation applies in the early stages of synthesis where the units are representable only as boxes, it also applies to simple mixer units. In elaborat-

....... t" "t ...... B

'1 Pseudo-unit at B

Fig. 18. Sequence for streams not directly connected to edges.

Fig. 16. Sequence for fine line separation. Fig. 19. Sharing offsets.

60

ing drawings of finally synthesized processes, in which realistic icons representing the individual units would be appropriate, the connexion sequence may be predefined. The algorithm would be appropriately simplified for such predefined sequences.

4.3. Computational experience

The effect of each of the "tuning parameters" within the genetic algorithm was explored, namely, the bias (B), the population size (p), the number of crossovers per generation (c), the number of mutations per generation (M), and the number of generations (g). For all investigations, the initial population of"p" drawings was generated by selecting all unit positions and stream paths randomly.

The performance was found to be relatively insensi- tive to B over the range 1.0 to 5.0, neither influencing the number of iterations to stagnation nor the quality of solution at stagnation. With any pseudo-random proce- dure, there is a statistical variation in the performance from run to run. There were indications that B=2.0 was a good choice but the variation of performance with B was not statistically significant.

The other parameters were optimized as follows:

p/u= 16,

C/u=4,

Mlu=8,

glu=8.

The value ofplu found is less than the 35 to 350 that would be expected from (4). It was, nevertheless, found that whilst (p/u)=16 gave significantly superior per- formance to p/u= 10, in terms of quality of drawing produced at stagnation, increasing plu to 20 rarely gave further improvement.

The ratio MIC is much larger than recommended by Simpson and Goldberg (1994). Extensive tests (tens of thousands of drawings evaluated) nevertheless showed that a high proportion of mutations was essential to achieve good quality drawing layouts in short computer runs. The reason for this high proportion was that it required a very large initial population and large numbers of random movements to produce the simple direct connexions characteristic of an intuitively mean- ingful drawing. The mutations produce such simple connexions at each application. As such, it was observed that the mutations are particularly effective in producing improved drawings early in the iterations when the random initial population is easily improved. The effectiveness of mutations is relatively insensitive to the proportions of "line simplification" to "scrolling" opera- tions. Taking both together is, however, significantly better than taking either form of mutation on its own. The decision was made to take equal numbers of "line simplification" and "scrolling" operations.

The number of generations (g = 8u) is much less than would be expected by Simpson and Goldberg (1994). Thus, if we put optimistically p=35u, we would find that, even for very small values of"u", g-. lgu would be

A. A. BRICE and W. R. JOHNS

expected. The reference also indicates that (g/u) should increase slowly with "u". Our finding is that, for larger drawings, e.g. > 8 units, stagnation has not been reached in 8u generations and larger values, such as 16u are beneficial. Indeed, for larger numbers of units, it is beneficial to increase p, C, M and g beyond the values indicated above. Insufficient tests have been done to establish a reliable predictive formula. Indeed, for larger drawings, run times becomes so long that individual experimentation is recommended. The relationship between problem size and run times is discussed below.

The time required to compute an "optimal" drawing is given by:

t= Fdg,

where F is time to evaluate the quality of one drawing, g is the number of generations and d is the number of new drawings produced per generation. "F" is propor- tional to the number of streams to be traced and the average number of calculations that need to be made per stream. Empirically, it is found that

F~u LS.

According to our recommended ratios, we have

and

giving

d=16u

g=8u,

tOcu 3.5.

In practice, it has already been noted that at least g should be increased more than linearly with "u". For equivalent quality drawings it would, therefore, be expected that t increases more rapidly with problem size than the 3.5 power; the 4th power is a reasonable estimate. It follows that, whilst the algorithm is very effective for small problems (typically a few seconds for a 5 unit problem, on a 50 MHz 486 PC) run times extend to many minutes, or hours, when "u" exceeds 20 units. Run times are proportionately faster on more powerful machines but are still unacceptable for large problems. For example, an overnight run on a UNIX workstation failed to fully optimize a 40 unit problem.

5. Results

Preliminary tests were done to evaluate the impact of drawing criteria on the drawings produced. These tests revealed that the close line spacing criterion did not give beneficial effects. Thus, for example, drawings such as those illustrated in Fig. 20 were produced. The forced separation of lines which should naturally follow similar paths was considered to he detrimental to perceived drawing quality. The criterion was never found to produce clearer drawings than were obtained by simply leaving line separatibn to the second stage fine separa- tion algorithm. The tests further failed to show any benefit in applying the abutment-avoidance criterion.

Optimization of flowsheet drawing

iFl tl Fig. 20. Avoiding close line separation.

Such abutments occur very rarely and were not per- ceived to be visually confusing after separation was achieved by the fine separation algorithm. These two criteria were, therefore, omitted from further considera- tion, leaving the six criteria shown in Table 1.

Typical drawings produced by the program are shown in Figs 21, 22 and 23. It will be noted tflat the placing of

Table 1. Weighting w] of drawing fitness parameters

Parameter Fig. 21 Fig. 22 Fig. 23

Fill ratio 0.5 0.4 0.4 Number of kinks 4 4 3 Number of crossings 3 10 3 Excess line length 1 1 1 Number of recycles 1 1 1 "High recycle" markers 1 1 10 Common parent line length 0 0 0

t'~O) f M~_I

layoug using a genetic algorithm 61

captions on the drawings has not yet been optimized to avoid overwriting.

Figures 21, 22 and 23 are of the same process. The differences between the figures are caused by the different weightings for the drawing quality function. The weightings are summarized in Table 1.

Figure 21 shows the drawing produced with weights selected to give a balance between the criteria. Increas- ing the kink weighting indefinitely produces essentially the same drawing within the aspect ratio used; the optimization has achieved the minimum of 9 kinks. Decreasing the fill ratio to 0.4 and increasing the weighting for line crossing gives rise to Fig. 22. Figure 22 still shows only 9 kinks and has succeeded in eliminating line crossings. A criticism of Fig. 22 may be that the long high recycle linking RS-i with SS-I obscures the general left to right flow of information. Increasing the "high-cycle" penalty produces a drawing with no high recycles and no line crossings, as illustrated in Fig. 23, although we have paid the penalty of 2 extra kinks. Note that stream NH 3 Recycle (6) does not follow the default path, it has been optimized to eliminate line crossing and high recycles.

The above illustrations all use a very small drawing so that the features can be illustrated compactly. They suffice to show that the various drawing features can be controlled by adjusting the weightings of the subsidiary criteria of drawing quality to give drawings to the taste of the user.

The simple flowsheet used to generate Figs 21, 22 and 23 is not adequate to show the grouping effect of the "common parent" parameter. This effect is illustrated in

ns.t

~ s o a ~

881

Fig. 21. Process drawing with default weightings.

62 A.A. BR]cE and W. R. JOHNS

vqx~lo)

t e)

S$.1 O :

Rm~ e.d(s)

Fig. 22. Increased penalty for line crossings.

NH~3) ~ MZx.,1 1

t R8.1

$8_1 ~ . . ~ I )

~ S O R ~

m pat~, gl~ls)

Fig. 23. Increased penalty for high recycles.

Optimization of flowsheet drawing layoug using a genetic algorithm 63

u ~

AeuI~ V 11) VRS_I t.~S.1 )

Iml

[ H20 WlmOg|

~ Amctm VttdKtn

m

p,., e.~ll~

P , ~ WWrOOt

ns_~

Fig. 24. Process with SS-1 expanded.

Figs 24, 25 and 26. Figure 24 shows a flowsheet with a parent unit RS-1. This is expanded, in Fig. 25, with a common parent weight of 0.0. In Fig. 26 the expansion

is made with a weight of 10.0. It draws the sub-units making up RS- 1 together at the expense of introducing a small excess line length.

~4~w Vw~IO}

Nm A l ~ l l m

Ir~dAt(t~

Fig. 2.5. SS-I and RS-I expanded.

64 A.A. BRICE and W. R. Jo.Ns

J

.!l[l

J

e e

1

Y.

J J r l'

e ~

e -

t . -

e ~

"7

Optimization of flowsheet drawing layoug using a genetic algorithm 65

6, Programming details 3. There are benefits in optimizing stream paths (edges)

The genetic algorithm and the graphical user interface (GUI) were programmed in C++, using the XVT portable toolkit for the GUI. The source code for the system (P1PDRAW) has been re-compiled without change and runs on a range of platforms including Microsoft Windows 3.1, DEC Ultrix/Motif, DEC Alpha/ Motif, Sun SunOS/Motif and IBM AIX/Motif. A screen shot of the Windows version is shown in Fig. 27.

The users select a PIP II file and it is parsed by PIPDRAW to produce a drawing showing the top level sub-system. This sub-system can be expanded into a drawing in a new window by clicking on it. This process can be repeated until all the sub-systems have been fully expanded. Facilities for changing font, printing, zoom- ing in and out and panning diagrams etc. are provided. Users can interrogate sub-systems and streams to display any information about them that was recorded in the PIP II file.

The criterion used to score drawings and the behav- iour of the genetic algorithm can be easily modified through the dialogue box seen in Fig. 28.

7. Conclusions

It has been shown that:

1. It is possible to separate the criteria for drawing quality from the algorithm used to "optimize" the drawing layout. Indeed, only by making such a separation is it possible to talk of optimization of drawing layout.

2. The factors that go to make up a good drawing (e.g. minimum line crossings) can be defined in mathemat- ically computable terms. A weighted sum of these factors enables the criteria to be balanced in a meaningful way that is reflected in the drawings produced.

and unit (node) positions simultaneously. The posi- tions and paths must be considered together to meet the quality criteria proposed.

4. We believe that fine-tuning (e.g. seeding the initial population with heuristically good drawings) has scope for increasing the speed by up to a factor of ten. The genetic algorithm will, nevertheless, still be too slow for large drawings. It has, however, been valuable in providing an easy way of exploring a wide range of optimization criteria and is adequate for optimizing small to moderate sized drawings (for example 10 units and 15 streams).

5. Some of the most difficult criteria to compute (e.g. close line spacing, kink abutment) were found to have no beneficial effect on drawing quality. Eliminating these criteria leaves an optimization function for which bounds are readily computable and which would be susceptible to mathematical optimization.

6. A genetic algorithm is recommended for other problems with both integer and real variables, many local optima, and optimization criteria that have not previously been well defined or that are difficult to put in a form for mathematical optimization.

7. Now that readily computable drawing layout criteria have been established, future work on optimizing drawing layout should concentrate on developing efficient mathematical optimization methods which have the potential to be orders of magnitude faster. It is believed that good bounding estimates can now be made and that a branch and bound approach [for example, as used by Johns and Mtiller (1976) to minimize recycles in process flowsheets] could be adapted to the drawing layout problem.

8. There is benefit in developing the present work to consider more realistic icons to represent process units and their alternative orientations on the drawing canvas. The focus of this work has, however, been on

expendable sub-system

menu bar

magnification

drawing window

iconiaed drawing window

stream index

non.expandable sub-system

scroll bar

design information

thumb bar

Fig. 27. The PIPDRAW user interface.

66 A.A. BRICE and W. R. JOHNS

Algod~m:

Fill ratio

Random seed

Elitism ratio

Population elze

Cross~rs

Mutations

Generations

Fig. 28. The tuning dialog box.

~ g S sc0dng:

Kinks

line crossings

Excess line length

Racyles

High recycles

Group siblings

rapidly displaying drawings for which only the encoded connectivity is available (e.g. numerically or alphanumerically); it is not intended to compete with the many excellent interactive CAD packages availa- ble. Indeed, we have allowed no interactive access to our drawings (other than resize and recompute facilities) and would recommend that, for artistic fine tuning, the facility is developed to transfer drawings produced by our package to an established CAD flowsheeting package. Experience shows that it is much easier manually to improve a reasonable drawing produced automatically than to generate a good drawing from scratch.

Acknowledgements

We thank E. I. DuPont de Nemours of Wilmington, Delaware, U.S.A., for initiating the project and for their financial support and continuing encouragement. We also thank Professors J. M. Douglas and M. E Malone, and Dr E. Korovessi and members of the PIP team at the University of Massachusetts for their advice and assis- tance.

References

Douglas J. M. (1987) Conceptual Design of Chemical Processes. McGraw-Hill, New York.

Freeman, H. and Ahn, J. (19.87) On the problem of placing names in a geographic map. lnt J. Pattern Recognition and Artificial Intelligence 1, 121-140.

Gassner, E. R., Koutsofios, E., North, S. C. and Vo, K.-P.

(1993) A technique for drawing directed graphs. IEEE Transactions on Software Engineering 19, 214--230.

Goldberg D. E. (1989) Genetic Algorithms in Search, Optimization and Machine Learning. Addison Wesley.

Johns W. R. and F. R. MUller (1976) Optimal sequence for computer flowsheeting calculations. Technical Report, Systems Engineering Group, Technische- Chemisches Labor, EidgenOssische Technische Hochscule, Ztirich.

Kirkwood, R. L., Locke, M. H. and Douglas, J. M. (1988) A prototype expert system for synthesizing chemical process flowsheets. Comput. Chem. Engng. 12(4), 329-343.

Korovessi E. (1995) Conceptual design of multistep reaction processes. PhD dissertation. University of Massachusetts, Amherst, U.S.A.

Nummenmaa, J. (1992) Constructing compact rectilinear planar layouts using canonical representation of planar graphs. Theoretical Computer Science 99, 213-230.

PROCEDE 2.1 (1995). Cherwell Scientific Publishing, Oxford.

Protsko, L. B., Sorenson, P. G. and Trembley, J. P. (1989) Mondrian: system for automatic generation of dataflow diagrams. Information and Software Tech. 31, 456--471.

Protsko, L. B., Sorenson, P. G., Trembley, J. P. and Schaefer, D. A. (1991) Towards the automatic generation of software diagrams. IEEE Transac- tions on Software Engineering 17, 10-21.

Simpson A. R. and D. E. Goldberg (1994) Pipeline optimization via genetic algorithms from theory to practice. 2nd International Conference on Water

Optimization of flowsheet drawing layoug using a genetic algorithm 67

Pipeline Systems ed. D. S. Miller), pp. 309-320. duals for VLSI floor-plan. Mathematical Program- Mechanical Engineering Publications Ltd, Bury St ming 53, 20--43. Edmunds, U.K.. Tamassia, R., Tollis I. G. and Vitter, J. S. (1991) Lower

Tani, K., Tsukiyama, S., Shinoda, S. and Shirakawa, I. bounds for planar orthogonal drawings of graphs. (1991) On area-efficient drawings of rectangular Information Processing Letters 39, 35040.