![Page 1: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/1.jpg)
PRACTICALITIESCITS4404Artificial Intelligence & Adaptive Systems
![Page 2: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/2.jpg)
2
Issues in applying CI techniques• Global optimisation, some definitions • Fitness progression • Generality • Role of domain knowledge • Niching and speciation • Memetic optimisation • Multi-objective problems • Co-evolution • Constraints • Noisy and dynamic problems • Offline vs online learning • Supervised, reinforcement, unsupervised learning • Experimental methodology, performance measures • Parameter tuning and/or control
![Page 3: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/3.jpg)
3
Global optimisation• Given a function f and a set of solutions S, search for
x* S such that x S: f(x*) beats f(x) • The graph below depicts a simple 1D fitness landscape
http://www.cs.vu.nl/~gusz/ecbook/ecbook-course.html
x* Global optimum
Local optima
Three basins of attraction
![Page 4: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/4.jpg)
4
Definitions • The function f giving the fitness for each member of S is
called the fitness landscape • The best solution wrt f is called the global optimum
• Note that there may be multiple (equal) global optima
• Non-global optima that are better than other “similar solutions” are called local optima
• The part of the landscape dominated by an optimum is called its basin of attraction
• Diversity refers to the distribution of a set of solutions over the fitness landscape • Diversity preservation techniques try to ensure a good distribution
![Page 5: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/5.jpg)
5
Heuristic optimisation• CI techniques are heuristic optimisers, also known as
generate-and-test optimisers • Searching procedures that use rules (inspired by nature) to decide
which solution(s) to try next
• The simplest heuristic optimisers are hill-climbers • Given one solution, generate similar solutions, and
keep the best of them
• A hill-climber is guaranteed to find a local optimum • They can exploit their basin of attraction, but they lack the ability to
explore the entire landscape properly
• CI techniques use populations and other tricks to promote both exploration and exploitation • More about this later under memetic algorithms
![Page 6: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/6.jpg)
6
Fitness progression• CI techniques are anytime algorithms • The fitness of the best known solution will improve over time,
but usually under the law of diminishing returns • The convergence rate (the rate of improvement) tends to fall over time
• Several shorter runs may be better than one long run
Progress in 1st half
Bes
t fit
ness
in p
opul
atio
n
Time (number of generations)
Progress in 2nd half
![Page 7: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/7.jpg)
7
Smart initialisation• Initial solutions can be generated randomly, or using
domain knowledge, often called smart initialisation • This can improve both results and time performance,
but it may also introduce bias into the search
Bes
t fit
ness
in p
opul
atio
n
Time (number of generations)
T: time needed to reach equivalent fitness of “smart” initialisation
T
![Page 8: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/8.jpg)
8
CI vs problem-specific methods • CI techniques are general-purpose
• Robust techniques giving good performance over a range of problems
Scale of “all” problems
Per
form
ance
Random search
Special problem-tailored method
Computational intelligence approach
P
![Page 9: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/9.jpg)
9
Domain knowledge • Performance can sometimes be improved by incorporating
“expertise” into the process
Scale of “all” problems
Per
form
ance
P
EA 1
EA 4
EA 3
EA 2
![Page 10: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/10.jpg)
10
Contd. • Too little domain knowledge makes the search space
bigger and can make the search inefficient • cf. EA1
• Too much domain knowledge can exclude novel solutions• cf. EA4
• But care must be taken!• “If you tell the system what the solution looks like,
that’s what it’ll give you!” [R.L. While]
• But this can all be highly non-obvious…
![Page 11: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/11.jpg)
• Most interesting problems are multi-modal • Sometimes we want to discover more than just the global optimum• i.e. we want to discover y* and z*, as well as x*
• This might be important to offer extra robustness • Often it is hard for a fitness function to capture everything
11
Niching and speciation
x*
y* z*
![Page 12: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/12.jpg)
12
Contd.• Each basin of attraction is called a niche or sometimes
a species • Niching can be achieved in two broad ways• Implicit niching is achieved by modifying the solution
representation • Explicit niching is achieved by promoting dissimilar solutions,
or penalising similar solutions• Both techniques rely on having some distance metric
between solutions
![Page 13: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/13.jpg)
13
Memetic algorithms• CI techniques are good at exploration – finding high peaks
in the fitness landscape – but are less good at exploitation of those peaks
• Memetic algorithms combine CI with some local-search technique that is good at exploitation • AKA Baldwinian, Lamarckian, or cultural algorithms
• Hill-climbing is the classic example • CI finds the best basin of attraction, then hill-climbing climbs the peak
• The two techniques can be applied in series, or in parallel
![Page 14: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/14.jpg)
14
Multi-objective optimisation • In many problems, solutions are assessed wrt several criteria
• e.g. speed vs safety vs price for car designs
• Fitness is now a vector, not a scalar, which complicates selection • Vectors have only a partial ordering, rather than a total ordering
• The “solution” to a multi-objective problem is a set of solutions offering different trade-offs between the objectives • It is important to make no a priori assumptions about trade-off weights
or the shape of the solution set
![Page 15: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/15.jpg)
15
Contd. Rank = 0
Rank = 1
Rank = 2
Rank = 1
Rank = 4
f2
f1
PQ
A
B
A and B are non-dominated
Rank = 0
Rank = 0
Rank = 0
P dominates Q
Two objectives f1 and f2, both being maximised
X dominates Y if it is better in all objectives
The rank of X is the number of solutions that dominate X
Selection is based on ranks
Each solution is plotted by its values in the objectives
![Page 16: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/16.jpg)
16
Co-evolution• Fitness is sometimes assessed by the interactions
between solutions, rather than in isolation • e.g. build a team which is the best in the AFL
• The above is an example of competitive co-evolution • Aim for an “arms race” between solutions to drive improvement
through parallel adaptation
• The alternative is co-operative co-evolution • Decompose a problem into simpler sub-tasks (layered learning),
and combine the sub-solutions to solve the original problem
![Page 17: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/17.jpg)
17
Constraints • A constraint is a requirement placed on a solution,
as opposed to a measure of quality • e.g. the length of this component must be less than X,
or the power of that component must be more than Y
• A feasible solution is one that satisfies all constraints• An infeasible solution fails at least one constraint
All solutions
Feasible
Feasible
Feasible
![Page 18: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/18.jpg)
18
Contd. • There are many different constraint-handling techniques
• Separatist: consider objectives and constraints separately• Purist: discard all infeasible solutions when they arise • Repair: repair infeasible solutions when they arise • Penalty: modify the fitness function to penalise infeasible
solutions • MOOP: add an extra objective function that measures
“degree of infeasibility” • Hybrid: some combination of the above
![Page 19: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/19.jpg)
19
Noisy problems • A noisy fitness function arises when the fitness calculations
aren’t perfect
Fitness landscape Noise landscape Noisy fitness landscape
![Page 20: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/20.jpg)
20
Contd. • The algorithm can now only estimate performance • Again, this complicates selection
• Bad solutions might get lucky and survive • Good solutions might get unlucky and die
• These behaviours cause undesirable long-term effects • The learning rate is reduced • Learning may not be retained
• The usual approach to this is resampling • Evaluate the fitness multiple times and average the results • But how many times is sufficient?
• A second common approach is to try to bound the error • Basically to assume the error won’t exceed a certain magnitude
![Page 21: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/21.jpg)
21
Dynamic problems• With some problems, the
fitness landscape changes over time, maybe due to • Temporal effects • External factors • System adaptation
• The system needs to adapt to this change • Requires online learning
http://www.natural-selection.com
![Page 22: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/22.jpg)
22
Offline vs online learning • Offline learning is where a system learns before use
• The strategy is fixed once training is completed • Requires comprehensive training data • Only feasible in well-understood environments
• Online learning is where a system learns while in use • The strategy is adapted from each instance encountered • Initial decisions are made from incomplete training data • Usually much greater time-pressure to improve
![Page 23: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/23.jpg)
23
Learning paradigms • Supervised learning is (offline) training by
comparing a system’s responses to expected responses in training data
• Reinforcement learning is (online) training using feedback from the environment to assess the quality of responses
• Unsupervised learning is (offline) training with no training data • No real question presented • The system looks for patterns in the data
![Page 24: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/24.jpg)
• Question: tell me about this data
24
Unsupervised learning • Question: tell me about this data
![Page 25: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/25.jpg)
25
Supervised learning • Question: what is an apple?
![Page 26: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/26.jpg)
26
Experimental methodology • CI techniques are stochastic
• Their results are non-deterministic
• Thus we should never draw conclusions from a single run• Always perform a “large” number of runs • Assess results using statistical measures • Assess significance using statistical tests
• When comparing algorithms, it is crucial to make all comparisons fair • Give each the same amount of resource • Use the same performance measures • Try different competition limits
![Page 27: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/27.jpg)
27
Performance measures • Offline performance measures
• Effectiveness (algorithm quality) • Success rate (percentage of “good’ runs) • Mean best fitness at termination
• Efficiency (algorithm speed) • CPU to “completion” • Number of solutions evaluated
• Online performance measures • Population distribution • Fitness distribution • Improvement rate
![Page 28: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/28.jpg)
28
Parameter tuning • How do we decide on the various constants in a run
of the system? • e.g. bigger population, or more generations? • How big should mutations be?
• This can be difficult! • Sub-optimal values can seriously degrade performance • Choosing good values can take significant time • Exhaustive search is usually impractical • Good values may become bad during a run
![Page 29: PRACTICALITIES CITS4404 Artificial Intelligence & Adaptive Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062308/56649e6f5503460f94b6cd9f/html5/thumbnails/29.jpg)
29
Parameter control • Can we get the system to choose parameters automatically?
• i.e. allow settings to vary during the run
• Three main alternatives are used • Deterministic: change parameters according to some pre-determined
schedule, e.g. based on the passage of time • Adaptive: change parameters according to some measure of the
search progress • Self-adaptive: encode the “scope of change” into the solution
representation in some way
• One important goal is to reduce the prevalence of “magic numbers” in the system • Still, finding good settings is not easy