using quantum fuzzy logic to learn facial gestures of a schrödinger cat puppet for robot theater...
Post on 21-Dec-2015
220 views
TRANSCRIPT
Using Quantum Fuzzy Logic to learn facial gestures of a Schrödinger Cat puppet for
Robot Theater
Arushi RaghuvanshiProf. Marek Perkowski
24 May 20081
Background: Quantum Robots
H
A B
P QL1 L2
S1 S2
M1
M2
M3 M4
M5 M6
Quantum Braitenberg
Mr PotatoHead
Old Duck Biped
Schrödinger's Cat
*character in Interactive Robot Theatre2
(ISMVL 2007)
Programming Robot Behaviors
BehaviorSelection
Theatre Director
Input Initialization
Quantum or other logic controller
Measurement Effectors
sound
Simple sequential flow with no feedback
3
Programming Robot Behaviors
BehaviorSelection
Theatre Director
Input Initialization
Quantum or other logic controller
Measurement Effectors
sound
Adding emotions and environmental feedback
emotionEnvironment
including human audience
Theatre Director
4
Programming Robot Behaviors
BehaviorSelection
Theatre Director
Input Initialization
Quantum or other logic controller
Measurement Effectors
sound
Emotional Interactive Robots with Sensors and Feedback Modifying the Behavior
emotionEnvironment
including human audience
Theatre Director sensors
5
Quantum & Fuzzy LogicQuantum Circuit
(Can be transformed into Quantum Fuzzy Logic, by replacing gates)
NOT -> Fuzzy NOTOR -> MAXAND -> MIN
Fuzzy Logic with MIN & MAX operators
New Operators and Literals can be defined for Quantum Fuzzy Logic
6
Fuzzy Logic Example
7
0.3
0.7
0.3
0.70.7
0.30.3
0.7
0.3
0.7
0
0.3
1
0.7
0.7
0.3
Fuzzy Logic Operations
8
• Multiple ways to create Fuzzy operations• Two examples below• Example 1
– NOT (a) = (1 – a) • e.g. NOT (0.34) = 0.66
– MIN (a, b) = if (a < b) then a else b • e.g. MIN (0.3, 0.75) = 0.3
– MAX (a, b) = if (a > b) then a else b• e.g. MAX (0.63, 0.83) = 0.83
• Example 2– NOT (a) = (1 – a)
• e.g. NOT (0.34) = 0.66
– MIN (a, b) = a * b• e.g. MIN (0.3, 0.7) = 0.21
– MAX (a, b) = (a + b) – a*b• e.g. MAX (0.3, 0.7) = 0.3+0.7-0.21 = 0.79
• As in example 2, MAX and MIN may be misnomers. They can be called OR and AND operations instead
a MAX b = NOT ( NOT (a) MIN NOT (b))
=NOT ((1-a)*(1-b))
=NOT(1-a-b+a*b)
=1-1+a+b-a*b
=a+b-a*b
Z
YX
0
1
1
-1
-1
1
-1
|0›
|1›
Representing Fuzzy Values on Bloch Sphere
• Fuzzy values can be represented in different ways on Bloch Sphere
• Simplest way to represent is along the meridian (as shown on left)
• After measurement, value can be 0, 1 or anywhere in between
• Other mechanisms (e.g. values inside the Bloch Sphere, or parallels of latitudes etc. ) can also be used
9Measurements
00.15
0.5
0.81
Quantum Fuzzy Literals
Rotation Around Y Axis Rotation Around X Axis Phase Shift (270 degree rotation around Z axis)
We use this to define the Fuzzy NOT operations (Other literals can be used as well).
10
X
Z
Y
Quantum Fuzzy ‘NOT’ operator
Inverter is defined in exactly the same way as in quantum logic:
Fuzzy Quantum Not(α|0 +β|1)β|0 +α |1
where the square of the (in general complex) value associated with ket |1 is an equivalent of fuzzy value in interval [0, 1].
11
1 0 0 0 0 0 0 00 1 0 0 0 0 0 00 0 1 0 0 0 0 00 0 0 1 0 0 0 00 0 0 0 1 0 0 00 0 0 0 0 1 0 00 0 0 0 0 0 0 10 0 0 0 0 0 1 0
0 R (Davio)
α1|0 + α2|1
β1|0 + β2|1α1β1|000 + α1β2|010 + α2β1|100 + α2β2|111
= (α1β1|00 + α1β2|01 + α2β1|10) |0 + (α2β2|11) |1
=> Probability of measurement
of ‘1’ is |α2β2 2
α1β10 α1β20α2β100α2β2
000
001
010
011
100
101
110
111
α1β1α1β2α2β1α2β2
α1α2
β1β2
10
Input is Kroenekar product of 3 parallel input lines
=10
=
α1β10α1β20α2β10α2β20
α1β10α1β20α2β10α2β20
=X
Toffoli Gate
Input Matrix Output Matrix
Quantum Fuzzy ‘MIN’ operatorMin (α1|0 + α2|1, β1|0 + β2|1 ) = Davio (α1|0 + α2|1, β1|0 + β2|1, 0)
Quantum Fuzzy ‘MAX’ operator
The definition of Fuzzy Quantum Maximum Operator is calculated from De Morgan rule:
A max B = NOT ( NOT (A) min NOT (B)).
13
Quantum Fuzzy Logic in Robots
14
Fuzzy Value Sensors
Light Sensors 0 = completely dark0.5 = semi-dark 1 = completely bright
Sound Sensors0 = pin-drop silence0.5 = normal noise (people talking)1 = loud crash
Image Sensors Quantum Fuzzy Logic
Motor Controls causing output behaviors
Back to Robot Theatre….
Combination of Genetic Algorithm and Quantum Fuzzy Logic
15
Synchronizing Lips with Speech
Not This
Want This
16
Traditional Methods
• Use mapping of phonetic symbol to a lip shape (as shown on left)
• Sound streams can be parsed to generate phonetic symbols
• The methods are language dependent (i.e. different mapping for different language)
• Need to be modified for speed and style of speaking
17
Using Genetic AlgorithmsSound Input
Initial Set of genomes representing lip movements(initial population for GA)
These are dynamically generated by program
A
Input to Fitness Function(User evaluation – interactive)
ESRA Robot
Shows Lips Movements
B
*** The matching function is dynamic, so it doesn’t matter if people have different accents, talk slower/faster, etc.
GA Engine
Sequence representing Lip movements matching with input stream ‘A’
18
Genome
• A Genome (or a chromosome) is a pattern that corresponds to a behavior.
• A possible solution to the given problem can be encoded encoded to create a genome.
• In genetic algorithms, a set of random genomes are created.
• When decoded these genomes represent possible solutions to the given problem.
• In my experiment, a genome is an encoded string that represents a sequence of lip movements. For example:
49__9__31__9__46_1640__• When decoded, this code represents the lip motion for
the phrase “Hi I am a robot.”
19
Encoding Lip Shapes for Defining the Genome
Code 0, 1Upper: 127Lower: 127
Code 2Upper: 87Lower: 173
Code 3Upper: 170Lower: 120
Code 4Upper: 140Lower: 56
Code 5Upper: 0Lower: 0
Code 6Upper: 0Lower: 167
Code 7, 8Upper: 80Lower: 45
Code 9Upper: 100Lower: 45 20
Fitness Function
• The better the robot completes the problem, the higher the fitness function.
• When synchronizing sound and lip motion the fitness function would be a user input.
• To test the Genetic Algorithm, I calculated the fitness function by comparing the genomes to the best solution.
• The best solution was determined by the traditional method.
21
Fitness Function Algorithm
1 4 9 5 7 _ 3 8
5 3 _ 8 3 _ 3 8
↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑4 1 9 3 4 0 0 0
Best Genome (for calculating Fitness Score)
Genome Under Test
Find Difference for each corresponding element
• Closeness implies better match (4-3 is better than 1-5)
• Pauses ‘_’ must match in position to get any score, so it is either 0 or 9
9-X = 5 8 0 6 5 9 9 9
X =
Total Score = 5+8+0+6+5+9+9+9 = 51
Fitness Score % = (Total/TotalPossible)*100
= 51/72 * 100 = 70.83%
Higher number is better now !
22
Selection• The higher the fitness
score, the higher the probability of being selected.
• Selection methods include the Roulette Wheel, Tournament Selector, and Truncation Selection
• In my experiment, I used a Roulette Wheel for selection.
23
Crossover
When two chromosomes from the group are selected they are combined to create a new genome.
Dependent on the crossover rate the bits from each chosen genome are crossed at a randomly chosen point.
The higher the crossover rate is, the more likely it is that a crossover will occur.
The crossover occurs at a randomly chosen point in the genome.
24
Mutation• Depending on the
mutation rate, chosen bits of the genome are changed.
• The higher the mutation rate, the more likely it is that a bit will be changed.
• Shown to the right are many types of mutation
25
Mutation
• In my experiment I used two different mutation functions– Swap mutation– myMutator
• I created my own mutator which changes a single bit, rather than swapping two bits.
26
Terminating ConditionsThis generational process is repeated until a termination
condition has been reached. Common terminating conditions are
* A solution is found that satisfies minimum criteria * Fixed number of generations reached * Allocated budget (computation time/money) reached * The highest ranking solution's fitness is reaching or has
reached a plateau such that successive iterations no longer produce better results
* Manual inspection * Combinations of the above.
I used a fixed number of generations as the ending criteria. Default-4,000 generations; I also experimented with changing the number of generations.
27
initialize population
select individuals for mating based on Fitness Function
mate individuals to produce offspring
mutate offspring
insert offspring into population
are stopping criteria satisfied?
finish
Basic Genetic Algorithm Flow
28
GA for Lip Synchronization
Initial Set of genomes representing lip movements(initial population for GA)
These are dynamically generated by program
A
Interactive Input to Fitness Function
ESRA Robot
Shows Lips Movements
B
In real application, input to Fitness Function is dynamic, language independent, and it doesn’t matter if people have different accents, talk slower/faster, etc.
GA Engine
Sequence representing Lip movements matching with input stream ‘A’
Test Sound Input Matching Sequence for Automating
Fitness Fn Evaluationlength
original sound input
Automated Mode
Interactive Mode
29
Genetic Algorithm BehaviorsInput length vs Objective Score (Swap
mutator)
0.000
20.000
40.000
60.000
80.000
100.000
120.000
1 2 4 8 16 32 64 128
Input length (number of characters)
Input Length vs. Time (Swap Mutator)
0.000
1.000
2.000
3.000
4.000
5.000
6.000
7.000
8.000
9.000
10.000
1 2 4 8 16 32 64 128
Input Lenght(number of characters)
Input Length vs. Time (My Mutator)
0.000
2.000
4.000
6.000
8.000
10.000
12.000
14.000
1 2 4 8 16 32 64 128
Input Length (number of characters)
Avera
ge t
ime (
sec.m
illisec)
Input Length vs. Objective Score (My Mutator)
92.000
93.000
94.000
95.000
96.000
97.000
98.000
99.000
100.000
101.000
1 2 4 8 16 32 64 128
Input Length (number of characters)
Mutation Rate-Swap Mutator
70.000
75.000
80.000
85.000
90.000
95.000
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Mutation Rate (0-1)
Obj
ectiv
e S
core
(%)
Mutation Rate (My Mutator)
0
20
40
60
80
100
120
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Mutation Rate (from 0-1)
Obj
ectiv
e S
core
(%)
Crossover Rate vs. Objective Score
74.00076.00078.00080.00082.00084.00086.00088.00090.000
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Crossover Rate (0-1)
Obj
ectiv
e Sc
ore
(%)
Number of Generations vs. Objective Score
0.000
20.000
40.000
60.000
80.000
100.000
Number of Generations
Ob
jecti
ve S
co
re(%
)
Number of Generations vs. Avg Time
0.000
1.000
2.000
3.000
4.000
5.000
6.000
Number of Generations
Avg
. T
ime (
sec.m
illisec)
Population Size vs. Avg. Time
0.000
5.000
10.000
15.000
20.000
25.000
30.000
35.000
40.000
1 16 32 64 128 256 512
Population Size
Avg
. T
ime (
sec.m
illisec)
Population Size vs. Objective Score
80.000
82.000
84.000
86.000
88.000
90.000
92.000
94.000
1 16 32 64 128 256 512
Population Size
Number of Offspring vs. Avg. Time
1.240
1.260
1.280
1.300
1.320
1.340
1.360
1.380
1.400
1.420
1.440
1 2 4 8 16 32 64 128
Number of Off spring
Number of Offspring vs. Objective Score
78.00079.00080.00081.00082.00083.00084.00085.00086.00087.00088.00089.000
1 2 4 8 16 32 64 128
Number of Off spring
GA Results thus far..
• Created a self-learning robot that can learn how to synchronize sounds and words with appropriate facial expressions.
• Finding the best solution depends on different conditions. In general, I noticed that the functions that gave the higher objective scores tended to take more time to complete 4,000 generations.
34
Ongoing work• Combining Quantum Fuzzy Logic to Robotic
Theatre. • Modify the body language (hand and arm
movements) based on environmental sensors– Sound Sensors (fuzzy value input) to detect noisy or
quiet environments and modify behavior– Light sensor values (fuzzy value input) to detect day
and nights and modify behavior
• Quantum Fuzzy Schrödinger Cat sitting on Quantum Fuzzy Braitenberg vehicle arguing with Einstein, singing a song and going crazy .
35
Cat Singing
A lively little quantum went darting through the air, Just as happy quanta go speeding everywhere ………..
Thank You
37
Genetic Algorithms
A genetic algorithm is a search technique used in computing to find exact or approximate solutions to optimization and search problems. Genetic algorithms are a particular class of evolutionary algorithms that use techniques such as inheritance, mutation, selection, and crossover.
38
Traditional Method(Without Genetic Algorithms)
AudioSpeech
Recognition
PhoneticLetters,
Punctuation,and syllables
Matches inputto correctlip motion:
Static*** Since the matching function is static, it will have to be entirely recoded for different people: they have different accents, talk slower/faster, etc.
Sequence representing Lip movements matching with audio input string.
ESRA Robot
Shows Lips Movements
Language Dependent
39
ESRA Robot Facial Expressions
Motor for Eye Lids
Motor for Lower Lip
Motor for Upper Lip
• ESRA Robot has several motors for lips, eyelids and arm movements
• I am primarily using lip motors for my experiment
• Specific position of lip motors define the shape of the lip
• The shape can be matched with speech
40
Crossover
• Single Point Crossover• Double Point Crossover
gives any two points on each genome an equal chance of being split up.
• In my experiment, I used a single point crossover with a 90 percent crossover rate.
41
Procedure1. Create a robot with a face, a mouth, and two motors for lip movement.2. Assign shapes of the mouth for every sound/syllable3. Encode these shapes using numbers and characters4. Create a random set of genomes for a given input.5. Depending on the number of encodings that match with the appropriate
sound, a fitness function will be assigned to each genome.6. Using a Roulette Wheel, genomes will be selected for reproduction. The
higher the fitness score: the higher the probability of being selected for reproduction.
7. To create a new set of offspring, one random crossover point will be chosen for each pair of genomes.
8. There will also be a 1% mutation rate.9. A new set of genomes (the offspring) are created.10. Repeat steps 5-9 for a fixed number of generations.11. Change the Genetic Algorithm parameters and record the dependent
variables.
42
Program
• I used GALib from MIT lab as a library in my program.
• I designed my own genome• Defined my fitness function• Created an initializer function• Created a mutator function• Program link- Project file• EsraGA- Main C++ source code
43
Data
Data Tables with swap mutator
Data Tables with my mutator
44
AbstractThe purpose of this project is to create efficient Genetic Algorithms
for robotic learning and the synchronization of speech and visual expressions. This experiment uses an ESRA robot which has a set of motors to control facial expressions including lip motion and eyebrow motion. Emotions can be created using facial expressions and arm motion; however, for the simplicity of this experiment, the focus is on lip motion. Various shapes of the mouth are assigned to the appropriate sounds and encoded. Using these encodings I create a random set of chromosomes. I then use Genetic Algorithms so the robot can develop the lip motion to correspond with spoken text. Next, I use the Genetic Algorithm to test how long it takes to synchronize text and lip motion for varying length, crossover rate, mutation rate, number of generations, population size, and number of offspring. Overall, I concluded that my hypothesis was supported because using genetic algorithms for behavioral evolution, I was able to create a robot that can learn how to synchronize sounds and words with appropriate facial expressions. After testing various parameters, I concluded that functions that return higher objective scores, take a longer time to complete. Some applications of this project include translating text into lip motion for animation movies and humanoid robots. The next step in this project would be to try different parameters such as convergence and migrating populations. I could also develop body language as well as lip motion. 45
Applications• With a program using genetic algorithms, matching lip
movements to speech are language independent. Also, one can use the same program for different people. In the traditional style, the tables would have to be recoded because everyone has individual accents, body language, and how fast they talk.
• This program can be used to match text and lip motion for movie animation and humanoid robots.
• Animation industries don’t have to hand draw lip motion or use a databank of words. This would be most affective if I used a combination of pre-programmed lipcodes and user inputs.
• This could be used to convert sounds into lip motion so deaf people can understand what is being said in situations in which they can’t see the person who is speaking. I
• t could also be used in reverse and convert lip motion into text. This could be useful in documenting presentations, speeches, and even court cases. It could also be used to create subtitles in movies.
46
Representing Fuzzy Values on Bloch Sphere
• Show L1 through L5 options
47
Synchronizing Lips with Speech
Not This
Want This
48