levels of the self-improvement of the ai
TRANSCRIPT
Levels of self-improvement
of the AIAlexey Turchin,
Science for Life extension Foundation
What is self-improvement of
the AI?Roman V. Yampolskiy From Seed AI to Technological Singularity
via Recursively Self-Improving Software. https://arxiv.org/pdf/1502.06512v1.pdf
What is intelligence?
Intelligence is a measure of average level of performance
Shane Legg, Marcus Hutter. Universal Intelligence: A Definition of Machine Intelligence, https://arxiv.org/abs/0712.3329
Measure can grow but it can’t increase itself
So is recursive self-improving
magic?
Is RSI like nuclear chain reaction?
E.Yudkowsky. Intelligence Explosion Microeconomics. https://intelligence.org/files/IEM.pdf
What is going on inside AI which is trying to make its performance better?
AI has many levels and changes could happen
on all of them:
• Goal level
• Architecture and code
• Learning and data • Hardware
Hardware level: acceleration
Increasing of the speed of computation
Gain: No more than 3-5 times gain on current elementary base
Limitations: Thermal energy dissipation
Risk: No much risks on early stages
Safety: Low hanging fruit
Hardware level: more computers
Increasing of the speed of computation
Gain: Logarithmic growth
Limitations: Connection and pararlelization problems
Risk: Will try to takeover internet
Safety: Boxing, fake resources, low hanging fruit.
Hardware level: hardware accelerators
Increasing of the speed of computation
Gain: 100-1000 times
Limitations: 1 month time delay; access to fabs
Risk: AI needs money and power to get it
Safety: Control over fabs.
Hardware level: Change of the
elementary baseIncreasing of the speed of computation
Gain: 100-1000 times
Limitations: 1 month time delay; access to fabs
Risk: AI needs money and power to get it
Safety: Control over fabs.
Learning level: Data acquisition
Getting data from outer sources, like scanning internet, reading books
Gain: unclear, but large
Limitations: bandwidth of access to the internet, internal memory size, long time
Risk: AI could have mistaken ideas about the world on its early stages
Safety: Control over connections.
Learning level: Passive learning
Training of neural nets.
Gain: unclear
Limitations: competitively extensive and data hungry task. It may need some labeled data.
Risk: Overfitting or wrong fitting
Safety: Supervision
Learning level: Active learning with thinking
Creating new rules and ideas.
Gain: unclear
Limitations: meta-meta problems
Risk: Testing
Safety: Supervision
Learning level: Active learning with thinking
Acquiring unique important information
Gain: may be enormous
Limitations: context dependence.
Risk: Running out of box
Safety: Supervision
Learning level: Active learning with thinking
Experimenting in nature and Bayesian updates
Gain: may be large
Limitations: context dependence, slow experiments in real life
Risk: Running out of box
Safety: Supervision
Learning level: Active learning with thinking
Thought experiments and simulations.
Gain: may be large
Limitations: long and computationally expensive, not good for young AI
Risk:
Safety: Supervision
Learning level: Active learning with thinking
World model changes and important facts
Gain: may be large
Limitations: long and computationally expensive, not good for young AI
Risk: Different interpretation of the main goal
Safety: Some world model could make AI safer (if it thinks that it is in simulation)
Learning level: Active learning with thinking
Value learning. If AI don’t have fixed goals it could have intention to continue learn values from humans.
Limitations: long and computationally expensive, not good for young AI
Risk: Different interpretation of the main goal
Safety: Some world model could make AI safer (if it thinks that it is in simulation)
Learning level: Active learning with thinking
Learning to self-improve
Limitations: need for tests, no previous knowledge
Risk: explosive potential of the AI
Safety: Keep knowledge about AI away from AI
Learning level: Active learning with thinking
Information about own structure
Limitations: need for tests, no previous knowledge
Risk: explosive potential of the AI
Safety: Keep knowledge about AI away from AI
Rewriting its own code Rewriting of neural net: choosing right architecture of the net for a task Gain: huge on some tasks Limitations: any neural net has a failure mode
Risk: Look rather benign
Safety: not clear
DeepMind’s PathNet: A Modular Deep Learning Architecture for AGI. https://medium.com/intuitionmachine/pathnet-a-modular-deep-learning-architecture-for-agi-5302fcf53273#.
48g6wx5i2
Rewriting its own code Optimization and debugging.
Gain: limited
Limitations: some bugs are very subtle
Risk: Look rather benign
Safety: insert bugs artificially?
Rewriting its own code Rewriting of modules and creating subprograms
Gain: limited
Limitations: Risk: Look rather benign
Safety:
Rewriting its own code Adding important instrument, which will have consequences on all levels.
Gain: may be high
Limitations: testing is needed
Risk:
Safety:
Rewriting its own code Rewriting its own the core
Gain: may be high
Limitations: risks of halting, need for tests,
Risk: recursive problems
Safety: Encryption, boxing
Rewriting its own code Architectural changes: changes of relation between all elements of AI of all level
Gain: may be high
Limitations: risks of halting, need for tests
Risk: recursive problems
Safety:
Rewriting its own code Unplug of restrictions
Gain: it depends
Limitations: there should be restrictions
Risk: many dangers
Safety: Second level restriction which starts if first level is broken; self-termination code
Rewriting its own code Coding of the new AI from scratch based on completely different design
Gain: it depends
Limitations: there should be restrictions
Risk: many dangers
Safety: Second level restriction which starts if first level is broken; self-termination code
Rewriting its own code Acquiring new master algorithm
Gain: large
Limitations: need for testing
Risk: New way of presenting goals may be needed, Father-child problem
Safety:
Rewriting its own code Meta-meta level changes. These are the changes that change AIs ability to SI, like learning to learn, but with more intermediate levels, like improvement of improvement of improvement.
Gain: could be extremely large or 0.
Limitations: could never return to practice
Risk: recursive problems, complexity
Safety: Philosophical landmines with recursion
Goal system changes
Reward driven learning
Gain: could be extremely large or 0.
Limitations: could never return to practice
Risk: recursive problems, complexity
Safety: Philosophical landmines with recursion
Goal system changes
Reward hacking
Gain: could be extremely large or 0.
Limitations: could never return to practice
Risk: recursive problems, complexity
Safety: Philosophical landmines with recursion
Yampolskiy, R.V., Utility Function Security in Artificially Intelligent Agents. Journal of Experimental and Theoretical Artificial Intelligence (JETAI), 2014: p. 1-17
Goal system changes
Changes of instrumental goals and subgoals
Gain: could be extremely large or 0.
Limitations: could never return to practice
Risk: recursive problems, complexity
Safety: Philosophical landmines with recursion
Goal system changes
Changes of the final goal.
Gain: No gain
Limitations: will not want to do it
Risk: could happen randomly, but irreversably
Safety: Philosophical landmines with recursion
Improving by accusation non-AI resources
• Money • Time • Power over others • Energy • Allies • Controlled territory • Public image • Freedom from human and
other limitations, and safety
Stephen M. OMOHUNDRO. The Basic AI Drives https://selfawaresystems.files.wordpress.com/2008/01/ai_drives_final.pdf
Changing number of AIs Creating narrow AIs, Tool AIs and agents with specific goals
Gain: Limited
Limitations: need to control them
Risk: revolt
Safety: Narrow AIs as AI police
Changing number of AIs Creating own copies and collaborating with them
Gain: Limited
Limitations: need to control them
Risk: revolt
Safety: Narrow AIs as AI police
Changing number of AIs Creating own new version and its testing
Gain: Large
Limitations: need to control them
Risk: revolt
Safety:
Changing number of AIs Creating orgainsations from copies
Gain: Large
Limitations: need to control them
Risk: revolt
Safety:
Cascades, cycles and styles of SI
Yudkowsky suggested that during its evolution different types of SI-activity will be presented in the some forms, which he called cycles and cascades. Cascade is a type self-improvement, where next version is defined by biggest expected gain in productivity. Cycle is a form of cascade there several action repeated all over again.
Styles: evolution and revolutions
Evolution is smooth, almost linear increase of the AI capabilities by learning, increasing of computer resources, upgrading modules, writing subroutines.
Styles: evolution and revolutions
Revolutions are radical changes of architecture, goal system, master algorithm. They are crucial for recursive SI. They are intrinsically risky and unpredictable, but they produce most of the capabilities gains.
Cycles
Knowledge-hardware cycle of SI is a cycle in which AI collect knowledge about new hardware and when build it for itself.
Cycles
AI theory knowledge – architectural changes cycle is primary revolution cycle, and it is very unpredictable for us. Each architectural change will give the AI ability to learn more how to make better AIs.
Possible limits and obstacles in self-
improvement
Theoretical limits to computation
Possible limits and obstacles in self-
improvementMathematical nature of complexity of the problems and definition of intelligence “it becomes obvious that certain classes of problems will always remain only approximately solvable and any improvements in solutions will come from additional hardware resources not higher intelligence” [Yampolsky].
Possible limits and obstacles in self-
improvementNature of recursive self-improvement provides diminishing returns of logarithmic scale, “Mahoney also analyzes complexity of RSI software and presents a proof demonstrating that the algorithmic complexity of Pn (the nth iteration of an RSI program) is not greater than O(log n) implying a very limited amount of knowledge gain would be possible in practice despite theoretical possibility of RSI systems. Yudkowsky also considers possibility of receiving only logarithmic returns on cognitive reinvestment: log(n) + log(log(n)) + … in each recursive cycle.”
Possible limits and obstacles in self-
improvementNo Free Lunch theorems – difficulty to search the space of all possible minds to find a mind with superior intelligence to a given mind.
Possible limits and obstacles in self-
improvementDifficulties connected with Gödel and Lob theorem, “Lobstacle”: “Löb’s theorem states that a mathematical system can’t assert its own soundness without becoming inconsistent.”
“If this sentence is true, then Santa Claus exists."
Possible limits and obstacles in self-
improvement“Procrastination paradox will also prevent the system from making modifications to its code since the system will find itself in a state in which a change made immediately is as desirable and likely as the same change made later.”
Possible limits and obstacles in self-
improvementParadoxes in logical reasoning with self-reference, like “This sentence is false.” I call deliberately created paradox of such type “philosophical landmines” and they could be a mean of last hope to control AI.
Possible limits and obstacles in self-
improvementYampolsky showed inevitable wireheading of agents above certain level of intelligence, that is hacking of own reward and utility function
Possible limits and obstacles in self-
improvementCorrelation obstacle by Chalmers: “a possibility that no interesting properties we would like to amplify will correspond to ability to design better software.”
Pointer problem: If a program starts to change its code, while running it simultaneously, it could crash, if it change the same lines of code there its pointer is now. A program can’t run and change it self simultaneously.
Possible limits and obstacles in self-
improvement
Possible limits and obstacles in self-
improvement
Father and child problem is in fact a fight for dominance between AI generations, and it clearly has many failure modes.
Possible limits and obstacles in self-
improvement
If AI is a single computer program, it could halt
Converging instrumental goals in self-improvement of AI
AI Safety problem on each new level: Avoiding war with new generation
Converging instrumental goals in self-improvement of AI
Need to test new versions for their real ability to reliably solve complex problems better
Converging instrumental goals in self-improvement of AI
Ability to return to previous state
Converging instrumental goals in self-improvement of AI
Preferring evolution to revolutions, and lower level changes to higher level changes: AI prefers to reach the same level of optimization power by lower level changes, that is by evolutionary development, but not by revolutions
Converging instrumental goals in self-improvement of AI
Revolutions in early stage of AI and evolution on later stage
AI will prefer revolutions only if it will be in very urgent situation, which will probably be in the beginning of its development, when it has to win over other AI p r o j e c t s a n d u r g e n t l y prevent other global risks.
Converging instrumental goals in self-improvement of AI
Military AI as converging goal n early stages of AI development
Converging instrumental goals in self-improvement of AI
Solving Fermi paradox
Converging instrumental goals in self-improvement of AI
Cooperation with humans of early stages of its development
Converging instrumental goals in self-improvement of AI
Protecting its own reward function against wireheading
Self-improving of the net of AIs
• It can’t halt. If one agent halts, other will work. • It has natural ability to clean bugs (natural selection). • It is immune to suicide of any single object. Even if all of them will suicide it will not
happen simultaneously and they will be able to create offsprings so the net will continue to exist.
• There is no pointer problem. • There is no so strong difference between evolution and revolutions. Revolutionary
changes may be tried by some agents, and if they work, such agents will dominate. • There is no paperclip maximizers: different agents have different final goals. • If one agent start to dominate other, the evolution of all system almost stops (the same way
as dictatorship is bad for market economy).
Possible interventions in self-improving process to make it less
dangerous1. Taking low hanging fruits 2. Explanation of risks to Young AI 3. Initial AI designs that are not able to quick SI 4. Required level of testing 5. Goal system, which prevent unlimited SI 6. Control rods and signalization
Self-improvement is not necessary condition for global catastrophic AI
Narrow AI designed to construct dangerous biological viruses could му even worse
Conclusion: 30 different levels of self-improvment
Some produce small gains, but some may produce recursive gains.
Conservative estimate: Each level will increase performance 5 times, and there is no recursive SI.
In that case total SI: 931 322 574 615 478 500 000 = 10 power 21 times
Conclusion: Recursive SI is not necessary to create superinteligence, even modest SI on many levels is
Conclusion: Medium level self-improvement of
Young AI and its risks
While unlimited self-improvement may meet some conceptual difficulties, first human level AI may get some medium level self-improvement on approximately low cost, quickly and with low self-risk.
But combination of this low hanging SI tricks may produce 100-1000 increase in performance even for the boxed Young AI.
So some types of SI will not be available to the Young AI, as they are risky, take a lot of time or require external resources.