leveraging human knowledge for machine learning curriculum design matthew e. taylor...
TRANSCRIPT
![Page 1: Leveraging Human Knowledge for Machine Learning Curriculum Design Matthew E. Taylor teamcore.usc.edu/taylorm](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56649ee65503460f94bf68f7/html5/thumbnails/1.jpg)
Leveraging Human Knowledge for Machine Learning Curriculum Design
Matthew E. Taylorteamcore.usc.edu/taylorm
![Page 2: Leveraging Human Knowledge for Machine Learning Curriculum Design Matthew E. Taylor teamcore.usc.edu/taylorm](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56649ee65503460f94bf68f7/html5/thumbnails/2.jpg)
Overview• Want agents to learn difficult problems
– Lots of data needed (time)– Picking a correct bias (NFL)
• Taxi driving example
• Use human to design sequence of tasks1. Basic car control2. Parking lot navigation3. Small Town4. Los Angeles
• Why not have agents select tasks?
![Page 3: Leveraging Human Knowledge for Machine Learning Curriculum Design Matthew E. Taylor teamcore.usc.edu/taylorm](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56649ee65503460f94bf68f7/html5/thumbnails/3.jpg)
Problem Statement
• Humans can selecting a training sequence• Results in faster training / better performance
![Page 4: Leveraging Human Knowledge for Machine Learning Curriculum Design Matthew E. Taylor teamcore.usc.edu/taylorm](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56649ee65503460f94bf68f7/html5/thumbnails/4.jpg)
Task Transfer
1. Reduce total training time by picking source task(s)2. Learn sequence of source tasks, then learn
(previously unknown) task
SourceS, A
TargetS’, A’
![Page 5: Leveraging Human Knowledge for Machine Learning Curriculum Design Matthew E. Taylor teamcore.usc.edu/taylorm](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56649ee65503460f94bf68f7/html5/thumbnails/5.jpg)
Problem Statement
• Humans can selecting a training sequence• Results in faster training / better performance
• Meta-planning problem for agent learning
MDPMDP MDPMDP
MDPMDP ?MDP
![Page 6: Leveraging Human Knowledge for Machine Learning Curriculum Design Matthew E. Taylor teamcore.usc.edu/taylorm](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56649ee65503460f94bf68f7/html5/thumbnails/6.jpg)
Type of Shaping
• Assume agents could learn on their own• Think of Skinner (1953)• Not “RL Shaping” [Colombetti and Dorigo (1993) or Ng (1999)]
DANGER: Negative Transfer
![Page 7: Leveraging Human Knowledge for Machine Learning Curriculum Design Matthew E. Taylor teamcore.usc.edu/taylorm](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56649ee65503460f94bf68f7/html5/thumbnails/7.jpg)
Not On-line or Interactive Help
Advice / Demonstration / Imitation– Human unable or unwilling
Picking sequence of tasks– How to best learn important skills / ideas
![Page 8: Leveraging Human Knowledge for Machine Learning Curriculum Design Matthew E. Taylor teamcore.usc.edu/taylorm](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56649ee65503460f94bf68f7/html5/thumbnails/8.jpg)
Types of Useful Information
• Common Sense– Soccer balls roll after being kicked– Friction reduces an object’s speed
• Domain Knowledge– It is easier to complete short passes than long passes
• Algorithmic Knowledge– State space size can impact learning speed
![Page 9: Leveraging Human Knowledge for Machine Learning Curriculum Design Matthew E. Taylor teamcore.usc.edu/taylorm](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56649ee65503460f94bf68f7/html5/thumbnails/9.jpg)
Useful?
• Training time critical• Agent needs robust understanding of domain– (rare affordances)
• Consumer Level– Low bar for background knowledge– Save consumer time
![Page 10: Leveraging Human Knowledge for Machine Learning Curriculum Design Matthew E. Taylor teamcore.usc.edu/taylorm](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56649ee65503460f94bf68f7/html5/thumbnails/10.jpg)
Possible Domains?
• Nero
• RoboCup Coach
![Page 11: Leveraging Human Knowledge for Machine Learning Curriculum Design Matthew E. Taylor teamcore.usc.edu/taylorm](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56649ee65503460f94bf68f7/html5/thumbnails/11.jpg)
Path of Study• Determine what makes a good sequence– Increasing Difficulty– Basic skills (options)– Basic concepts / learn useful abstractions– Retrospective analysis
• Education literature?• On-line sequence adaptation? (social scaffolding)
![Page 12: Leveraging Human Knowledge for Machine Learning Curriculum Design Matthew E. Taylor teamcore.usc.edu/taylorm](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56649ee65503460f94bf68f7/html5/thumbnails/12.jpg)
Conclusion
• Leveraging human knowledge• Both experts and non-experts
• Where is constructing a task sequence superior?– Easy– Effective
• How can we construct such sequences well?– Transfer Learning / Lifelong Learning Analysis– Empirical studies
![Page 13: Leveraging Human Knowledge for Machine Learning Curriculum Design Matthew E. Taylor teamcore.usc.edu/taylorm](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56649ee65503460f94bf68f7/html5/thumbnails/13.jpg)
![Page 14: Leveraging Human Knowledge for Machine Learning Curriculum Design Matthew E. Taylor teamcore.usc.edu/taylorm](https://reader035.vdocuments.mx/reader035/viewer/2022062802/56649ee65503460f94bf68f7/html5/thumbnails/14.jpg)
Possible Domains?
• Nero• ESP, Peekaboom• RoboCup Coach