numenta acm data min - powerpoint presentation
TRANSCRIPT
Copyright © 2009 Numenta
Hierarchical Temporal Memory
Subutai [email protected]
Vice President, EngineeringNumenta
Introduction to Numenta
What can we learn from Neuroscience?
How can we incorporate these ideas into
Algorithms?
How can we incorporate these ideas into
Applications?
Agenda
Numenta Snapshot
•Creating a new computing technology, Hierarchical Temporal Memory, based on the structure and function of the neocortex
•16 employees
—Founded in 2005 by Jeff Hawkins, Donna Dubinsky and Dileep George
•For-profit company with very long term roadmap and “patient capital”
—Focus on core technology
—Currently developing our third generation of algorithms
—Very selective corporate partnerships and application development
Numenta Timeline
2002 Redwood Neuroscience Institute, Jeff Hawkins
2004 On Intelligence, Hawkins and Blakeslee
Described theory of Hierarchical Temporal Memory (HTM)
2005 Mathematical formalism (Dileep George)
2005 Numenta founded to build new computing
platform based on HTM
2007 Released NuPIC software platform
2008 First HTM Workshop (>200 attendees)
2009 Vision toolkit Beta release
2010 Prediction toolkit release
Demo: An Easy Visual Task
Goal: output the name of the object in the image
cell phone
cow
rubber duck
sailboat
Why Isn’t This Easy For Computers?
Huge variations in images, even within a single category
It is impossible to write down a set of rules or transformations that cover all possibilities
Vision4 - Four Category Object Recognition Demo
Introduction to Numenta
What can we learn from Neuroscience?
How can we incorporate these ideas into
Algorithms?
How can we incorporate these ideas into
Applications?
Agenda
No Universal Learning Machine
No Free Lunch Theorem“no learning algorithm has an inherent superiority over other learning algorithms for all problems.”
(Wolpert, 1995)
Universal Learning Machine Specific Learning Machine
Machine with assumptions that match the structure of the world
x
• Many different regions performing specialized functions
• Local structure is similar across regions
The Neocortex
Common Cortical Algorithm
Cortical Hierarchy
Sensory data(retina)
Sensory data(skin)
• Representations are distributed hierarchically
• Connections are bidirectional – significant feedback projections
• Each region exposed to constantly changing sensory patterns and is constantly predicting future patterns
From: Felleman and Van Essen
Introduction to Numenta
What can we learn from Neuroscience?
How can we incorporate these ideas into
Algorithms?
How can we incorporate these ideas into
Applications?
Agenda
Hierarchical Temporal Memory (HTM)
•Network of learning nodes
•All nodes do same thing
— Learns common spatial patterns
— Learns common sequences (groups patterns with common cause)
•Create a hierarchical, spatio-temporal model of data
—Probability of sequences passed up
—Predicted spatial patterns passed down
•Bayesian methods resolve ambiguity
Common spatial patterns
Common sequences
High level causes
Low level causes
First Order Markov GraphHTM Nodes Learn Static Patterns
Memorizes static patterns, “coincidences”
HTM Node
Stable, sparse vectors
[Input vector]
First Order Markov GraphHTM Nodes Learn Temporal Sequences
Memorizes static patterns, “coincidences”
HTM Node
Models frequency of transitions between patterns
Variable order Markov Chains, “groups”
[Input vectors]
First Order Markov GraphHTM Nodes Output Probability Over Sequences
HTM Node [P(g1), P(g2), … ]
[…], […], […], …
HTM Nodes Are Connected In Hierarchies
Hierarchies Allow Contextual Prediction
Summary: Hierarchical Temporal Memory
•Network of learning nodes
•All nodes do same thing
— Learns common spatial patterns
— Learns common sequences (groups patterns with common cause)
•Creates hierarchical model of data
—Sequence names passed up
—Predicted spatial patterns passed down
•Bayesian methods resolve ambiguity
Common spatial patterns
Common sequences
High level causes
Low level causes
Introduction to Numenta
What can we learn from Neuroscience?
How can we incorporate these ideas into
Algorithms?
How can we incorporate these ideas into
Applications?
Agenda
Web Analytics
•Analyze temporal patterns in a very high traffic news website (Forbes.com)
•Question: Can HTM’s model temporal statistics and predict topics and pages of interest to users?
Which Topic Is The User Interested In Next?
•177 total topics
•Random prediction gives 0.56% accuracy
?
?
Time
Training Paradigm
HTM trained using 100,000 user sequences
Temporal pooler builds up a variable order sequence model
Prediction Based On Page View Statistics
•Could predict using no temporal context, based just on popularity of different topics (“0’th order” prediction)
•This is what most sites do today
•Leads to 23% accuracy
?
?
Time
???
First Order Prediction
•Can do better if we use transition probabilities from each page
•Improves accuracy from 23% to 28%
?
?
Time
??
Variable Order Prediction
•“Variable order prediction” – how much temporal context you need is determined based on individual sequences
•Accuracy jumps to 45%
?
?
Time
Summary: Predicting News Topics
Prediction
Accuracy
Random chance 0.56 %
Page views prediction 23 %
1st order prediction 28 %
Variable order prediction
45 %
Summary: Predicting News Topics
Prediction
Accuracy
Accuracy Predicting Top-5
Pages
Random chance 0.56 % 3.16 %
Page views prediction 23 % 46 %
1st order prediction 28 % 58 %
Variable order prediction
45 % 69 %
HTMs potentially represent a powerful mechanism for predicting and analyzing web traffic patterns
Potential Applications In Web Analytics
• Increase length of site visits
—Predict pages that are directly relevant to each user
• Increase revenue
—Predict ad-clicks based on current user’s immediate history
•Display interesting traffic patterns through a website
—What are most common sequences?
•Display changes in traffic patterns
—How are sequence models changing from day to day?
Video Analysis: People Tracking
Person
Example Videos – Persons
Occlusions Non-ideal lighting
Groups/overlapping peopleSmall, non-upright
Non-Persons – Potential False Positives
Cars/Vehicles Balloons
Animals Trees/foliage/pool sweeper
People Tracking Demo
Applications In Biomedical Imaging
•Numerous pattern recognition tasks in biomedical imaging
Pattern Detection In Digital Pathology
Glands Not glands
Task: detect patterns in biopsy slides indicative of cancer
Malformed glands -> could be prostate cancer
Early Results Were Promising
Glands Not glands
•We trained a network to discriminate glands from other structures
•Test set accuracy was around 95%
HTM For Biomedical Imaging
•HTM performing quite well in gland detection as well as some other tasks
•There could be applications in other areas of Biomedical Imaging
—Radiology
—Electron microscopy
—….
•Key differentiator:
—General purpose pattern recognition algorithm
—Most existing work involves coding very specific algorithms to specific patterns
Applications Areas
•Web analytics
•Biomedical Imaging
•Video Analysis
•Credit card fraud
•Automotive
•Gaming
•Drug discovery
•Business modeling
•Healthcare
Working With Numenta On HTMs
• NuPIC, Numenta Platform For Intelligent Computing, available free for research on numenta.com
• Support through an active forum
• Contains implementation of our second generation of algorithms
• Vision Toolkit Beta, free for research
• Easy to use GUI for creating vision applications
• Includes hosted inference and a web services API
• Internships available for students!
• Send email to [email protected]
THANK YOU!!