numenta acm data min - powerpoint presentation

Copyright © 2009 Numenta

Hierarchical Temporal Memory

Subutai [email protected]

Vice President, EngineeringNumenta

Introduction to Numenta

What can we learn from Neuroscience?

How can we incorporate these ideas into

Algorithms?


Applications?

Agenda

Numenta Snapshot

•Creating a new computing technology, Hierarchical Temporal Memory, based on the structure and function of the neocortex

•16 employees

—Founded in 2005 by Jeff Hawkins, Donna Dubinsky and Dileep George

•For-profit company with very long term roadmap and “patient capital”

—Focus on core technology

—Currently developing our third generation of algorithms

—Very selective corporate partnerships and application development

Numenta Timeline

2002 Redwood Neuroscience Institute, Jeff Hawkins

2004 On Intelligence, Hawkins and Blakeslee

Described theory of Hierarchical Temporal Memory (HTM)

2005 Mathematical formalism (Dileep George)

2005 Numenta founded to build new computing

platform based on HTM

2007 Released NuPIC software platform

2008 First HTM Workshop (>200 attendees)

2009 Vision toolkit Beta release

2010 Prediction toolkit release

Demo: An Easy Visual Task

Goal: output the name of the object in the image

cell phone

cow

rubber duck

sailboat

Why Isn’t This Easy For Computers?

Huge variations in images, even within a single category

It is impossible to write down a set of rules or transformations that cover all possibilities

Vision4 - Four Category Object Recognition Demo




Algorithms?


Applications?

Agenda

No Universal Learning Machine

No Free Lunch Theorem“no learning algorithm has an inherent superiority over other learning algorithms for all problems.”

(Wolpert, 1995)

Universal Learning Machine Specific Learning Machine

Machine with assumptions that match the structure of the world

x

• Many different regions performing specialized functions

• Local structure is similar across regions

The Neocortex

http://brainmapping.loni.ucla.edu/BMD_HTML/SharedCode/slides/fMRIBasics/img021.gif

Common Cortical Algorithm

Cortical Hierarchy

Sensory data(retina)

Sensory data(skin)

• Representations are distributed hierarchically

• Connections are bidirectional – significant feedback projections

• Each region exposed to constantly changing sensory patterns and is constantly predicting future patterns

From: Felleman and Van Essen




Algorithms?


Applications?

Agenda

Hierarchical Temporal Memory (HTM)

•Network of learning nodes

•All nodes do same thing

— Learns common spatial patterns

— Learns common sequences (groups patterns with common cause)

•Create a hierarchical, spatio-temporal model of data

—Probability of sequences passed up

—Predicted spatial patterns passed down

•Bayesian methods resolve ambiguity

Common spatial patterns

Common sequences

High level causes

Low level causes

First Order Markov GraphHTM Nodes Learn Static Patterns

Memorizes static patterns, “coincidences”

HTM Node

Stable, sparse vectors

[Input vector]

First Order Markov GraphHTM Nodes Learn Temporal Sequences

Memorizes static patterns, “coincidences”

HTM Node

Models frequency of transitions between patterns

Variable order Markov Chains, “groups”

[Input vectors]

First Order Markov GraphHTM Nodes Output Probability Over Sequences

HTM Node [P(g1), P(g2), … ]

[…], […], […], …

HTM Nodes Are Connected In Hierarchies

Hierarchies Allow Contextual Prediction

Summary: Hierarchical Temporal Memory

•Network of learning nodes

•All nodes do same thing

— Learns common spatial patterns

— Learns common sequences (groups patterns with common cause)

•Creates hierarchical model of data

—Sequence names passed up

—Predicted spatial patterns passed down

•Bayesian methods resolve ambiguity

Common spatial patterns

Common sequences

High level causes

Low level causes




Algorithms?


Applications?

Agenda

Web Analytics

•Analyze temporal patterns in a very high traffic news website (Forbes.com)

•Question: Can HTM’s model temporal statistics and predict topics and pages of interest to users?

Which Topic Is The User Interested In Next?

•177 total topics

•Random prediction gives 0.56% accuracy

?

?

Time

Training Paradigm

HTM trained using 100,000 user sequences

Temporal pooler builds up a variable order sequence model

Prediction Based On Page View Statistics

•Could predict using no temporal context, based just on popularity of different topics (“0’th order” prediction)

•This is what most sites do today

•Leads to 23% accuracy

?

?

Time

???

First Order Prediction

•Can do better if we use transition probabilities from each page

•Improves accuracy from 23% to 28%

?

?

Time

??

Variable Order Prediction

•“Variable order prediction” – how much temporal context you need is determined based on individual sequences

•Accuracy jumps to 45%

?

?

Time

Summary: Predicting News Topics

Prediction

Accuracy

Random chance 0.56 %

Page views prediction 23 %

1st order prediction 28 %

Variable order prediction

45 %

Summary: Predicting News Topics

Prediction

Accuracy

Accuracy Predicting Top-5

Pages

Random chance 0.56 % 3.16 %

Page views prediction 23 % 46 %

1st order prediction 28 % 58 %

Variable order prediction

45 % 69 %

HTMs potentially represent a powerful mechanism for predicting and analyzing web traffic patterns

Potential Applications In Web Analytics

• Increase length of site visits

—Predict pages that are directly relevant to each user

• Increase revenue

—Predict ad-clicks based on current user’s immediate history

•Display interesting traffic patterns through a website

—What are most common sequences?

•Display changes in traffic patterns

—How are sequence models changing from day to day?

Video Analysis: People Tracking

Person

Example Videos – Persons

Occlusions Non-ideal lighting

Groups/overlapping peopleSmall, non-upright

Non-Persons – Potential False Positives

Cars/Vehicles Balloons

Animals Trees/foliage/pool sweeper

People Tracking Demo

Applications In Biomedical Imaging

•Numerous pattern recognition tasks in biomedical imaging

Pattern Detection In Digital Pathology

Glands Not glands

Task: detect patterns in biopsy slides indicative of cancer

Malformed glands -> could be prostate cancer

Early Results Were Promising

Glands Not glands

•We trained a network to discriminate glands from other structures

•Test set accuracy was around 95%

HTM For Biomedical Imaging

•HTM performing quite well in gland detection as well as some other tasks

•There could be applications in other areas of Biomedical Imaging

—Radiology

—Electron microscopy

—….

•Key differentiator:

—General purpose pattern recognition algorithm

—Most existing work involves coding very specific algorithms to specific patterns

Applications Areas

•Web analytics

•Biomedical Imaging

•Video Analysis

•Credit card fraud

•Automotive

•Gaming

•Drug discovery

•Business modeling

•Healthcare

Working With Numenta On HTMs

• NuPIC, Numenta Platform For Intelligent Computing, available free for research on numenta.com

• Support through an active forum

• Contains implementation of our second generation of algorithms

• Vision Toolkit Beta, free for research

• Easy to use GUI for creating vision applications

• Includes hosted inference and a web services API

• Internships available for students!

• Send email to [email protected]

THANK YOU!!

[email protected]

numenta acm data min - powerpoint presentation

Documents

temporal context

uppredicted spatial

learning algorithms

sensory patterns

order predictionthis

timevariable order prediction

htms model temporal

common cortical algorithm13