large-scale long-tailed recognition in an open world › papers › longtail_poster.pdftop-1...
TRANSCRIPT
Head Classes Tail Classes
bottom-up
attentiontop-down
attention
familiarity
visual
memory
Open Classes
Avoid Forgetting
Knowledge Transfer
Sensitivity to Novelty
Large-Scale Long-Tailed Recognition in an Open World
Ziwei Liu*, Zhongqi Miao*, Xiaohang Zhan, Jiayun Wang, Boqing Gong, Stella X. Yu
The Chinese University of Hong Kong & UC Berkeley / ICSI
Open Long-Tailed Recognition
Relation to Existing Tasks
• the whole spectrum of real-world visual recognition
• A full treatment to the whole spectrum of real-world
visual recognition
• A task requiring improvement to different aspects of
the existing deep neural networks
Approach Intuition
Module Explanation
Tail Class ‘African Grey’
Tail Class ‘Buckeye’
Open
Sample
bottom-up attention(spatial self-attention)
top-down attention(memory-induced features
with attention selection)
familiarity(reachability to
the learned memory)
Overall Architecture
Benchmarks
DIRECT FEATURE
fc + tanh
Concept Selector
fc + softmax
Hallucinator
MEMORY FEATURE
Memory
minimum distance
Reachability
cosine normalization
Classifier
OLTR FEATURE
Modules
Operations
Logits
Experimental Results
Conclusions
ImageNet-LT Benchmark Absolute Performance Gain: ~20%
Places-LT Benchmark Absolute Performance Gain: ~10%
Few shot
top-1 classification accuracy on ImageNet-LT
top-1 classification accuracy on Places-LT
head class > 1000 samples, tail class < 5 samples
Across-the-board improvement on all class types• incorporate memory and attention into learning
𝑅,
×
• New Task towards real-world visual recognition
Open Long-Tailed Recognition
• New Approach with memory-augmented network
Dynamic Meta-Embedding
• New Benchmarks for future research
ImageNet-LT Places-LT MS1M-LT
Input Image
Modulated Attention
Avoid Forgetting
Knowledge Transfer
Sensitivity to Novelty
Tail Class: Patio
(5 training samples)
Ours
Plain
Model
patio
restaurant restauranthorse cart
patio patio patio
patio
restaurant
restaurant