detecting and reading text in natural scenesayuille/jhucourses/visionasbayesianinference202… ·...

25
Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen, Yuille}@stat.ucla.edu Statistics dept, UCLA

Upload: others

Post on 23-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and Reading Text in Natural Scenes

Xiangrong Chen, Alan L. Yuille{xrchen, Yuille}@stat.ucla.edu

Statistics dept, UCLA

Page 2: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 2CVPR ’04

Outline

Background

Overview of our method

Detecting text

Reading text

Experiments

Summary

Page 3: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 3CVPR ’04

Text detection methods

Text as texture Text as connected component

TEXT T

Page 4: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 4CVPR ’04

Text as texture connected component

Feature Texture analysis Shape, structure and appearance analysis

Searching method

Scan the image using a small window in different

scales

Enumerate all the CCPS; need image segmentation to

obtain the CCPs

ProsEasy to deal with scale

and complex background; scan quickly

Easily lead to generative model and thus can guide

recognition task

ConsDiscriminant model; a black

box, not easy to guide recognition task

No good enough segmentation algorithm available to get CCPs

Comparison

Page 5: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 5CVPR ’04

Combination

Find candidate area using text as texture

Verify using text as connected component

Page 6: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 6CVPR ’04

Proposed method

AdaBoost fortext detection

Connected compo-nents evaluation

Adaptive binarization

OCR engine

Text as texture

Text as connected component

Page 7: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 7CVPR ’04

Why using AdaBoost

Improves classification accuracy

Can be used with many different classifiers

Simple to implement

Not prone to overfitting

Page 8: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 8CVPR ’04

Training data

162 Source images by normal and blind people

Manually label text regions

Cut the text regions into overlapped training samples with fixed width-to-height ratio, 2:1

Page 9: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 9CVPR ’04

Features – Criterion

InformativeInvariant for text regionsDiscriminating between text and non-text regions

CostComputation

Page 10: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 10CVPR ’04

0 50 100 150 200 250 300 350 4000.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Features-Training samplesFace Text

4,000faces

32 × 32

4,000patches20 × 40

Rawdata

Align,Crop &Scale

PCA

Features ?

Mean face Mean patch

First 50 PCs capture90% energy

First 150 PCs capture90% energy

Page 11: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 11CVPR ’04

Features – Set I

dIdx

dIdyMean of Mean of

1st order derivatives

Page 12: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 12CVPR ’04

Features – Set II

Histogram of Intensity and gradient

Page 13: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 13CVPR ’04

Features – Set III

Edge linking features

edge map thinning linking

Using statistics of the length of the linked edges

Page 14: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 14CVPR ’04

Ability of the strong classifier is determined by the ability of the weak learners

Strong classifier with 1D stub weak learners can’t deal with the example

We use log-likelihood ratio test on distributions of both single features and pairs of features as weak learners ( Konishi and Yuille, 2003)

Weak learners

y

x

x

x

o

o

Page 15: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 15CVPR ’04

An example of Weak learners

Joint distribution of a pair of features form the first weak learner AdaBoost selected

Text distribution is shaded.

Page 16: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 16CVPR ’04

Cascade of strong classifiers

µ and σ

Derivative features

Derivative features

All features

Candidates

Results

Ruled out

Page 17: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 17CVPR ’04

Text detection examples

Page 18: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 18CVPR ’04

Fail to detect

Vertically aligned text Individual letters Extreme cases

Page 19: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 19CVPR ’04

Adaptive binarization

Ni’Black’s method

Determine range of neighborhood sizeRelative to the sub-window height h

( ) ( ) ( )r r rT x x k xµ σ= +

0( )( ) min { ( ) }rr R h

r x x Tσ⊂

= >

Page 20: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 20CVPR ’04

OCR engine

Currently we use a commercial OCR engine A generative model for reading text is under developing

Page 21: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 21CVPR ’04

Text reading examples

Page 22: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 22CVPR ’04

False positives

Building structures Signs or icons Tree leaves and branches

Page 23: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 23CVPR ’04

Results

AccuracyFalse Negative for detection 2.8%False Positive for detection ~ 1/200,000False Negative for reading 7%False Positive for reading 10% (1% w/ constraint to form coherent word)

Speed3 Seconds for 2,048*1536 image ~ 15fps for 320*240 video frames

Page 24: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 24CVPR ’04

Summary

Using Adaboost to learn a strong classifier for detecting text in unconstrained scenes

Selection of informative features with consideration of computation cost

Detecting and reading over 90% text regions in our database

Real-time (15fps) for video quality images (320 * 240)

Page 25: Detecting and Reading Text in Natural Scenesayuille/JHUcourses/VisionAsBayesianInference202… · Detecting and Reading Text in Natural Scenes Xiangrong Chen, Alan L. Yuille {xrchen,

Detecting and reading text in natural scenes 25CVPR ’04

ICDAR’s competition

Database