visual search engine for handwritten and typeset math in...
TRANSCRIPT
![Page 1: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/1.jpg)
Visual Search Engine for Handwritten and Typeset Math in Lecture Videos and LATEX Notes
Kenny Davila and Richard Zanibbi August 6, 2018
Center for Unified Biometrics and Sensors
![Page 2: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/2.jpg)
2
Select
![Page 3: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/3.jpg)
3
Select
![Page 4: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/4.jpg)
4
Select
Search
![Page 5: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/5.jpg)
SEARCH RESULTS Found in Lecture Videos 1. Linear Algebra β Lecture 06 2. Linear Algebra β Lecture 08 3. Linear Algebra β Lecture 10 β¦
Related Topics 1. Systems of Equations 2. Matrix Reduction 3. Linear Algebra
5
![Page 6: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/6.jpg)
What about other Mathematical Expressions? Could I write my queries instead of using Images?
6
![Page 7: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/7.jpg)
What about other Mathematical Expressions? Could I write my queries instead of using Images? Yes, using
7
![Page 8: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/8.jpg)
Potential Search Modes
Lecture Notes
Lecture Video
β
β Whiteboard Whiteboard β Whiteboard
β Whiteboard
8
![Page 9: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/9.jpg)
Tangent-V Visual Search Engine
Applied to Indexing and Retrieval of formulae from Lecture materials Based on Matching Symbol Pairs from Line of Sight Graphs (LOS)
Domain knowledge is given by Recognition Module - Currently: Mathematical Symbol Recognition
Source code released: https://cs.rit.edu/~dprl/Software.html
9
![Page 10: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/10.jpg)
Related Work
Related fields: - Content-Based Image Retrieval [1] - Word Spotting [2] - Mathematical Information Retrieval [3]
- Formula Representation: Semantic vs Appearance - Retrieval Modality: Symbol vs Image-based - Tangent-V generalizes the Tangent-S formula retrieval model [4]
[1] J. Sivic & A. Zisserman, βVideo Google: A text retrieval approach to object matching in videos,β in ICCV 2003 [2] S. Sudholt & G. A. Fink, βPhocnet: A deep convolutional neural network for word spotting in handwritten documents,β in ICFHR 2016 [3] R. Zanibbi & D. Blostein, βRecognition and retrieval of mathematical expressions,β IJDAR, vol. 15, no. 4, 2012. [4] K. Davila & R. Zanibbi, βLayout and semantics: Combining representations for mathematical formula search,β SIGIR, 2017 10
![Page 11: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/11.jpg)
Tangent-V Overview
Indexing Pipeline
Navigation Pipeline
Retrieval Pipeline
11
![Page 12: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/12.jpg)
Supplementary Lecture Notes ( LaTe )
Input Lecture Notes
Binary Images
Output Math Expressions
12
![Page 13: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/13.jpg)
[1] Davila, K., Zanibbi, R. Whiteboard Content Summarization via Spatio-Temporal Conflict Minimization in Lecture Videos. ICDAR 2017
Preprocessing Lecture Video Summarization [1]
Temporal Index
Binary Images
Input Lecture Video
Output Whiteboard Contents Keyframes
MTS/ MP4
Content Extraction
Temporal Segmentation
Spatio-temporal Analysis
13
![Page 14: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/14.jpg)
Lecture Video Navigation from Keyframes
14
![Page 15: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/15.jpg)
Indexing Pipeline (Overview)
(Videos Only)
Temporal Index
Raw Data
Pre-processing
Binary Images
[1] Davila, K., Zanibbi, R. Whiteboard Content Summarization via Spatio-Temporal Conflict Minimization in Lecture Videos. ICDAR 2017
AccessMath Lecture Video Summarization [1]
15
![Page 16: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/16.jpg)
Indexing Pipeline (Overview)
(Videos Only)
Temporal Index
Raw Data
Pre-processing
Spatial Index
Binary Images
LOS Graph Construction
Spatial Index Construction
[1] Davila, K., Zanibbi, R. Whiteboard Content Summarization via Spatio-Temporal Conflict Minimization in Lecture Videos. ICDAR 2017
Tangent-V AccessMath Lecture
Video Summarization [1]
16
![Page 17: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/17.jpg)
Line of Sight (LOS) Graphs
Uses Connected Components (CC) as Nodes Two nodes are connected if - One can see the other - Max. distance factor considered for whiteboard content (2 times median size)
17
![Page 18: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/18.jpg)
True Node Labels/Relationships are unknown - After Symbol Recognition, each Node has top k labels with probabilities
- π π|π π₯πβΞ© β₯ 80% π β€ 10 - Edges have 3D unit vectors indicating direction
Line of Sight (LOS) Graphs
18
2π₯ 2π₯ π₯2 (0.707, 0.707, 0.000) (1.000, 0.000, 0.000) (-0.707, -0.707, 0.000)
π (0.146, -0.146, 0.978)
![Page 19: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/19.jpg)
Spatial Indexing using Symbol Pairs
19
ππ - π(ππ|ππ) π - 3D Unit Vector from ππ to ππ ππ - Size Ratio between ππ and ππ
Tuples Generated ππ Γ ππ ππ,ππ, ππ, ππ, π, ππ
Top k-labels per node π
π1 = π₯ π2 = 8
πΊ1 = (π₯, 0.8), (π, 0.2) πΊ2 = (8, 0.6), (&, 0.3)
π = π. ππ,βπ. ππ, π. ππ
ππ = 1.26
Inverted Index for Symbol Pairs Entries: Pairs of symbol labels ππ,ππ
Posting lists: Pair locations in images with π°π«,ππ, ππ, π, ππ ππ, ππ
![Page 20: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/20.jpg)
Tangent-V Overview Indexing of Videos/Notes
Data
Indexing Pipeline
Navigation Pipeline
Retrieval Pipeline
Spatial Index
Temporal Index
20
![Page 21: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/21.jpg)
Tangent-V Retrieval Model
Spatial Index
Search Results
Initial Lookup
Pre-processing
Query Image
Query Graph
Structural Alignment
Layer 1 Layer 2
21
![Page 22: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/22.jpg)
Layer 1: Initial Lookup Query symbol pairs are used to find matches on their corresponding entries on the inverted index structure A match between index symbol pair ππ = (π1, π2) and query pair ππ = (π1, π2) will be accepted as valid if and only if: 1 - They are spatially consistent:
π β π β₯ cos 45β 2 - Optionally, if they have consistent size ratios (not too small/large) Matching Pairs Scores are then aggregated by unique Graph Pair IDs
22
![Page 23: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/23.jpg)
Layer 2: Structural Alignment
23
Matching Pairs
Matching Subgraphs
![Page 24: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/24.jpg)
Layer 2: Structural Alignment
24
Greedy Match Growing
Matching Pairs
Matching Subgraphs
X + Y
Match 1
X + Y
Match 2
X + Y
New Match
X + Y
Query
+ = Score= 0.7 Score= 0.5 Score= 1.2
![Page 25: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/25.jpg)
Layer 2: Structural Alignment
25
Greedy Match Growing
Greedy Match Connection
Matching Pairs
Matching Subgraphs
+ =
X + Y
Query
= 0
Match 1
Score= 0.5
X + 1 = 0
Match 2
Score= 0.4
X + 1 = 0
New Match
Score= 0.9
X + 1 = 0
![Page 26: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/26.jpg)
Layer 2: Structural Alignment
26
Greedy Match Growing
Greedy Match Connection
Incompatible Match Removal
Matching Pairs
Matching Subgraphs
Score= 5.0 Score= 0.5
X + X
Query
+ 1
2
Match 1
X + X + 1
2
Match 2
X + X + 1
2
Accepted Removed
![Page 27: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/27.jpg)
Layer 2: Structural Alignment
27
Greedy Match Growing
Greedy Match Connection
Incompatible Match Removal
Match Grouping
Matching Pairs
Matching Subgraphs
Query:
Lecture 01 β KF #5 Lecture 01 β KF #6
Same match!
![Page 28: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/28.jpg)
Match Scoring and Ranking
We introduce two scoring schemes: Ξ± and h
28
Item πΆ π΄ π π΄
Description A weighted edge recall Harmonic mean of weighted edge recall and node recall
Edge weighting pair-wise symbol alignments and scaled cosine similarity
scaled cosine similarity
Node weighting - Individual symbol alignments
Based on - Maximum Subtree Similarity (MSS) [1]
Execution Times Faster Slower
[1] R. Zanibbi, K. Davila, A. Kane, & F. Tompa, βMulti-stage math formula search: Using appearance-based similarity metrics at scale,β SIGIR, 2016
![Page 29: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/29.jpg)
Tangent-V Overview
Retrieval System
Data
Search Results
Indexing Pipeline
Navigation Pipeline
Retrieval Pipeline Query
Spatial Index
Temporal Index
29
![Page 30: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/30.jpg)
Tangent-V Overview
Video Navigation
Data
Search Results
Indexing Pipeline
Navigation Pipeline
Retrieval Pipeline Query
Spatial Index
Temporal Index
30
![Page 31: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/31.jpg)
Lecture Video Navigation from Search Results
Check our demo at: https://youtu.be/gn24qo1MLN0 31
![Page 32: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/32.jpg)
Experiments AccessMath Dataset - 13 Lecture videos with supplementary notes
A total of 20 evaluation queries were chosen with rejection sampling A total of 4 combinations of Query-vs-Index modalities - Handwritten expressions - Typeset expressions
For a given query, the target is to find a math expression that contains the whole query graph - query is same expression - query is sub-expression
32
![Page 33: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/33.jpg)
Evaluation Metrics
Two metrics are considered - Recall @ 10: Target found @ rank β€ 10 - MRR @ 10: Mean of Reciprocal Rank (RR), with
π π = 1π
1 β€ π β€ 10
0 ππ‘βπππ€ππ π
33
![Page 34: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/34.jpg)
Results: Recall @ 10
Weighted Edge Recall πΆ Harmonic Mean h Query Index πΆ πΆβ§ πΆβ§π π πβ§ πβ§π
LaTeX 1.00 1.00 1.00 1.00 1.00 1.00 Whiteboard 0.95 1.00 1.00 1.00 1.00 1.00
Whiteboard 0.95 0.95 0.90 0.95 1.00 0.95 Whiteboard LaTeX 0.80 0.85 0.85 0.90 0.90 0.90
34
![Page 35: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/35.jpg)
Results: MRR @ 10
Weighted Edge Recall πΆ Harmonic Mean h Query Index πΆ πΆβ§ πΆβ§π π πβ§ πβ§π
LaTeX 0.98 1.00 1.00 0.98 1.00 1.00 Whiteboard 0.93 1.00 1.00 1.00 1.00 1.00
Whiteboard 0.66 0.69 0.71 0.89 0.84 0.86 Whiteboard LaTeX 0.63 0.71 0.74 0.74 0.78 0.84
35
![Page 36: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/36.jpg)
Conclusions Tangent-V is effective for search between Typeset and Handwriting - Multiple labels help finding targets when recognition accuracy is low Tangent-V can also be used to create navigational tools
New symbol recognizers can be used for indexing of new domains - Code is released for others to try on new domains (http://cs.rit.edu/~dprl/Software.html)
Future work: - Test unsupervised symbol classification - Explore Vector formats - Speed-up search
36
![Page 37: Visual Search Engine for Handwritten and Typeset Math in ...icfhr2018.org/SlidesPosters/Slides-Paper124.pdfVisual Search Engine for Handwritten and Typeset Math in Lecture Videos and](https://reader034.vdocuments.mx/reader034/viewer/2022042115/5e91fa8deb9200352f66df0a/html5/thumbnails/37.jpg)
Thank You!
This material is based upon work supported by the National Science Foundation (USA) under Grants No. IIS-1016815 and HCC-1218801. We also thank Anurag Agarwal for helping in the creation of the lecture videos used to evaluate our system.
Source code: www.cs.rit.edu/~dprl/Software.html
37