cognitive metadata extraction · 2019-11-26 · introduction. 3. aikno ®: ai for industrial...

10
COGNITIVE METADATA EXTRACTION powered by AiKno ® on Intel

Upload: others

Post on 18-Mar-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: COGNITIVE METADATA EXTRACTION · 2019-11-26 · Introduction. 3. AiKno ®: AI for Industrial Engineering. 3 AiKno ® Framework 4 What can AiKno ® do? 5 Cognitive metadata extraction

L&T TECHNOLOGY SERVICES PAGE 1Copyright © L&T Technology [email protected]

ENGINEERING THE CHANGE

COGNITIVE METADATA EXTRACTION

powered by AiKno® on Intel

Page 2: COGNITIVE METADATA EXTRACTION · 2019-11-26 · Introduction. 3. AiKno ®: AI for Industrial Engineering. 3 AiKno ® Framework 4 What can AiKno ® do? 5 Cognitive metadata extraction

Introduction 3

AiKno®: AI for Industrial Engineering 3

AiKno® Framework 4

What can AiKno® do? 5

Cognitive metadata extraction powered by AiKno® 6

LTTS OCR 7

Speed-up with Intel AI technologies 7

Results 8

Inferencing Throughput Performance of LTTS OCR Module 8

Inferencing Throughput Performance of AiKno® Metadata Extraction Pipeline 9

Conclusion 10

Table of Contents

Page 3: COGNITIVE METADATA EXTRACTION · 2019-11-26 · Introduction. 3. AiKno ®: AI for Industrial Engineering. 3 AiKno ® Framework 4 What can AiKno ® do? 5 Cognitive metadata extraction

L&T TECHNOLOGY SERVICES PAGE 03

As Artificial Intelligence (AI) adoption moves beyond initial Research & Development (R&D) and matures over the coming years, early adopters are poised to define the frontiers of what’s next. However, the road to industrial AI is riddled with challenges. Unlike deciphering consumer data, unlocking the secrets hidden in industrial data requires very high level of precision and accuracy. And most importantly, it requires an AI that can enable human-edge in machines.

AiKno® connects industrial assets, data, analytics, and processes to drive human-edge in machines to successfully navigate the Industrial Engineering challenges and give businesses the competitive edge.

Powered by Machine Learning, Vision Based Computing, Natural Language processing and strengthened by years of cumulative experience of LTTS experts, AiKno® is transforming industrial products, medical devices, plant engineering, transportation, consumer electronics, telecommunications, semiconductors industries by optimizing production, distribution, and field-services to enabling processes that enhance efficiency across all industrial stages.

Introduction

AiKno®: AI for Industrial Engineering

Page 4: COGNITIVE METADATA EXTRACTION · 2019-11-26 · Introduction. 3. AiKno ®: AI for Industrial Engineering. 3 AiKno ® Framework 4 What can AiKno ® do? 5 Cognitive metadata extraction

PAGE 04 L&T TECHNOLOGY SERVICES

DAT

A SO

URC

ES

DATA PRE-PROCESSING

REPORT DASHBOARD GRAPH

SENSORS | IMAGES | DOCUMENTS | WEB STREAMINGLOG FILES | AUDIO/SPEECH VIDEO | PLC - SCADA

NATURAL LANGUAGEPROCESSING (NLP)

MACHINELEARNING (ML)

IMAGEPROCESSING (IP)

AI LIBRARIES AND PACKAGES

Topic Modeling

Text Extraction &Summarization

Semantic Analysis

Information Extraction

Question Answering

Predictive Analytics

Deep Learning

Supervised, Unsupervised& Reinforcement Learning

Clustering, Classificationand Regression

Feature Extraction

Image Restoration andExtraction

Segmentation

3D Reconstruction

Tracking

[Data imputation, Outlier Detection, Feature Selection, Dimentionality Detection, Feature Engineering, etc.]

RESULT

AI

LAYE

R

AIK

NO

TM

AiKno® FrameworkFigure 2.1 below illustrates the LTTS AiKno® framework.

Figure 2.1: LTTS AiKno® framework

Page 5: COGNITIVE METADATA EXTRACTION · 2019-11-26 · Introduction. 3. AiKno ®: AI for Industrial Engineering. 3 AiKno ® Framework 4 What can AiKno ® do? 5 Cognitive metadata extraction

L&T TECHNOLOGY SERVICES PAGE 05

DAT

A SO

URC

ES

DATA PRE-PROCESSING

REPORT DASHBOARD GRAPH

SENSORS | IMAGES | DOCUMENTS | WEB STREAMINGLOG FILES | AUDIO/SPEECH VIDEO | PLC - SCADA

NATURAL LANGUAGEPROCESSING (NLP)

MACHINELEARNING (ML)

IMAGEPROCESSING (IP)

AI LIBRARIES AND PACKAGES

Topic Modeling

Text Extraction &Summarization

Semantic Analysis

Information Extraction

Question Answering

Predictive Analytics

Deep Learning

Supervised, Unsupervised& Reinforcement Learning

Clustering, Classificationand Regression

Feature Extraction

Image Restoration andExtraction

Segmentation

3D Reconstruction

Tracking

[Data imputation, Outlier Detection, Feature Selection, Dimentionality Detection, Feature Engineering, etc.]

RESULT

AI

LAYE

R

AIK

NO

TM

What can AiKno® do?

4. Intelligent Virtual Agents

Design and develop Service Support Agents assisting any customer/service support personnel through document management systems and content management solution as back end.

1.Customer and Marketing Analytics

Transform and personalize interactions with your customers by intelligently using all digital touchpoints to deliver intuitive, seamless, and exceptional user experiences.

• Curated Content Recommendation

• Content and Targeted Ads

• Customer Usage Behavior

2. Cognitive Automation - Engineering Document Management System

Implement advanced information retrieval and legacy data extraction systems to eliminate redundant tasks and enable smarter content recommendations.

• Data Extraction

• Smart Information Retrieval

• Content Recommendation

3. Predictive Analytics

Gain real-time visibility into equipment health, anticipate equipment failure before it happens, and automatically trigger maintenance or repair actions with the help of advanced predictive analytics.

• Smart Maintenance Strategy

• Anomaly Detection

Page 6: COGNITIVE METADATA EXTRACTION · 2019-11-26 · Introduction. 3. AiKno ®: AI for Industrial Engineering. 3 AiKno ® Framework 4 What can AiKno ® do? 5 Cognitive metadata extraction

PAGE 06 L&T TECHNOLOGY SERVICES

Despite technological advancements, traditional engineering companies have been very slow in digital adoption. The task of digitalizing legacy data, which when done manually is both time-consuming and cost-intensive activity. AiKno® helped a client in the Oil and Gas Drilling industry to design a cognitive metadata extraction solution for reducing manual effort and eliminating documentation errors encountered when digitizing legacy content management.

• The solution can extract the provided metadata from the scanned documents available inimage format.

• The continuous self-learning system can do auto corrections and drive semantics-based ruleson human feedback without any need of re-engineering.

The AiKno® Meta Data extraction can be broadly divided in to 3 steps. First, the proprietary advanced image processing algorithms that process images which have huge amount of noise associated with scanning artifacts and of chemical nature due to reactions of Ink and paper over a period of time. Next, the text is extracted using a proprietary LTTS OCR (Optical Character Recognition) engine, the output of which was passed through the third step, a NLP (Natural Language Processing) engine to extract the meta data in the third step. Figure 2.2 illustrates the AiKno® Meta Data extraction workflow.

Cognitive metadata extraction powered by AiKno®

Figure 2.2: AiKno® Meta Data extraction workflow

Proprietary algorithms to process

noisy input

Image Processor

Input documents of different formats

Document Ingestion

Proprietary OCR to extract text

Speed-up in LTTS OCR is being

evaluated

LTTS OCR

Proprietary algorithms to extract

metadata

NLP Engine

Speed-up in pipeline as a whole is being evaluated

Page 7: COGNITIVE METADATA EXTRACTION · 2019-11-26 · Introduction. 3. AiKno ®: AI for Industrial Engineering. 3 AiKno ® Framework 4 What can AiKno ® do? 5 Cognitive metadata extraction

L&T TECHNOLOGY SERVICES PAGE 07

L&T Technology Services has designed and developed its own proprietary OCR to extract data with maximum accuracy. It has been designed using state of the art Deep Learning techniques and customized to give maximum accuracy for extraction from technical engineering documents. The OCR is trained on noisy scanned technical documents to give maximum accuracy. Also, with cognitive validation of OCR results, the model is continuously updated and customized, thus improving accuracy of the OCR. This improved accuracy of the OCR increases extraction rate of the Metadata Extraction module thus reducing overall digitization time.

Intel worked with LTTS to optimize the AiKno® framework pipeline for Metadata Extraction on Intel® Xeon® Platinum 8124M processor-based platform @ 3.00GHz. To accelerate document metadata extraction, LTTS AiKno® pipeline was migrated to an Intel Xeon Platinum 8124M processor-based platform and optimized with Intel Optimization for TensorFlow, which includes the Intel Math Kernel Library for Deep Neural Networks (Intel® MKL-DNN).

Possibilities of further optimization in pipeline were explored using the Intel Distribution of OpenVINO™ toolkit. Based on Convolutional Neural Networks (CNN), the toolkit enables enhanced performance in networks used for tasks of visual inferencing, like object detection and image classification and other deep learning workloads, like audio, speech, language, and recommender. Its Model Optimizer and Inference Engine components optimize the neural network model, as well as perform hardware specific acceleration when interferencing using the trained Deep Learning model.

For evaluating both the optimization techniques, execution of pipeline was performed on datasets having an average of 100 pages per document. The pipeline was run using default TensorFlow, the Intel Optimization for TensorFlow and the Intel Distribution of OpenVINO toolkit v2018.5.455 on a Intel Xeon Platinum processor-based AWS c5.9xlarge instance. The execution was performed for batch sizes of 256 and 768.

LTTS OCR

Speed-up with Intel AI technologies

Page 8: COGNITIVE METADATA EXTRACTION · 2019-11-26 · Introduction. 3. AiKno ®: AI for Industrial Engineering. 3 AiKno ® Framework 4 What can AiKno ® do? 5 Cognitive metadata extraction

PAGE 08 L&T TECHNOLOGY SERVICES

Figure 4.1 shows the inferencing throughput performance of LTTS OCR module of AiKno® Metadata Extraction pipeline when run using the default version and the Intel Optimization for TensorFlow, while Figure 4.2 compares the same with the performance obtained when the Intel Distribution of OpenVINO toolkit is used.

Inferencing Throughput Performance of LTTS OCR Module

Batch Size 256 Batch Size 768

Tim

e

LTTS OCR Throughput Compared for Intel Optimization for TensorFlow

Default TensorFlow Intel Optimized TensorFlow

3.03X Improvement 2.63X Improvement

Figure 4.1: Inferencing throughput performance of LTTS OCR module on Intel Xeon Platinum c5.9xlarge instance with Intel optimized TensorFlow

Figure 4.2: Inferencing throughput performance of LTTS OCR module on Intel Xeon Platinum c5.9xlarge instance with Intel optimized TensorFlow and OpenVINO

Batch Size 256 Batch Size 768

Tim

e

LTTS OCR Throughput Compared for Intel Distribution of OpenVINO Toolkit

Default TensorFlow Intel Optimized TensorFlow with OpenVINO

7.24X Improvement 5.01X Improvement

Throughput result for different components of AiKno® Metadata Extraction pipeline was obtained by executing the pipeline on Intel Xeon Platinum processor AWS instance. Speed-up observed in LTTS OCR and the full pipeline are shown.

Results

Page 9: COGNITIVE METADATA EXTRACTION · 2019-11-26 · Introduction. 3. AiKno ®: AI for Industrial Engineering. 3 AiKno ® Framework 4 What can AiKno ® do? 5 Cognitive metadata extraction

L&T TECHNOLOGY SERVICES PAGE 09

Figure 4.3 shows the inferencing throughput performance of LTTS AiKno® Metadata Extraction pipeline when run using default and Intel optimized TensorFlow, while Figure 4.4 compares the same with the performance obtained when Intel Distribution of OpenVINO toolkit is used.

As shown in Figures 4.1 and 4.2, the work resulted in achieving up to a 3.03X and up to a ~7.24X performance improvement for the LTTS OCR module of the pipeline on c5.9xlarge instance when executed using Intel optimized TensorFlow and a combination of Intel optimized TensorFlow and the Intel Distribution of OpenVINO toolkit respectively. The Metadata Extraction pipeline as a whole showed improvement of ~2.00X and ~2.77X relative to default installation of TensorFlow on the c5.9xlarge instance (Figures 4.3 and 4.4). Figure 4.5 summarizes the speed-up results obtained.

Inferencing Throughput Performance of AiKno® Metadata Extraction Pipeline

Batch Size 256 Batch Size 768

Tim

e

AiKno® Pipeline Throughput Compared for Intel Optimization for TensorFlow

Default TensorFlow Intel Optimized TensorFlow

2.00X Improvement 1.80X Improvement

Figure 4.3: Inferencing throughput performance of AiKno® Metadata Extraction pipeline on Intel® Xeon® Platinum CPU AWS c5.9xlarge instance with Intel Optimization for TensorFlow

Figure 4.4: Inferencing throughput performance of AiKno® Metadata Extraction pipeline on Intel Xeon Platinum c5.9xlarge instance with Intel optimized TensorFlow and OpenVINO

Batch Size 256 Batch Size 768

Tim

e

AiKno® Pipeline Throughput Compared for Intel Distribution of OpenVINO Toolkit

Default TensorFlow Intel Optimized TensorFlow with OpenVINO

2.77X Improvement 2.18X Improvement

Page 10: COGNITIVE METADATA EXTRACTION · 2019-11-26 · Introduction. 3. AiKno ®: AI for Industrial Engineering. 3 AiKno ® Framework 4 What can AiKno ® do? 5 Cognitive metadata extraction

Copyright © L&T Technology [email protected]

ENGINEERING THE CHANGE

Figure 4.5: Summary of relative processing speed results for AiKno® pipeline

Through framework optimizations and development of software libraries and toolkits, Intel has enabled significant performance enhancements for deep learning inference on Intel-based platforms using Intel Optimizations for TensorFlow and the Intel Distribution of OpenVINO toolkit. Working with Intel, LTTS has successfully migrated it’s AI Framework AiKno® powered Metadata Extraction pipeline on Intel Xeon Platinum 8124M processors - for both on premise systems and cloud-based data platforms. LTTS has achieved breakthroughs in technology and outstanding results for Cognitive Automation for Engineering Document Management System, which has helped launch their solution into the Oil and Gas Drilling industry.

To learn more, visit Intel AI Builders at: https://builders.intel.com/ai

To learn more about LTTS, visit: https://www.LTTS.com/

To learn more about LTTS AiKno® , visit at: https://www.LTTS.com/aikno

Conclusion