ii-sdv 2014 a new approach to flexible, meaning-rich document parsing (paul barba -- lexalytics,...

Post on 11-May-2015






Click to see full reader



And how to get what you need.

Lexalytics is

• A software company

• We sell the “Salience Engine”

• Salience is a Text Analytics Engine that fits into your software, services, or applications

• What we ship is a set of libraries and configuration files

© 2014 Lexalytics Inc. All rights reserved. lexalytics.com2

S A L I E N C E 5 . 2

Market Proven IP: 11 Years of R&D

© 2014 Lexalytics Inc. All rights reserved. lexalytics.com3

Approximately 3 Billion documents/day go through Salience.

2/2012: Mobile Functionality – Port the Salience engine to Android mobile devices

11/2010: Salience 4.4 released, includes support for first non-English language (French)

10/2011: Salience v5.0 incorporates innovative Concept Matrix functionality

06/2012: Salience v5.1 released, expansion of available options and optimized sentiment analysis functionality

08/2013: Chinese language released; multi-lingual support in 6 languages

Q4/2014: Salience v6 – new underpinnings, easier tuning, and “Intent” extraction

2004: Lexalytics launches first commercial text and sentiment analysis engine, Salience v1.0

10/2008: Salience 4.0 released, based on maximum entropy model for detection and labeling of novel entities

08/2010: Salience 4.3 to include custom handling of Twitter and micro-blog content

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014

Q4/2014: Salience v5.2 released with various feature enhancements

A Multi-lingual World WLOA (With Lots of Acronyms) and Context Everywhere

4Lexalytics Salience Training prepared for Analytics 8


• New Labor Party

• National Landcare Program

• Network Layer Packet

• NeuroLinguistic Programming

• Wicked

• Sick

• Hack

Always running to catch up…

5Lexalytics Salience Training prepared for Analytics 8

New Tools

God Bless Moore’s Law and Librarians

7Lexalytics Salience Training prepared for Analytics 8

Unsupervised learning is the key

8Lexalytics Salience Training prepared for Analytics 8

Meaning Matters

9Lexalytics Salience Training prepared for Analytics 8

It ’s not that I don’t like tea I just prefer coffee

Meaning Matters

10Lexalytics Salience Training prepared for Analytics 8

Jane will be joining already with a search experta team

Meaning Matters

11Lexalytics Salience Training prepared for Analytics 8

Jane will be joining a team already with some search experience

Episode 4: A New Hope

12Lexalytics Salience Training prepared for Analytics 8

Sentence POS Tagger ChunkerRulesFile


Jane and her team

<Jane will be joining a team already with search experience>

• Pos Tag<Jane_NNP will_MD be_VBjoining_VBP a_DT team_NNalready_RB with_PP search_JJexperience_NN>

• Chunk<Jane> <will be joining> <a team> <already with search experience>


• Extract possible links

Jane => will be joining

will be joining => a team

a team => already with search experience

will be joining => already with search experience

Jane => already with search experience.

Lexalytics Salience Training prepared for Analytics 8

Matrices of Meaning

14Lexalytics Salience Training prepared for Analytics 8

Matrix Math

15Lexalytics Salience Training prepared for Analytics 8

All noun phrases

All verb


Now look at how easy it is

• <Do you want me to get anything else while I go to the store for milk?>

• pos tag and chunk it.

<Do> <you> <want> <me> <to get> <anything else> <while> <I> <go> <to the store> <for milk>


Find the possible links.

do want

you want

want me

you to get

want to get

me to get

to get anything else

want while

to get while

while go

I go

go to the store

I to the store

get to the store

want to the store

to the store for milk

go for milk

want for milk

Lexalytics Salience Training prepared for Analytics 8

A world of new possibilities

17Lexalytics Salience Training prepared for Analytics 8

top related