ii-sdv 2014 a new approach to flexible, meaning-rich document parsing (paul barba -- lexalytics,...
TRANSCRIPT
Nice
And how to get what you need.
Lexalytics is
• A software company
• We sell the “Salience Engine”
• Salience is a Text Analytics Engine that fits into your software, services, or applications
• What we ship is a set of libraries and configuration files
© 2014 Lexalytics Inc. All rights reserved. lexalytics.com2
S A L I E N C E 5 . 2
Market Proven IP: 11 Years of R&D
© 2014 Lexalytics Inc. All rights reserved. lexalytics.com3
Approximately 3 Billion documents/day go through Salience.
2/2012: Mobile Functionality – Port the Salience engine to Android mobile devices
11/2010: Salience 4.4 released, includes support for first non-English language (French)
10/2011: Salience v5.0 incorporates innovative Concept Matrix functionality
06/2012: Salience v5.1 released, expansion of available options and optimized sentiment analysis functionality
08/2013: Chinese language released; multi-lingual support in 6 languages
Q4/2014: Salience v6 – new underpinnings, easier tuning, and “Intent” extraction
2004: Lexalytics launches first commercial text and sentiment analysis engine, Salience v1.0
10/2008: Salience 4.0 released, based on maximum entropy model for detection and labeling of novel entities
08/2010: Salience 4.3 to include custom handling of Twitter and micro-blog content
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014
Q4/2014: Salience v5.2 released with various feature enhancements
A Multi-lingual World WLOA (With Lots of Acronyms) and Context Everywhere
4Lexalytics Salience Training prepared for Analytics 8
NLP
• New Labor Party
• National Landcare Program
• Network Layer Packet
• NeuroLinguistic Programming
• Wicked
• Sick
• Hack
Always running to catch up…
5Lexalytics Salience Training prepared for Analytics 8
New Tools
God Bless Moore’s Law and Librarians
7Lexalytics Salience Training prepared for Analytics 8
Unsupervised learning is the key
8Lexalytics Salience Training prepared for Analytics 8
Meaning Matters
9Lexalytics Salience Training prepared for Analytics 8
It ’s not that I don’t like tea I just prefer coffee
Meaning Matters
10Lexalytics Salience Training prepared for Analytics 8
Jane will be joining already with a search experta team
Meaning Matters
11Lexalytics Salience Training prepared for Analytics 8
Jane will be joining a team already with some search experience
Episode 4: A New Hope
12Lexalytics Salience Training prepared for Analytics 8
Sentence POS Tagger ChunkerRulesFile
CandidateParseTerms
Jane and her team
<Jane will be joining a team already with search experience>
• Pos Tag<Jane_NNP will_MD be_VBjoining_VBP a_DT team_NNalready_RB with_PP search_JJexperience_NN>
• Chunk<Jane> <will be joining> <a team> <already with search experience>
13
• Extract possible links
Jane => will be joining
will be joining => a team
a team => already with search experience
will be joining => already with search experience
Jane => already with search experience.
Lexalytics Salience Training prepared for Analytics 8
Matrices of Meaning
14Lexalytics Salience Training prepared for Analytics 8
Matrix Math
15Lexalytics Salience Training prepared for Analytics 8
All noun phrases
All verb
phrases
Now look at how easy it is
• <Do you want me to get anything else while I go to the store for milk?>
• pos tag and chunk it.
<Do> <you> <want> <me> <to get> <anything else> <while> <I> <go> <to the store> <for milk>
16
Find the possible links.
do want
you want
want me
you to get
want to get
me to get
to get anything else
want while
to get while
while go
I go
go to the store
I to the store
get to the store
want to the store
to the store for milk
go for milk
want for milk
Lexalytics Salience Training prepared for Analytics 8
A world of new possibilities
17Lexalytics Salience Training prepared for Analytics 8