seenit: applying n1ql and fts over machine learning – couchbase connect 2016

38
Full Text Search & N1QL www.seenit.io @_seenit Machine Learning Dave Starling - CTO @ Seenit Couchbase Connect 2016 @davestarling

Upload: couchbase

Post on 15-Apr-2017

185 views

Category:

Software


2 download

TRANSCRIPT

Page 1: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Full Text Search & N1QL

www.seenit.io@_seenit

❤️Machine Learning

Dave Starling - CTO @ SeenitCouchbase Connect 2016@davestarling

Page 2: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

• Founded in 2014• Provide a new production model for

internal comms, broadcasting, marketing, and fan-engagement

• Working with companies like Rolls-Royce, Red Bull F1 Racing, BT Sport, BBC, Unilever, and more

• Python, CherryPy, RabbitMQ on Google Cloud Platform

• Built to exploit Couchbase features• Currently running Couchbase Enterprise

4.5.1 in production

About Seenit

By 2019, video will account for 80% of consumer internet

traffic*

80%*Cisco Visual Networking Index: Forecast and

Methodology, 2014-2019 White Paper

Page 3: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

1.

How it works: our video co-creation platform

2.Your community

capturesYour community creates authentic

videos using the Seenit Capture app

You collect & editThese videos are automatically

collected in our Seenit Studio where you can edit them

3.Co-created videos

You get highly relevant, authentic, reactive and sharable co-created

video content

Page 4: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Seenit Capture AppTurn your community into your camera crew.

• Contributors shoot videos based on the script created by you

• Videos from all user are uploaded automatically to Seenit Studio

• Contributors can view curated content, get inspired and share videos further

• Gamify and reward contribution to boost engagement and content creation

Page 5: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Video production & community engagement all in one place.

• Easily managed by one non-technical person

• Collect & edit uploaded videos• Tracking & analytics measure video

success• Create a script to direct your

contributors• Direct messaging to engage your team• Share great clips to inspire your crew• You own all content

Seenit Studio

Page 6: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

60s Video Example Placeholder

Page 7: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

● For fast turnaround, watching all incoming video is not an option

● Large video libraries make it hard to find the right clip

● Sometimes only 2s of a video is needed for an edit

Searching for the right video is hard

Take the pain out of manual video processing

Page 8: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

● Media Type● Duration● Script/Storyboard Item● Date Created● Name

Current Filtering Capabilities

Page 9: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

● Gives no insight into objective qualities – exposure, stability, audio noise

● Gives no insight into content qualities – sentiment, clarity of voice

Filtering Limitations

Page 10: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

● Computer vision using Google’s TensorFlow

● Includes several ready-to-use image and audio processing tools

● Talks Python and JSON● Results easily stored in Couchbase as

JSON objects

Machine Learning

Page 11: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

● 1 second frame grabs● Process each frame grab● Combine results● Increase confidence where results in

multiple frames● Results stored in Couchbase as JSON

objects

Machine Learning with video

Page 12: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

● Help editors find well-shot videos● Well exposed● Stable● No background noise● Shot style (panning, stationary)● Dominant colour for theming

Quantitative Analysis

Underexposed

Background Noise

Stable

Page 13: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

● Define what is in a video● Helps editors find specific scenes for

filler or story telling● Defines confidence of object appearing

in the scene● Automatically moderate offensive

content● Enables sentiment analysis and OCR

Object Classification

• Windmill• Grass• Green• Sky• Blue

• Field• Clouds• Sports• People• Building

Page 14: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Object Classification

• Formula One• Vehicle• Open Wheel Car• Human Activity• Formula One Car• Sports

• Automobile• Mode of Transport• Racing• Auto Racing• Sports Stadium

Page 15: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Object Classification

• Banana family• Plant• Plantain• Food• Banana• Produce• Fruit• Yellow

Page 16: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Object Classification

• Athlete• Track and Field Athletics• Athletics• Sports

Page 17: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

● Identifies visual emotions like anger, joy, and sorrow

● Helps editors find specific looks for emotional scenes

● Combined with transcription analysis, allows for sentiment searching

● E.g. “happy people who say hello”

Sentiment Analysis

• Happy• Hair• Hairstyle• Eyewear• Glasses• Cheerful

• Person• Man• Beard• Facial Hair

Page 18: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Sentiment Analysis• Sad• Sorrow• Facial Expression• Hair• Hairstyle• Eyewear• Glasses• Cheerful

Page 19: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

● Define what is spoken in a video● Helps editors find specific scenes for

story telling● Defines word position and timecode● Automatically moderate offensive

content● Enables further sentiment analysis

Audio Transcription

Page 20: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

● Entity analysis – extract proper nouns● Sentiment analysis – extract and identify

prevailing emotional opinion: positive, negative, neutral

● Tokenization – extract verbs and nouns

Transcription Language Analysis

Page 21: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Transcription Language Analysis“I really like working with Couchbase, and enjoy talking about the work we do with it”

{ "sentiment": { "polarity": 0.8, "magnitude": 4.0 }, "entities": [{ "name": "Couchbase", "type": "ORGANIZATION", "salience": 0.8 }]}

Page 22: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Applying N1QL { "type": "video_analysis", "visual_objects" : ["face", "happy", "beard", "eyewear", "glasses"], "transcription": "I really like working with Couchbase, and enjoy talking about the work we do with it", "sentiment": { "polarity": 0.8, "magnitude": 4.0 }, "entities": [{ "name": "Couchbase", "type": "ORGANIZATION", "salience": 0.8 }]}

Page 23: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Applying N1QL SELECT * FROM bucket_name WHERE type="video_analysis” AND sentiment.polarity > 0.5;

Page 24: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Applying N1QL SELECT * FROM bucket_name WHERE type="video_analysis" AND sentiment.polarity > 0.5 AND ANY tag IN visual_objects SATISFIES tag IN [‘face’, ‘happy’] END;

Page 25: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Applying N1QL SELECT *, ARRAY item.name FOR item IN entities WHEN item.type='ORGANIZATION' END AS entities_array FROM bucket_name WHERE type="video_analysis" AND sentiment.polarity > 0.5 AND ANY tag IN visual_objects SATISFIES tag IN [‘face’, ‘happy’] AND ANY organization IN ARRAY item.name FOR item IN entities WHEN item.type='ORGANIZATION' END SATISFIES 'Couchbase' END;

Page 26: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

But…

● Whilst we can use field LIKE ‘%partial'● LIKE is case sensitive● Does not tolerate typos● Does not handle ignoring stop words like

’is’, ‘the’

Page 27: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

● Improved speed● Word stemming and text analysis with

several prebuilt analysers● Fuzzy searching● Result snippets and word highlights

Applying Full Text Search

● Simple-to-use conjunction, disjunction and boolean queries

● tf-idf scoring● Boosting – increase relative importance

of specific clauses

Page 28: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Applying FTS Results Example

Page 29: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Applying FTS Creating The Index

Page 30: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Applying FTS Indexing Tips

Page 31: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Applying FTS 4

Page 32: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Combining N1QL & FTS

Page 33: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Combining N1QL & FTS 2

Page 34: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Combining N1QL & FTS 3

Page 35: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Next Steps in Learning

Page 36: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Refining Search Further

Page 37: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Automated Content Creation

Page 38: Seenit: Applying N1QL and FTS over machine learning – Couchbase Connect 2016

Together we see more.