seenit: applying n1ql and fts over machine learning – couchbase connect 2016
TRANSCRIPT
Full Text Search & N1QL
www.seenit.io@_seenit
❤️Machine Learning
Dave Starling - CTO @ SeenitCouchbase Connect 2016@davestarling
• Founded in 2014• Provide a new production model for
internal comms, broadcasting, marketing, and fan-engagement
• Working with companies like Rolls-Royce, Red Bull F1 Racing, BT Sport, BBC, Unilever, and more
• Python, CherryPy, RabbitMQ on Google Cloud Platform
• Built to exploit Couchbase features• Currently running Couchbase Enterprise
4.5.1 in production
About Seenit
By 2019, video will account for 80% of consumer internet
traffic*
80%*Cisco Visual Networking Index: Forecast and
Methodology, 2014-2019 White Paper
1.
How it works: our video co-creation platform
2.Your community
capturesYour community creates authentic
videos using the Seenit Capture app
You collect & editThese videos are automatically
collected in our Seenit Studio where you can edit them
3.Co-created videos
You get highly relevant, authentic, reactive and sharable co-created
video content
Seenit Capture AppTurn your community into your camera crew.
• Contributors shoot videos based on the script created by you
• Videos from all user are uploaded automatically to Seenit Studio
• Contributors can view curated content, get inspired and share videos further
• Gamify and reward contribution to boost engagement and content creation
Video production & community engagement all in one place.
• Easily managed by one non-technical person
• Collect & edit uploaded videos• Tracking & analytics measure video
success• Create a script to direct your
contributors• Direct messaging to engage your team• Share great clips to inspire your crew• You own all content
Seenit Studio
60s Video Example Placeholder
● For fast turnaround, watching all incoming video is not an option
● Large video libraries make it hard to find the right clip
● Sometimes only 2s of a video is needed for an edit
Searching for the right video is hard
Take the pain out of manual video processing
● Media Type● Duration● Script/Storyboard Item● Date Created● Name
Current Filtering Capabilities
● Gives no insight into objective qualities – exposure, stability, audio noise
● Gives no insight into content qualities – sentiment, clarity of voice
Filtering Limitations
● Computer vision using Google’s TensorFlow
● Includes several ready-to-use image and audio processing tools
● Talks Python and JSON● Results easily stored in Couchbase as
JSON objects
Machine Learning
● 1 second frame grabs● Process each frame grab● Combine results● Increase confidence where results in
multiple frames● Results stored in Couchbase as JSON
objects
Machine Learning with video
● Help editors find well-shot videos● Well exposed● Stable● No background noise● Shot style (panning, stationary)● Dominant colour for theming
Quantitative Analysis
Underexposed
Background Noise
Stable
● Define what is in a video● Helps editors find specific scenes for
filler or story telling● Defines confidence of object appearing
in the scene● Automatically moderate offensive
content● Enables sentiment analysis and OCR
Object Classification
• Windmill• Grass• Green• Sky• Blue
• Field• Clouds• Sports• People• Building
Object Classification
• Formula One• Vehicle• Open Wheel Car• Human Activity• Formula One Car• Sports
• Automobile• Mode of Transport• Racing• Auto Racing• Sports Stadium
Object Classification
• Banana family• Plant• Plantain• Food• Banana• Produce• Fruit• Yellow
Object Classification
• Athlete• Track and Field Athletics• Athletics• Sports
● Identifies visual emotions like anger, joy, and sorrow
● Helps editors find specific looks for emotional scenes
● Combined with transcription analysis, allows for sentiment searching
● E.g. “happy people who say hello”
Sentiment Analysis
• Happy• Hair• Hairstyle• Eyewear• Glasses• Cheerful
• Person• Man• Beard• Facial Hair
Sentiment Analysis• Sad• Sorrow• Facial Expression• Hair• Hairstyle• Eyewear• Glasses• Cheerful
● Define what is spoken in a video● Helps editors find specific scenes for
story telling● Defines word position and timecode● Automatically moderate offensive
content● Enables further sentiment analysis
Audio Transcription
● Entity analysis – extract proper nouns● Sentiment analysis – extract and identify
prevailing emotional opinion: positive, negative, neutral
● Tokenization – extract verbs and nouns
Transcription Language Analysis
Transcription Language Analysis“I really like working with Couchbase, and enjoy talking about the work we do with it”
{ "sentiment": { "polarity": 0.8, "magnitude": 4.0 }, "entities": [{ "name": "Couchbase", "type": "ORGANIZATION", "salience": 0.8 }]}
Applying N1QL { "type": "video_analysis", "visual_objects" : ["face", "happy", "beard", "eyewear", "glasses"], "transcription": "I really like working with Couchbase, and enjoy talking about the work we do with it", "sentiment": { "polarity": 0.8, "magnitude": 4.0 }, "entities": [{ "name": "Couchbase", "type": "ORGANIZATION", "salience": 0.8 }]}
Applying N1QL SELECT * FROM bucket_name WHERE type="video_analysis” AND sentiment.polarity > 0.5;
Applying N1QL SELECT * FROM bucket_name WHERE type="video_analysis" AND sentiment.polarity > 0.5 AND ANY tag IN visual_objects SATISFIES tag IN [‘face’, ‘happy’] END;
Applying N1QL SELECT *, ARRAY item.name FOR item IN entities WHEN item.type='ORGANIZATION' END AS entities_array FROM bucket_name WHERE type="video_analysis" AND sentiment.polarity > 0.5 AND ANY tag IN visual_objects SATISFIES tag IN [‘face’, ‘happy’] AND ANY organization IN ARRAY item.name FOR item IN entities WHEN item.type='ORGANIZATION' END SATISFIES 'Couchbase' END;
But…
● Whilst we can use field LIKE ‘%partial'● LIKE is case sensitive● Does not tolerate typos● Does not handle ignoring stop words like
’is’, ‘the’
● Improved speed● Word stemming and text analysis with
several prebuilt analysers● Fuzzy searching● Result snippets and word highlights
Applying Full Text Search
● Simple-to-use conjunction, disjunction and boolean queries
● tf-idf scoring● Boosting – increase relative importance
of specific clauses
Applying FTS Results Example
Applying FTS Creating The Index
Applying FTS Indexing Tips
Applying FTS 4
Combining N1QL & FTS
Combining N1QL & FTS 2
Combining N1QL & FTS 3
Next Steps in Learning
Refining Search Further
Automated Content Creation
Together we see more.