la nación - editorslab hackdays finale -
TRANSCRIPT
Welcome to
You’ll love it!
VIDEODATA
VideoData is a project that focuses on solving 3 newsroom problems:
1) REALLY useful archiving of videos for later reuse. 2) Finding quickly THAT special moment inside one video piece. 3) Text mining video
“Imagine being able to get that special moment in a video without having to see it in full length. Oh, yesss, Magic!”
To make it possible we are going to share the tool and our own archiving work with general audience & news junkies.
VIDEODATA
• The site works with online hosted videos (Youtube & co) with emphasis in collaborative & automated tagging moments on a single piece.
• Additionally, it retrieves transcripts from Youtube automatic captions.
• It also invites audiences to edit automated transcripts via Amara.com to get the most accurate version.
• VideoData users can search within the content and/or contribute tagging material to enrich the online database.
VIDEODATA
• Tags include title and a short description or comment of what is happening. E.g.: disagreeing with a public servant saying and adding links to additional information to prove it.
• An open data feature allows you to download information from a) 1 video or b) the result of a search, in formats like CSV, XML & JSON. One of our goals is to make data analysis (eg text mining).
• It will have highlighted content with editorial curation, like official videos coming from President’s channel in Youtube, or Senate, World Cup and VIP people in Argentina.
MORE DETAILED & TECH INFO
Download Youtube automatic translation to Spanish via KeepSubs.com
Download Youtube automatic translation to Spanish via KeepSubs.com
Download Youtube automattic translation to Spanish via KeepSubs.com
Download Youtube automatic translation to Spanish via KeepSubs.com
Keepsubs was the easiest solution but we won't use it in production.Anyway, we just use keepsubs to analyze a youtube video and then wescrap the website looking for spanish subtitles with PyQuery. We also extract the plain text from the subtitles using regular expressions inPython.
Download Transcripts in spanish via Amara
Edited Subtitles via Amara
The idea is to add a link to Amara's projectin order to improve the subtitles.
Amara has an API, we can use it to check updates on the subtitles projectand when it reaches a 100% level in the desired language (e.g. spanish argentinian), we canautomatically download the finished subtitle and attach it to the video we are showing
We can link the subtitles in: str and txt
We can use: http://amara.readthedocs.org/en/latest/api.html
Download via Amara as SRT
Download via Amara as TXT
Automatic Tags from transcriptionsComing soon!
http://pastie.org/9284203
Ready to Tag?