personal informaon management systems and …abiteboul.com/presentation/16.pimsthymeflow.pdfpersonal...
TRANSCRIPT
Personalinforma-onmanagementsystemsand
knowledgeintegra-on
SergeAbiteboulInria&EcoleNormaleSupérieureCachan
[email protected] http://abiteboul.com
Organiza8on
1. Personaldata2. ThePims
1. TheconceptofPims2. ThePimsarearrivingandthatiscool
3. Researchissues4. Anillustra8onwiththeThymeflowsystem
Disc2016 SergeAbiteboul 2
1.Personaldata
Personaldataoutthere
SergeAbiteboul 4Disc2016
Personaldataoutthere• Variety
– Structured,semi-structured,unstructured– Metadataandknowledge(RDF)– Differentlanguages,terminologies,ontologies,structures
• Veracity– Varyingquality:errors,opinions,missingdata…– Varyingimportance:hardtoassess
• Velocity– Changes,staleness…– Recentdataistypicallyveryvaluable
− Volume(???)– GrowingbutnoBigdata
+ Distributed– Inmanyautonomoussystemsthatactassilos– Differentsystems,protocols
5SergeAbiteboulDisc2016
• Lossoffunc8onali8esbecauseoffragmenta8on– Youdon’tknowwhereyourdatais,howtomaintainituptodate,howtogetitsome8mes
– Difficulttodoglobalsearch,maintenance,synchroniza8on,archiving...
• Lossofcontroloverthedata– Difficulttocontrolprivacy– Difficulttocontrolsharing– Leaksofprivateinforma8on
• Lossoffreedom– Vendorlock-in
Badnews(1)
6SergeAbiteboulDisc2016
Badnews(2)
• Afewcompaniesconcentratemostoftheworld’sdataandanaly8cpower– Theyhavethemeanstodestroybusinesscompe88oninlargepor8onsoftheeconomy
• Afewcompaniescontrolallyourpersonaldata– Theydeterminewhatinforma8onyouareexposedto– Theyguidemanyofyourdecisions– Theypoten8allyinfringeonyourprivacyandfreedom
Disc2016 SergeAbiteboul 7
2.ThePims
FromManagingyourdigitallifewithaPersonalinforma5onmanagementsystem,withBenjaminAndré&DanielKaplan,Communica-onsoftheACM2015
Alterna8ves
• Con8nuewiththisincreasingmess– Seeashrinktoovercome thefrustra8on
• Gatherallyourdatainoneplaform– Google,Apple,Facebook,…,anewcomer– Seeashrinktoovercomeresentment
• Study2yearstobecomeageek– Geeksknowhowtomanagetheirinforma8on– Seeashrinktosurvivetheexperience
9SergeAbiteboul
Wheredoyoukeepyourdata?
Disc2016
OrmovetoPims!Amemexisadeviceinwhichanindividualstoresallhisbooks,records,andcommunica5ons,andwhichismechanizedsothatitmaybeconsultedwithexceedingspeedandflexibility.Itisanenlargedin5matesupplementtohismemory. VannevarBush,TheAtlan8cMonthly,1945
Defini8onforthistalk:aPersonalInforma-onManagementSystemisacloudsystemthatmanagesalltheinforma5onofaperson
OnePims,twoPims…manyPims
SergeAbiteboul 10Disc2016
ThePims:achangeinparadigmManyWebservicesEachonerunning
• Onsomeunknownmachines
• Withyourdata
• Somesokware
YourPims• Yourmachine• Withyourdata
– possiblyreplicaofdatafromsystemsyoulike
• Wrappertosomesokware
– Externalservice• Oryoursokware
– Decentralizedservice
SergeAbiteboul 11Disc2016
ThePimsare(Ibelieve)arriving!
Why?For3kindsofreasons:• Society• Technology• Industry
Disc2016 SergeAbiteboul 12
Societyisreadytomove
• Growingresentment– Againstcompanies:intrusivemarke8ng,cryp8cpersonaliza8onandbusinessdecisions(e.g.,onpricing),creepy"bigdata"inferences
– Againstgovernments:NSAanditsEuropeancounterparts• Increasingawarenessofthedissymmetry
– betweenwhatthesesystemsknowaboutaperson,andwhatthepersonactuallyknows
• Emergingunderstandingofthevalueofpersonaldataforindividuals– Quan8fiedself
SergeAbiteboul 13Disc2016
Societyisreadytomove(2)
• Privacycontrol:regula8onsinEurope• Informa8onsymmetry:Vendorrela8onmanagement• Manyreports/proposalsthataffirmtheownershipofpersonaldatabytheperson
• Personaldatadisclosureini8a8ves– SmartDisclosure(US);MiData(UK),MesInfos(France)– Severallargecompanies(networkoperators,banks,retailers,insurers…)agreeingtosharewithcustomersthepersonaldatathattheyhaveaboutthem
SergeAbiteboul 14Disc2016
Technologyisgearingup
• Systemadministra8oniseasier– Abstrac8ontechnologiesforservers– Virtualiza8onandconfigura8onmanagementtools
• Open-sourcealterna8vestoproprietaryonlineservicesareincreasinglyavailable
• Priceofmachinesisgoingdown– Ahostedlow-costserverisascheapas5€/month– Payingisnolongerabarrierforamajorityofpeople
Youmayhavefriendsalreadydoingit
SergeAbiteboul 15Disc2016
Technologyisgearingup(2)
• Manysystems&projects– Lifestreams,Stuff-I’ve-Seen,Haystack,MyLifeBits,Connec8ons,Seetrieve,PersonalDataspaces,ordeskWeb.
– YounoHost,Amahi,ArkOS,OwnCloudorCozyCloud
• Someonpar8cularaspects– Mailpileformail– LimaforaDropbox-likeservice,butathome.– PersonalNAS(network-connectedstorage)e.g.Synologie– PersonaldatastoreSAMIofSamsung...
• Manymore SergeAbiteboul 16Disc2016
IndustryisinterestedPre-digitalcompanies
• E.g.,hotelsorbanks• DisintermediatedfromtheircustomersbypureInternetplayerssuchasGoogle,Amazon,Booking.com,Mint.
• InPims,theycanrebuilddirectinterac8on• Theplayingfieldisneutral
– UnlikeontheInternetwheretheyhavelessdata• Theycanoffernewserviceswithoutcompromisingprivacy
SergeAbiteboul 17Disc2016
Industryisinterested(2)Homeappliancescompanies
• Manydevicesdeployedathomeorindatacenters– Internetserviceprovider“boxes”,NASservers,“smart”metersprovidedbyenergyvendors,homeautoma8onsystems,“digitallockers”…
• Personaldataspacesdedicatedtospecificusage• Couldevolvetobecomemoregeneric• ControlofprivateInternetofthings
SergeAbiteboul 18Disc2016
Industryisinterested(3)PureInternetplayers
• Amazon:greatknow-howinprovidingservices• Facebook,Google:cannotaffordtobeoutofamovementinpersonaldatamanagement
• Veryfarfromtheirbusinessmodelbasedonpersonaladver8sement
• Movingtothisnewmarketwouldrequiremajorchanges&theclarifica8onoftherela8onshipwithusersw.r.t.datamone8za8on
SergeAbiteboul 19Disc2016
Advantages–rebalancetheWeb
• Usercontrolovertheirdata– Whohasaccesstowhat,underwhatrules,todowhat
• Userempowerment– Theychooseservicesfreely&theycanleaveaservice
• Par8cipa8oninamore“neutral”Web– Withthe“networkeffect”,themainplaformsareaccumula8ngdata/customersanddistor8ngcompe88on
– ThePimsbringbackfairnessontheWeb– Goodprac8cesareencouraged,e.g.,interoperability,portability
SergeAbiteboul 20Disc2016
ThePimswillprimarilyarrivebecauseofnewfunc8onali8es
Thisis(forme)thekeyingredientforadop8on
Newfunc8onali8es➸Newopportuni8es
NewplayingfieldforstartupsNewplayingfieldforresearchers
SergeAbiteboul 21Disc2016
3.ResearchissueswiththePims
FromPersonalInforma5onManagementSystems,tutorialinExtendedDataBaseTechnologyConference,2015,withAmélieMarian
R&Dissueswewillnotconsidermuch
Someoldproblemsrevisited• Epsilon-principle(epsilon-user-administra8on)• Backups&Tasksequencing• Accesscontrol&Exchangeofinforma8on• Security(e.g.works@INRIARocquencourt)• Connectedobjectscontrol
SergeAbiteboul 23Disc2016
R&Dissueswewillbrieflyillustrate
Someoldproblemsrevisited• Personalinforma8onintegra8on• Synchroniza8on• Personaliza8onandcontextawareness• Personaldataanalysis
SergeAbiteboul 24Disc2016
4.Anillustra8onwiththeThymeflowsystem
DemoinInterna8onalConferenceonInforma8onandKnowledgeManagement(CIKM’16)withDavidMontoya,ThomasPellissier-Tanon,FabianM.Suchanek
Pimsarefirstaboutdataintegra8on
Disc2016 SergeAbiteboul 26
mimi
lulu
zaza
loca8on
webSearch
calendar
contacts
TripAdvisor
banks
Facebook Integra8onoftheusersofaservice
Integra8onoftheservicesofauser
ALICEX
X
X
X
X
X
X
X
X
Orratheronknowledgeintegra8on
• Data/Informa-on➼Knowledge– Personaldata/infomanagementisgeyngtoocomplicated– Machinespreferstructuredknowledgetounstructuredinforma8onorseman8c-freedata
• Thesis:Letusturnallourinforma8onintoadistributedknowledgebase
ERCWebdam,hzp://webdam.inria.fr(endedin2015)
SergeAbiteboul 27Disc2016
TheThymeflowKnowledgeBase
• ThymeflowisaKB,anextensionofaperson’smemory– Episodicalmemory(typicallyrelatedtospa8o-temporalevents)and– Seman8cmemory(knowledgethatholdsirrela8vetoanysuchevent)
• Thymeflow’sknowledgeis– Extractedfromalltheinforma8ontracesoftheperson– ObtainedfromtheWeb(Wikidata,OpenStreetMap…)– DerivedbysokwaremodulesthatanalyzetheKB
• Thymeflowisanapplica8onfortheWebandmobilephones– Loading:calendar,contacts,mails,geoloca8on(GPS),socialnetworks…– Derivinglinksbetweenthesedatasourcesandotherknowledgebases– Suppor8ngqueryprocessinganddataanaly8cs
Disc2016 SergeAbiteboul 28
Datasourcesloading/sync
Architecture
Disc2016 SergeAbiteboul 29
synchronizersynchronizer
synchronizersynchronizer
synchronizersynchronizer
synchronizer
ThymeflowKB
enricherenricher
enricherenricher
enricherenricher
enricher
KBenriching
Externalsourcesquerying
PersistentKB
QueryingVisualiza-onAnaly-cs
• Backend:– HTTPserver– RESTAPI– SPARQLendpoint
(Sesame)
• Frontend:Webapp• Mobileapp
– forgeoloca8on
RDFknowledgebase
• RDFmodel– RDFTriples
subject–predicate–object• Schema
– hzp://schema.org/– hzp://thymeflow.com/personal
• Mostusefulclasses– personal:Agent – schema:Event – schema:Place – schema:EmailMessage
Disc2016 SergeAbiteboul 30
Queryexamples
• Atwhat8medoIusuallysendemails?
• Full-textqueryinmyen8rememory
Disc2016 SergeAbiteboul 31
Maincomponent:synchronizer
• Transformdataintoknowledgeandsynchronizeadatasourcewiththeknowledgebase
Examples• CalDavSynchronizer/CardDavSynchronizer:
– ManageiCalendar(.ical)andvCard(.vcf)
• EmailSynchronizer– IMAPtoconnecttomailservers
Disc2016 SergeAbiteboul 32
ThymeflowKB
Updatepropaga8onFromdatasourcestoKB FromKBtodatasources(1)
Disc2016 SergeAbiteboul 33
ThymeflowKB
ThymeflowKB
PersistentKB PersistentKB
PersistentKB
FromKBtodatasources(1)
ThymeflowKB
ThymeflowKB
ThymeflowKB
PersistentKB
???
Maincomponent:enricher
• Alignconceptscomingfromdifferentdatasources• AddknowledgetotheKBExamples• Alignagentsbasedon,e.g.,theirnames,emails… • Addgeoloca8onstocalendarevents• Addseman8cstoplacesphysicallyvisited• Aligncalendareventstoplacesphysicallyvisited
Disc2016 SergeAbiteboul 34
Dataanaly8cs
• SmalldataanalysiswithPims– Learnfrompersonaldata,e.g.,
• Personalhealthandwell-being• Digitalpersonalassistant:no8fica8on&planning
– Issues• Muchsmalleramountsofdata–sta8s8csharder• Varyingdataquality:imprecision,inconsistencies
• BigdataanalysisfromPims– AggregatedatafromlargenumberofPims– DeriveknowledgeusefulforPims,e.g.,trafficjams– Issue:dataprivacy
SergeAbiteboul 35Disc2016
Conclusion
GoalMakethedigitalworldabezerplacetolivein
ThePimsseemapromisingdirec8onforthatLotsofresearchissuesremaining
SergeAbiteboul 36Disc2016