Transcript
Page 1: Dealing with Semantic Heterogeneity in Real-Time Information

EarthBiAs2014  

Global  NEST    

University  of  the  Aegean  

Dealing  with  Seman@c  Heterogeneity  in  Real-­‐Time  Informa@on  

 Dr.  Edward  Curry  

Insight  Centre  for  Data  Analy@cs,    Na@onal  University  of  Ireland  Galway  

Tuesday  8th  July  2014    

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014   1  

Page 2: Dealing with Semantic Heterogeneity in Real-Time Information

Talk  Overview  

•  Part  I:  Large  Scale  Open  Environments  •  Part  Ii:  ComputaKonal  Paradigms  •  Part  III:  RDF  Event  Processing  •  Part  IV:  Theory  of  Event  Exchange  •  Part  V:  Approaches  to  SemanKc  Decoupling  •  Part  VI:  Example  ApplicaKon:  Linked  Energy  Intelligence  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 3: Dealing with Semantic Heterogeneity in Real-Time Information

About  Me  

•  PhD  in  Computer  Science  (NUI  Galway)  

•  Green  and  Sustainable  IT  Research  Group  Leader  in  DERI/Insight  NUI  Galway  

•  Researcher  in  both  Computer  Science  and  InformaKon  Systems    

Page 4: Dealing with Semantic Heterogeneity in Real-Time Information
Page 5: Dealing with Semantic Heterogeneity in Real-Time Information

Overall Objective WATERNOMICS will provide personalised and actionable

information about water consumption and water availability to individual households, companies and cities in an intuitive

and effective manner at a time-scale relevant for decision making.

Page 6: Dealing with Semantic Heterogeneity in Real-Time Information

Project-­‐Sense  

Non-Technical Users

•  Targets Occupants of the Building

•  Non-Technical Office Workers

•  No experience in Energy Management

•  Low cost installation

Self-Configuration

•  Collaborative system configuration

•  Crowdsourced contextual data from building occupants

•  Imports relevant enterprise data via Excel

•  Semantic event matching reduces configuration costs

Decision Support

•  Sensor and Data Fusion •  Multi-level decision

support model

•  Identifies Energy Saving Opportunities

•  Leverages Open Data and Predictive Analytics

User Experience

•  From Awareness to Engagement

•  Transtheoretical Model •  Gamification •  User Personalisation •  Simple non-technical user

interfaces

Self-­‐configuring  smart  energy  management  systems  for  small  commercial  buildings  

Page 7: Dealing with Semantic Heterogeneity in Real-Time Information

7 European Data Forum 2014 BIG 318062

BIG Big Data Public Private Forum

7 BIG 318062

The BIG Project

BIG aims to promote a well-developed EU industrial landscape in Big Data: ▶  Providing a clear picture of existing technology trends and

their maturity ▶  Acquiring a sharp understanding of how Big Data can be

applied to concrete environments / use cases ▶  Pushing European Big Data research and innovation to

contribute in increasing European competitiveness ▶  Building a self-sustainable, industry-led initiative

Overall Objective

Work at technical, business and policy levels, shaping the future through the positioning of IIM and Big Data

specifically in Horizon 2020.

Bringing the necessary stakeholders into a self-sustainable industry-led initiative, which will greatly contribute to enhance the EU competitiveness taking

full advantage of Big Data technologies.

Page 8: Dealing with Semantic Heterogeneity in Real-Time Information

@BYTE_EU www.byte-project.eu

Big  data  roadmap  and  cross-­‐disciplinarY  community  for  addressing  socieTal  Externali9es

•   The  effects  of  a  decision  by  stakeholders  (e.g.,  governments,  industry,  scienKsts,  policy-­‐makers)  that  have  an  impact  on  a  third  party  (especially  members  of  the  public).    

•   May  be  posiKve  or  negaKve  

Economic  

• Boost  to  the  economy  

• InnovaKon  • Increase  efficiency  

• Smaller  actors  le]  behind  

• Shrink  economies  

Legal  

• Privacy  • Data  protecKon  • Data  ownership  • Copyright  • Risks  associated  with  inclusion  &  exclusion  

Social  &  Ethical    

• Transparency  • DiscriminaKon  • Methodological  difficulKes  

• Spurious  relaKonships  

• Consumer  manipulaKon  

PoliKcal  

• Reliance  on  US  services  

• Services  have  become  uKliKes  

• Legal  issues  become  trade  issues  

Page 9: Dealing with Semantic Heterogeneity in Real-Time Information

LARGE  SCALE  OPEN  ENVIRONMENTS  

PART  I  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 10: Dealing with Semantic Heterogeneity in Real-Time Information

Emerging Environments… Smart  City  Energy  

Smart  Building   Water  Management  

Page 11: Dealing with Semantic Heterogeneity in Real-Time Information

From  Internet  of  Things  to  Internet  of  Everything  

Page 12: Dealing with Semantic Heterogeneity in Real-Time Information

Lots  of  Data   “90%  of  the  data  in  the  world  today  has  been  created  in  the  last  two  years    alone”    –  IBM  

“The  bringing  together  of  a  vast  amount  of  data  from  public  and  private  sources  […]  is  what  

Big  Data  is  all  about”  –  IDC  

Over  the  next  few  years  we’ll  see  the  adop@on  of  scalable  frameworks  and  pla^orms  for  handling  streaming,  or  near  real-­‐@me,  analysis  and  processing.”  –  O’Reilly  

Big Data represents a number of developments in technology that have

been brewing for years and are coming to a boil. They include an

explosion of data and new kinds of data, like from the Web and sensor

streams; [...].” – IDC

Page 13: Dealing with Semantic Heterogeneity in Real-Time Information

From  Rigid  Schemas  to  Schema-­‐less  

13  

•  Heterogeneous,  complex  and  large-­‐scale  data  •  Very-­‐large  and  dynamic  “schemas”  •  Open   Environments:   distributed,   decoupled   data   sources,   anonymous  

users,  mulK-­‐domain,  lack  of  global  order  of  informaKon  flow  

 10s-­‐100s  aeributes  

1,000s-­‐1,000,000s  aeributes  

circa  2000  

circa  2014  

Page 14: Dealing with Semantic Heterogeneity in Real-Time Information

Fundamental  DecentralizaKon  

14  

•  MulKple  perspecKves  (conceptualizaKons)  of  the  reality.  •  Ambiguity,  vagueness,  inconsistency.  

 

Page 15: Dealing with Semantic Heterogeneity in Real-Time Information

Current  Trends  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Small  scale,  controlled  environments  

Large  scale,  open  environments  

Informa@on  sources   10s  to  100s   1000s  to  millions  

Data  heterogeneity   Small  number  of  schemas   High  number  of  schemas  

Users   Small  number  Know  the  environment  

Large  number  Not  quite  know  the  environment  

Users  organiza@on   Users  know  each  others  Top-­‐down  hierarchies  (e.g.  enterprises)  

Decoupled  and  distributed  

Dynamism   Low   High  (sources  and  users  join  and  leave  o]en)  

Domain   Domain  specific   Users  interest  range  from  domain  specific  to  domain  agnosKc  

Page 16: Dealing with Semantic Heterogeneity in Real-Time Information

COMPUTATIONAL  PARADIGMS  

PART  II  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 17: Dealing with Semantic Heterogeneity in Real-Time Information

InformaKon  Flow  Processing  (IFP)  

•  Users  need  to  collect  informaKon  – Produced  by  mulKple  distributed  sources  – For  Kmely  way  processing  – To  extract  knowledge  asap  

 

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  Financial Continuous

Analytics RFID Inventory Management

Environmental Monitoring

Page 18: Dealing with Semantic Heterogeneity in Real-Time Information

InformaKon  Flow  Processing  (IFP)  

•  Processing  informaKon  as  it  flows  – No  intermediate  storage  – New  informaKon  produced  – Raw  informaKon  can  be  discarded  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

InformaKon  Flow  Processing  Engine  

Producers   Consumers  

Rule  managers  

CUGOLA,  G.  AND  MARGARA,  A.,  2011.  Processing  flows  of  informaKon:  From  data  stream  to  complex  event  processing.  ACM  Compu:ng  Surveys  Journal.  

Page 19: Dealing with Semantic Heterogeneity in Real-Time Information

InformaKon  Flow  Processing  (IFP)  

•  Requirements  – Real-­‐Kme  or  near  real-­‐Kme  processing  – Expressive  language  for  rules  – Scalability  to  large  number  of  producers  and  consumers  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 20: Dealing with Semantic Heterogeneity in Real-Time Information

ComputaKonal  Paradigm  

•  Event  Processing  –  Event:  object  represenKng  a  happening.  –  Deals  with  events  and  relaKons  of  events  (e.g.  inter-­‐events  sequencing,  causality,  etc.)  

•  Stream  Processing  –  Stream:  homogeneous  and  totally  ordered  set  of  data  items.  –  Deals  with  streams  and  operaKons  on  streams  (e.g.  joins).  

•  Event  “cloud”  may  contain  steams  of  events  as  well  as  parKally  ordered  set  of  events.  

–  (Cugola  &  Margara,  2012)  

Page 21: Dealing with Semantic Heterogeneity in Real-Time Information

•  Event  processing  agents,  network,  and  rules.  

Event  Processing  Architecture  

Producer  

Producer  E2  

E3  

E1  

Rule  

21  of  31  

Event  Processing  Engine   Consumer  

Page 22: Dealing with Semantic Heterogeneity in Real-Time Information

Events  Processing  is  Decoupled  for  Scalability  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Event  Processing  

Space  

Time  

SynchronizaKon  Event  source  

Event  consumer  

Patrick  Th.  Eugster,  Pascal  A.  Felber,  Rachid  Guerraoui,  and  Anne-­‐Marie  Kermarrec.  2003.  The  many  faces  of  publish/subscribe.  ACM  Comput.  Surv.  35,  2  (June  2003),  114-­‐131.    

Page 23: Dealing with Semantic Heterogeneity in Real-Time Information

AcKve  Databases  

•  TradiKonal  database  systems  –  Passive  –  Store  data  and  wait  for  user’s  interacKon  –  ReacKve  behaviour  in  the  applicaKon  layer  

–  DAYAL,  U.,  BLAUSTEIN,  B.,  BUCHMANN,  A.,  CHAKRAVARTHY,  U.,  HSU,  M.,  LEDIN,  R.,  MCCARTHY,  D.,  ROSENTHAL,  A.,  SARIN,  S.,  CAREY,  M.  J.,  LIVNY,  M.,  AND  JAUHARI,  R.  1988.  The  hipac  project:  Combining  acKve  databases  and  Kming  constraints.  SIGMOD  Rec.  17,  1,  51–70.  

–  LIEUWEN,  D.  F.,  GEHANI,  N.  H.,  AND  ARLEIN,  R.  M.  1996.  The  ode  acKve  database:  Trigger  semanKcs  and  implementaKon.  In  Proceedings  of  the  12th  InternaKonal  Conference  on  Data  Engineering  (ICDE’96).  IEEE  Computer  Society,  Los  Alamitos,  CA,  412–420.  

–  GATZIU,  S.  AND  DITTRICH,  K.  1993.  Events  in  an  acKve  object-­‐oriented  database  system.  In  Proceedings  of  the  InternaKonal  Workshop  on  Rules  in  Database  Systems  (RIDS),  N.  Paton  and  H.  Williams,  Eds.  Workshops  in  CompuKng,  Springer-­‐Verlag,  Edinburgh,  U.K.  

–  CHAKRAVARTHY,  S.  AND  ADAIKKALAVAN,  R.  2008.  Events  and  streams:  Harnessing  and  unleashing  their  synergy!  In  Proceedings  of  the  2nd  InternaKonal  Conference  on  Distributed  Event-­‐Based  Systems  (DEBS’08).  ACM,  New  York,  NY,  1–12.  

 7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 24: Dealing with Semantic Heterogeneity in Real-Time Information

AcKve  Databases  

•  ReacKve  behaviour  to  database  layer  •  Event-­‐CondiKon-­‐AcKon  (ECA)  rules    

– Event:  source.  E.g.  tuple  inserted  – CondiKon:  post  event.  E.g.  inserted.value  >  5  – AcKon:  what  to  do.  E.g.  modify  the  DB  

•  Cons  – Persistent  storage  model  – Suitable  when  updates  not  frequent  and  few  rules  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 25: Dealing with Semantic Heterogeneity in Real-Time Information

Data  Stream  Management  Systems  

•  Streams  unbounded  (not  like  tables)  •  No  arrival  order  assumpKons  •  Typically  no  storage  •  Use  conKnuous,  or  standing,  queries  •  ReacKve  in  nature  •  CHANDRASEKARAN,  S.,  COOPER,  O.,  DESHPANDE,  A.,  FRANKLIN,  M.  J.,  HELLERSTEIN,  J.  M.,  HONG,  W.,  KRISHNAMURTHY,  S.,  

MADDEN,  S.  R.,  REISS,  F.,  AND  SHAH,  M.  A.  2003.  Telegraphcq:  ConKnuous  dataflow  processing.  In  Proceedings  of  the  ACM  SIGMOD  InternaKonal  Conference  on  Management  of  Data  (SIGMOD’03).  ACM,  New  York,  NY,  668–668.  

•  CHEN,  J.,  DEWITT,  D.  J.,  TIAN,  F.,  AND  WANG,  Y.  2000.  Niagaracq:  A  scalable  conKnuous  query  system  for  Internet  databases.  SIGMOD  Rec.  29,  2,  379–390.  

•  LIU,  L.,  PU,  C.,  AND  TANG,  W.  1999.  ConKnual  queries  for  internet  scale  event-­‐driven  informaKon  delivery.  IEEE  Trans.  Knowl.  Data  Eng.  11,  4,  610–628.  

•  ARASU,  A.,  BABU,  S.,  AND  WIDOM,  J.  2006.  The  CQL  conKnuous  query  language:  SemanKc  foundaKons  and  query  execuKon.  VLDB  J.  15,  2,  121–142.  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 26: Dealing with Semantic Heterogeneity in Real-Time Information

Data  Stream  Management  Systems  

•  ConKnuous  queries  semanKcs  – Answer:  append  only  stream  or  update  store  – Exact  or  approximate  answer  

•  Cons  – Atomic  item  is  the  stream  – Not  possible  to  detect  sequencing  or  causal  paeerns  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 27: Dealing with Semantic Heterogeneity in Real-Time Information

Publish/Subscribe    Systems  

•  InformaKon  items  are  no:fica:on    •  Indirect  addressing-­‐based  communicaKon  scheme  

•  Ancestors  – Message  Passing  –  Remote  Procedure  Call  (RPC)  –  Shared  spaces  – Message  Queueing    EUGSTER,  P.T.,  FELBER,  P.A.,  GUERRAOUI,  R.  AND  KERMARREC,  A.M.,  2003.  The  many  faces  of  publish/subscribe.  ACM  Compu:ng  Surveys  (CSUR),  35(2),  pp.114–131.  MUHL  ,  G.,  FIEGE,  L.,  AND  PIETZUCH,  P.  2006.  Distributed  Event-­‐Based  Systems.  Springer    

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 28: Dealing with Semantic Heterogeneity in Real-Time Information

Publish/Subscribe  Systems  

•  One-­‐to-­‐many  and  many-­‐to-­‐many  distribuKon  mechanism  –  allows  single  producer  to  send  a  message  to  one  user  or  potenKally  hundreds  of  thousands  of  consumers    

   E.  Curry,  “Message-­‐Oriented  Middleware,”  in  Middleware  for  CommunicaKons,  Q.  H.  Mahmoud,  Ed.  Chichester,  England:  John  Wiley  and  Sons,  2004,  pp.  1–28.  

IntroducKon  to  Message-­‐Oriented  Middleware   28  

Page 29: Dealing with Semantic Heterogeneity in Real-Time Information

Publish/Subscribe    Systems  

•  Topic-­‐based  pub/sub  –  Topics  are  groups  or  channels  –  Events  of  a  topic  are  sent  to  the  topic’s  subscribers  ALTHERR,  M.,  ERZBERGER,  M.,  AND  MAFFEIS,  S.  1999.  iBus—a  so]ware  bus  middleware  for  the  Java  plavorm.  In  Proceedings  of  the  InternaKonal  Workshop  on  Reliable  Middleware  Systems.  43–53.    

•  Content-­‐based  pub/sub  – Matching  by  message  filters  –  Publishers  and  subscribers  channels  are  defined  by  the  content  and  the  subscripKons  

David  S.  Rosenblum  and  Alexander  L.  Wolf.  1997.  A  design  framework  for  Internet-­‐scale  event  observaKon  and  noKficaKon.  SIGSOFT  SoGw.  Eng.  Notes  22,  6  (November  1997),  344-­‐360.  DOI=10.1145/267896.267920  hep://doi.acm.org/10.1145/267896.267920    

•  Type-­‐based  pub/sub  – Matching  on  type  hierarchy  EUGSTER,  P.  AND  GUERRAOUI,  R.  2001.  Content  based  publish/subscribe  with  structural  reflecKon.  In  Proceedings  of  the  6th  Usenix  Conference  on  Object-­‐Oriented  Technologies  andSystems  (COOTS’01).  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 30: Dealing with Semantic Heterogeneity in Real-Time Information

Complex  Event  Processing  Systems  

•  DetecKon  of  complex  paeerns  – Sequencing  – Causal  – Ordering  in  general  – Of  mulKple  events    – And  generate  complex,    or  derived,  events  

   LUCKHAM,  D.,  2002.  The  Power  of  Events:  An  Introduc:on  to  Complex  Event  Processing  in  Distributed  Enterprise  Systems,  Addison-­‐Wesley  Professional.  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 31: Dealing with Semantic Heterogeneity in Real-Time Information

Complex  Event  Processing  Systems  

 

Adapted  from  CUGOLA,  G.  AND  MARGARA,  A.,  2011.  Processing  flows  of  informaKon:  From  data  stream  to  complex  event  processing.  ACM  Compu:ng  Surveys  Journal.  

 

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 32: Dealing with Semantic Heterogeneity in Real-Time Information

RDF  EVENT  PROCESSING  

PART  III  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 33: Dealing with Semantic Heterogeneity in Real-Time Information

Why  Linked  Data  for  the  IoT?  

•  Many  communiKes  struggle  with  closed  approaches  –  E.g.,  pervasive  compuKng,  embedded  systems,  IoT,  ...  

•  Cyber-­‐Physical  Systems  are  inherently  “open  world”  –  Prof.  David  Karger  (MIT)  in  his  ESWC  2013  keynote:      “Semantic Web technologies support and open world assumption where

millions of unforeseeable schemas may have to be integrated.”

•  Simple  integraKon  with  exisKng  LOD  data  sets  –  Geo-­‐spaKal,  governmental,  media,  ...  

•  Manageable  integraKon  effort  with  other  graph  data,  e.g.,  Google  Knowledge  Graph,  Facebook  Graph,  etc.  

Page 34: Dealing with Semantic Heterogeneity in Real-Time Information

EU ICT OpenIoT Project

Knowledge-Based Future Internet Step 2:

Sensor/Cloud Formulation

Step 1: Sensing-as-a-Service

Request

Step 3: Service Provisioning

(Utility Metrics)

Infrastructure’s provider(s) (e.g., Smart City)

OpenIoT User (Citizen, Corporate)

Domain #1

Domain #N

34

Middleware Core features:

Open Source

Linked Data

Cloud Computing

Internet of Things

IoT Management

Data Privacy and

Security

Mobility

and

Quality of Service

www.openiot.eu

EU ICT-2011.1.3 Contract No.: 287305

An Open Source Cloud Solution for the Internet of Things!

Open Source blueprint for large scale self-organizing cloud environments for IoT applications

Page 35: Dealing with Semantic Heterogeneity in Real-Time Information

Sensor Networks

•  OpenIoT leverages the SoA on Internet of Things (IoT) RFID/WSN middleware frameworks.

•  OpenIoT provides baseline service functionalities associated with registering and looking up internet-connected objects (ICOs) named things.

IoT Management

•  OpenIoT provides baseline visualization services.

•  OpenIoT supports dynamic interoperable self-organizing management on cloud environments for IoT.

•  OpenIoT enables the autonomy of a variety of IoT entities and resources.

Cloud Computing

•  OpenIoT allows creation of PaaS models over internet-connected objects.

•  OpenIoT supports applications that leverage information from multiple sensors, actuators and other devices to the cloud.

•  OpenIoT enables cloud solutions to support IoT.

Open Source

•  OpenIoT is an open source solution.

•  OpenIoT is first a kind of extension of existing open cloud computing infrastructures towards the IoT support.

•  OpenIoT is a customizable toolkit for the IoT.

OpenIoT Innovation for the Smart Industry www.openiot.eu

Agrifood Phenonet Smart City Manufacturing

Smart Campus Gain Briddes Plant Key Performance

Indicators Air Quality Silver Angel

P

S

Broker

Broker

Broker

Mobile Broker

P

S

S

35

Page 36: Dealing with Semantic Heterogeneity in Real-Time Information

SemanKc  Sensor  Networks  Ontology  

[JoWS 2012]

Page 37: Dealing with Semantic Heterogeneity in Real-Time Information

SSN  ApplicaKon:  SPITFIRE    

• DUL: DOLCE+DnS Ultralite • EventF: Event-Model F

• SSN: SSN-XG • CC: Contextualised-Cognitive

Concepts on sensor network topology and devices

Concepts on sensor role, events, sensor project

Event Datasets

Sensor Datasets

LOD Cloud

Page 38: Dealing with Semantic Heterogeneity in Real-Time Information

CQELS  

n  ConKnuous  Query  EvaluaKon  over  Linked  Streams  

n  Scalable  processing  model  for  unified  Linked  Stream  Data  and  Linked  Open  Data  

n  Combines  data  pre-­‐processing  and  an  adapKve  cost-­‐based  query  opKmizaKon  algorithm  

[SSN  2009,  SSN  2010,  ISWC  2011]  

Page 39: Dealing with Semantic Heterogeneity in Real-Time Information

Linked  Stream  Middleware  

[WWW 2009, JoWS 2012, CLOSER 2013] http://lsm.deri.ie/

Page 40: Dealing with Semantic Heterogeneity in Real-Time Information

LSM:  Live  train  info  

Page 41: Dealing with Semantic Heterogeneity in Real-Time Information

Projects  using  Linked  Data  for  IoT  Open Source IoT Architectural Blueprint http://www.openiot.eu/

https://github.com/OpenIotOrg/openiot

Real-Time IoT Stream Processing and Large-scale Data Analytics for Smart Cities http://www.ict-citypulse.eu/

Smart, secure and cost-effective integrated IoT deployments in smart cities http://vital-project.eu/

Behaviour-driven Autonomous Services for smart transportation in smart cities http://gambas-ict.eu/

Page 42: Dealing with Semantic Heterogeneity in Real-Time Information

THEORY  OF  EVENT  EXCHANGE    PART  IV  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 43: Dealing with Semantic Heterogeneity in Real-Time Information

Problem  

•  Event  producers  and  consumers  are  semanKcally  coupled  –  Consumers  need  prior  knowledge  of  event  types,  aeributes  and  values.  

–  Limits  scalability  in  heterogeneous  and  dynamic  environments  due  to  explicit  dependencies  

–  Difficult  development  of  event  processing  subscripKons/rules  in  heterogeneous  and  dynamic  environments.  

Space

Time

Synch

Producer Consumer Semantic

Page 44: Dealing with Semantic Heterogeneity in Real-Time Information

Type   Energy Consumption  

Place   Room 202e

Amount   40 kWh

Type   Electricity Consumption  

Loca@on   Room 202e

Amount   70 kWh

Type   Electricity Utilized  

Venue   Room 202e

Amount   600 kWh

e1

Event Producers e.g. Sensors

Type =“Energy Consumption” Place =“Room 202e”

Type =“Electricity Consumption” Location =“Room 202e”

Type =“Electricity Utilized” Venue =“Room 202e”

TradiKonal  Event  

Processing  

e1

Consumer

e1 e2

e1 e3

Exact  Matching  Model  

Page 45: Dealing with Semantic Heterogeneity in Real-Time Information

Type   Energy Consumption  

Place   Room 202e

Amount   40 kWh

Type   Electricity Consumption  

Loca@on   Room 202e

Amount   70 kWh

Type   Electricity Utilized  

Venue   Room 202e

Amount   600 kWh

e1

Event Producers e.g. Sensors

e1

e1 e2

e1 e3

SemanKc  Event  

Processing  

Type =“Energy Consumption”~ Location =“Room 202e”

Consumer

SemanKc  Matching  

Page 46: Dealing with Semantic Heterogeneity in Real-Time Information

How  Good  are  Our  Paradigms?  

•  Scale  – Big  volume  – Big  Velocity  – Big  Variety  

•  Distributed  sources  and  consumers  

•  The  big  challenge  is  now  in  the  exchange  of  knowledge  at  a  very  large-­‐scale  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 47: Dealing with Semantic Heterogeneity in Real-Time Information

Shannon-­‐Weaver  Model  

C.  Shannon  and  W.  Weaver.  The  mathemaKcal  theory  of  communicaKon.  University  of  Illinois  Press,  1949.  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 48: Dealing with Semantic Heterogeneity in Real-Time Information

Cross-­‐Boundaries  Exchange  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

     

   

   

SyntacKc  

SemanKc  

PragmaKc  

Producer   Consumer  

P.  R.  Carlile.  Transferring,  translaKng,  and  transforming:  An  integraKve  framework  for  managing  knowledge  across  boundaries.  OrganizaKon  science,  15(5):555{568,  2004.  

Boundaries   Open  environment  

Known  environment  

Page 49: Dealing with Semantic Heterogeneity in Real-Time Information

SyntacKc  Boundary  

•  Transfer  is  the  most  common  type  of  informaKon  movement  across  this  boundary  

•  A  common  lexicon  exists  – Move  and  process  syntax  (0’s  and  1’s)    – Dominant  form  of  Shannon  Weaver’s  theory  

•  E.g.  Different  data  models  of  events  •  E.g.  Transfer  RDF  events  over  HTTP  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 50: Dealing with Semantic Heterogeneity in Real-Time Information

SemanKc  Boundary  

•  Common  lexicon  doesn’t  exist  •  Lexicon  evolve  •  AmbiguiKes  exist  •  TranslaKon  is  the  process  to  cross  this  boundary  

•  E.g.  Different  ontologies  for  sensors  •  E.g.  Ontology  alignment  for  RDF  events  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 51: Dealing with Semantic Heterogeneity in Real-Time Information

PragmaKc  Boundary  

•  Actors  on  the  sides  of  the  boundary  have:  – Different  contexts  – Different  perspecKves  – Different  interests  

•  TransformaKon  is  the  process  to  cross  this  boundary  

•  E.g.  Temp  sensor  reading  of  35  celsius  is  acceptable  from  outdoor  sensors  but  not  from  indoor  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 52: Dealing with Semantic Heterogeneity in Real-Time Information

Cross-­‐Boundaries  Exchange  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

     

   

   

SyntacKc  

SemanKc  

PragmaKc  

Producer   Consumer  

Boundaries   Open  environment  

Known  environment  

P.  R.  Carlile.  Transferring,  translaKng,  and  transforming:  An  integraKve  framework  for  managing  knowledge  across  boundaries.  OrganizaKon  science,  15(5):555{568,  2004.  

Page 53: Dealing with Semantic Heterogeneity in Real-Time Information

Transfer-­‐Translate-­‐Transform  

•  Current  approaches  in  event  processing  •  Transfer  

–  Common  event/language  models  •  E.g.  RDF  over  HTTP  

•  Translate  – Agreements  on  schemas/thesauri/ontologies  

•  E.g.  DERI  Energy  ontology  for  building  energy  events  •  Curry,  Edward,  et  al.  "Linking  building  data  in  the  cloud:  IntegraKng  cross-­‐domain  building  data  using  linked  

data."  Advanced  Engineering  Informa:cs  27.2  (2013):  206-­‐219.  

•  Transform  – Dedicated  enrichers,  joins  in  event  languages  

•  CQELS  language  for  Linked  Stream  Data  mashups  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 54: Dealing with Semantic Heterogeneity in Real-Time Information

Decoupling  for  Scalability  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Event  Processing  

Space  

Time  

SynchronizaKon  Event  source  

Event  consumer  

Patrick  Th.  Eugster,  Pascal  A.  Felber,  Rachid  Guerraoui,  and  Anne-­‐Marie  Kermarrec.  2003.  The  many  faces  of  publish/subscribe.  ACM  Comput.  Surv.  35,  2  (June  2003),  114-­‐131.    

Page 55: Dealing with Semantic Heterogeneity in Real-Time Information

SemanKc  Coupling  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Event  Processing  Space  

Time  

SynchronizaKon  Event  source  

Event  consumer  SemanKc  Coupling  

type,  aTributes,  values  

Page 56: Dealing with Semantic Heterogeneity in Real-Time Information

APPROACHES  TO  SEMANTIC  COUPLING    Part  V  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 57: Dealing with Semantic Heterogeneity in Real-Time Information

Loosening  the  SemanKc  Coupling  

•  Approach  1:  Content-­‐Based  with  SemanKc  Decoupling  –  A.  Carzaniga,  D.  S.  Rosenblum,  and  A.  L.  Wolf.  Achieving  scalability  and  expressiveness  in  an  internet-­‐scale  event  noK_caKon  service.  In  Proceedings  of  the  

nineteenth  annual  ACM  symposium  on  Principles  of  distributed  compuKng,  pages  219-­‐227.  ACM,  2000.  

•  Approach  2:  Content-­‐Based  with  Implicit  Shared  Agreements  •  David  S.  Rosenblum  and  Alexander  L.  Wolf.  1997.  A  design  framework  for  Internet-­‐scale  event  observaKon  and  noKficaKon.  SIGSOFT  SoGw.  Eng.  Notes  22,  6  

(November  1997),  344-­‐360.  DOI=10.1145/267896.267920  hep://doi.acm.org/10.1145/267896.267920  

•  Approach  3:  Concept-­‐Based  –  M.  Petrovic,  I.  Burcea,  and  H.-­‐A.  Jacobsen.  S-­‐topss:  semanKc  toronto  publish/subscribe  system.  In  Proceedings  of  the  29th  internaKonal  

conference  on  Very  large  data  bases  -­‐  Volume  29,  VLDB  '03,  pages  1101-­‐1104.  VLDB  Endowment,  2003.  

•  Approach  4:  Loose  SemanKc  Coupling  +  ApproximaKon  –  Hasan,  S.  and  Curry,  E.,  2014.  Approximate  SemanKc  Matching  of  Events  for  The  Internet  of  Things.  ACM  Transac:ons  

on  Internet  Technology  (TOIT).  In  Press  

•  Approach  5:  Theme-­‐Based  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 58: Dealing with Semantic Heterogeneity in Real-Time Information

Current  Approaches  

Semantic Decoupling

Effectiveness & Efficiency

Conte

nt-

bas

ed

Conce

pt-

bas

ed

Bot

tom

-up

Sem

anti

cs

Page 59: Dealing with Semantic Heterogeneity in Real-Time Information

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Approach  1:  Content-­‐Based  with  SemanKc  Decoupling  

•  Very  low  detecKon  rate  – High  false  posiKves/negaKves  – Low  precision/recall  

Producer   Consumer  

event  

Seman@c  De-­‐Coupling  

Happened  

Publish:  A  Happened  

Interested  in    

Subscribe:  Interested  in  B  

Page 60: Dealing with Semantic Heterogeneity in Real-Time Information

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Approach  1:  Content-­‐Based  with  SemanKc  Decoupling  

•  Use  many  rules  to  improve  detecKon  – Time  and  effort  – Affects  scalability  to  heterogeneous  environments  

Producer   Consumer  

event  

Seman@c  De-­‐Coupling  

Happened  

Publish:  A  Happened  

Interested  in    

Subscribe:  Interested  in  A  Interested  in  B  Interested  in  C  

 

Page 61: Dealing with Semantic Heterogeneity in Real-Time Information

Approach  2:  Content-­‐Based  with  Implicit  Shared  Agreements  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Producer   Consumer  

event  

Seman@c  Coupling  via  Implicit  Agreements  

Happened  

Publish:  A  Happened  

Interested  in    

Subscribe:  Interested  in  A  

Face-­‐to-­‐face,  or  via  documentaKon  

 Use  symbol  A  to  describe        

 

Page 62: Dealing with Semantic Heterogeneity in Real-Time Information

Approach  2:  Content-­‐Based  with  Implicit  Shared  Agreements  

•  Implicit  semanKcs  – Top-­‐down  approach  to  semanKcs  – Granular  on  the  level  of  concepts  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Producer   Consumer  

event  

Seman@c  Coupling  via  Implicit  Agreements  

Happened  

Publish:  A  Happened  

Interested  in    

Subscribe:  Interested  in  A  

Page 63: Dealing with Semantic Heterogeneity in Real-Time Information

Approach  2:  Content-­‐Based  with  Implicit  Shared  Agreements  

•  Need  for  shared  agreements  – Time  and  effort  – Affects  scalability  to  heterogeneous  environments  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Producer   Consumer  

event  

Seman@c  Coupling  via  Implicit  Agreements  

Happened  

Publish:  A  Happened  

Interested  in    

Subscribe:  Interested  in  A  

Page 64: Dealing with Semantic Heterogeneity in Real-Time Information

Approach  3:  Concept-­‐Based  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Producer   Consumer  

event  

Seman@c  Coupling  via  Ontologies  

Happened  

Publish:  A  Happened  

Interested  in    

Subscribe:  Interested  in  B  

C  D  

B  E  

A  F  subClassOf  

Page 65: Dealing with Semantic Heterogeneity in Real-Time Information

Approach  3:  Concept-­‐Based  

•  Explicit  semanKcs  – Top-­‐down  approach  to  semanKcs  – Granular  on  the  level  of  concepts  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Producer   Consumer  

event  

Seman@c  Coupling  via  Ontologies  

Happened  

Publish:  A  Happened  

Interested  in    

Subscribe:  Interested  in  B  

Page 66: Dealing with Semantic Heterogeneity in Real-Time Information

Approach  3:  Concept-­‐Based  

•  Need  for  shared  agreements  – Time  and  effort  – Affects  scalability  to  heterogeneous  environments  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Producer   Consumer  

event  

Seman@c  Coupling  via  Ontologies  

Happened  

Publish:  A  Happened  

Interested  in    

Subscribe:  Interested  in  B  

Page 67: Dealing with Semantic Heterogeneity in Real-Time Information

•  Most  semanKc  models  have  dealt  with  parKcular   types  of  construcKons,  and  have  been  carried  out  under  very  simplifying  assumpKons,  in  true  lab  condiKons.    

•  If   these   idealizaKons   are   removed   it   is   not   clear   at   all   that   modern  semanKcs   can   give   a   full   account   of   all   but   the   simplest   models/statements.  

Sahlgren,  2013  

Formal  World  

 

 

 

Real  World  

 

 

 

SemanKcs  for  a  Complex  World    

67  

Baroni  et  al.  2013  

Page 68: Dealing with Semantic Heterogeneity in Real-Time Information

Distributional Semantic Model

•  Distributional hypothesis: the context surrounding a given word in a text provides relevant information about its meaning.

•  Simplified semantic model. –  Associational and quantitative.

•  Explicit Semantic Analysis (ESA) is the primary distributional model used in this work.

68

A  wife   is  a   female  partner   in  a  marriage.  The  term  "wife"  seems  to  be  a  close   term   to   bride,   the   laeer   is   a   female   parKcipant   in   a   wedding  ceremony,  while  a  wife  is  a  married  woman  during  her  marriage.    ...  

Page 69: Dealing with Semantic Heterogeneity in Real-Time Information

DistribuKonal  SemanKc  Model  

c1

child

husband spouse

cn

c2

function (number of times that the words occur in c1)

0.7

0.5

Commonsense is here

69  (Freitas,  2012)  

Page 70: Dealing with Semantic Heterogeneity in Real-Time Information

SemanKc  Relatedness  

70  

θ

c1

child

husband spouse

cn

c2

Works as a semantic ranking function

E.g.  esa(room,  building)=  0.099  E.g.  esa(room,  car)=  0.009    (Freitas,  2012)  

Page 71: Dealing with Semantic Heterogeneity in Real-Time Information

Approach  4:  Loose  SemanKc  Coupling  +  ApproximaKon  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Producer   Consumer  

event  

Loose  Seman@c  Coupling  via  Large  Text  Corpora  

Happened  

Publish:  A  Happened  

Interested  in    

Subscribe:  Interested  in  B  

A   d1   d2   d3   d4   d5   d6   d7   d8   ….  

B   d1   d3   d4   d17   d25   d26   d77   d78  ….  

~  

(Hasan  et  al.,  2004)  

Page 72: Dealing with Semantic Heterogeneity in Real-Time Information

Approach  4:  Loose  SemanKc  Coupling  +  ApproximaKon  

•  Boeom-­‐up  model  of  semanKcs  •  Global  semanKcs:  distribuKon  vs.  granular  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Producer   Consumer  

event  

Loose  Seman@c  Coupling  via  Large  Text  Corpora  

Happened  

Publish:  A  Happened  

Interested  in    

Subscribe:  Interested  in  B  ~  

Page 73: Dealing with Semantic Heterogeneity in Real-Time Information

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Approach  4:  Loose  SemanKc  Coupling  +  ApproximaKon  

•  Low  cost  to  Scale  to  heterogeneous  environments  

•  Slightly  lower  detecKon  rate  

Producer   Consumer  

event  

Loose  Seman@c  Coupling  via  Large  Text  Corpora  

Happened  

Publish:  A  Happened  

Interested  in    

Subscribe:  Interested  in  B  ~  

Page 74: Dealing with Semantic Heterogeneity in Real-Time Information

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Approach  5:  Theme-­‐Based  

•  Can  we  exchange  beeer  approximaKons  of  meanings  rather  than  mere  symbols  to  improving  detecKon  rate?  

Producer   Consumer  

event  

Loose  Seman@c  Coupling  via  Large  Text  Corpora  

Happened  

Publish:  A  Happened  

Interested  in    

Subscribe:  Interested  in  B  ~  

(Hasan  and  Curry,  2014)  

Page 75: Dealing with Semantic Heterogeneity in Real-Time Information

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Approach  5:  Theme-­‐Based  

Producer   Consumer  

event  

Loose  Seman@c  Coupling  via  Large  Text  Corpora  

Happened  

Publish:  (A+T1)  

Happened  

Interested  in    

Subscribe:  Interested  in  (B

+T2)  

A   d1   d2   d3   d4   d5   d6   d7   d8   ….  

B   d1   d3   d4   d17   d25   d26   d77   d78  ….  

~  

Theme  T2  

Page 76: Dealing with Semantic Heterogeneity in Real-Time Information

The  ThemaKc  Approach  

•  Exchange  approximaKons  of  meanings  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Event  

Publisher  Alice  

Consumer  Bob  

Theme  the  

Payload  

Subscrip@on  

Theme  ths  

Expression  Approximate  matcher  

ParameterizaKon  

Loose  coupling  mode:  lightweight  agreement  on  themes  

No  coupling  mode:  free  use  of  well  representaKve  themes  

Hasan,  S.  and  Curry,  E.,  2014.  ThemaKc  Event  Processing.  Middleware  2014.  Under  review.  

Page 77: Dealing with Semantic Heterogeneity in Real-Time Information

Event  RepresentaKon  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Event  

energy,  appliances,  building  

type:  increased  energy  consumpKon  event,  measurement  unit:  kilowae  per  hour,  

device:  computer,    office:  room  112  

Page 78: Dealing with Semantic Heterogeneity in Real-Time Information

SubscripKon  RepresentaKon  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Subscrip@on  

power,  computers  

type=  increased  energy  usage  event~,  device~=  laptop~,    office=  room  112  

Page 79: Dealing with Semantic Heterogeneity in Real-Time Information

ProbabilisKc  Approximate  Matcher  

•  Top-­‐1  and  Top-­‐k  mappings  between  an  event  and  a  subscripKon  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 80: Dealing with Semantic Heterogeneity in Real-Time Information

Building  IoT  So]ware  

7-­‐11  July  2014,  Rhodes,  Greece  

Indexing  

Collector  

SemanKc  relatedness  web  service  

Textual  corpus  

Vector  space  index  

Consumer  Bob  (user)  

Publisher  Alice  Publish  +  thema:c  tags  

ThemaKc  event  processing  engine(s)  

Approximate  single  event  matching  

Subscribe  +  thema:c  tags  

IoT  sensors  

Terms  +  themes  pairs  

Relatedness  score  

Collector  Publisher  Carol  Publish  +  thema:c  tags  

Collector  Publisher  Dave  Publish  +  thema:c  tags  

Consumer  Dan  (applicaKon  developer)  

Consumer  Erin  (applicaKon  developer)  

Heterogeneous  IoT  Events  

Relevant  events  

normalized  for  Bob  

Subscribe  +  thema:c  tags  

Relevant  events  

normalized  for  Dan  

Subscribe  +  thema:c  tags  

Relevant  events  

normalized  for  Erin  

Page 81: Dealing with Semantic Heterogeneity in Real-Time Information

Summary  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Simple  Content-­‐based  

Content-­‐based  +  Many  Rules  

Concept-­‐based  

Simple  Distribu@onal  +  Approxima@on  

Thema@c  

Matching   exact  string  matching  

exact  string  matching  

Boolean  semanKc  matching  

approximate  semanKc  matching  

approximate  semanKc  matching  

SemanKc  Coupling  

term-­‐level  full  agreement  

term-­‐level  full  agreement  

concept-­‐level  shared  agreement  

loose  agreement   loose  agreement  

SemanKcs   not  explicit   not  explicit   top-­‐down  ontology-­‐based  

staKsKcal  model  based  on  distribuKonal  semanKcs  

staKsKcal  model  based  on  distribuKonal  semanKcs  +  themes  

EffecKveness     very  low   100%   depends  on  the  domains  and  number  of  concept  models  

depends  on  the  corpus   depends  on  the  corpus  +  theme  representaKves  

Cost   defining  a  small  number  of  rules  

defining  a  large  number  of  rules  

establishing  shared  agreement  on  ontologies  

minimal  agreement  on  a  large  textual  corpus  

minimal  agreement  on  a  large  textual  corpus  +  good  theme  representaKves  

Efficiency   high   high   medium  to  high   medium  to  high   Medium  to  high  

Page 82: Dealing with Semantic Heterogeneity in Real-Time Information

EvaluaKon  Dataset  

•  Seed  events  synthesized  from  IoT  sensors  •  SmartSantander  smart  city  project  

–  Luis  Sanchez,  Jos´e  Antonio  Galache,  Veronica  GuKerrez,  JM  Hernandez,  J  Bernat,  Alex  Gluhak,  and  Tom´as  Garcia.  2011.  SmartSantander:  The  meeKng  point  between  Future  Internet  research  and  experimentaKon  and  the  smart  ciKes.  In  Future  Network  &  Mobile  Summit  (FutureNetw),  2011.  IEEE,  1–8.  

•   Sensor  CapabiliKes  –  solar  radiaKon,  parKcles,  speed,  wind  direcKon,  wind    speed,  temperature,  water  ow,  atmospheric  pressure,  noise,  ozone,  rainfall,  parking,  radiaKon  par,  co,  ground  temperature,  light,  no2,  soil  moisture  tension,  relaKve  humidity,  energy  consumpKon,  cpu  usage,  memory  usage  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Hasan,  S.  and  Curry,  E.,  2014.  Approximate  SemanKc  Matching  of  Events  for  The  Internet  of  Things.  ACM  Transac:ons  on  Internet  Technology  (TOIT).  In  Press  

Page 83: Dealing with Semantic Heterogeneity in Real-Time Information

EvaluaKon  Dataset  

•  Seed  events  synthesized  from  IoT  sensors  •  Linked  Energy  Intelligence  plavorm  

–  Edward  Curry,  Souleiman  Hasan,  and  Sean  O’Riain.  2012.  Enterprise  energy  management  using  a  linked  dataspace  for  Energy  Intelligence.  In  Sustainable  Internet  and  ICT  for  Sustainability  (SustainIT),  2012.  IEEE,  1–6.  

•  Car  brands  from  the  yahoo  directory  –  Yahoo!  2013.  Yahoo!  Directory:  AutomoKve  -­‐  Makes  and  Models.  (2013).  hep://dir.yahoo.com/recreaKon/  

automoKve/makes  and  models/  

•  Home  based  appliances  from  BLUED  dataset  –  Kyle  Anderson,  Adrian  Ocneanu,  Diego  Benitez,  Derrick  Carlson,  Anthony  Rowe,  and  Mario  Berges.  2012.  BLUED:  A  

Fully  Labeled  Public  Dataset  for  Event-­‐Based  Non-­‐Intrusive  Load  Monitoring  Research.  In  Proc.  SustKDD.  

•  Rooms  from  DERI  Building  –  Richard  Cyganiak.  2013.  Rooms  in  the  DERI  building.  (2013).  hep://lab.linkeddata.deri.ie/2010/deri-­‐rooms  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Hasan,  S.  and  Curry,  E.,  2014.  Approximate  SemanKc  Matching  of  Events  for  The  Internet  of  Things.  ACM  Transac:ons  on  Internet  Technology  (TOIT).  In  Press  

Page 84: Dealing with Semantic Heterogeneity in Real-Time Information

EvaluaKon  

•  FScore  up  to  95%  and  1000s  events/sec  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Hasan,  S.  and  Curry,  E.,  2014.  Approximate  SemanKc  Matching  of  Events  for  The  Internet  of  Things.  ACM  Transac:ons  on  Internet  Technology  (TOIT).  In  Press  

Page 85: Dealing with Semantic Heterogeneity in Real-Time Information

EXAMPLE  APPLICATION:    LINKED  ENERGY  INTELLIGENCE    PART  VI  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 86: Dealing with Semantic Heterogeneity in Real-Time Information

New  Smart  Building  

86  

Cost  -­‐  €  40,000,000      

Page 87: Dealing with Semantic Heterogeneity in Real-Time Information

A  Real-­‐World  Example  

87  

Time Monday Tuesday Wednesday Thursday Friday08:00-­‐09:0009:00-­‐10:00 237 237 200 23710:00-­‐11:00 237 237 237 20011:00-­‐12:00 237 180 180 145 23712:00-­‐13:00 237 200 237 200 14913:00-­‐14:00 14514:00-­‐15:00 221 237 145 14015:00-­‐16:00 221 120 160 14016:00-­‐17:00 149 250 16017:00-­‐18:00 200 160

CO2  levels  

ASHRAE    62.1-­‐2010  

Occupancy  Paeern  

AirCon  8:30-­‐11:00  &  15:00-­‐16:00  Mon  to  Fri      Cost  -­‐  €  40,000,000      

Page 88: Dealing with Semantic Heterogeneity in Real-Time Information

Legacy  Building  •  DERI  Building  •  No  BMS  or  BEMS  •  160  person  Office  space  •  Café  •  Data  centre    •  3  Kitchens  •  80  person  Conference  

room  •  4  MeeKng  rooms  •  CompuKng  museum    •  Sensor  Lab  

88

Page 89: Dealing with Semantic Heterogeneity in Real-Time Information

Energy  Management  System  

Page 90: Dealing with Semantic Heterogeneity in Real-Time Information

Sensors  

90  of  26  

Page 91: Dealing with Semantic Heterogeneity in Real-Time Information

Energy  Management  So]ware  

Page 92: Dealing with Semantic Heterogeneity in Real-Time Information

HolisKc  Energy  ConsumpKon  

Holis@c  Energy  

Management  

   

   

FaciliKes  

Business  Travel  Data  Centre  

Daily  Commute  Office  IT  

Page 93: Dealing with Semantic Heterogeneity in Real-Time Information

Business  Context  of  Energy  ConsumpKon  

Resource Allocation

Energy

Finance

Asset Mgmt

Human Resources

Page 94: Dealing with Semantic Heterogeneity in Real-Time Information

MulK-­‐Level  Energy  Analysis     Example KPI:

Energy used by global IT department

CIO

Example KPI: PUE of the Data Center in Dublin

Helpdesk

Example KPI: kWhs used by server 172.16.0.8

Maintenance Personnel

Building

Data Center

CEO

CSO

Operational Analysis •  Technician needs

equipment power usage

•  Low-level monitoring Sensors, events

Strategic Analysis •  CIO needs high-level

business function power usage

•  CSO real-time carbon emissions

Tactical Analysis •  Manager needs energy

usage of business processes, business line or group

94 of XYZ

Page 95: Dealing with Semantic Heterogeneity in Real-Time Information

Key  Challenges  

•  Technology  and  Data  Interoperability  •  Data  scaeered  among  different    systems  •  MulKple  incompaKble  technologies  make  it  difficult  to  use  

•  InterpreKng  Dynamic  and  StaKc  Data  •  Sensors,  ERP,  BMS,  assets  databases,  …  •  Need  to  proacKvely  idenKfy  efficiency  opportuniKes      

•  Empowering  AcKons  and  Including  Users  in  the  Loop  •  Understanding  of  direct  and  indirect  impacts  of  acKviKes    •  Embedding  impacts  within  business  processes  •  Engaging  Users  

95

Page 96: Dealing with Semantic Heterogeneity in Real-Time Information

96    

Building Data Center

Office IT Logistics

Corporate

Organisation-level

Business Process Personal-level

Linked  dataspace  for  Energy  Intelligence  

Linked  Energy  Intelligence  

Page 97: Dealing with Semantic Heterogeneity in Real-Time Information

Linked  Energy  Intelligence  A

pplica

tions

Energy Analysis Model

Complex Events

Situation Awareness Apps

Energy and Sustainability Dashboards

Decision Support Systems

Linked

Dat

a

Support

Se

rvic

es

Entity Management

Service

Data Catalog

Complex Event Processing

Engine

Provenance Search & Query

Sourc

es

Adapter Adapter Adapter Adapter Adapter

n  Cloud of Energy Data

n  Linked Sensor Middleware

n  Resource Description Framework (RDF)

n  Semantic Sensor Networks n  Constrained Application

Protocol (CoAP)

n  Semantic Event Processing

n  Collaborative Data Mgmt.

n  Energy Saving Applications

n  Energy Awareness

Curry E. et al, Enterprise Energy Management using a Linked dataspace for Energy Intelligence. In: The Second IFIP Conference on Sustainable Internet and ICT for Sustainability (SustainIT) 2012.

Page 98: Dealing with Semantic Heterogeneity in Real-Time Information

Energy  Saving  ApplicaKons  

Enterprise Energy Observatory

Smart Buildings Green Cloud Computing

Office IT Energy Mgmt. Personal Energy Mgmt.

Page 99: Dealing with Semantic Heterogeneity in Real-Time Information

Building  Energy  Explorer  

99 of 26

1.  Data  from  Enterprise  Linked  Data  Cloud  

2.  Sensor  Data  

3.  Building  Energy  SituaKon  Awareness  

Page 100: Dealing with Semantic Heterogeneity in Real-Time Information

Energy  Analysis  by  Group  

Page 101: Dealing with Semantic Heterogeneity in Real-Time Information

iEnergy  –  Personal    

Page 102: Dealing with Semantic Heterogeneity in Real-Time Information

@WATERNOMICS_EU www.waternomics.eu 102

Concrete Objectives

•  To introduce demand response and accountability principles (water footprint) in the water sector

•  To engage consumers in new interactive and personalized ways that bring water efficiency to the forefront and leads to changes in water behaviours

•  To empower corporate decision makers and municipal area managers with a water information platform together with relevant tools and methodologies to enact ICT-enabled water management programs

•  To promote ICT enabled water awareness using airports and water utilities as pilot examples

•  To make possible new water pricing options and policy actions by combining water availability and consumption data

WATERNOMICS will provide personalised and actionable information on water consumption and water availability to individual households, companies and cities in an intuitive & effective manner at relevant time-scales for decision making

Page 103: Dealing with Semantic Heterogeneity in Real-Time Information

@WATERNOMICS_EU www.waternomics.eu 103

WATERNOMICS PLATFORM ARCHITECTURE Su

pport

Se

rvic

es

Sourc

es

Applica

tions

Water Analysis Model

Complex Events

Usage Model Water Dashboards

Entity Management

Service

Decision Support Systems

Linked

Wat

er

Dat

a

Data Catalog

Complex Event Processing

Engine Prediction Search &

Query

Adapter Adapter Adapter Adapter Adapter

▶ Water Management Apps ▶ Water Data Analysis and

Prediction ▶ Semantic Sensor

Networks and Complex Event Processing to aid Decision Making

▶ Linking of data from different Water Management Sustems using Linked Data / RDF

Page 104: Dealing with Semantic Heterogeneity in Real-Time Information

@WATERNOMICS_EU www.waternomics.eu 104

PILOT OVERVIEW

# Focus Location Intent Partner

1 Water utility for domestic users (Thermi)

To demonstrate, validate, and assess the WATERNOMICS Platform for domestic water users

2

Water Management Cycle in an airport (Milan Linate)

To demonstrate, validate, and assess the WATERNOMICS methodology and hardware innovations, and software/analysis results via the deployment of WATERNOMICS ICT

3

Water distribution in a Municipality (Sochaczew)

To validate and showcase the WATERNOMICS Platform at a municipal level (i.e. mixed use consumers supplied by a water utility)

Page 105: Dealing with Semantic Heterogeneity in Real-Time Information

Conclusions  

•  Coupling  necessary  for  crossing  boundaries  •  Decoupling  necessary  for  scalable  so]ware  •  Event-­‐based  systems  do  not  address  the  coupling/decoupling  tradeoff  for  semanKcs  

•  Approximate  and  themaKc  event  processing  exchange  approximaKons  of  meaning  with  loose  semanKc  coupling  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 106: Dealing with Semantic Heterogeneity in Real-Time Information

Dataset  and  So]ware  

•  Dataset  – Souleiman  Hasan,  Edward  Curry,  ThemaKc  event  processing  dataset,  DOI:  10.13140/2.1.3342.9123  

•  hep://www.researchgate.net/publicaKon/263673956_ThemaKc_event_processing_dataset  

•  Collider    –  Souleiman  Hasan,  Kalpa  Gunaratna,  Yongrui  Qin,  and  Edward  Curry.  2013.  Demo:  approximate  semanKc  matching  in  

the  collider  event  processing  engine.  In  Proceedings  of  the  7th  ACM  interna:onal  conference  on  Distributed  event-­‐based  systems  (DEBS  '13).  ACM,  New  York,  NY,  USA,  337-­‐338.  DOI=10.1145/2488222.2489277  hep://doi.acm.org/10.1145/2488222.2489277  

•  Easy  ESA  –  EasyESA  is  an  implementaKon  of  Explicit  SemanKc  Analysis  (ESA)  –  hep://treo.deri.ie/easyesa/  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 107: Dealing with Semantic Heterogeneity in Real-Time Information

References  

•  CUGOLA,  G.  AND  MARGARA,  A.,  2011.  Processing  flows  of  informaKon:  From  data  stream  to  complex  event  processing.  ACM  Compu:ng  Surveys  Journal.  

•  EUGSTER,  P.T.,  FELBER,  P.A.,  GUERRAOUI,  R.  AND  KERMARREC,  A.M.,  2003.  The  many  faces  of  publish/subscribe.  ACM  Compu:ng  Surveys  (CSUR),  35(2),  pp.114–131.  

•  Carlile,  Paul  R.  "Transferring,  translaKng,  and  transforming:  An  integraKve  framework  for  managing  knowledge  across  boundaries."  Organiza:on  science15.5  (2004):  555-­‐568.  

•  HASAN,  S.  AND  CURRY,  E.,  2014.  Approximate  SemanKc  Matching  of  Events  for  The  Internet  of  Things.  ACM  Transac>ons  on  Internet  Technology  (TOIT).  In  Press  

•  HASAN,  S.,  O’RIAIN,  S.  AND  CURRY,  E.,  2013.  TOWARDS  UNIFIED  AND  NATIVE  ENRICHMENT  IN  EVENT  PROCESSING  SYSTEMS.  IN  THE  7TH  ACM  INTERNATIONAL  CONFERENCE  ON  DISTRIBUTED  EVENT-­‐BASED  SYSTEMS  (DEBS  2013).  ARLINGTON,  TEXAS,  USA:  ACM.  

•  HASAN,  S.,  O’RIAIN,  S.  AND  CURRY,  E.,  2012.  Approximate  SemanKc  Matching  of  Heterogeneous  Events.  In  6th  ACM  Interna:onal  Conference  on  Distributed  Event-­‐Based  Systems  (DEBS  2012).  Berlin,  Germany:  ACM,  pp.  252–263.  

•  HASAN,  S.  AND  CURRY,  E.,  2014.  ThemaKc  Event  Processing.  Middleware  2014.  Under  review.  •  HASAN,  S.,  CURRY,  E.,  BANDUK,  M.,  AND  O’RIAIN,  S.  TOWARD  SITUATION  AWARENESS  FOR  THE  SEMANTIC  

SENSOR  WEB:  COMPLEX  EVENT  PROCESSING  WITH  DYNAMIC  LINKED  DATA  ENRICHMENT.  THE  4TH  INTERNATIONAL  WORKSHOP  ON  SEMANTIC  SENSOR  NETWORKS  2011  (SSN11),  (2011),  60–72.  

•  E.  Curry,  “Message-­‐Oriented  Middleware,”  in  Middleware  for  CommunicaKons,  Q.  H.  Mahmoud,  Ed.  Chichester,  England:  John  Wiley  and  Sons,  2004,  pp.  1–28.  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 108: Dealing with Semantic Heterogeneity in Real-Time Information

More  References  

•  P.  McFedries,  The  coming  data  deluge,  IEEE  Spectrum,  2011.  •  CUGOLA,  G.  AND  MARGARA,  A.,  2011.  Processing  flows  of  informaKon:  From  data  stream  to  complex  event  processing.  ACM  Compu:ng  

Surveys  Journal.  •  EUGSTER,  P.T.,  FELBER,  P.A.,  GUERRAOUI,  R.  AND  KERMARREC,  A.M.,  2003.  The  many  faces  of  publish/subscribe.  ACM  Compu:ng  Surveys  

(CSUR),  35(2),  pp.114–131.  •  LUCKHAM,  D.,  2002.  The  Power  of  Events:  An  Introduc:on  to  Complex  Event  Processing  in  Distributed  Enterprise  Systems,  Addison-­‐Wesley  

Professional.  •  DAYAL,  U.,  BLAUSTEIN,  B.,  BUCHMANN,  A.,  CHAKRAVARTHY,  U.,  HSU,  M.,  LEDIN,  R.,  MCCARTHY,  D.,  ROSENTHAL,  A.,  SARIN,  S.,  CAREY,  

M.  J.,  LIVNY,  M.,  AND  JAUHARI,  R.  1988.  The  hipac  project:  Combining  acKve  databases  and  Kming  constraints.  SIGMOD  Rec.  17,  1,  51–70.  

•  LIEUWEN,  D.  F.,  GEHANI,  N.  H.,  AND  ARLEIN,  R.  M.  1996.  The  ode  acKve  database:  Trigger  semanKcs  and  implementaKon.  In  Proceedings  of  the  12th  InternaKonal  Conference  on  Data  Engineering  (ICDE’96).  IEEE  Computer  Society,  Los  Alamitos,  CA,  412–420.  

•  GATZIU,  S.  AND  DITTRICH,  K.  1993.  Events  in  an  acKve  object-­‐oriented  database  system.  In  Proceedings  of  the  InternaKonal  Workshop  on  Rules  in  Database  Systems  (RIDS),  N.  Paton  and  H.  Williams,  Eds.  Workshops  in  CompuKng,  Springer-­‐Verlag,  Edinburgh,  U.K.  

•  CHAKRAVARTHY,  S.  AND  ADAIKKALAVAN,  R.  2008.  Events  and  streams:  Harnessing  and  unleashing  their  synergy!  In  Proceedings  of  the  2nd  InternaKonal  Conference  on  Distributed  Event-­‐Based  Systems  (DEBS’08).  ACM,  New  York,  NY,  1–12.  

•  CHANDRASEKARAN,  S.,  COOPER,  O.,  DESHPANDE,  A.,  FRANKLIN,  M.  J.,  HELLERSTEIN,  J.  M.,  HONG,  W.,  KRISHNAMURTHY,  S.,  MADDEN,  S.  R.,  REISS,  F.,  AND  SHAH,  M.  A.  2003.  Telegraphcq:  ConKnuous  dataflow  processing.  In  Proceedings  of  the  ACM  SIGMOD  InternaKonal  Conference  on  Management  of  Data  (SIGMOD’03).  ACM,  New  York,  NY,  668–668.  

•  CHEN,  J.,  DEWITT,  D.  J.,  TIAN,  F.,  AND  WANG,  Y.  2000.  Niagaracq:  A  scalable  conKnuous  query  system  for  Internet  databases.  SIGMOD  Rec.  29,  2,  379–390.  

•  LIU,  L.,  PU,  C.,  AND  TANG,  W.  1999.  ConKnual  queries  for  internet  scale  event-­‐driven  informaKon  delivery.  IEEE  Trans.  Knowl.  Data  Eng.  11,  4,  610–628.  

•  ARASU,  A.,  BABU,  S.,  AND  WIDOM,  J.  2006.  The  CQL  conKnuous  query  language:  SemanKc  foundaKons  and  query  execuKon.  VLDB  J.  15,  2,  121–142.  

•  MUHL  ,  G.,  FIEGE,  L.,  AND  PIETZUCH,  P.  2006.  Distributed  Event-­‐Based  Systems.  Springer  •  ALTHERR,  M.,  ERZBERGER,  M.,  AND  MAFFEIS,  S.  1999.  iBus—a  so]ware  bus  middleware  for  the  Java  plavorm.  In  Proceedings  of  the  

InternaKonal  Workshop  on  Reliable  Middleware  Systems.  43–53..  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 109: Dealing with Semantic Heterogeneity in Real-Time Information

More  References  

•  David  S.  Rosenblum  and  Alexander  L.  Wolf.  1997.  A  design  framework  for  Internet-­‐scale  event  observaKon  and  noKficaKon.  SIGSOFT  SoGw.  Eng.  Notes  22,  6  (November  1997),  344-­‐360.  DOI=10.1145/267896.267920  hep://doi.acm.org/10.1145/267896.267920  

•  EUGSTER,  P.  AND  GUERRAOUI,  R.  2001.  Content  based  publish/subscribe  with  structural  reflecKon.  In  Proceedings  of  the  6th  Usenix  Conference  on  Object-­‐Oriented  Technologies  andSystems  (COOTS’01).  

•  C.  Shannon  and  W.  Weaver.  The  mathemaKcal  theory  of  communicaKon.  University  of  Illinois  Press,  1949.  •  P.  R.  Carlile.  Transferring,  translaKng,  and  transforming:  An  integraKve  framework  for  managing  knowledge  across  boundaries.  

OrganizaKon  science,  15(5):555{568,  2004.  •  Curry,  Edward,  Souleiman  Hasan,  and  Seán  O'Riain.  "Enterprise  energy  management  using  a  linked  dataspace  for  energy  

intelligence."  Sustainable  Internet  and  ICT  for  Sustainability  (SustainIT),  2012.  IEEE,  2012.  •  Curry,  Edward,  et  al.  "Linking  building  data  in  the  cloud:  IntegraKng  cross-­‐domain  building  data  using  linked  data."  Advanced  

Engineering  Informa:cs  27.2  (2013):  206-­‐219.  •  Patrick  Th.  Eugster,  Pascal  A.  Felber,  Rachid  Guerraoui,  and  Anne-­‐Marie  Kermarrec.  2003.  The  many  faces  of  publish/subscribe.  ACM  

Comput.  Surv.  35,  2  (June  2003),  114-­‐131.    •  A.  Carzaniga,  D.  S.  Rosenblum,  and  A.  L.  Wolf.  Achieving  scalability  and  expressiveness  in  an  internet-­‐scale  event  noK_caKon  service.  In  

Proceedings  of  the  nineteenth  annual  ACM  symposium  on  Principles  of  distributed  compuKng,  pages  219{227.  ACM,  2000.  •  M.  Petrovic,  I.  Burcea,  and  H.-­‐A.  Jacobsen.  S-­‐topss:  semanKc  toronto  publish/subscribe  system.  In  Proceedings  of  the  29th  internaKonal  

conference  on  Very  large  data  bases  -­‐  Volume  29,  VLDB  '03,  pages  1101-­‐1104.  VLDB  Endowment,  2003.  •  Luis  Sanchez,  Jos´e  Antonio  Galache,  Veronica  GuKerrez,  JM  Hernandez,  J  Bernat,  Alex  Gluhak,  and  Tom´as  Garcia.  2011.  

SmartSantander:  The  meeKng  point  between  Future  Internet  research  and  experimentaKon  and  the  smart  ciKes.  In  Future  Network  &  Mobile  Summit  (FutureNetw),  2011.  IEEE,  1–8.    

•  Edward  Curry,  Souleiman  Hasan,  and  Sean  O’Riain.  2012.  Enterprise  energy  management  using  a  linked  dataspace  for  Energy  Intelligence.  In  Sustainable  Internet  and  ICT  for  Sustainability  (SustainIT),  2012.  IEEE,  1–6.  

•  Yahoo!  2013.  Yahoo!  Directory:  AutomoKve  -­‐  Makes  and  Models.  (2013).  hep://dir.yahoo.com/recreaKon/  automoKve/makes  and  models/    

•  Kyle  Anderson,  Adrian  Ocneanu,  Diego  Benitez,  Derrick  Carlson,  Anthony  Rowe,  and  Mario  Berges.  2012.  BLUED:  A  Fully  Labeled  Public  Dataset  for  Event-­‐Based  Non-­‐Intrusive  Load  Monitoring  Research.  In  Proc.  SustKDD.  

•  Richard  Cyganiak.  2013.  Rooms  in  the  DERI  building.  (2013).  hep://lab.linkeddata.deri.ie/2010/deri-­‐rooms  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  

Page 110: Dealing with Semantic Heterogeneity in Real-Time Information

Credits  

Green  and  Sustainable  IT  Group  at  Insight  Galway  for  all  their  hard  work.    Special  thanks  to  Souleiman  Hasan  for  his  assistance  with  the  Tutorial    Andre  Freitas  –  Slides  on  DistribuKonal  SemanKcs    Prof.  Manfred  Hauswirth  and  USM  at  Insight  Galway  (LSM,  OpenIoT,  etc..)  

7-­‐11  July  2014,  Rhodes,  Greece   EarthBiAs2014  


Top Related