Dealing with Semantic Heterogeneity in Real-Time Information

Download Dealing with Semantic Heterogeneity in Real-Time Information

Post on 11-Aug-2014

382 views

Category:

Data & Analytics

19 download

DESCRIPTION

Tutorial at the EarthBiAs 2014 Summer School on Dealing with Semantic Heterogeneity in Real-Time Information Part I: Large Scale Open Environments Part Ii: Computational Paradigms Part III: RDF Event Processing Part IV: Theory of Event Exchange Part V: Approaches to Semantic Decoupling Part VI: Example Application: Linked Energy Intelligence

TRANSCRIPT

EarthBiAs2014 Global NEST University of the Aegean Dealing with Seman@c Heterogeneity in Real-Time Informa@on Dr. Edward Curry Insight Centre for Data Analy@cs, Na@onal University of Ireland Galway Tuesday 8th July 2014 7-11 July 2014, Rhodes, Greece EarthBiAs2014 1 Talk Overview Part I: Large Scale Open Environments Part Ii: ComputaKonal Paradigms Part III: RDF Event Processing Part IV: Theory of Event Exchange Part V: Approaches to SemanKc Decoupling Part VI: Example ApplicaKon: Linked Energy Intelligence 7-11 July 2014, Rhodes, Greece EarthBiAs2014 About Me PhD in Computer Science (NUI Galway) Green and Sustainable IT Research Group Leader in DERI/ Insight NUI Galway Researcher in both Computer Science and InformaKon Systems Overall Objective WATERNOMICS will provide personalised and actionable information about water consumption and water availability to individual households, companies and cities in an intuitive and effective manner at a time-scale relevant for decision making. Project-Sense Non-Technical Users Targets Occupants of the Building Non-Technical Office Workers No experience in Energy Management Low cost installation Self-Configuration Collaborative system configuration Crowdsourced contextual data from building occupants Imports relevant enterprise data via Excel Semantic event matching reduces configuration costs Decision Support Sensor and Data Fusion Multi-level decision support model Identifies Energy Saving Opportunities Leverages Open Data and Predictive Analytics User Experience From Awareness to Engagement Transtheoretical Model Gamification User Personalisation Simple non-technical user interfaces Self-conguring smart energy management systems for small commercial buildings 7European Data Forum 2014 BIG 318062 BIG Big Data Public Private Forum 7 BIG 318062 The BIG Project BIG aims to promote a well-developed EU industrial landscape in Big Data: Providing a clear picture of existing technology trends and their maturity Acquiring a sharp understanding of how Big Data can be applied to concrete environments / use cases Pushing European Big Data research and innovation to contribute in increasing European competitiveness Building a self-sustainable, industry-led initiative Overall Objective Work at technical, business and policy levels, shaping the future through the positioning of IIM and Big Data specifically in Horizon 2020. Bringing the necessary stakeholders into a self- sustainable industry-led initiative, which will greatly contribute to enhance the EU competitiveness taking full advantage of Big Data technologies. @BYTE_EU www.byte-project.eu Big data roadmap and cross- disciplinarY community for addressing socieTal Externali9es The eects of a decision by stakeholders (e.g., governments, industry, scienKsts, policy-makers) that have an impact on a third party (especially members of the public). May be posiKve or negaKve Economic Boost to the economy InnovaKon Increase eciency Smaller actors le] behind Shrink economies Legal Privacy Data protecKon Data ownership Copyright Risks associated with inclusion & exclusion Social & Ethical Transparency DiscriminaKon Methodological diculKes Spurious relaKonships Consumer manipulaKon PoliKcal Reliance on US services Services have become uKliKes Legal issues become trade issues LARGE SCALE OPEN ENVIRONMENTS PART I 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Emerging Environments Smart City Energy Smart Building Water Management From Internet of Things to Internet of Everything Lots of Data 90% of the data in the world today has been created in the last two years alone IBM The bringing together of a vast amount of data from public and private sources [] is what Big Data is all about IDC Over the next few years well see the adop@on of scalable frameworks and pla^orms for handling streaming, or near real-@me, analysis and processing. OReilly Big Data represents a number of developments in technology that have been brewing for years and are coming to a boil. They include an explosion of data and new kinds of data, like from the Web and sensor streams; [...]. IDC From Rigid Schemas to Schema-less 13 Heterogeneous, complex and large-scale data Very-large and dynamic schemas Open Environments: distributed, decoupled data sources, anonymous users, mulK-domain, lack of global order of informaKon ow 10s-100s aeributes 1,000s-1,000,000s aeributes circa 2000 circa 2014 Fundamental DecentralizaKon 14 MulKple perspecKves (conceptualizaKons) of the reality. Ambiguity, vagueness, inconsistency. Current Trends 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Small scale, controlled environments Large scale, open environments Informa@on sources 10s to 100s 1000s to millions Data heterogeneity Small number of schemas High number of schemas Users Small number Know the environment Large number Not quite know the environment Users organiza@on Users know each others Top-down hierarchies (e.g. enterprises) Decoupled and distributed Dynamism Low High (sources and users join and leave o]en) Domain Domain specic Users interest range from domain specic to domain agnosKc COMPUTATIONAL PARADIGMS PART II 7-11 July 2014, Rhodes, Greece EarthBiAs2014 InformaKon Flow Processing (IFP) Users need to collect informaKon Produced by mulKple distributed sources For Kmely way processing To extract knowledge asap 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Financial Continuous Analytics RFID Inventory Management Environmental Monitoring InformaKon Flow Processing (IFP) Processing informaKon as it ows No intermediate storage New informaKon produced Raw informaKon can be discarded 7-11 July 2014, Rhodes, Greece EarthBiAs2014 InformaKon Flow Processing Engine Producers Consumers Rule managers CUGOLA, G. AND MARGARA, A., 2011. Processing ows of informaKon: From data stream to complex event processing. ACM Compu:ng Surveys Journal. InformaKon Flow Processing (IFP) Requirements Real-Kme or near real-Kme processing Expressive language for rules Scalability to large number of producers and consumers 7-11 July 2014, Rhodes, Greece EarthBiAs2014 ComputaKonal Paradigm Event Processing Event: object represenKng a happening. Deals with events and relaKons of events (e.g. inter-events sequencing, causality, etc.) Stream Processing Stream: homogeneous and totally ordered set of data items. Deals with streams and operaKons on streams (e.g. joins). Event cloud may contain steams of events as well as parKally ordered set of events. (Cugola & Margara, 2012) Event processing agents, network, and rules. Event Processing Architecture Producer Producer E2 E3 E1 Rule 21 of 31 Event Processing Engine Consumer Events Processing is Decoupled for Scalability 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Event Processing Space Time SynchronizaKon Event source Event consumer Patrick Th. Eugster, Pascal A. Felber, Rachid Guerraoui, and Anne-Marie Kermarrec. 2003. The many faces of publish/ subscribe. ACM Comput. Surv. 35, 2 (June 2003), 114-131. AcKve Databases TradiKonal database systems Passive Store data and wait for users interacKon ReacKve behaviour in the applicaKon layer DAYAL, U., BLAUSTEIN, B., BUCHMANN, A., CHAKRAVARTHY, U., HSU, M., LEDIN, R., MCCARTHY, D., ROSENTHAL, A., SARIN, S., CAREY, M. J., LIVNY, M., AND JAUHARI, R. 1988. The hipac project: Combining acKve databases and Kming constraints. SIGMOD Rec. 17, 1, 5170. LIEUWEN, D. F., GEHANI, N. H., AND ARLEIN, R. M. 1996. The ode acKve database: Trigger semanKcs and implementaKon. In Proceedings of the 12th InternaKonal Conference on Data Engineering (ICDE96). IEEE Computer Society, Los Alamitos, CA, 412420. GATZIU, S. AND DITTRICH, K. 1993. Events in an acKve object-oriented database system. In Proceedings of the InternaKonal Workshop on Rules in Database Systems (RIDS), N. Paton and H. Williams, Eds. Workshops in CompuKng, Springer-Verlag, Edinburgh, U.K. CHAKRAVARTHY, S. AND ADAIKKALAVAN, R. 2008. Events and streams: Harnessing and unleashing their synergy! In Proceedings of the 2nd InternaKonal Conference on Distributed Event-Based Systems (DEBS08). ACM, New York, NY, 112. 7-11 July 2014, Rhodes, Greece EarthBiAs2014 AcKve Databases ReacKve behaviour to database layer Event-CondiKon-AcKon (ECA) rules Event: source. E.g. tuple inserted CondiKon: post event. E.g. inserted.value > 5 AcKon: what to do. E.g. modify the DB Cons Persistent storage model Suitable when updates not frequent and few rules 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Data Stream Management Systems Streams unbounded (not like tables) No arrival order assumpKons Typically no storage Use conKnuous, or standing, queries ReacKve in nature CHANDRASEKARAN, S., COOPER, O., DESHPANDE, A., FRANKLIN, M. J., HELLERSTEIN, J. M., HONG, W., KRISHNAMURTHY, S., MADDEN, S. R., REISS, F., AND SHAH, M. A. 2003. Telegraphcq: ConKnuous dataow processing. In Proceedings of the ACM SIGMOD InternaKonal Conference on Management of Data (SIGMOD03). ACM, New York, NY, 668668. CHEN, J., DEWITT, D. J., TIAN, F., AND WANG, Y. 2000. Niagaracq: A scalable conKnuous query system for Internet databases. SIGMOD Rec. 29, 2, 379390. LIU, L., PU, C., AND TANG, W. 1999. ConKnual queries for internet scale event-driven informaKon delivery. IEEE Trans. Knowl. Data Eng. 11, 4, 610628. ARASU, A., BABU, S., AND WIDOM, J. 2006. The CQL conKnuous query language: SemanKc foundaKons and query execuKon. VLDB J. 15, 2, 121142. 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Data Stream Management Systems ConKnuous queries semanKcs Answer: append only stream or update store Exact or approximate answer Cons Atomic item is the stream Not possible to detect sequencing or causal paeerns 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Publish/Subscribe Systems InformaKon items are no:ca:on Indirect addressing-based communicaKon scheme Ancestors Message Passing Remote Procedure Call (RPC) Shared spaces Message Queueing EUGSTER, P.T., FELBER, P.A., GUERRAOUI, R. AND KERMARREC, A.M., 2003. The many faces of publish/subscribe. ACM Compu:ng Surveys (CSUR), 35(2), pp.114131. MUHL , G., FIEGE, L., AND PIETZUCH, P. 2006. Distributed Event-Based Systems. Springer 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Publish/Subscribe Systems One-to-many and many-to-many distribuKon mechanism allows single producer to send a message to one user or potenKally hundreds of thousands of consumers E. Curry, Message-Oriented Middleware, in Middleware for CommunicaKons, Q. H. Mahmoud, Ed. Chichester, England: John Wiley and Sons, 2004, pp. 128. IntroducKon to Message-Oriented Middleware 28 Publish/Subscribe Systems Topic-based pub/sub Topics are groups or channels Events of a topic are sent to the topics subscribers ALTHERR, M., ERZBERGER, M., AND MAFFEIS, S. 1999. iBusa so]ware bus middleware for the Java plavorm. In Proceedings of the InternaKonal Workshop on Reliable Middleware Systems. 4353. Content-based pub/sub Matching by message lters Publishers and subscribers channels are dened by the content and the subscripKons David S. Rosenblum and Alexander L. Wolf. 1997. A design framework for Internet-scale event observaKon and noKcaKon. SIGSOFT SoGw. Eng. Notes 22, 6 (November 1997), 344-360. DOI=10.1145/267896.267920 hep://doi.acm.org/10.1145/267896.267920 Type-based pub/sub Matching on type hierarchy EUGSTER, P. AND GUERRAOUI, R. 2001. Content based publish/subscribe with structural reecKon. In Proceedings of the 6th Usenix Conference on Object-Oriented Technologies andSystems (COOTS01). 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Complex Event Processing Systems DetecKon of complex paeerns Sequencing Causal Ordering in general Of mulKple events And generate complex, or derived, events LUCKHAM, D., 2002. The Power of Events: An Introduc:on to Complex Event Processing in Distributed Enterprise Systems, Addison-Wesley Professional. 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Complex Event Processing Systems Adapted from CUGOLA, G. AND MARGARA, A., 2011. Processing ows of informaKon: From data stream to complex event processing. ACM Compu:ng Surveys Journal. 7-11 July 2014, Rhodes, Greece EarthBiAs2014 RDF EVENT PROCESSING PART III 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Why Linked Data for the IoT? Many communiKes struggle with closed approaches E.g., pervasive compuKng, embedded systems, IoT, ... Cyber-Physical Systems are inherently open world Prof. David Karger (MIT) in his ESWC 2013 keynote: Semantic Web technologies support and open world assumption where millions of unforeseeable schemas may have to be integrated. Simple integraKon with exisKng LOD data sets Geo-spaKal, governmental, media, ... Manageable integraKon eort with other graph data, e.g., Google Knowledge Graph, Facebook Graph, etc. EU ICT OpenIoT Project Knowledge-Based Future Internet Step 2: Sensor/Cloud Formulation Step 1: Sensing-as-a-Service Request Step 3: Service Provisioning (Utility Metrics) Infrastructures provider(s) (e.g., Smart City) OpenIoT User (Citizen, Corporate) Domain #1 Domain #N 34 Middleware Core features: Open Source Linked Data Cloud Computing Internet of Things IoT Management Data Privacy and Security Mobility and Quality of Service www.openiot.eu EU ICT-2011.1.3 Contract No.: 287305 An Open Source Cloud Solution for the Internet of Things! Open Source blueprint for large scale self-organizing cloud environments for IoT applications Sensor Networks OpenIoT leverages the SoA on Internet of Things (IoT) RFID/WSN middleware frameworks. OpenIoT provides baseline service functionalities associated with registering and looking up internet- connected objects (ICOs) named things. IoT Management OpenIoT provides baseline visualization services. OpenIoT supports dynamic interoperable self-organizing management on cloud environments for IoT. OpenIoT enables the autonomy of a variety of IoT entities and resources. Cloud Computing OpenIoT allows creation of PaaS models over internet-connected objects. OpenIoT supports applications that leverage information from multiple sensors, actuators and other devices to the cloud. OpenIoT enables cloud solutions to support IoT. Open Source OpenIoT is an open source solution. OpenIoT is first a kind of extension of existing open cloud computing infrastructures towards the IoT support. OpenIoT is a customizable toolkit for the IoT. OpenIoT Innovation for the Smart Industry www.openiot.eu Agrifood PhenonetSmart CityManufacturing Smart Campus Gain Briddes Plant Key Performance Indicators Air Quality Silver Angel Broke r Broke r Broke r Mobile Broker P S S 35 SemanKc Sensor Networks Ontology [JoWS 2012] SSN ApplicaKon: SPITFIRE DUL: DOLCE+DnS Ultralite EventF: Event-Model F SSN: SSN-XG CC: Contextualised-Cognitive Concepts on sensor network topology and devices Concepts on sensor role, events, sensor project Event Datasets Sensor Datasets LOD Cloud CQELS n ConKnuous Query EvaluaKon over Linked Streams n Scalable processing model for unied Linked Stream Data and Linked Open Data n Combines data pre-processing and an adapKve cost-based query opKmizaKon algorithm [SSN 2009, SSN 2010, ISWC 2011] Linked Stream Middleware [WWW 2009, JoWS 2012, CLOSER 2013] http://lsm.deri.ie/ LSM: Live train info Projects using Linked Data for IoT Open Source IoT Architectural Blueprint http://www.openiot.eu/ https://github.com/OpenIotOrg/openiot Real-Time IoT Stream Processing and Large-scale Data Analytics for Smart Cities http://www.ict-citypulse.eu/ Smart, secure and cost-effective integrated IoT deployments in smart cities http://vital-project.eu/ Behaviour-driven Autonomous Services for smart transportation in smart cities http://gambas-ict.eu/ THEORY OF EVENT EXCHANGE PART IV 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Problem Event producers and consumers are semanKcally coupled Consumers need prior knowledge of event types, aeributes and values. Limits scalability in heterogeneous and dynamic environments due to explicit dependencies Dicult development of event processing subscripKons/rules in heterogeneous and dynamic environments. Space Time Synch Producer Consumer Semantic Type Energy Consumption Place Room 202e Amount 40 kWh Type Electricity Consumption Loca@on Room 202e Amount 70 kWh Type Electricity Utilized Venue Room 202e Amount 600 kWh e1 Event Producers e.g. Sensors Type =Energy Consumption Place =Room 202e Type =Electricity Consumption Location =Room 202e Type =Electricity Utilized Venue =Room 202e TradiKonal Event Processing e1 Consumer e1e2 e1e3 Exact Matching Model Type Energy Consumption Place Room 202e Amount 40 kWh Type Electricity Consumption Loca@on Room 202e Amount 70 kWh Type Electricity Utilized Venue Room 202e Amount 600 kWh e1 Event Producers e.g. Sensors e1 e1e2 e1e3 SemanKc Event Processing Type =Energy Consumption~ Location =Room 202e Consumer SemanKc Matching How Good are Our Paradigms? Scale Big volume Big Velocity Big Variety Distributed sources and consumers The big challenge is now in the exchange of knowledge at a very large-scale 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Shannon-Weaver Model C. Shannon and W. Weaver. The mathemaKcal theory of communicaKon. University of Illinois Press, 1949. 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Cross-Boundaries Exchange 7-11 July 2014, Rhodes, Greece EarthBiAs2014 SyntacKc SemanKc PragmaKc Producer Consumer P. R. Carlile. Transferring, translaKng, and transforming: An integraKve framework for managing knowledge across boundaries. OrganizaKon science, 15(5):555{568, 2004. Boundaries Open environment Known environment SyntacKc Boundary Transfer is the most common type of informaKon movement across this boundary A common lexicon exists Move and process syntax (0s and 1s) Dominant form of Shannon Weavers theory E.g. Dierent data models of events E.g. Transfer RDF events over HTTP 7-11 July 2014, Rhodes, Greece EarthBiAs2014 SemanKc Boundary Common lexicon doesnt exist Lexicon evolve AmbiguiKes exist TranslaKon is the process to cross this boundary E.g. Dierent ontologies for sensors E.g. Ontology alignment for RDF events 7-11 July 2014, Rhodes, Greece EarthBiAs2014 PragmaKc Boundary Actors on the sides of the boundary have: Dierent contexts Dierent perspecKves Dierent interests TransformaKon is the process to cross this boundary E.g. Temp sensor reading of 35 celsius is acceptable from outdoor sensors but not from indoor 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Cross-Boundaries Exchange 7-11 July 2014, Rhodes, Greece EarthBiAs2014 SyntacKc SemanKc PragmaKc Producer Consumer Boundaries Open environment Known environment P. R. Carlile. Transferring, translaKng, and transforming: An integraKve framework for managing knowledge across boundaries. OrganizaKon science, 15(5):555{568, 2004. Transfer-Translate-Transform Current approaches in event processing Transfer Common event/language models E.g. RDF over HTTP Translate Agreements on schemas/thesauri/ontologies E.g. DERI Energy ontology for building energy events Curry, Edward, et al. "Linking building data in the cloud: IntegraKng cross-domain building data using linked data." Advanced Engineering Informa:cs 27.2 (2013): 206-219. Transform Dedicated enrichers, joins in event languages CQELS language for Linked Stream Data mashups 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Decoupling for Scalability 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Event Processing Space Time SynchronizaKon Event source Event consumer Patrick Th. Eugster, Pascal A. Felber, Rachid Guerraoui, and Anne-Marie Kermarrec. 2003. The many faces of publish/ subscribe. ACM Comput. Surv. 35, 2 (June 2003), 114-131. SemanKc Coupling 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Event Processing Space Time SynchronizaKon Event source Event consumer SemanKc Coupling type, aTributes, values APPROACHES TO SEMANTIC COUPLING Part V 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Loosening the SemanKc Coupling Approach 1: Content-Based with SemanKc Decoupling A. Carzaniga, D. S. Rosenblum, and A. L. Wolf. Achieving scalability and expressiveness in an internet-scale event noK_caKon service. In Proceedings of the nineteenth annual ACM symposium on Principles of distributed compuKng, pages 219-227. ACM, 2000. Approach 2: Content-Based with Implicit Shared Agreements David S. Rosenblum and Alexander L. Wolf. 1997. A design framework for Internet-scale event observaKon and noKcaKon. SIGSOFT SoGw. Eng. Notes 22, 6 (November 1997), 344-360. DOI=10.1145/267896.267920 hep://doi.acm.org/10.1145/267896.267920 Approach 3: Concept-Based M. Petrovic, I. Burcea, and H.-A. Jacobsen. S-topss: semanKc toronto publish/subscribe system. In Proceedings of the 29th internaKonal conference on Very large data bases - Volume 29, VLDB '03, pages 1101-1104. VLDB Endowment, 2003. Approach 4: Loose SemanKc Coupling + ApproximaKon Hasan, S. and Curry, E., 2014. Approximate SemanKc Matching of Events for The Internet of Things. ACM Transac:ons on Internet Technology (TOIT). In Press Approach 5: Theme-Based 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Current Approaches Semantic Decoupling Effectiveness & Efficiency Content-based Concept-based Bottom-up Semantics 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Approach 1: Content-Based with SemanKc Decoupling Very low detecKon rate High false posiKves/negaKves Low precision/recall Producer Consumer event Seman@c De-Coupling Happened Publish: A Happened Interested in Subscribe: Interested in B 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Approach 1: Content-Based with SemanKc Decoupling Use many rules to improve detecKon Time and eort Aects scalability to heterogeneous environments Producer Consumer event Seman@c De-Coupling Happened Publish: A Happened Interested in Subscribe: Interested in A Interested in B Interested in C Approach 2: Content-Based with Implicit Shared Agreements 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Producer Consumer event Seman@c Coupling via Implicit Agreements Happened Publish: A Happened Interested in Subscribe: Interested in A Face-to-face, or via documentaKon Use symbol A to describe Approach 2: Content-Based with Implicit Shared Agreements Implicit semanKcs Top-down approach to semanKcs Granular on the level of concepts 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Producer Consumer event Seman@c Coupling via Implicit Agreements Happened Publish: A Happened Interested in Subscribe: Interested in A Approach 2: Content-Based with Implicit Shared Agreements Need for shared agreements Time and eort Aects scalability to heterogeneous environments 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Producer Consumer event Seman@c Coupling via Implicit Agreements Happened Publish: A Happened Interested in Subscribe: Interested in A Approach 3: Concept-Based 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Producer Consumer event Seman@c Coupling via Ontologies Happened Publish: A Happened Interested in Subscribe: Interested in B C D B E A F subClassOf Approach 3: Concept-Based Explicit semanKcs Top-down approach to semanKcs Granular on the level of concepts 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Producer Consumer event Seman@c Coupling via Ontologies Happened Publish: A Happened Interested in Subscribe: Interested in B Approach 3: Concept-Based Need for shared agreements Time and eort Aects scalability to heterogeneous environments 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Producer Consumer event Seman@c Coupling via Ontologies Happened Publish: A Happened Interested in Subscribe: Interested in B Most semanKc models have dealt with parKcular types of construcKons, and have been carried out under very simplifying assumpKons, in true lab condiKons. If these idealizaKons are removed it is not clear at all that modern semanKcs can give a full account of all but the simplest models/ statements. Sahlgren, 2013 Formal World Real World SemanKcs for a Complex World 67 Baroni et al. 2013 Distributional Semantic Model Distributional hypothesis: the context surrounding a given word in a text provides relevant information about its meaning. Simplified semantic model. Associational and quantitative. Explicit Semantic Analysis (ESA) is the primary distributional model used in this work. 68 A wife is a female partner in a marriage. The term "wife" seems to be a close term to bride, the laeer is a female parKcipant in a wedding ceremony, while a wife is a married woman during her marriage. ... DistribuKonal SemanKc Model c1 child husband spouse cn c2 function (number of times that the words occur in c1) 0.7 0.5 Commonsense is here 69 (Freitas, 2012) SemanKc Relatedness 70 c1 child husband spouse cn c2 Works as a semantic ranking function E.g. esa(room, building)= 0.099 E.g. esa(room, car)= 0.009 (Freitas, 2012) Approach 4: Loose SemanKc Coupling + ApproximaKon 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Producer Consumer event Loose Seman@c Coupling via Large Text Corpora Happened Publish: A Happened Interested in Subscribe: Interested in B A d1 d2 d3 d4 d5 d6 d7 d8 . B d1 d3 d4 d17 d25 d26 d77 d78 . ~ (Hasan et al., 2004) Approach 4: Loose SemanKc Coupling + ApproximaKon Boeom-up model of semanKcs Global semanKcs: distribuKon vs. granular 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Producer Consumer event Loose Seman@c Coupling via Large Text Corpora Happened Publish: A Happened Interested in Subscribe: Interested in B ~ 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Approach 4: Loose SemanKc Coupling + ApproximaKon Low cost to Scale to heterogeneous environments Slightly lower detecKon rate Producer Consumer event Loose Seman@c Coupling via Large Text Corpora Happened Publish: A Happened Interested in Subscribe: Interested in B ~ 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Approach 5: Theme-Based Can we exchange beeer approximaKons of meanings rather than mere symbols to improving detecKon rate? Producer Consumer event Loose Seman@c Coupling via Large Text Corpora Happened Publish: A Happened Interested in Subscribe: Interested in B ~ (Hasan and Curry, 2014) 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Approach 5: Theme-Based Producer Consumer event Loose Seman@c Coupling via Large Text Corpora Happened Publish: (A+T1) Happened Interested in Subscribe: Interested in (B +T2) A d1 d2 d3 d4 d5 d6 d7 d8 . B d1 d3 d4 d17 d25 d26 d77 d78 . ~ Theme T2 The ThemaKc Approach Exchange approximaKons of meanings 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Event Publisher Alice Consumer Bob Theme the Payload Subscrip@on Theme ths Expression Approximate matcher ParameterizaKon Loose coupling mode: lightweight agreement on themes No coupling mode: free use of well representaKve themes Hasan, S. and Curry, E., 2014. ThemaKc Event Processing. Middleware 2014. Under review. Event RepresentaKon 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Event energy, appliances, building type: increased energy consumpKon event, measurement unit: kilowae per hour, device: computer, oce: room 112 SubscripKon RepresentaKon 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Subscrip@on power, computers type= increased energy usage event~, device~= laptop~, oce= room 112 ProbabilisKc Approximate Matcher Top-1 and Top-k mappings between an event and a subscripKon 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Building IoT So]ware 7-11 July 2014, Rhodes, Greece Indexing Collector SemanKc relatedness web service Textual corpus Vector space index Consumer Bob (user) Publisher Alice Publish + thema:c tags ThemaKc event processing engine(s) Approximate single event matching Subscribe + thema:c tags IoT sensors Terms + themes pairs Relatedness score Collector Publisher Carol Publish + thema:c tags Collector Publisher Dave Publish + thema:c tags Consumer Dan (applicaKon developer) Consumer Erin (applicaKon developer) Heterogeneous IoT Events Relevant events normalized for Bob Subscribe + thema:c tags Relevant events normalized for Dan Subscribe + thema:c tags Relevant events normalized for Erin Summary 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Simple Content- based Content- based + Many Rules Concept- based Simple Distribu@onal + Approxima@on Thema@c Matching exact string matching exact string matching Boolean semanKc matching approximate semanKc matching approximate semanKc matching SemanKc Coupling term-level full agreement term-level full agreement concept-level shared agreement loose agreement loose agreement SemanKcs not explicit not explicit top-down ontology- based staKsKcal model based on distribuKonal semanKcs staKsKcal model based on distribuKonal semanKcs + themes EecKveness very low 100% depends on the domains and number of concept models depends on the corpus depends on the corpus + theme representaKves Cost dening a small number of rules dening a large number of rules establishing shared agreement on ontologies minimal agreement on a large textual corpus minimal agreement on a large textual corpus + good theme representaKves Eciency high high medium to high medium to high Medium to high EvaluaKon Dataset Seed events synthesized from IoT sensors SmartSantander smart city project Luis Sanchez, Jose Antonio Galache, Veronica GuKerrez, JM Hernandez, J Bernat, Alex Gluhak, and Tomas Garcia. 2011. SmartSantander: The meeKng point between Future Internet research and experimentaKon and the smart ciKes. In Future Network & Mobile Summit (FutureNetw), 2011. IEEE, 18. Sensor CapabiliKes solar radiaKon, parKcles, speed, wind direcKon, wind speed, temperature, water ow, atmospheric pressure, noise, ozone, rainfall, parking, radiaKon par, co, ground temperature, light, no2, soil moisture tension, relaKve humidity, energy consumpKon, cpu usage, memory usage 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Hasan, S. and Curry, E., 2014. Approximate SemanKc Matching of Events for The Internet of Things. ACM Transac:ons on Internet Technology (TOIT). In Press EvaluaKon Dataset Seed events synthesized from IoT sensors Linked Energy Intelligence plavorm Edward Curry, Souleiman Hasan, and Sean ORiain. 2012. Enterprise energy management using a linked dataspace for Energy Intelligence. In Sustainable Internet and ICT for Sustainability (SustainIT), 2012. IEEE, 16. Car brands from the yahoo directory Yahoo! 2013. Yahoo! Directory: AutomoKve - Makes and Models. (2013). hep://dir.yahoo.com/recreaKon/ automoKve/makes and models/ Home based appliances from BLUED dataset Kyle Anderson, Adrian Ocneanu, Diego Benitez, Derrick Carlson, Anthony Rowe, and Mario Berges. 2012. BLUED: A Fully Labeled Public Dataset for Event-Based Non-Intrusive Load Monitoring Research. In Proc. SustKDD. Rooms from DERI Building Richard Cyganiak. 2013. Rooms in the DERI building. (2013). hep://lab.linkeddata.deri.ie/2010/deri-rooms 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Hasan, S. and Curry, E., 2014. Approximate SemanKc Matching of Events for The Internet of Things. ACM Transac:ons on Internet Technology (TOIT). In Press EvaluaKon FScore up to 95% and 1000s events/sec 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Hasan, S. and Curry, E., 2014. Approximate SemanKc Matching of Events for The Internet of Things. ACM Transac:ons on Internet Technology (TOIT). In Press EXAMPLE APPLICATION: LINKED ENERGY INTELLIGENCE PART VI 7-11 July 2014, Rhodes, Greece EarthBiAs2014 New Smart Building 86 Cost - 40,000,000 A Real-World Example 87 Time Monday Tuesday Wednesday Thursday Friday 08:00-09:00 09:00-10:00 237 237 200 237 10:00-11:00 237 237 237 200 11:00-12:00 237 180 180 145 237 12:00-13:00 237 200 237 200 149 13:00-14:00 145 14:00-15:00 221 237 145 140 15:00-16:00 221 120 160 140 16:00-17:00 149 250 160 17:00-18:00 200 160 CO2 levels ASHRAE 62.1-2010 Occupancy Paeern AirCon 8:30-11:00 & 15:00-16:00 Mon to Fri Cost - 40,000,000 Legacy Building DERI Building No BMS or BEMS 160 person Oce space Caf Data centre 3 Kitchens 80 person Conference room 4 MeeKng rooms CompuKng museum Sensor Lab 88 Energy Management System Sensors 90 of 26 Energy Management So]ware HolisKc Energy ConsumpKon Holis@c Energy Management FaciliKes Business Travel Data Centre Daily Commute Oce IT Business Context of Energy ConsumpKon Resource Allocation Energy Finance Asset Mgmt Human Resources MulK-Level Energy Analysis Example KPI: Energy used by global IT department CIO Example KPI: PUE of the Data Center in Dublin Helpdesk Example KPI: kWhs used by server 172.16.0.8 Maintenance Personnel Building Data Center CEO CSO Operational Analysis Technician needs equipment power usage Low-level monitoring Sensors, events Strategic Analysis CIO needs high-level business function power usage CSO real-time carbon emissions Tactical Analysis Manager needs energy usage of business processes, business line or group 94 of Key Challenges Technology and Data Interoperability Data scaeered among dierent systems MulKple incompaKble technologies make it dicult to use InterpreKng Dynamic and StaKc Data Sensors, ERP, BMS, assets databases, Need to proacKvely idenKfy eciency opportuniKes Empowering AcKons and Including Users in the Loop Understanding of direct and indirect impacts of acKviKes Embedding impacts within business processes Engaging Users 95 96 Building Data Center Office IT Logistics Corporate Organisation-level Business Process Personal-level Linked dataspace for Energy Intelligence Linked Energy Intelligence Linked Energy Intelligence Applications Energy Analysis Model Complex Events Situation Awareness Apps Energy and Sustainability Dashboards Decision Support Systems LinkedData Support Services Entity Management Service Data Catalog Complex Event Processing Engine Provenance Search & Query Sources Adapter Adapter Adapter Adapter Adapter n Cloud of Energy Data n Linked Sensor Middleware n Resource Description Framework (RDF) n Semantic Sensor Networks n Constrained Application Protocol (CoAP) n Semantic Event Processing n Collaborative Data Mgmt. n Energy Saving Applications n Energy Awareness Curry E. et al, Enterprise Energy Management using a Linked dataspace for Energy Intelligence. In: The Second IFIP Conference on Sustainable Internet and ICT for Sustainability (SustainIT) 2012. Energy Saving ApplicaKons Enterprise Energy Observatory Smart Buildings Green Cloud Computing Office IT Energy Mgmt. Personal Energy Mgmt. Building Energy Explorer 99 of 26 1. Data from Enterprise Linked Data Cloud 2. Sensor Data 3. Building Energy SituaKon Awareness Energy Analysis by Group iEnergy Personal @WATERNOMICS_EU www.waternomics.eu102 Concrete Objectives To introduce demand response and accountability principles (water footprint) in the water sector To engage consumers in new interactive and personalized ways that bring water efficiency to the forefront and leads to changes in water behaviours To empower corporate decision makers and municipal area managers with a water information platform together with relevant tools and methodologies to enact ICT-enabled water management programs To promote ICT enabled water awareness using airports and water utilities as pilot examples To make possible new water pricing options and policy actions by combining water availability and consumption data WATERNOMICS will provide personalised and actionable information on water consumption and water availability to individual households, companies and cities in an intuitive & effective manner at relevant time-scales for decision making @WATERNOMICS_EU www.waternomics.eu103 WATERNOMICS PLATFORM ARCHITECTURE Support Services SourcesApplications Water Analysis Model Complex Events Usage Model Water Dashboards Entity Management Service Decision Support Systems LinkedWater Data Data Catalog Complex Event Processing Engine Prediction Search & Query Adapter Adapter Adapter Adapter Adapter Water Management Apps Water Data Analysis and Prediction Semantic Sensor Networks and Complex Event Processing to aid Decision Making Linking of data from different Water Management Sustems using Linked Data / RDF @WATERNOMICS_EU www.waternomics.eu104 PILOT OVERVIEW # Focus Location Intent Partner 1 Water utility for domestic users (Thermi) To demonstrate, validate, and assess the WATERNOMICS Platform for domestic water users 2 Water Management Cycle in an airport (Milan Linate) To demonstrate, validate, and assess the WATERNOMICS methodology and hardware innovations, and software/ analysis results via the deployment of WATERNOMICS ICT 3 Water distribution in a Municipality (Sochaczew) To validate and showcase the WATERNOMICS Platform at a municipal level (i.e. mixed use consumers supplied by a water utility) Conclusions Coupling necessary for crossing boundaries Decoupling necessary for scalable so]ware Event-based systems do not address the coupling/decoupling tradeo for semanKcs Approximate and themaKc event processing exchange approximaKons of meaning with loose semanKc coupling 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Dataset and So]ware Dataset Souleiman Hasan, Edward Curry, ThemaKc event processing dataset, DOI: 10.13140/2.1.3342.9123 hep://www.researchgate.net/publicaKon/263673956_ThemaKc_event_processing_dataset Collider Souleiman Hasan, Kalpa Gunaratna, Yongrui Qin, and Edward Curry. 2013. Demo: approximate semanKc matching in the collider event processing engine. In Proceedings of the 7th ACM interna:onal conference on Distributed event- based systems (DEBS '13). ACM, New York, NY, USA, 337-338. DOI=10.1145/2488222.2489277 hep://doi.acm.org/10.1145/2488222.2489277 Easy ESA EasyESA is an implementaKon of Explicit SemanKc Analysis (ESA) hep://treo.deri.ie/easyesa/ 7-11 July 2014, Rhodes, Greece EarthBiAs2014 References CUGOLA, G. AND MARGARA, A., 2011. Processing ows of informaKon: From data stream to complex event processing. ACM Compu:ng Surveys Journal. EUGSTER, P.T., FELBER, P.A., GUERRAOUI, R. AND KERMARREC, A.M., 2003. The many faces of publish/subscribe. ACM Compu:ng Surveys (CSUR), 35(2), pp.114131. Carlile, Paul R. "Transferring, translaKng, and transforming: An integraKve framework for managing knowledge across boundaries." Organiza:on science15.5 (2004): 555-568. HASAN, S. AND CURRY, E., 2014. Approximate SemanKc Matching of Events for The Internet of Things. ACM Transac>ons on Internet Technology (TOIT). In Press HASAN, S., ORIAIN, S. AND CURRY, E., 2013. TOWARDS UNIFIED AND NATIVE ENRICHMENT IN EVENT PROCESSING SYSTEMS. IN THE 7TH ACM INTERNATIONAL CONFERENCE ON DISTRIBUTED EVENT-BASED SYSTEMS (DEBS 2013). ARLINGTON, TEXAS, USA: ACM. HASAN, S., ORIAIN, S. AND CURRY, E., 2012. Approximate SemanKc Matching of Heterogeneous Events. In 6th ACM Interna:onal Conference on Distributed Event-Based Systems (DEBS 2012). Berlin, Germany: ACM, pp. 252263. HASAN, S. AND CURRY, E., 2014. ThemaKc Event Processing. Middleware 2014. Under review. HASAN, S., CURRY, E., BANDUK, M., AND ORIAIN, S. TOWARD SITUATION AWARENESS FOR THE SEMANTIC SENSOR WEB: COMPLEX EVENT PROCESSING WITH DYNAMIC LINKED DATA ENRICHMENT. THE 4TH INTERNATIONAL WORKSHOP ON SEMANTIC SENSOR NETWORKS 2011 (SSN11), (2011), 6072. E. Curry, Message-Oriented Middleware, in Middleware for CommunicaKons, Q. H. Mahmoud, Ed. Chichester, England: John Wiley and Sons, 2004, pp. 128. 7-11 July 2014, Rhodes, Greece EarthBiAs2014 More References P. McFedries, The coming data deluge, IEEE Spectrum, 2011. CUGOLA, G. AND MARGARA, A., 2011. Processing ows of informaKon: From data stream to complex event processing. ACM Compu:ng Surveys Journal. EUGSTER, P.T., FELBER, P.A., GUERRAOUI, R. AND KERMARREC, A.M., 2003. The many faces of publish/subscribe. ACM Compu:ng Surveys (CSUR), 35(2), pp.114131. LUCKHAM, D., 2002. The Power of Events: An Introduc:on to Complex Event Processing in Distributed Enterprise Systems, Addison-Wesley Professional. DAYAL, U., BLAUSTEIN, B., BUCHMANN, A., CHAKRAVARTHY, U., HSU, M., LEDIN, R., MCCARTHY, D., ROSENTHAL, A., SARIN, S., CAREY, M. J., LIVNY, M., AND JAUHARI, R. 1988. The hipac project: Combining acKve databases and Kming constraints. SIGMOD Rec. 17, 1, 51 70. LIEUWEN, D. F., GEHANI, N. H., AND ARLEIN, R. M. 1996. The ode acKve database: Trigger semanKcs and implementaKon. In Proceedings of the 12th InternaKonal Conference on Data Engineering (ICDE96). IEEE Computer Society, Los Alamitos, CA, 412420. GATZIU, S. AND DITTRICH, K. 1993. Events in an acKve object-oriented database system. In Proceedings of the InternaKonal Workshop on Rules in Database Systems (RIDS), N. Paton and H. Williams, Eds. Workshops in CompuKng, Springer-Verlag, Edinburgh, U.K. CHAKRAVARTHY, S. AND ADAIKKALAVAN, R. 2008. Events and streams: Harnessing and unleashing their synergy! In Proceedings of the 2nd InternaKonal Conference on Distributed Event-Based Systems (DEBS08). ACM, New York, NY, 112. CHANDRASEKARAN, S., COOPER, O., DESHPANDE, A., FRANKLIN, M. J., HELLERSTEIN, J. M., HONG, W., KRISHNAMURTHY, S., MADDEN, S. R., REISS, F., AND SHAH, M. A. 2003. Telegraphcq: ConKnuous dataow processing. In Proceedings of the ACM SIGMOD InternaKonal Conference on Management of Data (SIGMOD03). ACM, New York, NY, 668668. CHEN, J., DEWITT, D. J., TIAN, F., AND WANG, Y. 2000. Niagaracq: A scalable conKnuous query system for Internet databases. SIGMOD Rec. 29, 2, 379390. LIU, L., PU, C., AND TANG, W. 1999. ConKnual queries for internet scale event-driven informaKon delivery. IEEE Trans. Knowl. Data Eng. 11, 4, 610628. ARASU, A., BABU, S., AND WIDOM, J. 2006. The CQL conKnuous query language: SemanKc foundaKons and query execuKon. VLDB J. 15, 2, 121142. MUHL , G., FIEGE, L., AND PIETZUCH, P. 2006. Distributed Event-Based Systems. Springer ALTHERR, M., ERZBERGER, M., AND MAFFEIS, S. 1999. iBusa so]ware bus middleware for the Java plavorm. In Proceedings of the InternaKonal Workshop on Reliable Middleware Systems. 4353.. 7-11 July 2014, Rhodes, Greece EarthBiAs2014 More References David S. Rosenblum and Alexander L. Wolf. 1997. A design framework for Internet-scale event observaKon and noKcaKon. SIGSOFT SoGw. Eng. Notes 22, 6 (November 1997), 344-360. DOI=10.1145/267896.267920 hep://doi.acm.org/10.1145/267896.267920 EUGSTER, P. AND GUERRAOUI, R. 2001. Content based publish/subscribe with structural reecKon. In Proceedings of the 6th Usenix Conference on Object-Oriented Technologies andSystems (COOTS01). C. Shannon and W. Weaver. The mathemaKcal theory of communicaKon. University of Illinois Press, 1949. P. R. Carlile. Transferring, translaKng, and transforming: An integraKve framework for managing knowledge across boundaries. OrganizaKon science, 15(5):555{568, 2004. Curry, Edward, Souleiman Hasan, and Sen O'Riain. "Enterprise energy management using a linked dataspace for energy intelligence." Sustainable Internet and ICT for Sustainability (SustainIT), 2012. IEEE, 2012. Curry, Edward, et al. "Linking building data in the cloud: IntegraKng cross-domain building data using linked data." Advanced Engineering Informa:cs 27.2 (2013): 206-219. Patrick Th. Eugster, Pascal A. Felber, Rachid Guerraoui, and Anne-Marie Kermarrec. 2003. The many faces of publish/subscribe. ACM Comput. Surv. 35, 2 (June 2003), 114-131. A. Carzaniga, D. S. Rosenblum, and A. L. Wolf. Achieving scalability and expressiveness in an internet-scale event noK_caKon service. In Proceedings of the nineteenth annual ACM symposium on Principles of distributed compuKng, pages 219{227. ACM, 2000. M. Petrovic, I. Burcea, and H.-A. Jacobsen. S-topss: semanKc toronto publish/subscribe system. In Proceedings of the 29th internaKonal conference on Very large data bases - Volume 29, VLDB '03, pages 1101-1104. VLDB Endowment, 2003. Luis Sanchez, Jose Antonio Galache, Veronica GuKerrez, JM Hernandez, J Bernat, Alex Gluhak, and Tomas Garcia. 2011. SmartSantander: The meeKng point between Future Internet research and experimentaKon and the smart ciKes. In Future Network & Mobile Summit (FutureNetw), 2011. IEEE, 18. Edward Curry, Souleiman Hasan, and Sean ORiain. 2012. Enterprise energy management using a linked dataspace for Energy Intelligence. In Sustainable Internet and ICT for Sustainability (SustainIT), 2012. IEEE, 16. Yahoo! 2013. Yahoo! Directory: AutomoKve - Makes and Models. (2013). hep://dir.yahoo.com/recreaKon/ automoKve/makes and models/ Kyle Anderson, Adrian Ocneanu, Diego Benitez, Derrick Carlson, Anthony Rowe, and Mario Berges. 2012. BLUED: A Fully Labeled Public Dataset for Event-Based Non-Intrusive Load Monitoring Research. In Proc. SustKDD. Richard Cyganiak. 2013. Rooms in the DERI building. (2013). hep://lab.linkeddata.deri.ie/2010/deri-rooms 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Credits Green and Sustainable IT Group at Insight Galway for all their hard work. Special thanks to Souleiman Hasan for his assistance with the Tutorial Andre Freitas Slides on DistribuKonal SemanKcs Prof. Manfred Hauswirth and USM at Insight Galway (LSM, OpenIoT, etc..) 7-11 July 2014, Rhodes, Greece EarthBiAs2014

Recommended

View more >