data and metadata challenges
DESCRIPTION
3. 2. Problem Formulation. WALSAIP Conceptual Model. Study Mechanisms for data and metadata acquisition. Data + Metadata + Processing Decision Making!. Study correspondences of data and metadata files. Raw Data Servers. Computed Data Servers. INTERNET. Data Representation - PowerPoint PPT PresentationTRANSCRIPT
www.walsaip.uprm.eduWALSAIP
WALSAIPSupported By
A I PGroupGroupA u t o m a t e dI n f o r m a t i o nP r o c e s s i n g
2
NSF Grant CNS-0540592
This work centers on the design and development of a Java-based XML information representation (XIR) tool for the coupling/binding representation of data and metadata entities associated with physical sensors pertaining to environmental surveillance monitoring (ESM) applications. Metadata, defined in general as data that describe data, is associated with each sensor-signal-data through a binding/coupling registry process using Extensible Markup Language (XML) format. The concept of sensor data availability in ESM is decomposed into three specific requirements for the XIR system: let users get to information in a remote manner, get access to data as soon as it is required, and enable a uniform interpretation of data among heterogeneous data sources and data destinations.
Data and Metadata Challenges
Data and Metadata
[1] Manetti Luca, Terribilini Andrea, Knecht Alfredo, “Autonomous Remote Monitoring System for Landslides”, SPIE’s 9th Annual International Symposium on Smart Structures and Materials, 2002, San Diego, CA.
[2] Nativi Stefano, Giuli Dino, Innocenti Emilio Bugli, “Interoperability for Multimedia Systems to Support Decision-Makers in the Environment Sector” IEEE International Conference On Multimedia Computing and Systems, Volume 2, June 1999, Pages 338-342
[3] Dong-Jun Won, Il-Yop Chung, Joong-Moon Kim, Seung-Il Moon, Jang-Cheol Seo, Jong-Woong Choe, D Won, II-Yop Chung, J. Kim, S. Moon, J. Seo, J. Choe, “Development of Power Quality Monitoring System with Central Processing Scheme”, Power Engineering Society Summer Meeting, IEEE, South Korea, pp. 915-919 vol.2, 21-25 July 2002.
[4] MANTIS Project (MultimodAl Networks of In-situ Sensors): http://mantis.cs.colorado.edu/index.php/tiki-index.php
Abstract1
WALSAIP Conceptual Model2
References7
Ongoing Work6
Problem Formulation3
Proposed Solution4
Luz V. Acabá-Cuevas – M.S. Student Prof. Domingo Rodríguez – AdvisorAIP Group, ECE Department University of Puerto Rico Email: [email protected] Mayagüez Campus
Automated XML Schema Representations for Sensor-based Information Processing Systems
Data Representation Systems
Signal Conditioning System
Raw Data Servers
Computational Signal Processing Systems
Signal Data Post-processing
Pre-processing Stage Post-processing StageProcessing Stage
INTERNETComputed Data
Servers
Information Rendering Systems
Sensor Array Structures
Signal Data – all readings collected directly from sensors.
Metadata – data that describes data. Metadata is crucial to provide researchers a concrete idea of the real conditions in which data was collected. Metadata is a determinant of how the environment influenced the measurement in case of abnormal findings.
There is a need for proper characterization of binding/coupling relationships between data and metadata files to improve information content analysis.Data should be interoperable across heterogeneous users with different data architectures, storage systems, and platforms. A mechanism should be design to make data readable and understandable across heterogeneous users in automated information processing systems.Lack of support for dynamic metadata management.Systems need to incorporate information from “human sensors”.
Study Mechanisms for data and metadata acquisition
Study correspondences of data and metadata files
Figure 1. WALSAIP Conceptual Model
Figure 3. Example: NERR System Data/Metadata
Data + Metadata + Processing
Decision Making!
Figure 2. Decision Making Input
Figure 4. Shannon’s Theory and XML Processing
AUTOMATE!User
Information Source XML
Coder
Communication
Channel
RxReceiver
TxTransmitter
XMLDecoder
User
Information Destination
Hazards:
JammingInterferencePower FailureEtc….
Data MetaData XML
XMLData Meta
Data
Design and implementation of the Information Representation Tool (XIR) tool using Java, XML, and FTP technologies for encapsulation of data and metadata files (proposed as format for information content exchange) in automated information processing systems.Enable user to develop “stencils” in order to customize “XML tags” during encapsulation.Information theoretic measures are used to study how the extensible markup language (XML) may serve as a means for integrating symbols and meaning (semiotics and semantics parts), from metadata, with signals and structure (syntactic part) from sensor-based raw signal-data.
Applying engineering techniques for solution design.Generating source code to implement a proposed solution instantiation.Identifying potential test cases to perform functional verification test after coding.Integrating a proposed solution to the WALSAIP architecture.
Implementation Effort5Analysis of current metadata management and storage formats in order to provide encapsulation support. Analysis of data/metadata consumer modules within WALSAIP architecture to ensure compatibility and integration.Generation algorithms to acquire plain text values non text data such as images and acoustic signals.Engineering of algorithms to gather critical metadata directly from data. For example image dimensions and format.
Figure 6. Data and Metadata Encapsulation Example
Figure 5. Shannon’s Theory Approach to Information Flow Study
Proposed solution contemplates dynamic metadata management.Enable data and metadata enhancement with user observations.Context awareness aids in the detection, estimation, and classification of sensor-based signals acquired from ESM for the assessment and proper management of Earth’s geophysical, environmental, and ecological issues.
Note:
Static Local Information
Input
Server Application
Output
Inpu
t
1KD 1KM
Internet
KD KM
KKK MMM
~
1
KKK DDD
~
1
-
--
Internet
KD~
kM~
<?xml version="1.0" ?> - <encapsulation>- <metadata>- <research> <researchName>Wide Area Large Scale Automated Information Processing</researchName> <department>Department of Electrical and Computer Engineering</department> <intitution>University of Puerto Rico at Mayaguez</intitution> <phone>787-832-4040</phone> <contact>Domingo Rodriguez</contact> <email>[email protected]</email> </research>- <sensingInfo> <initialDate>2006-07-05</initialDate> <initialTime>22:23:00.14</initialTime> <endingDate>2006-07-06</endingDate> <endingTime>22:23:00.14</endingTime> <nodeID>0</nodeID> <samplingRate>138</samplingRate> <type>humidity</type> </sensingInfo> </metadata> <data>65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 65535 </data> </encapsulation>