surveillance, events and the semantic web from e-gov to connected governance: the role of cloud...
TRANSCRIPT
Surveillance, Events and the Semantic WebFrom E-Gov to Connected Governance: the Role of Cloud Computing, Web 2.0 and Web 3.0 Semantic TechnologiesWashington, D.C., February 17, 2009Dr. Nancy Grady ([email protected])
2Energy | Environment | National Security | Health | Critical Infrastructure
Overview
• Surveillance• Event models• Emerging Web technologies• Putting it all together
3Energy | Environment | National Security | Health | Critical Infrastructure
Surveillance
• The process of active data-gathering with appropriate analysis and interpretation in order to achieve
– Early warning of threats – Early warning of events– Results of analysis
– Overall situational awareness • Communication to stakeholders of
– Events– Investigations– Conclusions
4Energy | Environment | National Security | Health | Critical Infrastructure
Surveillance
Mining for actionable intelligence• Traditional surveillance
– Look for anomalous activity in sensor data– Fuse primary or secondary data sources– Federated datasets
• New surveillance for “events”– Harvest Web-based information– Exchange event data with other agencies
5Energy | Environment | National Security | Health | Critical Infrastructure
Conceptual Architecture
5 National Biosurveillance Integration System 2.004/19/23
Wiki
CollectionCollectionCollectionCollection Preparation and ModelingPreparation and ModelingPreparation and ModelingPreparation and Modeling EvaluationEvaluationEvaluationEvaluation DeploymentDeploymentDeploymentDeployment
AnalysisCollaboration
Workflow
AnalysisCollaboration
Workflow
Extract, Transform, Load (ETL)
AnalysisUsers
AnalysisUsers
SituationalAwarenessSituationalAwareness
Reports,Notes,
Products,Papers,
. . .
Reports,Notes,
Products,Papers,
. . .
Web Service, RSS, SFTP, eMail, Browser, . . .
PDF, .doc, .xls. .html
ExtractionExtraction
CategorizationCategorization
Natural LanguageProcessing
Natural LanguageProcessing
GeospatialAnalysis
GeospatialAnalysis
Semantic QuerySemantic Query
IntegrationIntegration
Structured,Unstructured,and Metadata
SemanticallyIntegrated
Information
Open Source Data Feeds
6Energy | Environment | National Security | Health | Critical Infrastructure
Overview
• Surveillance• Event models• Emerging Web technologies• Putting it all together
7Energy | Environment | National Security | Health | Critical Infrastructure
Events Are the Output of Surveillance
• Who• What• Where• When• How• How many• Why
8Energy | Environment | National Security | Health | Critical Infrastructure
Exchanging Events
• Situational awareness is communicating significant events
• Easier to collaborate around events than primary/secondary data– Privacy
– Bandwidth
– “Local” expertise on data
– Existing surveillance systems tuned to data
• Open source (or intelligence) will be about events
9Energy | Environment | National Security | Health | Critical Infrastructure
Event Representation
• Prose
• Spreadsheets
• Relational database
• Relationship database (ontology)
• Semantic Web
10Energy | Environment | National Security | Health | Critical Infrastructure
Event - Spreadsheet Model
Subspecies
Subdistrict
NumberDestroyed
CountryReport
Date
Location
Age
District
Onset
LatitudeLongitude
NumberOf Deaths
Number Of Cases
Species
Gender
Case
Human
Avian
Human/Avian
Legend:
11Energy | Environment | National Security | Health | Critical Infrastructure
Relationship (Ontological) Model
Event
Human
Time Span
Region
Value
Virus
Winter
Avian FluCase151
Person151object
Dogubeyazit Village
occursAt
Dogubeyazit
Turkey
Northern Hemisphere
SpatialPartOf
Jan 05, 2006
dateReported
Female14 Has Gender
Has Age
DeathD151
Jan 05, 2006occursOn
HospitalizationH151
object
Jan 01, 2006
occursOn
H5N1Influenza
Disease
diseaseType
SpatialPartOf
SpatialPartOf
Death
Human
Host type
Legend:
12Energy | Environment | National Security | Health | Critical Infrastructure
Comparisons
• Prose is difficult to track and query
• Spreadsheets are easy but limited
• Relational databases are more familiar to developers and easier to aggregate
• Ontologies are easier for queries using context and offer better scalability to many datasets
• Semantic Web better for interoperability and flexibility, not as good for contextual queries, good for tagging within prose
13Energy | Environment | National Security | Health | Critical Infrastructure
Maturing an Event Model
• Source descriptions– Veracity
• Analytical pedigree– Preparation and analytical techniques
• Community standards– Codes and representations
• Spatial extent descriptions
14Energy | Environment | National Security | Health | Critical Infrastructure
Overview
• Surveillance• Event models• Emerging Web technologies• Putting it all together
15Energy | Environment | National Security | Health | Critical Infrastructure
Web (1.0, 2.0, 3.0)
• Content1) Brochureware
2) Social networking
3) Semantic Web … data.
• Search 1) PageRank prioritization
2) Influencers
3) Semantic searches
• Software 1) HTML* and CSS**, XML***
2) Wikis, blogs, tag clouds
3) Services for data exchange
• Hardware 1) Servers
2) Server farms
3) Cloud computing
*HTML = Extensible HyperText Markup Language
**CSS = Cascading Style Sheets
***XML = eXtended Markup Language
16Energy | Environment | National Security | Health | Critical Infrastructure
Cloud Computing Benefits
• Economies of scale to fit surveillance scope• Handles surge capacity for breaking news or deep dives• Opens up large-scale enterprise services to small projects
that could not afford their own enterprise scale resources• Opens new low-risk experiment and prototype areas • Allows IT to integrate faster and be more responsive• Cost reductions• Redundancy
17Energy | Environment | National Security | Health | Critical Infrastructure
Mobile Platforms
• Smart phones
• Consumer readers
• Business readers
18Energy | Environment | National Security | Health | Critical Infrastructure
Overview
• Surveillance• Event models• Emerging Web technologies• Putting it all together
19Energy | Environment | National Security | Health | Critical Infrastructure
Information Needs
• Understanding what external information exists
(about your organization and mission)• Aggregating, integrating and analyzing all
external information that directly impacts an internal project
• Identifying, aggregating and integrating documents as part of a large-scale document management system
20Energy | Environment | National Security | Health | Critical Infrastructure
Analyst Needs to Harvest Open Source Information
• Search engines are not enough• Filter and triage information for analysts• Integrate internal and external data• Collaborative environment for knowledge workers• Store analysis and vetted results • Tracking of events• Situational awareness reporting• Dissemination to a mobile workforce
21Energy | Environment | National Security | Health | Critical Infrastructure
All Source Analytic Framework (ASAF)
Correspondence: [email protected]