data and applications security developments and directions
DESCRIPTION
Data and Applications Security Developments and Directions. Dr. Bhavani Thuraisingham The University of Texas at Dallas Data Warehousing, Data Mining and Security October 8, 2010. Outline. Background on Data Warehousing Security Issues for Data Warehousing Data Mining and Security. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/1.jpg)
Data and Applications Security Developments and Directions
Dr. Bhavani Thuraisingham
The University of Texas at Dallas
Data Warehousing, Data Mining and Security
October 8, 2010
![Page 2: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/2.jpg)
Outline
Background on Data Warehousing Security Issues for Data Warehousing Data Mining and Security
![Page 3: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/3.jpg)
What is a Data Warehouse?
A Data Warehouse is a:
- Subject-oriented
- Integrated
- Nonvolatile
- Time variant
- Collection of data in support of management’s decisions
- From: Building the Data Warehouse by W. H. Inmon, John Wiley and Sons
Integration of heterogeneous data sources into a repository Summary reports, aggregate functions, etc.
![Page 4: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/4.jpg)
Example Data Warehouse
OracleDBMS forEmployees
SybaseDBMS forProjects
InformixDBMS forMedical
Data Warehouse:Data correlatingEmployees WithMedical Benefitsand Projects
Could beany DBMS; Usually based on the relational data model
UsersQuerythe Warehouse
![Page 5: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/5.jpg)
Some Data Warehousing Technologies
Heterogeneous Database Integration Statistical Databases Data Modeling Metadata Access Methods and Indexing Language Interface Database Administration Parallel Database Management
![Page 6: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/6.jpg)
Data Warehouse Design
Appropriate Data Model is key to designing the Warehouse Higher Level Model in stages
- Stage 1: Corporate data model
- Stage 2: Enterprise data model
- Stage 3: Warehouse data model Middle-level data model
- A model for possibly for each subject area in the higher level model
Physical data model
- Include features such as keys in the middle-level model Need to determine appropriate levels of granularity of data in order
to build a good data warehouse
![Page 7: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/7.jpg)
Distributing the Data Warehouse
Issues similar to distributed database systems
Distributed Warehouse
Central Bank
Branch A Branch B
CentralWarehouse
CentralBank
Branch A Branch B
CentralWarehouse
Branch BWarehouse
Branch AWarehouse
Non-distributed Warehouse
![Page 8: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/8.jpg)
Multidimensional Data Model
Project Name
Project Leader
Project Sponsor
Project Cost
Project Duration
Dollars
Pounds
Yen
Years
Months
Weeks
Project Name
Project Leader
Project Sponsor
Project Cost
Project Duration
Dollars
Pounds
Yen
Years
Months
Weeks
![Page 9: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/9.jpg)
Indexing for Data Warehousing
Bit-Maps Multi-level indexing Storing parts or all of the index files in main memory Dynamic indexing
![Page 10: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/10.jpg)
Metadata Mappings
Metadatafor Data source A
Metadatafor Data source B
Metadatafor Data source C
Metadata for Mappings and Transformations
Metadata for Mappings and Transformations
Metadata for Mappings and Transformations
Metadatafor the Warehouse
Metadatafor Data source A
Metadatafor Data source B
Metadatafor Data source C
Metadata for Mappings and Transformations
Metadata for Mappings and Transformations
Metadata for Mappings and Transformations
Metadatafor the Warehouse
![Page 11: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/11.jpg)
Data Warehousing and Security
Security for integrating the heterogeneous data sources into the repository
- e.g., Heterogeneity Database System Security, Statistical Database Security
Security for maintaining the warehouse
- Query, Updates, Auditing, Administration, Metadata Multilevel Security
- Multilevel Data Models, Trusted Components
![Page 12: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/12.jpg)
Example Secure Data Warehouse
Secure Data Warehouse Manager
Secure DBMS A Secure DBMS B Secure DBMS C
SecureDatabase
SecureDatabase
SecureDatabase
User
Secure Warehouse
![Page 13: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/13.jpg)
Secure Data Warehouse Technologies
Secure Data Warehousing Technologies:
Secure data modelingSecure heterogeneous database integrationDatabase securitySecure access methods and indexingSecure query languagesSecure database administrationSecure high performance computing technologiesSecure metadata management
![Page 14: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/14.jpg)
Security for Integrating Heterogeneous Data Sources
Integrating multiple security policies into a single policy for the warehouse
- Apply techniques for federated database security?
- Need to transform the access control rules Security impact on schema integration and metadata
- Maintaining transformations and mappings Statistical database security
- Inference and aggregation
- e.g., Average salary in the warehouse could be unclassified while the individual salaries in the databases could be classified
Administration and auditing
![Page 15: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/15.jpg)
Security Policy for the Warehouse
Federated policies become warehouse policies?
Component Policy for Component A
Component Policy for Component B
Component Policy for Component C
Generic Policy for Component A
Generic Policy for Component B
Generic policy for Component C
Export Policy for Component A
Export Policy for Component B
Export Policy for Component C
Federated Policy for Federation
F1
Federated Policy for Federation
F2
Export Policy for Component B
Security Policy Integration and Transformation
![Page 16: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/16.jpg)
Security Policy for the Warehouse - II
Policyfor the Warehouse
PolicyFor Data Source A
PolicyFor Data Source B
PolicyFor Data Source C
Policy forMappings andTransformations
Policy forMappings andTransformations
Policy forMappings andTransformations
![Page 17: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/17.jpg)
Secure Data Warehouse Model
Dollars, S
Pounds, S
Yen, S
Year, U
Months, U
Weeks, U
Project Name, U
Project Leader, U
Project Sponsor, S
Project Cost, S
Project Duration, U
U = UnclassifiedS = Secret
![Page 18: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/18.jpg)
Methodology for Developing a Secure Data Warehouse
IntegrateSecuredatasources
Clean/modifydataSources.IntegratepoliciesSecure data
sources
Build securedata model,schemas,accessmethods,and indexstrategies forthe securewarehouse
![Page 19: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/19.jpg)
Multi-Tier Architecture
Tier 1:Secure Data Sources
Tier 2: Builds on Tier 1
Tier N: Data WarehouseBuilds on Tier N-1
**
Tier 1:Secure Data Sources
Tier 2: Builds on Tier 1
Tier N: Secure Data WarehouseBuilds on Tier N-1
**
Each layer builds on the Previous LayerSchemas/Metadata/Policies
![Page 20: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/20.jpg)
Administration
Roles of Database Administrators, Warehouse Administrators, Database System Security officers, and Warehouse System Security Officers?
When databases are updated, can trigger mechanism be used to automatically update the warehouse?
- i.e., Will the individual database administrators permit such mechanism?
![Page 21: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/21.jpg)
Auditing
Should the Warehouse be audited?
- Advantages Keep up-to-date information on access to the
warehouse
- Disadvantages May need to keep unnecessary data in the warehouse May need a lower level granularity of data May cause changes to the timing of data entry to the
warehouse as well as backup and recovery restrictions
Need to determine the relationships between auditing the warehouse and auditing the databases
![Page 22: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/22.jpg)
Multilevel Security
Multilevel data models
- Extensions to the data warehouse model to support classification levels
Trusted Components
- How much of the warehouse should be trusted?
- Should the transformations be trusted? Covert channels, inference problem
![Page 23: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/23.jpg)
Inference Controller
UserUser
Secure DBMS A Secure DBMS B Secure DBMS C
SecureDatabase
SecureDatabase
SecureDatabase
Secure WarehouseSecure Data Warehouse
Manager
InferenceController
![Page 24: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/24.jpg)
Status and Directions
Commercial data warehouse vendors are incorporating role-based security (e.g., Oracle)
Many topics need further investigation
- Building a secure data warehouse
- Policy integration
- Secure data model
- Inference control
![Page 25: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/25.jpg)
Data Mining for Counter-terrorism
Data Mining forNon real-time Threats:Gather data, build terrorist profilesMine data, prune results
Data Mining forCounter-terrorism
Data Mining forReal-time Threats:Gather data in real-time, build real-time models,Mine data, Report results
![Page 26: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/26.jpg)
Data Mining Needs for Counterterrorism: Non-real-time Data Mining
Gather data from multiple sources
- Information on terrorist attacks: who, what, where, when, how
- Personal and business data: place of birth, ethnic origin, religion, education, work history, finances, criminal record, relatives, friends and associates, travel history, . . .
- Unstructured data: newspaper articles, video clips, speeches, emails, phone records, . . .
Integrate the data, build warehouses and federations Develop profiles of terrorists, activities/threats Mine the data to extract patterns of potential terrorists and predict
future activities and targets Find the “needle in the haystack” - suspicious needles? Data integrity is important Techniques have to SCALE
![Page 27: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/27.jpg)
Data Mining for Non Real-time Threats
Integratedatasources
Clean/modifydatasources
BuildProfilesof Terrorists and Activities
Examineresults/
Pruneresults
Reportfinalresults
Data sourceswith informationabout terroristsand terrorist activities
Minethedata
![Page 28: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/28.jpg)
Data Mining Needs for Counterterrorism: Real-time Data Mining
Nature of data
- Data arriving from sensors and other devices Continuous data streams
- Breaking news, video releases, satellite images
- Some critical data may also reside in caches Rapidly sift through the data and discard unwanted data for later use
and analysis (non-real-time data mining) Data mining techniques need to meet timing constraints Quality of service (QoS) tradeoffs among timeliness, precision and
accuracy Presentation of results, visualization, real-time alerts and triggers
![Page 29: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/29.jpg)
Data Mining for Real-time Threats
Integratedatasources in real-time
Buildreal-timemodels
ExamineResults in Real-time
Reportfinalresults
Data sourceswith informationabout terroristsand terrorist activities
Minethedata
Rapidlysift throughdata and discardirrelevant data
![Page 30: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/30.jpg)
Data Mining Outcomes and Techniques for Counter-terrorism
Association:John and Jamesoften seen together after anattack
Link Analysis:Follow chain from A to B to C to D
Clustering: Divide population; People from country X of a certain religion; people from Country Y Interested in airplanes
Classification:Build profiles ofTerrorist and classify terrorists
Anomaly Detection:John registers at flight school;but des not care about takeoff or landing
Data Mining Outcomes and Techniques
![Page 31: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/31.jpg)
Example Success Story - COPLINK COPLINK developed at University of Arizona
- Research transferred to an operational system currently in use by Law Enforcement Agencies
What does COPLINK do?
- Provides integrated system for law enforcement; integrating law enforcement databases
- If a crime occurs in one state, this information is linked to similar cases in other states
- It has been stated that the sniper shooting case may have been solved earlier if COPLINK had been operational at that time
![Page 32: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/32.jpg)
Where are we now? We have some tools for
- building data warehouses from structured data
- integrating structured heterogeneous databases
- mining structured data
- forming some links and associations
- information retrieval tools
- image processing and analysis
- pattern recognition
- video information processing
- visualizing data
- managing metadata
![Page 33: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/33.jpg)
What are our challenges? Do the tools scale for large heterogeneous databases and petabyte
sized databases? Building models in real-time; need training data Extracting metadata from unstructured data Mining unstructured data Extracting useful patterns from knowledge-directed data mining Rapidly forming links and associations; get the big picture for real-
time data mining Detecting/preventing cyber attacks Mining the web Evaluating data mining algorithms Conducting risks analysis / economic impact Building testbeds
![Page 34: Data and Applications Security Developments and Directions](https://reader035.vdocuments.mx/reader035/viewer/2022062305/56815177550346895dbfb21b/html5/thumbnails/34.jpg)
IN SUMMARY:
Data Mining is very useful to solve Security Problems
- Data mining tools could be used to examine audit data and flag abnormal behavior
- Much recent work in Intrusion detection (unit #18) e.g., Neural networks to detect abnormal patterns
- Tools are being examined to determine abnormal patterns for national security
Classification techniques, Link analysis
- Fraud detection Credit cards, calling cards, identity theft etc.
BUT CONCERNS FOR PRIVACY