2:15 pm 3:00 pm moderator - fujitsuthe big data working group (bdwg) will be identifying scalable...
TRANSCRIPT
Panel #2 “Big Data: Application Security and Privacy”
Copyright 2013 FUJITSU LABORATORIES OF AMERICA 0
Moderator:
Keith Swenson, VP of Research and
Development, Fujitsu America, Inc.
Panelists:
Taka Matsutsuka, Researcher, Fujitsu
Laboratories of Europe, Ltd.
Praveen Murthy, Member of Research Staff,
Fujitsu Laboratories of America, Inc.
Arnab Roy, Member of Research Staff,
Fujitsu Laboratories of America, Inc.
2:15 PM – 3:00 PM
Big Data: Application Security and Privacy
Keith Swenson
Vice-President of Research and Development
Fujitsu America, Inc.
CSA Big Data Working Group
Arnab Roy
Software Systems Innovation Group
Fujitsu Laboratories of America, Inc.
BDWG Organization
Big Data Working Group
60+ members
Data analytics for security
Privacy preserving/enhancing
technologies
Big data-scale crypto
Cloud Infrastructures' Attack Surface Analysis and
Reduction
Framework and Taxonomy
Policy and Governance
Top 10
Legal Issues
https://basecamp.com/1825565/projects/511355-big-data-working
CSA BDWG Plan
4
The Big Data Working Group (BDWG) will be identifying scalable techniques for data-centric security and privacy problems. • BDWG’s investigation is expected to lead to
• Crystallization of best practices for security and privacy in big data, • Help industry and government on adoption of best practices, • Establish liaisons with SDOs to influence big data security and privacy standards • Accelerate the adoption of novel research aimed to address security and privacy issues.
Identify new and fundamentally different technical and organizational problems in big data security and privacy.
Summarize state of the art, propose best practices, and identify gaps
Establish liaison with NIST (US), ENISA (EU) and Participate in PAPs, SDOs
Execute research plans based on funding and IP
Report on outcomes of BDWG research
https://cloudsecurityalliance.org/research/big-data/
9/12 3/14 12/12 3/13
First Milestone: Identified Top 10 Challenges
5
Public/Private/Hybrid Cloud
5, 7, 8, 9
1, 3, 5, 6, 7, 8, 9, 10
4, 8, 9
4, 1010
2, 3, 5, 8, 9
Data Storage
1) Secure computations in distributed programming frameworks
2) Security best practices for non-relational datastores
3) Secure data storage and transactions logs
4) End-point input validation/filtering
5) Real time security monitoring
6) Scalable and composable privacy-preserving data mining and analytics
7) Cryptographically enforced access control and secure communication
8) Granular access control
9) Granular audits
10) Data provenance
Initial Set of Topics in Big Data Crypto
1) Communication protocols
2) Access policy based
encryption
3) Big data privacy
4) Key management
5) Data integrity and poisoning
concerns
6) Searching / filtering
encrypted data
7) Secure data
collection/aggregation
8) Secure collaboration
9) Proof of data storage
10) Secure outsourcing of
computation
2,5,7,10 3,6,8,9
2,3,6
1,3,6
Copyright 2013 Fujitsu Laboratories of America
Attack Surface Reduction For Big Data Infrastructure Praveen Murthy
Fujitsu Laboratories of America
BDWG Organization
Big Data Working Group
60+ members
Data analytics for security
Privacy preserving/enhancing
technologies
Big data-scale crypto
Cloud Infrastructures' Attack Surface Analysis and
Reduction
Framework and Taxonomy
Policy and Governance
Top 10
Legal Issues
https://basecamp.com/1825565/projects/511355-big-data-working
Big Data Security
New security challenges of big data
Public cloud environment
coupled with
Big data characteristics
Volume, Velocity, Variety
Increased Attack Surface
Big Data based on commodity cloud
architecture
Demands more cloud infrastructure
services to be exposed
More APIs exposed for attack
Need to identify unused services/APIs
and block them from access
9 Copyright 2013 Fujitsu Laboratories of America
Cloud attack surface taxonomy
Figure from: Gruschka et al., Attack Surfaces: A Taxonomy for Attacks on Cloud Services.
•Buffer overflow
•SQL injection
•Privilege escalation
•SSL certificate spoofing
•Phishing
•Resource exhaustion
•DoS
•Privacy attack
•Data integrity attack
•Data confidentiality attack
•Attacks on cloud control
•How much
can the cloud
learn about a
user?
Copyright 2013 Fujitsu Laboratories of America
Attack surface as information flow
Potential for
• Dangerous data to flow
from (un-trusted) user to
system
• SQL injection, side
channel attack,
buffer overflow…
or
• For sensitive information
to flow from system to
unauthorized user
Information flow examples:
• User has read permissions
on all files
• User can create files via APIs
• User can spawn multiple
VMs via APIs
Cloud infrastructure elements and JavaScript
Hadoop
Software framework for Big Data
Microsoft HDInsight has JavaScript API for
Hadoop
Interactive applications with real-time data
Node.js
Visualization of Big Data Analytics
D3.js
Amazon EC2
Cloud platform
NodeJS library can communicate with AWS
EC2 APIs (Open source)
Attack Surface Analysis on Cloud Software
13 Copyright 2013 Fujitsu Laboratories of America
To determine metrics based on number of
paths from user APIs to sensitive
data/functions in cloud infrastructure code
using static analysis on JavaScript.
To determine metrics based on higher level
audits of virtual images, hypervisors, ports,
and host OS’s.
Infrastructure code
APIs
Sensitive data Sensitive functions
Attack surface: Paths that access sensitive data/functions as a proportion of all paths
Program paths
Java 7 0-day: could attack surface analysis catch this?
14 Copyright 2013 Fujitsu Laboratories of America
import java.applet.Applet;
import java.awt.Graphics;
import java.beans.Expression;
import java.beans.Statement;
import java.lang.reflect.Field;
import java.net.URL;
import java.security.*;
import java.security.cert.Certificate;
import metasploit.Payload;
public class Exploit extends Applet { public Exploit() { }
public void disableSecurity() throws Throwable {
Statement localStatement = new Statement(System.class, "setSecurityManager",
new Object[1]);
Permissions localPermissions = new Permissions();
localPermissions.add(new AllPermission());
ProtectionDomain localProtectionDomain = new ProtectionDomain(new
CodeSource(new URL("file:///"), new Certificate[0]), localPermissions);
AccessControlContext localAccessControlContext = new AccessControlContext(new
ProtectionDomain[] { localProtectionDomain });
SetField(Statement.class, "acc", localStatement, localAccessControlContext);
localStatement.execute();
}
::Applet
::Statement
::Permissions
Sandbox
violation!!
Copyright 2013 Fujitsu Laboratories of Europe Limited
BigGraph
Taka Matsutsuka,
Fujitsu Laboratories of Europe Limited
16 Copyright 2013 Fujitsu Laboratories of Europe Limited
Business Problem – Frauds in Social Benefits
Costs ~200B yen
annually: in UK only!*
Hard to bridge and interconnect claims
- Heterogeneous formats
- Multiple councils
Difficult to adapt to change of
requirements
- Dynamism of fraud techniques * 1.6B pound
Ealing Westminster
Kent
17 Copyright 2013 Fujitsu Laboratories of Europe Limited
BigGraph – connects Big Data with relationships
A technology to enable analysis of Big Data with connections
This exhibition uses public sector claim analysis (using data from the
UK)
System C
Individual Files
Analysis
Individual Systems
System A
Month Month Staff
Month May 2012 3
Month June 2012
Month July 2012 4
5
Rule
Process
Rule
Graph layer: integrated view
Process
System B Month Name No Address
Month Nuno 3 Clefield
Month Roger 18 Prince G
Month Aisha 28
33
Flat 2, 223
12
18 Copyright 2013 Fujitsu Laboratories of Europe Limited
A graph from public sector
Analysed and graph-formed data
from UK public sectors
19 Copyright 2013 Fujitsu Laboratories of Europe Limited
Our Solution – BigGraph Big Data Application Platform based on Graph Technology
Graph that enables bridging and
interconnection of data to solve
multiple councils heterogeneity
Locally embeddable algorithms
to dynamically adapt to change
of requirements
Users
Add new
business logic
claim2
Event
ID Theft
Anomalies BigGraph Platform
phone
Various Data
Sources
claim3
home school
claim1
Leicestershir
e
Essex Surrey
New business logic attached
locally to the data and added
to the graph – on the fly
e.g. Home.coordinate - School.coordinate > 60
miles