tsinghua university: two exemplary applications in china
TRANSCRIPT
Big Machine Data - Two Exemplary Applications in China
Jianmin Wang
Tsinghua University
Beijing, China
Agenda
• Background
• Two Exemplary Applications in China
Who we are?
• Institute for Data Science,
Tsinghua University
• Founded in April 2014
• Missions & Status Quo
– Recruiting world-class researchers and engineers from industry and academia
– Long-term dedication to system research and industry practice
– Leading China’s big data strategy, especially for industrial big data
BIG data
Big Data Landscape
People generated
2
1
3
Computer generated
Machine generated
Machine Generated Data
• Broadly exist– Industrial business
– Agriculture
– Utility
– Military
– Smarter City
– Logistics
– Smart devices
– Science research
Data Rate
24*7, up to million data points/s, and millions of devices
Data Type
Mostly are time-series, temporal sequence, and spatial-temporal and array data
Data Usage
Real-time processing.
From monitoring to content, shape, signal based query and analysis
Industrial businesses have entered the era of “big data”, but the real challenge is
how to extract value from data.
Machine generated data is the core of industrial big data
Big Machine data is beyond 3Vs
Our research spans big data lifecycle
Storage1 Access & Exploration3
Preprocessing2 Modeling & Analytics4
Agenda
• Background
• Two Exemplary Applications in China
1 Industrial Sensor Data Management:Cassandra at China Sany Group
2 Climate Data Management:Cassandra at China Meteorological Administration
9© 2015. All Rights Reserved.
China Sany Group
10© 2015. All Rights Reserved.
More than 200K active engineering machineriesIn more than 150 countries
SANY Group is a global company in the
construction machinery industry.
In 2011, SANY became China’s unique
company listed among the world’s top
500 companies in the construction
machinery industry.
Pipeline of Industrial Sensor Data Processing
© 2015. All Rights Reserved. 11
Internet
三一运动控制器SYMC
三一工业显示屏SYLD
三一移动终端SYMT
产品主控制柜
基于SCP协
议包
车辆工况数据
无线基站
无线到有线 指定IP与端口
快反工程师资料工程师
...
用户计算机 服务人员业主
1
2
34
execute
collect
decision
transfer
The data records the operational
statuses of the machineries
5000 kinds of sensors
50 billion records per year
2008
• Start
managing
sensor data
2010
• 60k
machineries
2012
• 80k
machineries
• Can only
support 6
month data
online
2014
•>100k
machineries
•All data
online
2020
•>500K
machineries
•>10K users
Technology Roadmap in Sany Group
© 2015. All Rights Reserved. 12
SQL Server
➡ Oracle
Oracle➡Cassandra
Why Cassandra?•Cost performance
•Scalability
•P2P Architecture
Operation Worst case Average
case
Write 30% 2x
Query 22.6% 10x
Software Stack of Sensor Data Management
© 2015. All Rights Reserved. 13
Collect Store Analyze
Storm
设备(主键)
工况 1(列族 1) 工况 2(列族 2) ……
接收时间 1
(列 1)
接收时间 2
(列 2) ……
接收时间 1
(列 1)
接收时间 2
(列 2) …… ……
设备 1 监测值 监测值 …… 监测值 监测值 …… ……
设备 2 监测值 监测值 …… 监测值 监测值 …… ……
…… …… …… …… …… …… …… ……
Map/Reduce
row
key
sensor1(cf1) sensor2(cf2)
device2
received
time1received
time2received
time1received
time2
device1 value
value
value
value
value value
value value
Structured Storage
gath
er
tim
e
Cassandra Storage:
machin
e
gather time
sensors
。。。
。。。
Schema Design – Row and Column
• Use sensor as Column Family (CF)
• In each Column Family (CF)
– Use as the row key
– Use as the column name
– Use as the column value
– Columns of each row are sorted in advance
– The number of columns is readily increasable
ma
ch
ine
gather time
。。。
ma
ch
ine
gather time
。。。
…
sensor1 sensor2 sensorN
~5000
sensors
5000+ column families
Cassandra v1.2
CQL2 (not CQL3)
© 2015. All Rights Reserved. 15
Why 5000+ Column Families?
• Cassandra V1.* does not support multiple primary key & clustering key
• This makes programming more complex
• Manually split the row key or column name
• All the data in one SSTABLE belongs to a specific CF
• When querying a specified sensor, we need not scan unnecessary data
Row Key Column Name
machine_id sensor_id : gather_time
Row Key Column Name
sensor_id : machine_id gather_time
Cassandra v1.2
CQL2 (not CQL3)
Challenge 1 – Creating Schema Hang
• Problem– Create 5000+ CFs in batch
– Creation cost increases dramatically
© 2015. All Rights Reserved.
0
5000
10000
15000
1
28
55
82
109
136
163
190
217
244
271
298
325
352
379
406
433
460
487
514
541
Tim
e C
os
t(m
s)
CF Serial Number
Time Cost Create 1
CF: 10s
Create 1
CF: 0.1s
• Root Cause– Protocol Conflict
• Between Gossip Protocol and Request
Propagation Mechanism
– Message Overhead
• May transform the whole schema instead of
the changed part
ReceiveSchemaMessage
Memory Cost
SendSchemaMessage
Memory CostTotal
N1 4.465G 4.236G 8.70G
N2 4.308G 4.907G 9.21G
N3 4.236G 4.024G 8.26G
N4 4.808G 4.387G 9.19G
N5 6.111G 6.373G 12.48G
Memory used by Gossip
Challenge 1 – Creating Schema Hang
• Solution– Gossip takes effect only when:
• Propagation messages lost/timeout
• Nodes recovered from a failure
– Creation time cost can keep constant17
Propagate
LOADSTATUSSHCEMAVERSION
...
LOADSTATUS
SHCEMA 延迟:t秒VERSION
...
metadata metadata
Delay T sT strategy:
1 2
34
Adaptive Lazy Gossip
3 4
Challenge 2 – Balancing Consistency &
Throughput
• Production environment– Sany production: 5 nodes cluster,
2x4 cores 64GB
• Problem
– Throughput = 200K data points/sec
– 75% data is written successfully only
in one replica, while the other
replicas are stale (inconsistent)
• Cassandra is NOT very consistent
• Big obstacle for query operation
– Repair is required, but is very slow
© 2015. All Rights Reserved. 18
Experiment on Amazon EC2
2 cores, 8GB, 5 nodes
rywc: read your write consistency
Challenge 2 – Balancing Consistency &
Throughput
• Root Cause of slow Repair
– Too many column families (5000+)
– Too many ranges in the consistent
hashing ring
• 256 virtual nodes (VN) per physical
node
• Too many merkle trees (ranges x CFs)
• Experience and Suggestions– Repair CFs and ranges one by one
• Do not repair the whole keyspace (all
CFs) at once
– Repair the important CFs first
– Perform repair at light workload
© 2015. All Rights Reserved. 19
- 5 physical nodes
- each has 2 VNs
- 10 ranges in total
For each range and each CF, create merkle
tree and compare them between two nodes
Challenge 3 – Heterogeneous Nodes
• Problem– How to assign the data partitions
in a heterogeneous cluster?
• Experiment Study
– Deploy a heterogeneous cluster
• 2 powerful servers and 8 PCs
– Throughput performance
• Heterogeneous cluster cluster
only with the 2 powerful servers
© 2015. All Rights Reserved. 20
Assign the position of the nodes (i.e. Tokens) in
the ring according to their computing capacities
Challenge 3 – Heterogeneous Nodes
• Root Cause
– The replica mechanism makes the
unbalanced problem complicated
• Each Node’s configurations may impact
other nodes’ performance
– The Virtual Node (VN) mechanism
cannot fit all scenarios
• Too many VNs make the lookup table
too big and slow down repair speed
• Max #VNs in a physical node is 1536
(restricted by Cassandra source code)
© 2015. All Rights Reserved. 21
The capacity of N1 is the worst, and E is short
But N1 is responsible for many data records
to the cluster:
• N5 finish the operation quickly
• But N5 has to wait for N1, which is slow
Challenge 3 – Heterogeneous Nodes
• Solution
– Initialize the cluster properly
• Use Quadratic Optimization
(QP) to find the best positions
of the (virtual) servers
• Has been deployed to China
Sany Group successfully
– Scaling out the cluster
• Use a dynamic algorithm to
find the best positions for the
new added server
© 2015. All Rights Reserved. 22
Scaling out: Xiangdong Huang, Jianmin Wang et al. Optimizing Data Partition for Scaling out NoSQL Cluster. Concurrency and Computation: Practice and Experience (Early View)
Scaling out: find the best position
Optimize:
1. the order of the nodes in the ring
2. the range length of the ring
Datasets & Results in China Sany Group
• 5000+ column families for sensor data
• 100K+ engineering machineries
• Amount of historical data loaded – From 2012.4 to now
• Data size– Tens of billions operational statuses records
– Several billion GPS data
– Write throughput– 5 nodes (2*4 cores CPU, 64GB memory, 9TB Disk)
– 20K TPS as regular workload, 200K at peak
23
Industrial Big Data Platform: More Requirements
——Beyond Sany Applications
High frequency sensor
High volume sensors
10+ M data point/second
Time and value based query
Richer set of analytical queries
<1 Second response
Edge synchronization
Compression, out-of-order, retransmission
Different data, different algorithms
Transparent to query
Deep compression to historical data
Spatial-temporal index
Trajectory based queries
Even higher
throughput
Native time-series
query
Synchronization
Adaptive deep
compression
Moving object
support
Industrial Data Analysis Pipeline
© 2015. All Rights Reserved. 25
Boolean value
Status values
Analogue value1.046Billion
Basic indicator
8030
Baseline
1.046Billion
Variance
Sp
ecific
featu
res
Co
mm
on
fea
ture
sO
utlie
rs
Specifie
dop
era
tion
al
sta
tuse
s d
ata
General
count baseline variance
frequency baseline variance
..
Analogue
average baseline variance
variance baseline variance
extremum baseline variance
…
Boolean
times baseline variance
duration baseline variance
…
States
Changestimes
baseline variance
duration baseline variance
…
Driver profile
Hydraulic oil
temperature analysis
Temporal parameter
analysis for vehicle start Parameter correlation
Spatial analysis
for failure
ServiceQualityControlR&D
Key components
anomaly detection
Industry Practice – Value-Added Analytics
horizontal
inclination
angle
Concrete pump truck’s tip-over is mainly caused by insufficient leg’s cylinder
support, which is a major issue of production safety
Big Data Application 1
—Concrete Pump Truck Tip-over Detection
Big Data Application 1
—Concrete Pump Truck Tip-over Detection
Fast spot and prevent dangerous operation through group behavior
analysis of concrete pump trucks
The overall distribution of horizontal (X-axis) & vertical (Y-axis)
inclination angle of concrete pumps
Unstable instances
Idle instances
Inclination angle vibration
level filter
Inclination angle distribution of
individual concrete pump
Idle instances:
unplugging operation leads to
malfunctioning
Unstable instances:
Early degradation pattern of cylinder
Typical instances:
stable oscillation
Data driven anomaly and potential accidents detection
Big Data Application 1
—Concrete Pump Truck Tip-over Detection
Big Data Application 2
—Fault Diagnosis
Investigation proved that salt-spray environment and the water quality
along the seaside caused the corrosion of cylinder’s potted component
Via time series pattern analysis and spatial correlation, leakage problem of master
cylinder is highly correlated with a high-speed rail construction project.
Hangzhou-Shenzhen high-speed rail
Salt-spray corrosion environment
Big Data Application 3
- Spare Components Demand Forecasting
• Traditional approach is
based on marketers’
experience
• New approach
– Combining the real-time data from
machines, sale history, holdings of
vehicles, environment and GDP,
etc.
• Result
– Reduce half of inactive spare part
inventory
0
50
100
150
200
250
中旬
下旬
上旬
中旬
下旬
上旬
中旬
下旬
上旬
中旬
下旬
上旬
中旬
下旬
上旬
中旬
下旬
上旬
中旬
下旬
上旬
中旬
下旬
上旬
中旬
下旬
2012/10 2012/11 2012/12 2013/1 2013/2 2013/3 2013/4 2013/5 2013/6
配件
需求量数
量/个
实际备件需求量 基于矩阵分解的多地区协同备件预测结果 企业实际备货量
The predicted result fits the actual
demand better
Sp
are
part
sn
um
ber
Actual demand Actual preparedResults of Multi-Region Collaborative
Spare Components Prediction Based
on Matrix Factorization
1 Industrial Data Management:Cassandra at China Sany Group
2 Climate Data Management:Cassandra at China Meteorological Administration
32© 2015. All Rights Reserved.
Pipeline of Climate Data Processing
© 2015. All Rights Reserved.
Data Center
Internet
Collection
2 Transmission
3 Access
4 Browsing
1
T639
win
dfie
ld
tem
pera
ture
field
hum
idity
rain
fall
snow
fall
…...
model
Ground
Aerological
Satellite
Radar
Lightning
Typhoon
850Pa
800Pa
……
900Pa
tem
pera
ture
8AM, 3h
8AM, 6h
…
8PM, 3h
8PM, 6h
Characteristics of Climate Data
Challenge in Meteorological Application
—Pattern Data
© 2015. All Rights Reserved. 35
• Hierarchical pattern data + flat others
• A highly-efficient data-deliver system for end users
– Support millions of small files
– Access data fast
– Scan data in various order
• Performance requirement
– Get ~1MB data in 50ms
– 600 concurrent clients
/
T639d1 ...
windtemper ...d2
d3
d4
d5
800 850 900
2014.2.18.08
2014.2.18.20
2014.2.19.08
3 6 9
...
... ... ...
2014.2.18.08
...
...
2014.2.18.08
...
... ...3 3 3 3
t1t2t3 t4 t5 t6 t7
d3
Why Cassandra?
• Scalability
• Fast read/write data
• Some columns are sorted
– Easy to scan data sequentially
• Time-based Compaction (>=Cassandra v2.0) for time series
© 2015. All Rights Reserved. 36
key 3h 6h 9h …
T639/temperature/800Pa file file file …
1. Get the data where key=‘T639…/800Pa’2. Retrieval the data before 6h
Or retrieval the data after 6h
key 3h 6h 9h 12h … 3h 6h 9h
T639/temperature/800Pa file file file file … file file file
Solution – Schema Design for Pattern Data
• Data items
– 5-tuple
– Pattern and variable are disordered
– Level, time, ageing are ordered
© 2015. All Rights Reserved. 37
time
level
ageingData space
(pattern, variable)
ColumnFamily
Row key
Column
/
T639d1 ...
windtemper ...d2
d3
d4
d5
800 850 900
2014.2.18.08
2014.2.18.20
2014.2.19.08
3 6 9
...
... ... ...
2014.2.18.08
...
...
2014.2.18.08
...
... ...3 3 3 3
t1t2t3 t4 t5 t6 t7
Performance Results
• 10 servers: 2*4 cores of CPU, 64GB memory, 9TB SAS Disk
• Store 7 kinds of model data
– 16TB per day
• Get data quickly
– 100 times faster than the older
system
© 2015. All Rights Reserved. 38
Thank you