challenges of big data visualization in internet-of ... of big data... · decision making. it...
TRANSCRIPT
XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE
Challenges of Big Data Visualization in Internet-of-
Things Environments
Doaa Mohey Eldin
Information Systems Department
Faculty of Computers and Information
Cairo, Egypt
Aboul Ella Hassanien
Information Technology Department
Faculty of Computers and Information
Cairo, Egypt
Ehab Ezat
Information Systems Department
Faculty of computers and Information
Cairo, Egypt
Abstract— big data infers in all processes in our life. But the
analysis process of huge data is not sufficient, the human brain
heads for detecting a pattern more efficiently when data is
performed visually. Data Visualization and Data Analytics play
a significant role in decision making in various sectors. It also
opens new opportunities for the ideas for big data visualization
domain. An essential challenge in data visualization is a huge
amount of data in real time or in static form. This paper
discusses the importance of data visualization. This paper
illustrates the impact of big data on data visualization. It
focuses on the explanation of the essential challenges of big
data visualization in the real-time stream.
Keywords— Big data, Data visualization, challenges,
Internet-of-Things, Interactive visualization.
I. INTRODUCTION (HEADING 1)
The rate of growth of data has increased exponentially
within few years due to several factors like Internet of
Things (IoTs), and sensors. Internet of things (IoT) [1]
becomes one of newly important concepts of computing that
describes the motif of smart objects [2] being connected via
the internet that can simulate the physical world and objects.
It provides the pattern recognition and devices identification
methods called by sensors. These sensors have a big data [3]
in different volume, variety types and velocity transmission.
So the extracted data from these sensors requires to manage,
fusion, and visualize them as shown in figure.1.
Figure.1: The Basic Architecture of the Internet-of-Things Sensors Big
Data Processes.
So the visualization of big data becomes a hot area of
research recently. These sensors provide various domains
such as mange the patient data remotely in telemedicine [4],
fusion [5] the student’s data profiling in video conferences
[6] and E-learning, and observation the traffic flow [7]
continuously. These data needs to save in every seconds that
requires a huge storage for the massive amount of this data.
These data can’t show for the user as doctor, professor, or
police officer in tables and row data, but these data requires
to analyze, interpret and present in meaningful ways. The
main challenge of Big Data [8] lies in capturing, storing,
analyzing, sharing, searching, and visualizing data. So the
visualization of these data is very important process to
control any emergency cases or observe the evolution of
students, patients, streets and so on. One of the major
aspect of Big Data analysis is that we can find interesting
pattern in huge data set, but actually the result of the
analysis is usually raw numbers and by those numbers it is
very hard to explicate anything. But if those numbers are
represented visually then it becomes much easier for our
brain to find meaningful patterns and take decision
accordingly.
Data visualization [9] is certainly not a new thing; it has
been around for centuries. Data visualization is an easy and
quick way to convey messages and represent complex
things. Humans are adapted to find patterns in everything
we see. Since the data is mounting at such a massive rate the
traditional ways of presenting data is obsolete. Compared to
traditional data.
This paper discusses the data visualization concept and
the main architecture of it. This research provides the
importance of data Visualization and the vital role in
decision making. It presents the essential challenges of data
visualization and its relationship with big data. It can
support the researchers in selecting a topic from open
research directions of data visualization.
The rest of this paper is organized as follows: Section
2 presents the state of art. Section 3, benefits of the data
visualization. Section 4, the presentation of challenges of
data visualization and recent researches and tools work.
Finally, Section 5 conclusion and proposes directions for
future work.
II. THE STATE-OF-ART
This section discusses the relationship between big
data and data visualization. The data visualization [9, 10, &
11] targets to implement a tool for visualize the data. This
tool has relies on the statistical models and several attributes
such as color, distribution, graph style [12] (Pie-chart, bar,
or etc.). Implementing effective data visualization solutions
for Big Data has to take into account apart the volume of the
data, and other intrinsic constraints generated by the typical
characteristics of Big Data [13]: real-time changes, extreme
variety of the sources (Multi-source integration), and
Sensors Data management (IoT
network)
Big data fusion and analytics
Data
Visulization
different levels of data structuring. Furthermore, the
recommendation of synchronous techniques of the data
visualization usage to preferable illustrate relationships
among a large amount of data. The techniques enhance
decisions improve the Data in motion: that refers to analysis
of streaming data to enable decisions within fractions of a
second. The Data at scale: shows the Petabyte (1015
) to
Exabyte (1018
). Data in several formats relies on the
structured, unstructured, Text, multimedia. Complex
information Spaces: that depends on the data items being
difficult to compare based on raw data, data compound of
several base data types. Three critical elements in applying
visual analytics to extreme-scale data and complex
Information Spaces: Size, Inclusion of visual and analytical,
and Active involvement of a human. The Complexity and
flatness: the world is complex, dynamic, and Multi-
dimensional.
Figure.2: The Relationships between Big Data and Data Visualization
Characteristics
Big data has several characteristics validity, variety,
volume and velocity. They refers to the size of data in
different types. And the speed of changes theses data relies
on the velocity. The availability refers to the data is
available along time. The visualization of data is
considered one characteristics of big data because the
enormous number of data requires to summarize into
valuable graphs to support the users by results easily. Data
visualization faces several characteristics of big data such as
context-aware properties that depends on the type of inputs
and analytics data (video, audio, image, or text). This
property requires to treat the various topologies and convert
them into one topology structure to interpret the data
accurately. Data visualization is very sensitive [14] to users
of targeted data fusion. That meaning the visualization
usually relies on the specific domain. It also can deal with
changes simultaneous with variant attributes.
The target of data visualization is expressed by the
visual representation of rich sensors data in Internet-of-
things. This representation requires several measurements
techniques for simulate the physical world. The induction of
this relationship the data visualization relies on the types
and motions of big data.
Figure.3: The Data visualization lifecycle based on the Big
Data.
Data visualization has an effective role in visualizing
the valuable data and detecting the outlier [15] from the big
data. This vital role faces essential variant processes that,
mentioned in figure.3, goals and users identification, the
cleaning, integration, fusion, and analytics.
The data visualization has our layers in processing:
Figure.4: The Data Visualization Layers
The results of Visualization expresses the abstract
data, useful, clear, and synthetic data. These automatic
results of analysis requires to summarize graphically. The
good graph which had valuable and meaningful data, so the
selective graph or report type and color is very important for
users.
III. DATA VISUALIZATION METHODOLOGIES
There are Three Styles of Big Data Visualization as
shown in Table.1: data reduction [16], visual interaction
[17], and High Performance Computing (HCP) [18] .
Data Visualization
Big Data
Big Data
Goals idenfitication
Users Detections
Data Cleaning Data
Fusion/Intgeration
Data Analytics
Data Visualization
Dat
a vi
sual
izat
ion
laye
rs preprocessing Data
identify the suitable graphs
Results Analytics
Outliers Detection
Figure.5: Data Visualization Methodologies of big data
The first type of methodologies has several
motivations. It refers to the data reduction methodology.
Data reduction [16, 17] refers to the process of numerical or
text transformation into digitalized data ordered, accurate,
and simplified format. (Wickham) refers to the big data
filtering and feature reduction, [17,18] proposed a
visualization tool for molecular biological that can study the
behavior of the system based on data-driven. It depended on
the probabilistic latent variable model to produce data and
measures. There exist lots of graph drawing algorithms,
including string analogy-based methods. Most of them focus
explicitly or implicitly on local properties of graphs,
drawing nodes linked by an edge close together but avoiding
overlap.
The second type refers to the Interactive Visual
Analysis (IVA) [19] that relies on a group of techniques for
integrating the computational power of systems with the
aware and cognitive strengths of humans. In order, to extract
knowledge from large and complex datasets. The techniques
rely heavily on user interaction and the human visual
system, and exist in the intersection between visual
analytics and big data. It is a branch of data visualization.
IVA is an appropriate technique to the data analytics with
high-dimensions [20, 21]. Star Glyphs [22] and parallel
coordinates interaction Visual interaction that refers to the
representation of visual interaction [21] .
High-performance computing (HPC) [18] is the third
type of methodologies for data visualization that is
considered the use of super computers and parallel
processing techniques for solving complex computational
problems. HPC technology focuses on developing parallel
processing [22] algorithms and systems by incorporating
both administration and parallel computational techniques.
It relies on dividing and conquer methodology and depends
on the parallel computation [23].
Recently, the motivations of research goes to use machine
learning [24, 25] and deep learning algorithms [26][27] to
understand more features and reduction of data analysis to
reach the highest accuracy and performance.
IV. DATA VISUALIZATION BENEFITS
Data visualization is viewed by many disciplines as a
modern equivalent of visual communication. It is viewed as
a branch of descriptive statistics. Main objective of data
visualization illustrates in sharing and communicating
knowledge and information clearly and effectively through
graphical means. Visualization applications often utilize
visualization libraries; further, they provide an interface that
allows users to combine modules and set their attributes to
produce the desired results. It is highly extensible. New
modules can be developed without modifying the core
infrastructure. The design allows for dynamic composition
of modules. As a result, visualization libraries tend to have
modules that perform small, indivisible tasks that users can
build up to produce the exact visualization they want. The
advantages of data visualization [28, 29] as shown in
figure.6:
Figure.6: The Data Visualization Advantages
According to figure.6, a critical element in reducing the time
amount needed to understand data, visualization
applications are insistent realizing the value from a Big Data
initiative, to minimize the time required to know where
opportunities, issues, and risks reside in voluminous data.”
The gathering between data analytics, visualization
that should to be meaningful, valuable and experienced data
in rapidly manner to support users’ objectives and decisions.
The core of data visualization relies on the good analytics
and anomalies detection to enhance business or individual
targets.
V. DATA VISUALIZATION CHALLENGES
Big Data may be challenged to yield significant
actionable results. According to figure.7, the challenges can
be summarized into six challenges the lack of expertise in
learning data. Data visualization mostly works on specific
domain so that needs to the expert to manage the results.
The statistical results refers to the meaningful to take a
decision. If the system has several users that is very heavy
to be the data available to all users concurrently to get the
results of the processes as searching, getting reports and so
on. The data should to have analytics to reach of the
meaning of data not meaningless numbers. The helpful
analytics which can infer in the decision making, such as if
the patient has low or high pressure, what he can do. What
Data Visualization Methodologies
Dimesnsion Reduction
Visual Interaction High performance
computing
TimeSaving
provide self-service
s foruser
improve
collabortion
betterbig
dataanalysi
s
improve
decision
making
Series 1 20% 34% 41% 43% 77%
20%
34% 41% 43%
77%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
per
ecn
tage
rat
e
benefits of data visualization
will happen in the emergency cases? That requires to
contact to doctor or communicate with the nearest hospital
or ambulance to save his life. May be the results are similar
but the meaning of them not the same so the difficulty of
understanding. Also the users requires to have the correct
numbers in every point with right interpretation of these
data.
Running the queries, processes and reports on these
data takes a long time that is also a challenge of the data
visualization. The classical tools for data visualization are
limited with the big data can’t interpret the huge data and
changes continuously. They worked to improve latency of
data but they are still faced performance problems. For any
Big Data visualization tool should to be able to deal with
semi-structured and unstructured data because big data
usually have variant format data types. The parallelism
approach is not an enough solution because it featured to
break down frequently.
The challenges [13, 29, 30] of big data visualization
are not focus on the industry of business only but also
targets the researchers to improve the results and tools. So
this paper shows open research problems of visualization to
support researchers in finding a new open research areas and
directions.
Figure.7: the data visualization challenges
These challenges faces data visualization in the
implementation process of any visualization tool:
Figure.8: the open research challenges of data visualization
A. Context-awareness Data Model
One significant challenge [30, 31] to data models in
existing visualization software is that the types of data they
are expected to handle is growing. In the scientific space,
new refinement structures, new types of polynomial fields,
and even high dimensional grids are becoming more
common. The need for these structures are motivated both
by the demands of new science and by the evolution of
scientific computing algorithms.
Internet-of-things environments and mobile sensors
environments are considered a hot area of research in data
visualization that requires to unite the structure of the
variety of context into one context-type. The researchers
[19, 20] proposed a system to generate interactive context-
aware visualization of federated data sources provided by an
underlying context-aware framework called Augmented
World Model (AWM). It relies on the extracted data from
internet-of-things devices and sensors in real-time.
The authors presented a new visualization tool entitled
[31] to visualize big data graphically and detect the outliers
of data such as error or event based on historical tutorial.
Recently, this research [32] talked about Rules in Mobile
Context-Aware Systems and provides a SKE to the
development of mobile context-aware systems. The problem
of this research in the output data extracted from Mobile
sensors in performance and support the correct meaning of
users, the adaptive with users, conditions, environments,
efficiency with high usage resources and responsiveness.
This research presented [33] a new paradigm of the
data visualization based on machine learning of context-
aware identification. It provided the recommendations of the
charts types for given datasets based on specific domain.
The authors [34] proposed a systems for data visualization
that could support the integration of common visualization
modules in big data streams. It was based on the parallelized
visualization that can be scalable and flexible for the
heterogeneous data.
The previous observation of several visualization tools
that can manage the context-aware types, we notice that: the
visualization tool requires to be:
1. Adaptive: dynamic system
2. Easy to use: to allow users to change, update and
search by various dimensions such as time,
geographic locations.
3. High performance (low running time)
4. Classify patterns: that is based on features,
conditions, and patterns recognition.
Other important notion there is a relationship between
the fusion level of data in IoT or integration process and the
reliable results of visualization tool.
, (1) or
(2)
The equations 1, 2 refer to the correct fusion or
integration produces the reliable visualization results. On
other words, the high fusion causes of the high reliability of
visualization. Another observation of the high detection of
takes
a
long
time
diffic
ult to
share
diffic
ult to
analy
ze
not
effiec
tive
data
too
diffic
ult to
acces
s all
data
lack
of
exper
tise
precentage rate 19% 22% 37% 45% 50% 57%
19% 22%
37%
45%
50%
57%
0%
10%
20%
30%
40%
50%
60%
Pre
cen
tage
rate
Challenges of Data Visualization
outliers can support the results of visualization accuracy as
equation 3.
, (3)
B. Transparency Visualization
It can Support the user in understanding the reasons
behind the recommendations. There is still challenge
concept because that there is lack of explanation of why are
the results showed. Recommender systems [18] lack
transparency, when they appear as "black boxes" to the user,
making it incomprehensible how recommendations are
generated and why a specific list of items is presented.
Improving transparency of data visualization is requires to
avoid risk management process in business. That also
provides any user to recognize the reasoning behind the
visualization results.
The proposed method used to enhance visualization
transparency (as known by "Justification") of a
recommender system and the users’ trust in the results, are
explanations. They can help users to understand the reason
behind a recommendation, increase the user’s sense of
involvement in the recommendation process and can lead to
a greater acceptance of the recommender system as a
decision aide. The concept affords transparency of the
recommendations by visualizing an average rating (position)
as well as an individual rating for each user (glyphs). Viola
[31] also provided reasoning and explanation of the outliers
in data visualization.
This is still open research in visualization, the recent
motivations targets using deep learning and data reduction
based on features for creating an explanation of the data
visually.
, (4)
(5)
Our observation shows a high relationship between the
correct data reduction has Positive relationship with correct
fusion that will effect on the reliable visualization as
equations 4, and 5.
C. Social Internet-of-things (SIoT)
The internet-of-things refers to interconnected set of
sensors connected via internet that hold huge data about one
or more objective. The social internet-of-things [ 35, 36]
refers to the various user’s targets and the affected
attributes. There are levels of meaning of any data that are
very sensitive with the user authorities and goals. The
Interactivity of visualization data is complex process
because it is based on the complex features and levels of
data. So that will require new motivations to can manage the
data levels with user’s variety. The observation here reach to
this relationships in equation 6 and 7.
, (6)
(7)
D. Virtual Reality (VR):
Virtual Reality is going to have a huge impact on the
potential for data visualizations [18], allowing people to
interact with data in the third dimension for the first time.
Imagine being able to pick a data set and move it around on
any axis to compare it to another, it isn’t too far away.
According to SAS. People can process only 1 kilobit of
information per second on a flat screen, which can be
increased significantly if it’s analyzed in a 3D VR world.
This challenge has a big effect on the future results, such as
profit of the finance data. The visualization tools can predict
the profit, and loss in several cases and graph the
imagination based on real data concurrently. That can
improve the performance rate yearly.
VI. CONCULSION AND FUTURE WORKS
This paper presents a survey study of data visualization
as a hot area of research. It illustrates the relationship
between the big data and data visualization. It demonstrates
the benefits of the visualization. This paper aims to explain
the four challenges in this field: Context-awareness,
transparency, social internet-of-things based on the different
levels of understanding and virtual reality.
Challenges are discussed in the following briefly,
A. Context-Awareness: Considering different
situations, such as mood, time, individual, or
collaborative scenarios
B. Transparency. Supporting the user in understanding
the reasons behind the recommendations.
C. Social Internet-of-Things and Meaningful Data
levels: The difficulty for designing visualizations to
match up to the wide-ranging understanding of data
and data visualizations.
D. Virtual Reality: is going to have a huge impact on
the potential for data visualizations, allowing
people to interact with data in the third dimension
for the first running time.
In future work, we will propose a new visualization tool
can treat the mentioned challenges to support social internet-
of-things. This tool should to be adaptive, flexible, and easy
to use. That can will improve the fusion and integration
problems to support the high accuracy and performance of
the context types. We target to optimize the time
automatically to work on the real-time streams analysis.
REFERENCES
[1] V.Bhuvaneswari, and R.Porkodi, The Internet of Things (IoT) Applications and Communication Enabling Technology Standards: An Overview, 2014 International Conference on Intelligent Computing Applications, 2014
[2] Giancarlo Fortino, and Paolo Trunfio, Internet of Things Based on Smart Objects, Technology, Middleware and Applications, 2014, Internet of things, springer,2014
[3] Jin X, Wah BW, Cheng X, and Wang Y, “Significance and challenges of big data research,” Big Data Research, 30;2(2):59-64, 2015.
[4] “Expanding Florida’s Use and Accessibility of Telehealth”, Telehealth Advisory Council, October 31, 2017.
[5] Wilfried Elmenreich, An Introduction to Sensor Fusion, 2002
[6] Ahmed Sammoud, Ashok Kumar, Magdy Bayoumi, Tarek Elarabi, Real-time streaming challenges in Internet of Video Things (IoVT),IEEE International Symposium on Circuits and Systems (ISCAS),2017
[7] Senthil Kumar Janahan, M.R.M. Veeramanickam, S. Arun, Kumar Narayanan, R. Anandan, Shaik Javed Parvez, IoT based smart traffic signal monitoring system using vehicles counts, International Journal of Engineering & Technology, volume 7, 2018
[8] Alexandros Labrinidis, and H. V. Jagadish, Challenges and Opportunities with Big Data,Proceedings of the VLDB Endowment, Vol. 5, No. 12, 2012
[9] Lidong Wang, Guanghui Wang, and Cheryl Ann Alexander, “Big Data and Visualization: Methods, Challenges and Technology Progress,” Digital Technologies, vol. 1, no. 1, pp. 33-38. doi:10.12691/dt-1- 1-7, 2015.
[10] Childs H, Geveci B, Schroeder W, Meredith J, Moreland K, Sewell C, Kuhlen T, and Bethel EW, “Research challenges for visualization software,” Computer, issue 1(5) pp:34-42, 2013.
[11] Ekaterina OlshannikovaEmail author, Aleksandr Ometov, Yevgeni Koucheryavy and Thomas Olsson, Visualizing Big Data with augmented and virtual reality: challenges and research agenda, Journal of Big Data, 2015.
[12] Introduction to Data Visualization Techniques Using Microsoft Excel 2013 & Web-based Tools,Tufts Data Lab,2016
[13] White paper, Data Visualization Techniques From Basics to Big Data With SAS® Visual Analytics,2018
[14] David stodder, Data Visualizatio n AND DISCOVERY FOR BETTER BUSINESS DECISIONS, SAS, third quarter, Tdwi best pracitcis report, 2013
[15] Hodge, Victoria J. orcid.org/0000-0002-2469-0224 and Austin, James orcid.org/0000-0001-5762-8614 (2018) An Evaluation of Classification and Outlier Detection Algorithms. Working Paper
[16] Samuel Li, N.Marsaglia, christoph Garth, et al., Data Reduction Techniques for Scientific Visualization and Data Analysis,March 2018Computer Graphics Forum 37(2), 2018
[17] Samuel kaski, and Jaakko peltonen, Dimensionality Reduction for Data Visualization,
Published in: IEEE Signal Processing Magazine ( Volume: 28 , Issue: 2 , March 2011 )
[18] MANDY KECK, DIETRICH KAMMER , Exploring Visualization Challenges for Interactive Recommender Systems, VisBIA 2018
[19] Andrzej Cichocki, Anh-Huy Phan, Qibin Zhao, Namgil Lee, Ivan Oseledets, Masashi Sugiyama and Danilo P. Mandic, "Tensor Networks for Dimensionality Reduction and Large-scale Optimization: Part 2 Applications and Future Perspectives", Foundations and Trends® in Machine Learning: Vol. 9: No. 6, pp 431-673,2017.
[20] Interactive Context-Aware Visualization for Mobile Devices, International Symposium on Smart Graphics SG: Smart Graphics pp 167-178, 2009.
[21] Steffen Oeltze, Helmut Doleisch, Helwig Hauser, Gunther Weber., Interactive Visual Analysis of Scientific Data. Presentation at IEEE VisWeek 2012, Seattle (WA), USA
[22] Zoltan Konyha, Alan Lez, Kreimir Matkovic, Mario Jelovic, and Helwing Hauser, Interactive visual analysis of families of curves using data aggregation and derivation, i-KNOW '12 Proceedings of
the 12th International Conference on Knowledge Management and Knowledge Technologies,2012
[23] Jacqueline Strecker, Report :Data Visualization In Review ,2012
[24] Sebastian Raschka, MLxtend: Providing machine learning and data science utilities and extensions to Python’s scientific computing stack, the journal of open source software, 2018.
[25] Keita Fujino, Sozo Inoue, and Tomohire Shibata, Machine Learning of User Attentions in Sensor Data Visualization, International Conference on Mobile Computing, Applications, and Services, Part of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering book series (LNICST, volume 240), 2018.
[26] Kanit Wongsuphasawat ; Daniel Smilkov ; James Wexler ; Jimbo Wilson ; Dandelion Mané, et al., Visualizing Dataflow Graphs of Deep Learning Models in TensorFlow,Published in: IEEE Transactions on Visualization and Computer Graphics , Volume: 24 , Issue: 1 , 2018.
[27] Junyuan Xie, Ross Girshick , Ali Farhadi, Unsupervised Deep Embedding for Clustering Analysis, ICML'16 Proceedings of the 33rd International Conference on International Conference on Machine Learning – Volume48 ,2016.
[28] Jack G. Zheng, Data Visualization for Business Intelligence, In book: Global Business IntelligenceChapter: 6Publisher: Taylor & Francis, 2017
[29] White paper: Data Visualization: Making Big Data Approachable and Valuable, Market pluse, SOURCE: IDG RESEARCH SERVICES, SAS, Custom Solution Group, 2012.
[30] Mohsen Marjani, Fariza Nasaruddin , Abdullah Gan,Ahmad Karim, Ibrahim Abaker Targio Hashem* , Aisha Siddiqa,- Big IoT Data Analytics: Architecture, Opportunities, and Open Research Challenges, IEEE Access, 5, pages 5247-5261.
[31] Nan Cao, Chaoguang Lin, Qiuhan Zhu, Yu-Ru Lin, Xian Teng, Xidao Wen, Voila: Visual Anomaly Detection and Monitoring with Streaming Spatiotemporal Data, IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 24, NO. 1, JANUARY 2018
[32] Grzegorz J. Nalep, Rules in Mobile Context-Aware Systems, Modeling with Rules Using Semantic Knowledge Engineering pp 403-430
[33] W.A.D. Kanchana, G.D.L. Madushanka, et al., context-aware recommendation for data visualization, 2016
[34] Harald Sanftmann, Nazario Cipriani, and Daniel Weiskopf, Distributed Context-Aware Visualization,8th IEEE International Workshop on Middleware and System Support for Pervasive Computing, 2011
[35] Bo-Shen Chen ; Varsha A. Kshirsagar ; Shou-Chih Lo,Platform design for social Internet of Things,2017 IEEE International Conference on Consumer Electronics - Taiwan (ICCE-TW),2017
[36] Moneeb Gohar, Muhammad Muzammal, Arif Ur Rahman ,SMART TSS: Defining transportation system behavior using big data analytics in smart cities, Sustainable Cities and Society Volume 41, Pages 114-119, 2018