![Page 1: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/1.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
An Exploration ofPower-law in Use-relation ofJava Software Systems
Makoto Ichii, Makoto Matsushita, Katsuro Inoue
Osaka University
2008/3/26 ASWEC 20081
![Page 2: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/2.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Software Component Graph A software system is composed of software components.
Software component (component): building unit of a software system
Complex use-relation is formed between components Software component graph (component graph) represents
use-relation between components node: component / edge: use-relation
Various researches utilize component graphs to analyze software systems
It is important to know the nature of component graphs
2008/3/26 ASWEC 20082
![Page 3: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/3.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Power-law distribution
A graph is characterized by the degree distribution The graphs whose degree distribution follows the power-law
distribution attracts attention in various research domains Link structure of WWW pages Hosts on the Internet
Such graphs tend to have interesting characteristics Self similarity Fault tolerance
2008/3/26 ASWEC 20083
Explore the component graphs to seek whether the degree distributions follow the power law
p(x) = Cx-α
![Page 4: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/4.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Questions [1-2/4]Q. 1 Do the in- and out-degree distributions of a component
graph of a software system follow the power law?
2008/3/26 ASWEC 20084
Q. 2 Do the in- and out-degree distributions of a component graph of multiple software systems follow the power law?
?
?
![Page 5: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/5.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Questions [3-4/4]Q. 3 Do the in- and out-degree distributions of subgraph of a
component graph follow the power law?
2008/3/26 ASWEC 20085
Q. 4 What aspects of components affects the in- and out-degree distribution of component graphs?
?
![Page 6: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/6.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Definitions [1/2]Component: Java class (including interface)
Use-relation: Any of the following six relation types acquired by static analysis of the component source files.
A class or an interface extends another class or interface respectively.
A class implements an interface. A class or an interface declares a variable of a class or an
interface. A class instantiates a class object. A class calls a method of a class or an interface. A class or an interface references to a field variable of a class
or an interface.
2008/3/26 ASWEC 20086
![Page 7: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/7.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Definitions [2/2]Component graph: Directed simple graph
node: component edge: use-relation between components
In-(Out-)degree: The number of incoming (outgoing) edges to a node
2008/3/26 ASWEC 20087
class B { … A.exec(); …}
class A { void exec() { … }}
A
B
class C { … A a = new A(); …} C
in-degree: 2
out-degree: 0
in-degree: 0
out-degree: 1
in-degree: 0
out-degree: 1
![Page 8: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/8.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Observing the power-law Plot cumulative frequency on log-
log axis The data forms a straight line
if the distribution is the power law
2008/3/26 ASWEC 20088
gradient : -αgradient : -(α-1)
p(x) = Cx-α
in-(or out-)degreeM. E. J. Newman, "Power laws, Pareto distributions and Zipf's law",
Contemporary Physics 46, 323-351 (2005)
![Page 9: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/9.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Values shown in the experiments α: exponent
Derive from the gradient of the regression line
R*2: the determination coefficient adjusted for the degree of freedom
Fitness of a regression model for data
[0..1] Large value means good
fitness
2008/3/26 ASWEC 20089
gradient : -(α-1)
p(x) = Cx-α
in-(or out-)degree
![Page 10: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/10.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Experiment 1
Setup component sets Each set contains a single software system
Analyze component sets to create component graphs. Plot cumulative frequency of the degrees on log-log axis.
2008/3/26 ASWEC 200810
Q. 1 Do the in- and out-degree distributions of a component graph of a software system follow the power law?
Description # of components
JDK Java 2 SE Software Development Kit 1.4 11,556
ECLIPSE Eclipse 3.0.1 13,941
![Page 11: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/11.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Result of experiment 1 / JDK
2008/3/26 ASWEC 200811
α R*2
in-degree 2.1 ±8.6×10-3 0.99
out-degree 3.1 ±8.2×10-2 0.88
► The in-degree follows the power law
► The out-degree does not follow the power law
# of Nodes 11,556
# of Edges 107,198
1 100
11
01
00
10
00
10
00
0
10out-degree
cum
ulat
ive
freq
uenc
y
1 100 1000 5000
11
01
00
10
00
10
00
0
10in-degree
cum
ulat
ive
freq
uenc
y
![Page 12: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/12.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Result of experiment 1 / ECLIPSE
2008/3/26 ASWEC 200812
► The similar characteristics with JDK
The in-degree follows the power law
The out-degree does not follow the power lawα R*2
in-degree 2.2 ±1.6×10-2 0.96
out-degree 3.0 ±7.7×10-2 0.86
# of Nodes 13,941
# of Edges 140,678
1 100 200
11
01
00
10
00
0
10
10
00
cum
ulat
ive
freq
uenc
y
out-degree1 100 1000
11
01
00
10
00
0
10
10
00
cum
ulat
ive
freq
uenc
y
in-degree
![Page 13: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/13.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Experiment 2
Setup component sets Each set contains multiple software systems Use-relation across the systems exists
Analyze component sets to create component graphs. Plot cumulative frequency of the degrees on log-log axis.
2008/3/26 ASWEC 200813
Q. 2 Do the in- and out-degree distributions of a component graph for multiple software systems follow the power law?
Description # of components
ASF Various projects checked out from the repository of Apache Software Foundation
59,486
SPARS_DB The components stored in the database of demo.spars.info (includes ASF, JDK)
180,637
![Page 14: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/14.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Result of experiment 2 / ASF
2008/3/26 ASWEC 200814
► The similar characteristics with Exp. 1
The in-degree follows the power law
The out-degree does not follow the power lawα R*2
in-degree 2.4 ±1.1×10-2 0.98
out-degree 3.4±6.4×10-2 0.94
# of Nodes 59,486
# of Edges 303,7551
10
00
01
01
00
01
00
1 10 100 200
out-degree
cum
ulat
ive
freq
uenc
y
1 10
11
00
00
10
10
00
10
0
1000100
cum
ulat
ive
freq
uenc
y
in-degree
![Page 15: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/15.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Result of experiment 2 / SPARS_DB
2008/3/26 ASWEC 200815
α R*2
in-degree 2.0 ±1.5×10-3 1.00
out-degree 3.7 ±7.0×10-2 0.90
# of Nodes 180,637
# of Edges 1,808,982
1 10 100 200
11
00
10
00
0
out-degree
cum
ulat
ive
freq
uenc
y
1 100 10000
11
00
10
00
0
cum
ulat
ive
freq
uenc
y
in-degree
► The similar characteristics with Exp. 1
The in-degree follows the power law
The out degree does not follow the power-law completely
► In-degree distribution fits to the power-law straight line almost ideally.
![Page 16: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/16.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Experiment 3
Construct subsets of SPARS_DB Keyword: The components that contain a specified keyword in the
source code The keywords are randomly selected so that the number of resulting
components is about 1,000/10,000 Random: 1,000/10,000 random components
Analyze component sets to create component graphs. Plot cumulative frequency of the degrees on log-log axis.
2008/3/26 ASWEC 200816
Q. 3 Do the in- and out-degree distributions of subgraph of a component graph for software systems follow the power law?
Description # of componentsKWD1K The components that contain “labels” 1,002KWD10K The components that contain “getstring” 8,938RND1K Randomly-selected 1,000 components 1,000
RND10K Randomly-selected 10,000 components 10,000
![Page 17: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/17.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Result of experiment 3 / KWD1K
2008/3/26 ASWEC 200817
out-degree
cum
ulat
ive
freq
uenc
y
110
010
1000
1 102 5in-degree
cum
ulat
ive
freq
uenc
y
110
010
1000
1 10 1005 502 20
α R*2
in-degree 2.2 ±3.3×10-2 0.98
out-degree 3.7 ±2.0×10-1 0.93
# of Nodes 1,002
# of Edges 1,564
► The similar characteristics with SPARS_DB
The in-degree follows the power law
The out-degree does not follow the power law
![Page 18: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/18.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Result of experiment 3 / KWD10K
2008/3/26 ASWEC 200818
out-degree
cum
ulat
ive
freq
uenc
y
110
010
1000
1 102 5in-degree
cum
ulat
ive
freq
uenc
y
110
1000
100
1000
0
1 10 100 10000
α R*2
in-degree 2.1 ±9.3×10-3 0.99
out-degree 3.4 ±2.7×10-1 0.93
# of Nodes 8,938
# of Edges 24,317
► The similar characteristics with SPARS_DB
The in-degree follows the power law
The out-degree does not follow the power law
![Page 19: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/19.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Result of experiment 3 / RND1K
2008/3/26 ASWEC 200819
110
1000
100
1 102 5in-degree
cum
ulat
ive
freq
uenc
y
out-degree
cum
ulat
ive
freq
uenc
y
1010
0010
0
1 2
α R*2
in-degree 2.3 ±1.8×10-1 0.93
out-degree N/A N/A
# of Nodes 1,000
# of Edges 52
► The original characteristics is almost lost
![Page 20: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/20.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Result of experiment 3 / RND10K
2008/3/26 ASWEC 200820
out-degree
cum
ulat
ive
freq
uenc
y
110
1000
100
1000
0
1 102 5in-degree
cum
ulat
ive
freq
uenc
y
110
1000
100
1000
0
1 10 1005 50 5002 20
α R*2
in-degree 1.9 ±2.1×10-2 0.98
out-degree 4.3 ±3.3×10-1 0.91
# of Nodes 10,000
# of Edges 6,184
► The similar characteristics with SPARS_DB, however
# of edges is small
![Page 21: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/21.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Experiment 4
List top-ten components in the in- and out-degree Calculate correlation between degrees and metric values.
Spearman's rank correlation coefficient Target: SPARS_DB
2008/3/26 ASWEC 200821
Q. 4 What aspects of components affects the in- and out-degree distribution of component graphs?
Metric Description
LOC Non-comment source lines of code
WMC1 A variation of weighted methods per class (WMC)
Weight of a method: constant value (1)WMC2 A variation of WMC
Weight of a method: Cyclomatic complexityLCOM A variation of lack of cohesion of methods: LCOM5
![Page 22: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/22.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Result of experiment 4 / In-degree Top-ten components
The components that have fundamental/general role
Correlation with metrics In-degree have low
correlation with the metrics
The in-degree relates to the role
2008/3/26 ASWEC 200822
Name LOC
In-degree
Out-degree
1 java.lang.String 675 116,239 21
2 java.lang.Object 35 98,261 4
3 java.lang.Class 605 29,682 41
4 java.lang.Exception 15 21,046 2
5 java.lang.Throwable 136 19,519 12
6 java.lang.System 170 19,175 27
7 java.util.Iterator 5 15,522 1
8 java.util.List 27 14,462 4
9 java.util.ArrayList 200 13,656 19
10 java.lang.Integer 285 12,736 9
Out-degree
LOC WMC1 WMC1 LCOM
In-degree
0.00 0.07 0.24 0.08 0.12
![Page 23: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/23.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Result of experiment 4 / Out-degree Top-ten components
Simply large/complex classes Correlation with metrics
High correlation with LOC and WMC
The out-degree relates to the size/complexity of a component
2008/3/26 ASWEC 200823
In-degree
LOC WMC1 WMC1 LCOM
Out-degree
0.00 0.82 0.64 0.75 0.39
Name LOC
In-degree
Out-degree
1 org.apache...FunctionEval 364 1 354
2 org.jgraph.GPGraphpad 2,196 130 255
3 com.jgraph.GPGraphpad 2,200 131 253
4 org.jgraph.GPGraphpad 542 209 252
5
org.eclipse...
ASTConverter 4,520 3 223
6 org.eclipse...JavaEditor 1,368 115 220
7
net.sourceforge...
GanttProject 3,055 98 216
8
it.businesslogic...
MainFrame 7,177 46 204
9
org...
InstConstraintVisitor 1,626 3 197
10
org...
ASTInstructionCompiler 2,449 1 189
![Page 24: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/24.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Answers: summary of experiments [1/4]Q. 1 Do the in- and out-degree distributions of a component
graph of a software system follow the power law?The in-degree follows the power lawThe out-degree does not follow the power law
Mixture of the power-law distribution and the lognormal distribution
2008/3/26 ASWEC 200824
1 100
11
01
00
10
00
10
00
0
10out-degree
cum
ulat
ive
freq
uenc
y
1 100 1000 5000
11
01
00
10
00
10
00
0
10in-degree
cum
ulat
ive
freq
uenc
y
1 100
11
01
00
10
00
10
00
0
10out-degree
cum
ulat
ive
freq
uenc
y
1 100 1000 5000
11
01
00
10
00
10
00
0
10in-degree
cum
ulat
ive
freq
uenc
y
![Page 25: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/25.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Answers: summary of experiments [2/4]Q. 2 Do the in- and out-degree distributions of a component
graph for multiple software systems follow the power law?The in-degree follows the power lawThe out-degree does not follow the power law
The similar results with that of single software systems
2008/3/26 ASWEC 200825
11
00
00
10
10
00
10
0
1 10 100 200
out-degree
cum
ulat
ive
freq
uenc
y
1 10
11
00
00
10
10
00
10
0
1000100
cum
ulat
ive
freq
uenc
y
in-degree
1 10 100 200
11
00
10
00
0
out-degree
cum
ulat
ive
freq
uenc
y
1 100 10000
11
00
10
00
0
cum
ulat
ive
freq
uenc
y
in-degree
![Page 26: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/26.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Answers: summary of experiments [3/4]Q. 3 Do the in- and out-degree distributions of subgraph of a
component graph for software systems follow the power law?Depends on how the subgraph is created.
Keyword-based subgraph has similar characteristics with the superset Related components likely share words
Random-selection-based subgraph with small number of nodes has different characteristics
• Few edges exist.
2008/3/26 ASWEC 200826
out-degree
cum
ulat
ive
freq
uenc
y
110
010
1000
1 102 5in-degree
cum
ulat
ive
freq
uenc
y
110
010
1000
1 10 1005 502 20
out-degree
cum
ulat
ive
freq
uenc
y
110
010
1000
1 102 5in-degree
cum
ulat
ive
freq
uenc
y
110
1000
100
1000
0
1 10 100 10000
110
1000
100
1 102 5in-degree
cum
ulat
ive
freq
uenc
y
out-degree
cum
ulat
ive
freq
uenc
y
1010
0010
0
1 2
out-degreecu
mul
ativ
e fr
eque
ncy
110
1000
100
1000
0
1 102 5in-degree
cum
ulat
ive
freq
uenc
y
110
1000
100
1000
0
1 10 1005 50 5002 20
![Page 27: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/27.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Answers: summary of experiments [4/4]Q. 4 What aspects of components affects the in- and out-degree
distribution of component graphs?In-degree relates to the roles of components
Most of the components are used at the specific part Components with fundamental/general role are used from
everywhere The more the size of component set grows, the more the
value of in-degree becomes large.Out-degree relates to size/complexity of components
Many components have reasonable size/complexity Some components may have relatively large size/complexity
Extremely large components are unreasonable
2008/3/26 ASWEC 200827
![Page 28: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/28.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Summary Component graphs are investigated to seek whether the in-
and out-degree distribution follows the power-law As the results, following characteristics are revealed.
The in-degree distribution follows the power-law The in-degree of a component relates to the role of the component
The out-degree distribution does not follows the power-law The out-degree of a component relates to the size/complexity of
the component Some sort of subgraph of a component graph have the same
characteristics of degree distribution with the graph.
Future works Explore the other types of component graph
2008/3/26 ASWEC 200828
![Page 29: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/29.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
2008/3/26 ASWEC 200829
![Page 30: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/30.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
+
2008/3/26 ASWEC 200830
![Page 31: An Exploration of Power-law in Use-relation of Java Software Systems](https://reader036.vdocuments.mx/reader036/viewer/2022062408/56814090550346895dac2134/html5/thumbnails/31.jpg)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University
Discussion Generative models of a power-law graph
If a node is added to a graph, the nodes with large degree tend to get the edge to the new node. “rich get richer”
Meanings for component graphs If a new component is added to (developed for) a software
system, the new component uses the component that is already used by many components
The members of frequently-used components hardly change even if the software development proceeds If the member changes, it means that the fundamental structure
(design, architecture) of the software is changed
2008/3/26 ASWEC 200838