tracing software build processes to uncover license compliance inconsistencies
Post on 05-Dec-2014
169 Views
Preview:
DESCRIPTION
TRANSCRIPT
Tracing Software Build Processes to Uncover License Compliance Inconsistencies
@shane_mcintosh
Shane McIntosh
Sander van der Burg
Eelco Dolstra
Julius Davies
Daniel M. Germán
Armijn Hemel
! Tjaldur!
Software Governance Solutions
Software reuse enables rapid!development of new applications
2
Software reuse enables rapid!development of new applications
GNU
2
Software reuse enables rapid!development of new applications
GNUMozilla
2
Software reuse enables rapid!development of new applications
GNUApacheMozilla
2
Reusable components are released under different license terms
GNUMozilla
3
Apache
Reusable components are released under different license terms
GNUMozilla
3
Apache
Reusable components are released under different license terms
GNUMozilla
3
Apache
Reusable components are released under different license terms
GNUMozilla
3
Apache Public License
Apache
Reuse puts legal constraints on how client systems can be distributed
External component
Client system
Used by
4
Reuse puts legal constraints on how client systems can be distributed
External component
Client system
Used by
4
Failure to comply with license terms can lead to costly legal issues
5
Failure to comply with license terms can lead to costly legal issues
5
Failure to comply with license terms can lead to costly legal issues
5
Failure to comply with license terms can lead to costly legal issues
5
6
Which source files are enabled?
Ensuring license compliance with reused components
.c.c.c.c
6
Which source files are enabled?
Ensuring license compliance with reused components
.c.c.c
.c
6
Which source files are enabled?
Which components are used?
Ensuring license compliance with reused components
.c.c.c
.c
6
Which source files are enabled?
Which components are used?
How are they combined?
Ensuring license compliance with reused components
.c.c.c
.c
Static link
Dynamic link
6
Which source files are enabled?
Which components are used?
How are they combined?
Ensuring license compliance with reused components
.c.c.c
.c
Static link
Dynamic link
The build system can!answer these questions!
What is a build system?
7
Deliverable
What is a build system?
7
.tex
.c
.cc
.o
.o
.dvi
.a
.exe
.deb
Build systems describe how sources are!translated into deliverables
8
Step 1 - Configuration
Step 2 - Construction
Step 3 - Certification
Step 4 - Packaging
Step 5 - Deployment
9
Step 1 - Configuration
Step 2 - Construction
Step 3 - Certification
Step 4 - Packaging
Step 5 - Deployment
9
Focus of this paper
Step 1 - Configuration
10
11
Step 2 - Construction
patchelf: patchelf.o g++ patchelf.o -o patchelf
Incompleteness of build specs makes license compliance assessment difficult
patchelf.o: patchelf.cc g++ -c patchelf.cc
install: patchelf install patchelf /usr/bin/
12
patchelf: patchelf.o g++ patchelf.o -o patchelf
patchelf.o: patchelf.cc g++ -c patchelf.cc
install: patchelf install patchelf /usr/bin/
13
Incompleteness of build specs makes license compliance assessment difficult
patchelf: patchelf.o g++ patchelf.o -o patchelf
patchelf.o: patchelf.cc g++ -c patchelf.cc
install: patchelf install patchelf /usr/bin/
Header file dependencies are not listed
13
Incompleteness of build specs makes license compliance assessment difficult
patchelf.ccelf.h
patchelf.o
ExtractedMissing
Dependencies
14
Incompleteness of build specs makes license compliance assessment difficult
patchelf.ccelf.h
patchelf.o
ExtractedMissing
Dependencies
14
Incompleteness of build specs makes license compliance assessment difficult
patchelf.ccelf.h
patchelf.o
ExtractedMissing
Dependencies
14
Incompleteness of build specs makes license compliance assessment difficult
patchelf: patchelf.o g++ patchelf.o -o patchelf
patchelf.o: patchelf.cc g++ -c patchelf.cc
install: patchelf install patchelf /usr/bin/
15
Incompleteness of build specs makes license compliance assessment difficult
patchelf: patchelf.o g++ patchelf.o -o patchelf
patchelf.o: patchelf.cc g++ -c patchelf.cc
install: patchelf install patchelf /usr/bin/
External library dependencies are not listed
15
Incompleteness of build specs makes license compliance assessment difficult
patchelf.ccelf.h
patchelf.o
patchelf
libstdc++ExtractedMissing
Dependencies
16
Incompleteness of build specs makes license compliance assessment difficult
patchelf.ccelf.h
patchelf.o
patchelf
libstdc++ExtractedMissing
Dependencies
16
Incompleteness of build specs makes license compliance assessment difficult
patchelf.ccelf.h
patchelf.o
patchelf
libstdc++ExtractedMissing
Dependencies
16
Incompleteness of build specs makes license compliance assessment difficult
patchelf: patchelf.o g++ patchelf.o -o patchelf
patchelf.o: patchelf.cc g++ -c patchelf.cc
install: patchelf install patchelf /usr/bin/
17
Incompleteness of build specs makes license compliance assessment difficult
patchelf: patchelf.o g++ patchelf.o -o patchelf
patchelf.o: patchelf.cc g++ -c patchelf.cc
install: patchelf install patchelf /usr/bin/
Hidden relationship between patchelf and
/usr/bin/patchelf
17
Incompleteness of build specs makes license compliance assessment difficult
patchelf.ccelf.h
patchelf.o
patchelf
/usr/bin/patchelflibstdc++ExtractedMissing
Dependencies
18
Incompleteness of build specs makes license compliance assessment difficult
patchelf.ccelf.h
patchelf.o
patchelf
/usr/bin/patchelflibstdc++ExtractedMissing
Dependencies
18
Incompleteness of build specs makes license compliance assessment difficult
We use system tracing to recover the missing dependencies
Build process
19
Trace!log
OS kernel
open()
We use system tracing to recover the missing dependencies
Build process
19
read()
write()
close()
Trace!log
Trace!log
We mine build traces to construct a concrete build dependency graph
patchelf.ccelf.h
patchelf.o
patchelf
/usr/bin/patchelflibstdc++ExtractedMissing
Dependencies
20
Trace!log
We mine build traces to construct a concrete build dependency graph
patchelf.ccelf.h
patchelf.o
patchelf
/usr/bin/patchelflibstdc++ExtractedMissing
Dependencies
20
Trace!log
We mine build traces to construct a concrete build dependency graph
patchelf.ccelf.h
patchelf.o
patchelf
/usr/bin/patchelf
g++
libstdc++ExtractedMissing
Dependencies
20
Trace!log
We mine build traces to construct a concrete build dependency graph
patchelf.ccelf.h
patchelf.o
patchelf
/usr/bin/patchelf
g++
libstdc++
g++
ExtractedMissing
Dependencies
20
Trace!log
We mine build traces to construct a concrete build dependency graph
patchelf.ccelf.h
patchelf.o
patchelf
/usr/bin/patchelf
g++
libstdc++
g++install
ExtractedMissing
Dependencies
20
patchelf.ccelf.h
patchelf.o
patchelf
/usr/bin/patchelf
g++
libstdc++
g++install
ExtractedMissing
Dependencies
Annotate build graph nodes with license information using Ninka
21
patchelf.ccelf.h
patchelf.o
patchelf
/usr/bin/patchelf
g++
libstdc++
g++install
ExtractedMissing
Dependencies
Annotate build graph nodes with license information using Ninka
21
Inconsistency introduced!
patchelf.ccelf.h
patchelf.o
patchelf
/usr/bin/patchelf
g++
libstdc++
g++install
ExtractedMissing
Dependencies
Annotate build graph nodes with license information using Ninka
21
22
Empirical study!!
!
!
!
!
!
!
!
!
!
22
Empirical study!!
!
!
!
!
!
!
!
!
!
(RQ1)!Accuracy
22
Empirical study!!
!
!
!
!
!
!
!
!
!
(RQ1)!Accuracy
(RQ2)!Practicality
22
Empirical study!!
!
!
!
!
!
!
!
!
!
(RQ1)!Accuracy
(RQ2)!Practicality
23
24
Measuring the accuracy!of our CBDG approach
Included .c.c.c.c
Excluded
24
Measuring the accuracy!of our CBDG approach
Included .c.c.c
.cExcluded
24
Measuring the accuracy!of our CBDG approach
Included .c.c .c
.cExcluded
Delete
24
Measuring the accuracy!of our CBDG approach
Included .c.c .c
.cExcluded
Delete
Execute build
24
Broken means true positive
Measuring the accuracy!of our CBDG approach
Included .c.c .c
.cExcluded
Delete
Execute build
24
Clean means false positive
Broken means true positive
Measuring the accuracy!of our CBDG approach
Included .c.c .c
.cExcluded
Delete
Execute build
24
Clean means false positive
Broken means true positive
Measuring the accuracy!of our CBDG approach
Included .c.c .c
.cExcluded
Delete
Execute build
.c
24
Clean means false positive
Broken means true positive
Measuring the accuracy!of our CBDG approach
Included .c.c .c
.cExcluded
Delete
Execute build
Execute build
.c
24
Clean means false positive
Broken means true positive
Measuring the accuracy!of our CBDG approach
Included .c.c .c
.cExcluded
Delete
Execute build
Broken means false negative
Execute build
.c
24
Clean means false positive
Broken means true positive
Measuring the accuracy!of our CBDG approach
Included .c.c .c
.cExcluded
Delete
Execute build
Clean means true negative
Broken means false negative
Execute build
.c
Our approach accurately selects the files that impact system deliverables
Aterm Opkg Bash CUPS Xalan OpenSSL FFmpeg
Technology Make Make Make Make Ant Make Make
Precision 100% 97% 88% 100% 99% 99% 99%
Recall 98% 99% 100% 99% 100% 100% 100%
25
But there are cases where our approach makes mistakes
Aterm Opkg Bash CUPS Xalan OpenSSL FFmpeg
Technology Make Make Make Make Ant Make Make
Precision 100% 97% 88% 100% 99% 99% 99%
Recall 98% 99% 100% 99% 100% 100% 100%
26
And there are cases when our approach misses files that impact deliverables
Aterm Opkg Bash CUPS Xalan OpenSSL FFmpeg
Technology Make Make Make Make Ant Make Make
Precision 100% 97% 88% 100% 99% 99% 99%
Recall 98% 99% 100% 99% 100% 100% 100%
27
Empirical study!!
!
!
!
!
!
!
!
!
!
(RQ1)!Accuracy
Precision: 88%-100%
Recall: 98%-100%
(RQ2)!Practicality
28
Empirical study!!
!
!
!
!
!
!
!
!
!
(RQ1)!Accuracy
Precision: 88%-100%
Recall: 98%-100%
(RQ2)!Practicality
29
Bugs filed using our approach!on multi-licensed packages
FFmpeg
License!was updated!within 3 days
+
30
Bugs filed using our approach!on multi-licensed packages
FFmpeg
License!was updated!within 3 days
+
CUPS
+
Offending files were removed within 2 days
30
Empirical study!!
!
!
!
!
!
!
!
!
!
(RQ1)!Accuracy
Precision: 88%-100%
Recall: 98%-100%
(RQ2)!Practicality
Prompted quick code changes in two systems
31
top related