an historical examination of open source releases and ... · an historical examination of open...

13
Keyword(s): Abstract: An Historical Examination of Open Source Releases and Their Vulnerabilities Nigel Edwards, Liquin Chen HP Laboratories HPL-2012-63R1 Security; Protection; Measurement; Static Analysis; Risk Analysis; Open Source Software This paper examines historical releases of Sendmail, Postfix, Apache httpd and OpenSSL by using static source code analysis and the entry-rate in the Common Vulnerabilities and Exposures dictionary (CVE) for a release, which we take as a measure of the rate of discovery of exploitable bugs. We show that the change in number and density of issues reported by the source code analyzer is indicative of the change in rate of discovery of exploitable bugs for new releases - formally we demonstrate a statistically significant correlation of moderate strength. The strength of the correlation is an artifact of other factors such as the degree of scrutiny: the number of security analysts investigating the software. This also demonstrates that static source code analysis can be used to make some assessment of risk even when constraints do not permit human review of the issues identified by the analysis. We find only a weak correlation between absolute values measured by the source code analyzer and rate of discovery of exploitable bugs, so in general it is unsafe to use absolute values of number of issues or issue densities to compare different applications or software. Our results demonstrate that software quality, as measured by the number of issues, issue density or number of exploitable bugs, does not always improve with each new release. However, generally the rate of discovery of exploitable bugs begins to drop three to five years after the initial release. Copyright ACM 2012. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in CCS'12, 19th ACM Conference on Computer and Communications Security, October 16 - 18, 2012, Raleigh, North Carolina, USA. External Posting Date: October 6, 2012 [Fulltext] Approved for External Publication Internal Posting Date: October 6, 2012 [Fulltext] To be published in CCS'12, 19th ACM Conference on Computer and Communications Security, October 16 - 18, 2012, Raleigh, North Carolina, USA. Copyright ACM 2012.

Upload: others

Post on 13-Mar-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An Historical Examination of Open Source Releases and ... · An Historical Examination of Open Source Releases and Their Vulnerabilities Nigel Edwards, Liqun Chen Hewlett-Packard

Keyword(s): Abstract:

An Historical Examination of Open Source Releases and Their Vulnerabilities

Nigel Edwards, Liquin Chen

HP LaboratoriesHPL-2012-63R1

Security; Protection; Measurement; Static Analysis; Risk Analysis; Open Source Software

This paper examines historical releases of Sendmail, Postfix, Apache httpd and OpenSSL by using staticsource code analysis and the entry-rate in the Common Vulnerabilities and Exposures dictionary (CVE) fora release, which we take as a measure of the rate of discovery of exploitable bugs. We show that the changein number and density of issues reported by the source code analyzer is indicative of the change in rate ofdiscovery of exploitable bugs for new releases - formally we demonstrate a statistically significantcorrelation of moderate strength. The strength of the correlation is an artifact of other factors such as thedegree of scrutiny: the number of security analysts investigating the software. This also demonstrates thatstatic source code analysis can be used to make some assessment of risk even when constraints do notpermit human review of the issues identified by the analysis. We find only a weak correlation betweenabsolute values measured by the source code analyzer and rate of discovery of exploitable bugs, so ingeneral it is unsafe to use absolute values of number of issues or issue densities to compare differentapplications or software. Our results demonstrate that software quality, as measured by the number ofissues, issue density or number of exploitable bugs, does not always improve with each new release.However, generally the rate of discovery of exploitable bugs begins to drop three to five years after theinitial release. Copyright ACM 2012. This is the author's version of the work. It is posted here bypermission of ACM for your personal use. Not for redistribution. The definitive version was published inCCS'12, 19th ACM Conference on Computer and Communications Security, October 16 - 18, 2012,Raleigh, North Carolina, USA.

External Posting Date: October 6, 2012 [Fulltext] Approved for External PublicationInternal Posting Date: October 6, 2012 [Fulltext]To be published in CCS'12, 19th ACM Conference on Computer and Communications Security, October 16 - 18, 2012, Raleigh,North Carolina, USA.

Copyright ACM 2012.

Page 2: An Historical Examination of Open Source Releases and ... · An Historical Examination of Open Source Releases and Their Vulnerabilities Nigel Edwards, Liqun Chen Hewlett-Packard

An Historical Examination of Open Source Releases andTheir Vulnerabilities

Nigel Edwards, Liqun ChenHewlett-Packard Laboratories

Long Down AvenueBristol, BS34 8QZ, UK

<firstname.lastname>@hp.com

ABSTRACTThis paper examines historical releases of Sendmail, Post-fix, Apache httpd and OpenSSL by using static source codeanalysis and the entry-rate in the Common Vulnerabilitiesand Exposures dictionary (CVE) for a release, which wetake as a measure of the rate of discovery of exploitablebugs. We show that the change in number and density ofissues reported by the source code analyzer is indicative ofthe change in rate of discovery of exploitable bugs for newreleases — formally we demonstrate a statistically signifi-cant correlation of moderate strength. The strength of thecorrelation is an artifact of other factors such as the degreeof scrutiny: the number of security analysts investigatingthe software. This also demonstrates that static source codeanalysis can be used to make some assessment of risk evenwhen constraints do not permit human review of the issuesidentified by the analysis.We find only a weak correlation between absolute values

measured by the source code analyzer and rate of discoveryof exploitable bugs, so in general it is unsafe to use abso-lute values of number of issues or issue densities to comparedifferent applications or software. Our results demonstratethat software quality, as measured by the number of issues,issue density or number of exploitable bugs, does not alwaysimprove with each new release. However, generally the rateof discovery of exploitable bugs begins to drop three to fiveyears after the initial release.

Categories and Subject DescriptorsK.6.5 [Management of Computing and InformationSystems]: Security and Protection

General TermsSecurity, Measurement

KeywordsStatic Analysis, Risk Analysis, Open Source Software

Copyright ACM 2012. This is the author’s version of the work. It is postedhere by permission of ACM for your personal use. Not for redistribution.The definitive version was published in CCS’12, 19th ACM Conference onComputer and Communications Security, October 16–18, 2012, Raleigh,North Carolina, USA.

1. INTRODUCTIONIn this paper we present an investigation of the vulnerabil-

ity history of various open source projects. We use a staticsource code analysis tool to investigate sample releases overa number of years for potential security issues and comparethe results to the rate at which entries appear for the soft-ware in the Common Vulnerabilities and Exposures dictio-nary (CVE1) [12].

The purpose of the investigation is to understand whatstatic code analysis of software can tell us about the poten-tial future intrinsic risks of using that software. The size andcomplexity of much commonly used software renders man-ual analysis impractical: the Linux kernel contains over 13million line of code and OpenOffice contains over 9 millionlines of code at the time of writing. There is also no guar-antee that any particular piece of software has been subjectto rigorous analysis by skilled security analysts. Thereforewe believe automatic analysis is needed to make a soundassessment of the risk of using any software.

In this study we are interested in estimating the numberof bugs that might be in the software that could allow theconstruction of a successful exploit leading to system com-promise, for example: buffer overflow, cross-site-scriptingand SQL injection. We call these exploitable bugs. As wellas comparing different releases of the same software, we alsocompare different software. Our method of comparison is touse a static source code analysis tool and measure the num-ber of security issues it identifies. We are trying to answerthe following questions.

1. Does the change in number of issues or issue densitiesbetween a previous release and a new release of thesame software indicate anything?

2. What is the range of issue densities for popular opensource software?

3. Do very large differences in issue densities or numberof issue between different software tell us anything?

CVE is a dictionary of common names, “CVE Identifiers”,for publicly known security vulnerabilities. Its purpose is toprovide one name for each vulnerability to enable the use ofmultiple tools and databases. CVE identifiers are assignedso that each separate vulnerability is assigned a unique CVEIdentifier – see [12] for further details. We use the rate atwhich entries appear in CVE for the software after the re-lease date as an estimate of the number of exploitable bugs

1CVE is a trademark of The MITRE Corporation.

Page 3: An Historical Examination of Open Source Releases and ... · An Historical Examination of Open Source Releases and Their Vulnerabilities Nigel Edwards, Liqun Chen Hewlett-Packard

it contained on its release date. So if the rate of appear-ance of CVE entries for the software after its release dateis very low, then one might say that particular release hadrelatively low number of exploitable bugs. Conversely if therate is higher, we assume that there were a larger numberof exploitable bugs. The validity of these assumptions isdiscussed next.The approach of using the CVE entries to estimate the

number of exploitable bugs that might have been in a re-lease has obvious limitations: some software is subject togreater investigation by security researchers than others —the degree of scrutiny also matters. We also look at thiseffect. In addition not all exploitable bugs may result inCVE entries. Not all exploitable bugs may be discovered orreported. So CVE cannot be used as an absolute measureof the number of exploitable bugs. Assuming the degree ofscrutiny is constant, we believe it can be used to indicatewhether or not there were more or less security bugs in agiven release.We chose the following software for our study.

Sendmail Email server software: 1996 to 2011

Postfix Email server software: 1999 to 2010

Apache httpd Web server software: the 1.3 release seriesfrom 1998 to 2010; the 2.0 release series from 2002 to2010; the 2.2 release series from 2005 to 2011

OpenSSL A toolkit for implementing SSL and TLS: the0.9.6 release series from 2000 to 2004; the 0.9.7 releaseseries from 2002 to 2007; the 0.9.8 release series from2005 to 2011; the 1.0.0 release series from 2009 to 2011

All of the above are very widely used to provide Internetaccessible services. Therefore there are many millions ofinstances — over 320 million Apache web servers in Octo-ber 2011 [15]. The wide spread availability makes them atempting target for attackers and security researchers, sothe degree of scrutiny is high. The choice of software wasfurther driven by the availability of public release archivesenabling us to obtain older releases. Both Apache httpdand OpenSSL have a number of separate major release se-ries in which substantial new amounts of code were intro-duced with a new series (e.g. changing from 1.3 to 2.0 forApache). These major releases series evolved separately andare maintained in parallel. Therefore we have analyzed eachof these series separately. We did not have the resources toanalyze all instances of a release series, so we took samplesapproximately one year apart.The remainder of this paper is structured as follows. Sec-

tion 2 gives an overview of static source code analysis. Sec-tion 3 presents the results of our analysis of Sendmail, Post-fix, Apache and OpenSSL. Section 4 discusses to what ex-tent it is possible to compare analysis results from differentsoftware. In section 5 we discuss degree of scrutiny, usingadditional CVE histories of two open source databases. Sec-tion 6 describes some related work. Section 7 is our conclu-sion. The full analysis results are given in appendix A.

2. OVERVIEW OF STATIC SOURCE CODEANALYSIS

Static analysis of source code is the automatic examina-tion of source code to determine particular non-functional

properties of interest. The term “static” is used to denotethat no execution is involved in contrast to “dynamic” anal-ysis in which some form of execution and test data set isusually involved. Static analysis is used for a variety of pur-poses including type-checking, style-checking, performance-optimization and program verification. In this paper weare concerned with security analysis which is to detect thepresence of bugs which may lead to security problems —exploitable bugs.

Static source code analysis attempts to detect exploitablebugs automatically. It uses data flow analysis techniques[16], [6] to trace the potential paths of ingested data throughthe program. Each time a call of a dangerous functionsuch as strcpy() or mysql query() occurs the analysis toolchecks function-specific rules about the parameters beingpassed: whether the length of the target is greater than thelength of the source for strcpy(); whether the query param-eter of mysql query() has been cleansed – characters thatmight modify the query (e.g. SQL injection) have been re-moved.

Unfortunately many static analysis problems are unde-cidable (as a consequence of Rice’s theorem [22], see alsodiscussion in [11]) which means that static analysis toolsmust use approximation techniques. For example, they mayanalyze paths that can never be executed2. Typically theconsequence of this is that the tool will identify many moreissues for a program than there are security bugs. Thereforethe results it produces must be vetted by human auditorsto determine their legitimacy and impact. In addition, itis often possible to add program-specific rules. For exam-ple, the software may contain a data cleansing routine. Sowe might choose to add a rule that declares any ingesteddata that flows through the cleansing routine to be safe soit won’t trigger SQL injection or buffer overflow. Buildinga set of program-specific rules requires inspection of sourcecode to determine how and when data is cleansed. Typi-cally this is done by auditors inspecting the results from thefirst and subsequent analyses. Each issue identified by theanalyzer is considered and the code that triggered the issueinspected to determine if a custom rule to suppress the is-sue is warranted. One rule may suppress many issues. Thispaper explores the value we can derive when constraints donot permit human review of all the issues. What can theraw results do for us?

It follows from the above that the number of issues ordensity of issues for two different programs cannot be takenas an absolute measure of the number of security defects[5]. The difference might be explainable because, in the ab-sence of program-specific rules, the tool is doing a betterjob of understanding one program compared to the other.However, whilst we accept some range of difference is to beexpected and is unlikely to be significant, we seek to under-stand the range of issue densities and determine if extremelylarge differences are significant.

3. THE ANALYSISIn this section we present and discuss the results of our

analysis. Full results are given in appendix A.

2Some tools are capable of detecting some cases of “deadcode”: code which is never executed. Dead code can arisebecause of interfering logical constraints; this can be arbi-trarily difficult to detect in practice.

Page 4: An Historical Examination of Open Source Releases and ... · An Historical Examination of Open Source Releases and Their Vulnerabilities Nigel Edwards, Liqun Chen Hewlett-Packard

For our study we used the HP Fortify Source Code Ana-lyzer (SCA) version 5.10.0.0102 without any program-specificor custom rules. Other static source code analyzers includeIBM Rational AppScan and Klocwork Insight. For all soft-ware we configured the analyzer to trust the local system:the file system, environment variables and the like. So theprimary source of untrusted data is the network.The analyzer classifies issues as“Critical”,“High”or“Low”.

For each software release that we analyzed we consider thefollowing metrics which we generate from SCA.

• The total number of issues (TI): critical+ high+ low

• The total issue density (T-density): the number ofcritical, high and low issues per 100 lines of executablecode

• The number of critical issues (CI)

• The critical issue density (C-density): the number ofcritical issues per 10,000 lines of executable code

• The number of critical and high issues(CHI): critical+high

• The critical and high issue density (CH-density): thenumber of critical and high issues per 1,000 lines ofexecutable code

We compared these measurements to the number of en-tries appearing in CVE per year for that software release(CVE/yr). CVE only contains entries from 1999 on. CVEentries up to the end of calendar year 2011 are included.Note that the units for the density metrics, T-density, C-density and CH-Density are different: respectively issues per100 lines, issues per 10,000 lines and issues per 1,000 lines.This is because we believe it makes comparison of the fullresults from different software given in appendix A simpler.It is easier to compare C-densities of 7.64 and 25.26 (numberof critical issues per 10,000 lines of code) rather than 0.0764and .2526 (number of critical issues per 100 lines of code).We did apply some filtering to the CVE entries. Only

entries which detailed problems with the software being an-alyzed were used. We excluded entries in which referenceswere made to the use of the software, but the bug lay else-where. We included entries which were specific to a singleoperating system: due to the limited information availableit is usually not possible to tell if these are simple packag-ing errors or coding errors. Examples of CVE entries weexcluded include the following.

• Incorrect use of APIs by third party software. In 2009there were at least 41 CVE entries that referencedOpenSSL, 29 of these were reports of incorrect use ofOpenSSL by third parties and are therefore excludedfrom our analysis. Examples include CVE-2009-5057(incorrect configuration of OpenSSL) and CVE-2009-3766 (failure to verify the domain name in the Com-mon Name field of a certificate when using OpenSSL).The remaining 12 CVE entries in 2009 were bugs invarious releases of OpenSSL software and are there-fore included in our analysis.

• Bugs in third party plug-in software or plug-ins. Forexample CVE-2009-1012 concerns a bug for the Ora-cle BEA Weblogic plug-in for Apache httpd. This is

not a bug in any software that was included in any ofthe httpd releases. It is therefore excluded from ouranalysis.

We make the simplifying assumption that the rate of CVEoccurrence is constant over any given year. We assume eachCVE entry corresponds to one distinct issue. Although thisis the intent of the CVE editorial policy and seems to be thecase for the large majority of entries, we cannot guaranteethat a few CVE entries do not refer to multiple bugs orthat some entries may be duplicates. Classification is animprecise activity depending on human skill and judgment.

To calculate the number of CVE entries per year, CVE/yr,for a release rn, we use the CVE entries that occurred fromthe release of rn until the next release analyzed rn+1 anddivide by the time interval between release dates of rn andrn+1 release. The time interval spans fractions of years,as software is not generally released on January 1st. Wetherefore apportion CVE entries based on the fraction ofthe year covered by the time interval between release dates.Let dn denote the day of the year on which rn was released,and let dn+1 denote the day of the year on which rn+1 wasreleased. Then if rn was released in y1 and replaced by rn+1

in y2, the CVE entries per year, or CVE/yr for rn is givenby:

cvey1 × (365− dn) + cvey2 × dn+1

365− dn + dn+1

More generally, for a release interval spanning m years, m >2, the CVE/yr is given by:

cvey1 × (365− dn) + cveym × dn+1 + 365×∑i=m−1

i=2 cveyi

365− dn + dn+1 + (m− 2)× 365

If rn and rn+1 were released in the same year, then theCVE/yr is given by:

cvey1 × (dn+1 − dn)

dn+1 − dn= cvey1

Particularly for the older CVE entries, it is not always possi-ble to determine to which versions of the software the entryapplies. Also in many cases there were beta releases preced-ing the first release which we analyzed. These beta releasesdo not appear in the software archives and so we could notanalyze them. However, they still have CVE entries. Byweighting the CVE count with the number of days for whichthe software was available in any given year we compensatefor these effects.

For example, OpenSSL 0.9.7 was released December 31st,2002. As shown in table 4 in appendix A, there are 4CVE entries for OpenSSL 0.9.7. All these entries applied tobeta releases of 0.9.7 which are not in the OpenSSL sourcearchives. The next release we analyzed was 0.9.7c which wasreleased on September 30th, 2003 (day 272). The CVE/yrfor 0.9.7 is given by:

4× (365− 364) + 7× 272

365− 364 + 272= 6.99

Thus the 4 CVE entries for 2002 are given negligible weightcompared to the 7 for 2003.

We did not have the resources to analyze all consecu-tive releases of all the software, so we took samples ap-proximately 12 months apart. In some cases the sampletime is longer because the software was stable and exhibited

Page 5: An Historical Examination of Open Source Releases and ... · An Historical Examination of Open Source Releases and Their Vulnerabilities Nigel Edwards, Liqun Chen Hewlett-Packard

Figure 1: Sendmail issues & CVE entries

Figure 2: Postfix issues & CVE entries

very little change. For example for Sendmail we analyzed8.14.0 which was released on 1st February 2007, the nextrelease analyzed was 8.14.5 which was released September15th 2011, this has just 124 more lines of code and 3 fewerissues detected by SCA. If the CVE entry did not men-tioned a release which we analyzed, then to determine if weshould count it against one of our analyzed releases we hadto use our judgment and other sources of publicly availableinformation such as the SecurityFocus database [24] and theNational Vulnerability Database [14]. Thus CVE-2009-4565begins with “Sendmail before 8.14.4 does not properly han-dle a ’\0’ character in a Common Name (CN) field of anX.509 certificate...”. Although release 8.14.4 is not includedin our analysis, we interpret the above to mean the issuealso applies to release 8.14.0 which is included, so CVE-2009-4565 is included in the calculation of CVE per year forrelease 8.14.0.It will be apparent from the above that our process for

classification of CVE entries was manual, relying on ourjudgment of the contents of the entry supplemented by fur-ther manual searches of publicly available sources, so someerrors are possible.

3.1 SendmailSendmail was originally developed by Eric Allman in the

late 1970s and early 1980s. Being one of the earliest Internetcapable programs it was exploited in a number of incidentsincluding the Morris Internet Worm of 1988 [25]. Figure 1shows our metrics for various releases of Sendmail from 1996-2011. CVE information is available from 1999. Even in 1999there were 7 CVE entries for releases of Sendmail prior to

8.7.6, which was itself released in 1996. We excluded these 7entries from our count of CVE entries, but believes it justifiesincluding release 8.7.6 in our analysis, since it was clearly thesubject of security analysis work in 1999.

For easy visual comparison the metrics are scaled as shownin the figure so that all seven data sets can be representedby a common vertical axis. Full analysis details includingthe unscaled values are given in appendix A table 5. Theearliest releases we analyzed had a fairly large number ofissues reported by SCA: 2548 (total) and 136 (critical) forversion 8.9.3. This is reflected by the large number of CVEentries per year for that release. The 8.10.0 release had dra-matically fewer issues (662(total) and 120(critical)) and thisis reflected in the drop in CVE entries per year. Note thatalthough for recent releases of Sendmail SCA is reporting841 issues and 87 critical issues, this does not mean thereare 841 exploitable bugs. Rather it is an artifact that we didnot write any custom rules for Sendmail to denote defensivecode responsible for cleansing data to make it safe. There-fore SCA must assume all data being processed by Sendmailto be unsafe throughout its processing. Of these 841 issues,572 are unique: multiple paths to a dangerous function callare each flagged separately. Of these 572 unique issues manyare issues that might lead to denial of service rather thansystem compromise, for example 174 potential memory leaksare identified.

Over the releases we analyzed substantial amount of ad-ditional functionality was added to Sendmail. The 8.7.6 re-lease (September 17 1996) had 11,861 executable lines ofcode3. This increased to 15,099 for the 8.9.3 (February 5,1999). In the next three releases (8.10.0, (March 6, 2000),8.11.0 (July 19, 2000), 8.11.6 (August 20, 2001) ) the lines ofcode count dropped to under 11,000 with significant dropsin total number of issues, issue densities and CVE entriesper year. This is possibly indicative of a “clean-up” by thedevelopers. Over the remaining releases the number of linesof code increased to just over 32,000 in 8.14.5 (September 15,2011). With the largest increase coming between 8.12.6 and8.13.0: 16195 to 31668. This was marked by an increase inthe total number of issues, critical issues and critical+highissues, but a drop in densities. This may indicate significanteffort in improving code quality.

Release 8.13.0 has 0 CVE entries per year. This is becausethere were no CVE entries for 2004 and 2005 and then fivein 2006 (8.13.5) (see appendix A table 4). This patternof CVE entries is hard to explain. Possibly it is a normalstatistical variation and an artifact of there being relativelyfew undiscovered bugs in the software, or possibly it is dueto delayed reporting. The most recent release of Sendmailthat we analyzed, 8.14.5, has 32,270 lines of executable code.

The drops in issue density and low number of CVE entriessince 2004 (8.13.0) (see table 4), suggests Sendmail has ma-tured with significant attention being paid to code quality.

3.2 PostfixPostfix was originally developed by Wietse Venema in the

late 1990s. Figure 2 shows our metrics for various releases ofPostfix from 1999 to 2010. For easy visual comparison thevalues are scaled as shown in the figure so that all seven datasets can be represented by a common vertical axis. Note thatthe range of the vertical axis and scaling of the CVE/yr met-

3We use the lines of code reported by SCA; other tools maygenerate slightly different counts.

Page 6: An Historical Examination of Open Source Releases and ... · An Historical Examination of Open Source Releases and Their Vulnerabilities Nigel Edwards, Liqun Chen Hewlett-Packard

Figure 3: Sendmail & Postfix dangerous functiondensity

ric is very different to that shown in figure 1. Full analysisdetails are given in appendix A table 6. Over the course ofthe analyzed releases Postfix evolved from 10,472 executablelines of code to 32,470. By most standards Postfix has hadan excellent record with relatively few CVE entries over theyears: only 12 (see appendix A table 4). Such a small num-ber makes it difficult to draw conclusions about the increasein number of issues reported by SCA compared to the num-ber of CVE entries. Although the rise from 127 to 394 totalissues does seem to have some relation to 6 of the 12 entrieshaving occurred in the last 4 years to end of 2011.The figures and tables show a striking difference in the

metrics for Sendmail compared to Postfix. For example, forPostfix releases the “Total issues”metric ranges from 127 to394, whereas for Sendmail releases the metric ranges from2548 to 548 (841 for the most recent release analyzed). For“T-density”, the number of issues per 100 lines of code, therange for all Postfix releases varied from 0.96 to 1.33 andfor Sendmail it varied from 2.55 to 21.48. We believe thisdemonstrates a fundamentally different approach adoptedby Venema in the initial development of Postfix comparedto that used by Sendmail, perhaps learning from nearly twodecades of Sendmail experience. One simple measure is thedifference in use of known “dangerous” C functions by thetwo programs. These functions are generally accepted tobe “dangerous” because they don’t perform bounds checksor the bounds checking can be subverted so that can beused in exploits to overwrite arbitrary areas of memory. Ex-amples include strcpy, strcat, memset, memcpy and printf.Figure 3 shows the density of dangerous functions used inthe various releases of Sendmail and Postfix which we an-alyzed (the units are number of calls per 100 Lines). Thisappears to confirm that a fundamentally different approachwas taken for the implementation of the initial versions ofPostfix. Historically, it made much less use of these danger-ous functions. For the most recent releases of Sendmail andPostfix the usage is very similar.The large differences we see in the SCA analyses for earlier

releases of Sendmail are matched by the much greater num-ber of CVE entries for Sendmail. However, more recentlySendmail has had very few issues indeed, indicating muchimproved quality.

3.3 Apache httpdFull details of our analyses of the 1.3, 2.0 and 2.2 Apache

httpd series are given in appendix A table 7. Figure 4 sum-

Figure 4: Apache httpd 1.3 issues & CVE entries

Figure 5: Apache httpd 2.0 issues & CVE entries

marizes the results for the 1.3 series from 1998 to 2010. Dur-ing this period the number of lines of executable code in-creased from 11,079 to 14,201 (with 1.3.19 having the mostat 17099), and the T-density varied from 2.18 to 3.63. Thepattern shown by the figure is that an initial rise in SCAmeasure metrics being matched by a rise in CVE entries peryear. From release 1.3.32, although there is no drop in SCAmeasured metrics, there is a drop in the CVE/yr which ap-pears to be consistent with relatively few changes to maturecode in which most of the serious bugs have been discovered.This is also reflected in the number of lines of executablecode which increased by only 83 lines from 1.3.32 to 1.3.42.CVE information is shown for release 1.3.2 and later. 1.3.0was released June 1, 1998 but no CVE information is avail-able until 1999. The release of 1.3.2 was on September 21,1998 and 1.3.11 was January 1, 2000. So release 1.3.2 is thefirst release for which we show CVE information.

The results of the analysis for the 2.0 series from 2002 to2010 is shown in figure 5. 2.0.35 (April 6, 2002) is the earli-est release of the 2.0 series available in the Apache archives,but we had system compatibility problems and so we wereunable to compile it - a prerequisite to using SCA. Our anal-ysis therefore begins with 2.0.43 (October 3, 2002). Duringthis series the number of lines of executable code increasedfrom 23,982 lines of executable code to 25,720. There wasa modest reduction in some of the SCA measured metricssuch as Total issues and T-density. Other metrics remainedlargely unchanged: Critical issues and C-density. From re-lease 2.0.52 there is a reduction in CVE/yr consistent withmaturing code.

The results of the analysis for the 2.2 series from 2005 to

Page 7: An Historical Examination of Open Source Releases and ... · An Historical Examination of Open Source Releases and Their Vulnerabilities Nigel Edwards, Liqun Chen Hewlett-Packard

Figure 6: Apache httpd 2.2 issues & CVE entries

Figure 7: OpenSSL 0.9.6 issues & CVE entries

2011 are shown in figure 6. During this series the lines ofexecutable code increased from 28,057 to 30,655. The SCAmeasured metrics vary little through this series. There isalso no significant reduction in CVE/yr with later releasesin the series.

3.4 OpenSSLFull details are of our analyses of the OpenSSL 0.9.6, 0.9.7,

0.9.8 and 1.0.0 series are given in appendix A table 8. Fig-ure 7 summarizes the results for the 0.9.6 series from 2000 to2004. During this series of releases the number of lines of ex-ecutable code increased from 44,396 to 45,173. Most of theSCA measure metrics show a slight increase. The exceptionis Critical issues and C-density which show a slight drop.The comparatively low CVE/yr for the first release, 0.9.6,may be an artifact of the software being new and receivinglittle attention [7].Figure 8 shows the results of the analysis of the OpenSSL

0.9.7 series from 2002 to 2007. During this series of releasesthe number of lines of executable code increased from 56,216to 59,064. It is noticeable that for the 0.9.7.e release SCAdetected zero critical issues (hence C-density is also 0) andthere does seem to be a corresponding drop in CVE/yr forthat release. More generally there is little change in manyof the SCA measured metrics for this release series. Thedownward trend exhibited by some such as C-density doesseem to be matched by the reduction in CVE/yr.Figure 9 shows the results of the analysis of the OpenSSL

0.9.8 series from 2005 to 2011. During this series of re-leases the number of lines of executable code increased from71,529 to 75,324. Many of the SCA measured metrics show

Figure 8: OpenSSL 0.9.7 issues & CVE entries

Figure 9: OpenSSL 0.9.8 issues & CVE entries

little change across releases. The most noticeable change isin the rise in number of critical issues and c-density whichagain seems to show some correspondence to the slight risein CVE/yr.

Figure 10 shows the results of the analysis of the OpenSSL1.0.0 release series from 2009 to 2011. During this releaseseries the number of lines of executable code increased from87,987 to 88,603; there is very little change in any of theSCA measured metrics or CVE/yr.

3.5 Statistical AnalysisCan we show a statistically significant correlation between

metrics generated using SCA and CVE/yr? The sample sizefor each of the separate nine datasets for Sendmail, Postfix,the three releases of Apache httpd and the four releases ofOpenSSL is too small, in most cases, to show a statisticallysignificant correlation. The largest number of samples wehave is 13 which is for the Apache httpd 1.3 dataset.4 Typ-ically 50 to 100 samples is required to show a statisticallysignificant correlation. If we combine all our analysis datawe have 75 samples. We performed two separate correla-tion analyses of the SCA metrics and CVE/yr, one on thecombined unnormalized dataset the second on the combinenormalized dataset, as explained below.

The correlation analysis on the unnormalized datasets usesthe absolute values of the metrics for each dataset. For eachmetric the mean used in the calculation is the mean acrossall datasets. This tells use the strength of the relationshipbetween absolute values of a metric and CVE/yr. For exam-

41.3.0 is excluded from the correlation calculation as it wasreplaced by 1.3.2 before CVE was established in 1999.

Page 8: An Historical Examination of Open Source Releases and ... · An Historical Examination of Open Source Releases and Their Vulnerabilities Nigel Edwards, Liqun Chen Hewlett-Packard

Figure 10: OpenSSL 1.0.0 issues & CVE entries

ple, does a high value of“Total issues”or“T-density”suggesta high value for CVE/yr?We normalized each dataset by dividing the values for

CVE/yr and each SCA measured metric by the mean valuesfor that dataset. Each release series of Apache httpd (1.3,2.0, 2.2) and OpenSSL (0.9.6, 0.9.7, 0.9.8, 1.0.0) was treatedseparately and normalized with means for that release series.We performed a correlation analysis on the combined nor-malized dataset of 75 samples. This form of normalizationis not concerned with absolute values. It is concerned withdetermining whether or not a change in the value of a metricwith respect to the mean for that release series can explaina corresponding change in CVE/yr.

Table 1: SCA metrics - CVE/yr correlationMetric CC t-value SL CDTotal issues 0.211 1.84 90 0.044T-density 0.346 3.15 99 0.120Critical issues 0.305 2.73 99 0.093C-density 0.232 2.03 95 0.054Critcal + high issues 0.124 1.07 NA 0.015CH-Density 0.324 2.92 99 0.105

Table 2: Normalized metrics - CVE/yr correlationMetric CC t-value SL CDTotal issues 0.565 5.85 99 0.319T-density 0.559 5.76 99 0.313Critical issues 0.326 2.95 99 0.107C-density 0.313 2.82 99 0.098Critcal + high issues 0.495 4.86 99 0.245CH-Density 0.559 5.76 99 0.312

Tables 1 and 2 summarize the results of the correlationcalculations. The meaning of the columns in the tables isas follows. CC and t-value are the Pearson’s CorrelationCoefficient and t-value for the metric with CVE/yr. SL isthe significance level calculated from the t-value: 99 meansthat the correlation is significant at the 99% level; “NA”means not acceptable — there is no significant correlation.CD is the Coefficient of Determination which is give by thesquare of the correlation coefficient. This is the percentageof variation in CVE/yr that is “explainable” by the metric:0.044 is thus 4.4% and 0.319 is 31.9%.

Figure 11: Sendmail, Postfix & Apache httpd 2.0Total issues

The tables show a moderate correlation for the normalizedSCA metrics of “Total issues“, “T-density” and“CH-density”which are significant at the 99% level and explain over 30%of the variance in CVE/yr for the software analyzed. Thusa large increase in the T-density for a new release comparedto a previous release is indicative of an increase in CVE/yr.A moderate, rather than strong correlation is artifact of thepresence of other factors also discussed in this paper.

The correlation for the unnormalized metrics is weak: thebest being T-density with a correlation coefficient of 0.346and a coefficient of determination of 12%. This is consistentwith absolute values being less important than a change inmeasured metrics with respect to the mean for the series.The weak correlation for the absolute values of metrics toCVE/yr, show that drawing conclusions about CVE/yr fromthe absolute values of metrics measured from different soft-ware or release series is dangerous.

3.6 The effect of timeWe did not explicitly set out to explore the effect of time

on CVE/yr. However, with a few exceptions, the sam-ples of releases are spaced by approximately a year (seeappendix A). So the pattern of CVE/yr shown in figures 1-10 is similar if time is used as the x-axis instead of releaseversion. Setting aside Sendmail which has releases goingback well beyond our analysis and CVE records, the graphssuggest a non-linear relationship: an initial rise in CVE/yrthe first three to five years of a release series before a re-duction. Note that figure 5 for Apache 2.0 is consistentwith this trend. The first release of 2.0 occurred in March2000 [2], but the first release that we could analyze success-fully, 2.0.43, was not until October 2002. Release 2.0.55 wasa few months after the fifth anniversary of the initial release.

4. COMPARING DIFFERENT SOFTWAREIn section 3.5 we established that there was only a weak

correlation between the absolute values of SCA metrics andCVE/yr. So using metrics to compare different softwareis dangerous. In this section we explore this further. Inparticular can anything be said about orders of magnitudedifferences?

Consider figures 11 and 12 which shows the “Total issues”and“T-density”for Sendmail, Postfix and the Apache httpd-2.0 series. If one looks at these graphs for the period 2002 to2011 one might reasonably expect there to be more securitybugs discovered and hence more CVE entries for Sendmail

Page 9: An Historical Examination of Open Source Releases and ... · An Historical Examination of Open Source Releases and Their Vulnerabilities Nigel Edwards, Liqun Chen Hewlett-Packard

Figure 12: Sendmail, Postfix & Apache httpd 2.0T-density

Figure 13: Sendmail, Postfix & Apache httpd 2.0CVE entries

than Apache httpd or Postfix — there is generally a factor2 or greater difference in the metrics. However, as is shownby figure 13, there were significantly more CVE entries forApache httpd 2.0 than either Sendmail or Postfix.It is clear from the correlation coefficients for absolute val-

ues of SCA metrics with CVE/yr and the above qualitativeexample that even a factor 2 difference in absolute valuesdoes not necessarily indicate a corresponding difference inCVE/yr. However, there is an order of magnitude (10x or20x) between the metrics measured for the earliest releases ofSendmail and that of Postfix and httpd 2.0 which does seemto show some relationship to the larger number of CVE/yrfor Sendmail during this period. Unfortunately, the datasetis very small, so this is only weak evidence.

5. DEGREE OF SCRUTINYFor the above analysis we chose the software which we

expected to have a high degree of scrutiny. Sendmail, Post-fix, Apache httpd and OpenSSL present readily accessibletargets to attackers by providing accessible services to theInternet. We expected this to motivate significant numbersof security researchers external to the development teams toscrutinize the code for security flaws. Not all open sourcesoftware is subject to the same level of scrutiny. To illustratethis we looked at two open source databases whose names wehave chosen to redact and for the most recent CVE entriesused the SecurityFocus database [24] to identify the reporterof the bug. The results are summarized in table 3. “Tot”(Total) is the total number of CVE entries in the sample.

Table 3: Security bug reportersTot NA Devs Ext Unique Ext

Apache httpd 24 1 3 20 191

OpenSSL 26 5 32 21 173

Database-1 26 4 15 7 64

Database-2 32 1 19 12 105

1One individual reported two bugs2Developer team credited along with named individuals3Two individuals reported three bugs each4One individual reported two bugs5Two individuals reported two bugs each

“NA” (No Attribution) is the number for which we couldnot find any attribution to who first reported the vulnera-bility. “Devs” (Developers) is the number of vulnerabilitiesreported by the development team. “Ext” (External) is thenumber of vulnerabilities reported by security researcherswho, so far as we could tell, are independent of the devel-opment team. “Unique Ext” (Unique External) is the num-ber of unique external security researchers (or teams) whoreported vulnerabilities — some reported more than one.Note that the sum of the 3rd, 4th and 5th columns do notequal the 2nd column in the case of OpenSSL as a namedindividual as well as the developers are jointly credited for3 vulnerabilities.

This clearly demonstrates that it is the developers of thedatabases that are reporting the majority of the securitybugs (15 out of 26 for database-1 and 19 out of 32 fordatabase-2) in contrast to Apache httpd and OpenSSL whereindividuals and groups outside the development team are re-porting the majority of security bugs. This is consistent withthe hypothesis that Apache httpd and OpenSSL are subjectto a much greater degree of scrutiny than the databases. It ispossible that the relative lack of scrutiny of the databases bythe security community means that only a relatively smallpercentage of the vulnerabilities in the code is being reportedand fixed. However, since we cannot safely infer CVE/yrfrom absolute values of SCA metrics we can neither confirmor refute this hypothesis.

6. RELATED WORKThere have been several attempts to build “Vulnerability

Discovery Models”by analyzing vulnerability reporting dataand using it to predict likely future vulnerabilities: [1], [10],[18] and [21]. A survey of the approaches and shortcom-ings is given in by Ozment in [19]. Major issues identifiedby Ozment include accounting for the skill and numbers ofsecurity searchers; the discovery of new classes of vulnerabil-ities that can lead to temporary spikes in discovery; whetheror not discoveries are dependent or independent. By usingan automatic tool we are not dependent on the skill or oth-erwise of security researchers. However, if a new class ofvulnerability is discovered, we do need to update the tool’sknowledge-base to account for it.

In [21] Rescorla reports that software quality does not im-prove overtime, as measured by the rate of defect discovery.However, Ozment and Schechter studied releases of BSDto show that software quality does improve over time [20].Similarly Doyle and Walden demonstrated a drop in the pe-riod 2006-2010 in the number of issues detected by static

Page 10: An Historical Examination of Open Source Releases and ... · An Historical Examination of Open Source Releases and Their Vulnerabilities Nigel Edwards, Liqun Chen Hewlett-Packard

source code analysis in a set of PHP applications [8]. Ouranalysis shows that software quality as measured by metricsgenerated from SCA or CVE/yr does not always improvewith each new release: sometimes it is better; sometimesit is worse; sometimes it is unchanged. Generally CVE/yrbegins to drop three to five years after the initial release.Factors extrinsic to the software, such as the degree of

scrutiny, can have an affect on the rate and number of vul-nerabilities discovered. We did not set out to explore allpossible factors and there are others which we have not con-sidered. For example, in [7] Clark and colleagues show thatthe degree of familiarity can affect the rate of discovery. Ifthe community is already familiar with the code, then avulnerability is likely to be discovered sooner. This has im-plications for software reuse.Okun and colleagues [17] have investigated whether or not

using static analysis tools on a project improves security.However, they do not look at the predictive capabilities ofthese tools which is the objective of our work. Nagappanand Ball [13] studied the use of two static analysis tool topredict defect density of Windows Server 2003 components.For the more effective of the two tools they report a statisti-cally significant correlation coefficient of 0.577 which alignswell with our own results. We have focused exclusively onsecurity vulnerabilities for open source software. Ayewahand colleagues [3] investigate the classification of issues by asingle static analysis tool. They conclude that it is difficultfor the tool to distinguish trivial bugs and false positivesfrom serious bugs. We have discussed the reasons for this insection 2: undecidability and the need for approximations.This may also be a factor in why there can be a relativelyweak correlation in issues reported by different tools: [13],[23].There is a significant body of work, including the work of

Brooks [4], on software engineering that explores the generalpathology of bugs: their causes, their lifecycle and whatcan be down to prevent them. The focus of our work hasnot been to explore the pathology. Rather it has been toinvestigate what static analysis of source code can tell usabout the number of security bugs in software.

7. CONCLUSIONIn this paper we presented our analysis of the vulnerability

history of various popular open source software using a staticsource code analyzer and the rate at which CVE entries oc-cur. We have demonstrated that the changes in the numberof analyzer identified security issues between releases of thesame software are statistically significantly correlated to therate of occurrences of vulnerabilities in CVE, which we taketo be a measure of the number of exploitable bugs in thecode, subject to the limitations described in section 1 andsection 5. The correlation is moderate, as many other fac-tors are at work, some of which we have discussed. Thecorrelation is sufficient to suggest that that an increase inthe number of issues (or issue density) found by SCA in anew release is an indication of increase in in CVE/yr andtherefore the number of exploitable bugs. Correspondinglya decrease in issues or issue-density is an indication of reduc-tion in CVE/yr and therefore in the number of exploitablebugs.This also demonstrates that static source code analysis

can be used to make some assessment of risk even whenconstraints do not permit human review of the issues identi-

fied by the analysis. This is necessarily an imprecise assess-ment of risk: the correlation between exploitable bugs andchanges in issues and issue density is moderate, not strong.In addition, some defects are likely to pose greater risks thanothers.

Both the number and density of analyzer identified is-sues need to be considered when asking the question, hasthe code quality improved? If a large amount of new codehas been added in the new release, even though there is adrop in the issue density, the total number of issues mighthave increased indicating the potential for a higher numberof vulnerabilities. Our analysis demonstrates that softwarequality, as measured by our metrics does not always improvewith each new release. The introduction of large amounts ofnew code can decrease quality. The discovery of new classesof bug can lead to an increase in the rate of CVE entries[19]. However, generally the rate of CVE entries begins todrop three to five years after the initial release.

The degree of scrutiny is important. If there are few orno issues known it does not mean that the software has nosecurity issues. As is shown by our analysis of the reportinghistory for the databases in section 5, it may mean that thedegree of scrutiny by the security community is low. Thismight be a reflection that the software in question is not of-ten accessible directly as an Internet service. For example,unlike email or web servers, databases are often only acces-sible indirectly via other services such as application servers,possibly reducing the incentive to scrutinize the software forexploitable bugs. There could be a large number of bugs inthe software that are exploitable by an attacker if they cansomehow deliver the attack to the installation, perhaps byone or more levels of indirection. The Stuxnet attack [9] onindustrial controllers is an example of this. Indirection wasused to target an endpoint whose software had been subjectto little scrutiny and which was not accessible directly tothe attackers.

Our results demonstrate that static source code analysiscannot be used to compare accurately the number of vul-nerabilities that are likely to be in different software or re-lease series. We demonstrated this by weak correlations withCVE/yr to absolute values of metrics generated from theanalyzer and by a qualitative comparison of Apache httpd2.0, Postfix and Sendmail. It is possible that a large orderof magnitude difference (e.g. greater than 10x) in absolutevalues may be significant, but the size of our dataset withorders of magnitude differences is too small to be definitive.

There are a number of areas for potential future inves-tigation. We used a single static analysis tool for our in-vestigation. A possible next step is to compare differenttools: to see how correlation with rate of CVE entries variesbetween tools. It would be useful to match specific CVE en-tries with corresponding issues identified by static analysis.This would enable us to say what percentage of CVE entriesare detected by static analysis. However, the challenges indoing so are numerous. For example, identifying the specificline or lines of code that correspond to a CVE entry may re-quire studying patches or sample exploits which is difficultto automate and therefore very time consuming. Anotherpotential area for future research is the effect of API design,leading to exploitable bugs when software is used as a com-ponent of a larger system. For example in 2009, 29 of the 41CVE entries for OpenSSL were due to mistakes in its use by

Page 11: An Historical Examination of Open Source Releases and ... · An Historical Examination of Open Source Releases and Their Vulnerabilities Nigel Edwards, Liqun Chen Hewlett-Packard

third party software. We deliberately excluded such entriesfrom this study.We close by expressing our gratitude to the large commu-

nity of developers and visionaries who have given and con-tinue to give the Internet community so much useful softwareover the years. Mistakes are inevitable in any new pioneeringhuman endeavor. We hope that our analysis has shown howthe community has learnt from these mistakes and continuesto learn.

8. ACKNOWLEDGMENTSWe thank Chris Dalton, Jonathan Griffin, Keith Harrison,

Jack Herrington, Bill Horne, Matias Madou, Brian Mon-ahan, Miranda Mowbray, Martin Sadler, Jacob West andMike Wray for their help and advice. We are also extremelygrateful to the anonymous reviewers for their many con-structive comments.

9. REFERENCES[1] O. H. Alhazmi and Y. K. Malaiya. Quantitative

vulnerability assessment of systems software. InProceedings of the IEEE Reliability andMaintainability Symposium, pp615-620, 2005.

[2] Apache release history. http://www.apachehaus.com/.

[3] N. Ayewah, W. Pugh, J. D. Morgenthaler, J. Penix,and Y. Zhou. Evaluating static analysis defectwarnings on production software. In PASTE ’07Proceedings of the 7th ACM SIGPLAN-SIGSOFTworkshop on Program analysis for software tools andengineering, pp1-8, 2007.

[4] F. P. Brooks. The Mythical Man Month and OtherEssays on Software Engineering. Addison Wesley,1975, 1995(2nd Ed.).

[5] B. Chess and J. West. Secure Programming with StaticAnalysis. Pearson Education Inc., Boston,Massachusetts, 2007.

[6] B. V. Chess. Improving computer security usingextended static checking. In Proceedings of IEEESymposium on Security and Privacy, pp160-173, 2002.

[7] S. Clark, S. Frei, M. Blaze, and J. Smith. Familiaritybreeds contempt: the honeymoon effect and the role oflegacy code in zero-day vulnerabilities. In ACSAC ’10:Proceedings of the 26th Annual Computer SecurityApplications Conference, pp251-260, December 2010.

[8] M. Doyle and J. Walden. An empirical study of theevolution of PHP web application security. InInternational Workshop On Security Measurments andMetrics, MetriSec, 2011.

[9] N. Falliere, L. O. Murchu, and E. Chien. W32.stuxnetdossier, version 1.4 (February 2011).http://www.symantec.com/.

[10] R. Gopalakrishna and E. H. Spafford. A trend analysisof vulnerabilities. In Technical Report 2005-05,CERIAS, Purdue University, May 2005.

[11] W. Landi. Undecidability of static analysis. ACMLetters on Programming Languages and Systems(LOPLAS), 4(1):323–337, December 1992.

[12] The common vulnerabilities and exposures dictionary.http://cve.mitre.org/.

[13] N. Nagappan and T. Ball. Static analysis tools asearly indicators of pre-release defect density. In ICSE

’05 Proceedings of the 27th international conference onSoftware engineering, pp580-586, 2005.

[14] National vulnerability database. http://nvd.nist.gov/.

[15] October 2011 web server survey.http://news.netcraft.com/.

[16] F. Nielson, H. R. Nielson, and C. Hankin. Principlesof Program Analysis. Springer-Verlag, Berlin,Germany, 2005, 2nd Ed.

[17] V. Okun, W. Guthrie, R.Gaucher, and P. Black. Effectof static analysis tools on software security:preliminary investigation. In QoP ’07: Proceedings ofthe 2007 ACM workshop on Quality of protectionpp1-5, October 2007.

[18] A. Ozment. The likelihood of vulnerability rediscoveryand the social utility of vulnerability hunting. InWorkshop on the Economics of Information Security(WEIS), Cambridge, MA, USA, June 2005.

[19] A. Ozment. Improving vulnerability discovery models.In QoP ’07: Proceedings of the 2007 ACM workshopon Quality of protection, pp6-11, October 2007.

[20] A. Ozment and S. E. Schechter. Milk or wine: doessoftware security improve with age? In Proceedings ofthe 15th conference on USENIX Security Symposium -Volume 15, pp93-104, 2006.

[21] E. Rescorla. Is finding security holes a good idea?IEEE Security & Privacy, 3(1):14–19, February 2005.

[22] H. Rice. Classes of recursively enumerable sets andtheir decision problems. Trans. Amer. Math. Soc.,74(2):358–366, March 1953.

[23] N. Rutar, C. B. Almazan, and J. S. Foster. Acomparison of bug finding tools for java. In ISSRE ’04Proceedings of the 15th International Symposium onSoftware Reliability Engineering, pp245-256, October2007.

[24] Securityfocus vulnerability database.http://www.securityfocus.com/vulnerabilities.

[25] E. H. Spafford. The internet worm program: Ananalysis. ACM SIGCOMM Computer CommunicationReview, 19(1):17–57, January 1989.

APPENDIXA. ANALYSIS RESULTS

This appendix gives the analysis results for the softwarediscussed in this paper in tables 5 to 8. Table 4 gives thenumber of CVE entries per year from 1999 to 2011 for theanalyzed software. “OS” in table 4 denotes OpenSSL. Insubsequent tables the software analyzed is identified by theversion number and release date. The release date was de-termined from the time stamp on the software archive forthat release. In tables 5 to 8 “LOC” is the number of linesof executable code as measured by SCA. “CI”, “HI” and “LI”are respectively the number of “Critical”“High” and “Low”issues measured by SCA. “CHI” is the sum of “CI” and “HI”.“TI” is the “Total issues”— the sum of “CI”, “HI” and “LI”.For a fuller explanation of these and the metrics shown inother columns please see section 3. In table 5 the CVE/yrcalculation for release 8.7.6 of Sendmail does not take intoaccount 1996-1998 inclusive, since no CVE information isavailable for these years. In table 7 “N/A” denotes not ap-plicable — Apache 1.3.0 was displaced by 1.3.2 before CVEdata was available.

Page 12: An Historical Examination of Open Source Releases and ... · An Historical Examination of Open Source Releases and Their Vulnerabilities Nigel Edwards, Liqun Chen Hewlett-Packard

Table 4: CVE EntriesYear Sendmail Postfix httpd 1.3 httpd 2.0 httpd 2.2 OS 0.9.6 OS 0.9.7 OS 0.9.8 OS 1.0.01999 16 0 4 0 0 0 0 0 02000 3 0 5 0 0 0 0 0 02001 6 1 11 2 0 1 0 0 02002 6 0 11 10 0 4 4 0 02003 7 2 10 16 0 6 7 0 02004 0 2 10 16 0 4 4 0 02005 0 1 3 7 0 3 3 1 02006 5 0 3 4 6 5 5 5 02007 1 0 6 8 15 1 2 4 02008 0 4 3 5 6 3 3 7 02009 2 0 0 1 7 4 3 9 32010 0 0 1 1 7 4 4 9 72011 0 2 4 7 9 4 4 7 8

Table 5: SendmailVersion Date LOC CI HI LI CHI TI C-density CH-density T-density CVE/yr8.7.6 17/09/1996 11861 136 1332 1080 1468 2548 114.66 123.77 21.48 16.008.9.3 05/02/1999 15099 118 1449 931 1567 2498 78.15 103.78 16.54 13.868.10.0 06/03/2000 10381 120 91 451 211 662 115.60 20.33 6.38 3.008.11.0 19/07/2000 10617 124 95 453 219 672 116.79 20.63 6.33 4.758.11.6 20/08/2001 10999 121 88 456 209 665 110.01 19.00 6.05 6.008.12.0 08/09/2001 15769 76 252 220 328 548 48.20 20.80 3.48 6.008.12.6 27/08/2002 16195 76 257 230 333 563 46.93 20.56 3.48 5.008.13.0 20/06/2004 31668 84 473 255 557 812 26.53 17.59 2.56 0.008.13.5 16/09/2005 31902 84 474 255 558 813 26.33 17.49 2.55 3.698.14.0 01/02/2007 32146 87 511 246 598 844 27.06 18.60 2.63 0.638.14.5 15/09/2011 32270 87 510 244 597 841 26.96 18.50 2.61 0.00

Table 6: PostfixVersion Date LOC CI HI LI CHI TI C-density CH-density T-density CVE/yr19990317 17/03/1999 10472 8 52 67 60 127 7.64 5.73 1.21 0.00

19991231-pl06 30/03/2000 11925 17 56 77 73 150 14.26 6.12 1.26 0.0720010121 21/01/2001 13783 20 60 102 80 182 14.51 5.80 1.32 0.961.1.0 17/01/2002 17714 31 83 121 114 235 17.50 6.44 1.33 0.471.1.13 28/07/2003 17720 29 82 121 111 232 16.37 6.26 1.31 2.002.0.9 18/04/2003 19827 31 90 135 121 256 15.64 6.10 1.29 2.002.1.0 23/04/2004 22680 40 102 143 142 285 17.64 6.26 1.26 1.672.1.6 05/05/2005 22727 40 102 143 142 285 17.60 6.25 1.25 0.722.2.10 05/04/2006 25718 11 113 137 124 261 4.28 4.82 1.01 0.002.4.3 31/05/2007 30110 14 138 138 152 290 4.65 5.05 0.96 2.012.4.8 05/08/2008 30182 14 138 139 152 291 4.64 5.04 0.96 2.132.4.11 12/05/2009 30183 14 138 139 152 291 4.64 5.04 0.96 0.002.6.6 19/03/2010 32470 21 225 148 246 394 6.47 7.58 1.21 1.10

Page 13: An Historical Examination of Open Source Releases and ... · An Historical Examination of Open Source Releases and Their Vulnerabilities Nigel Edwards, Liqun Chen Hewlett-Packard

Table 7: Apache httpdVersion Date LOC CI HI LI CHI TI C-density CH-density T-density CVE/yr1.3.0 01/06/1998 11079 28 92 122 120 242 25.27 10.83 2.18 N/A1.3.2 21/09/1998 11870 35 103 125 138 263 29.49 11.63 2.22 3.201.3.11 20/01/2000 16826 61 167 153 228 381 36.25 13.55 2.26 5.001.3.14 10/10/2000 16976 91 241 180 332 512 53.61 19.56 3.02 7.421.3.19 26/02/2001 17099 91 241 180 332 512 53.22 19.42 2.99 11.001.3.22 09/10/2001 13774 91 239 170 330 500 66.07 23.96 3.63 11.001.3.27 02/10/2002 13854 91 240 169 331 500 65.69 23.89 3.61 10.241.3.29 24/10/2003 14014 92 239 167 331 498 65.65 23.62 3.55 10.001.3.32 18/10/2004 14119 92 239 167 331 498 65.16 23.44 3.53 4.421.3.34 24/10/2005 14170 92 241 168 333 501 64.93 23.50 3.54 3.001.3.37 27/07/2006 14169 92 241 168 333 501 64.93 23.50 3.54 4.831.3.39 04/09/2007 14193 92 241 168 333 501 64.82 23.46 3.53 5.791.3.41 10/01/2008 14198 92 241 168 333 501 64.80 23.45 3.53 1.481.3.42 08/01/2010 14201 92 241 168 333 501 64.78 23.45 3.53 2.51

2.0.43 03/10/2002 23982 33 114 158 147 305 13.76 6.13 1.27 14.602.0.48 24/10/2003 24804 35 121 149 156 305 14.11 6.29 1.23 16.002.0.52 23/09/2004 25325 35 115 148 150 298 13.82 5.92 1.18 9.362.0.55 10/10/2005 25581 35 115 151 150 301 13.68 5.86 1.18 4.862.0.59 27/07/2006 25594 35 117 149 152 301 13.68 5.94 1.18 6.442.0.61 04/09/2007 25670 35 97 150 132 282 13.63 5.14 1.10 8.002.0.63 01/01/2008 25679 35 97 150 132 282 13.63 5.14 1.10 2.442.0.64 14/10/2010 25720 35 97 150 132 282 13.61 5.13 1.10 5.93

2.2.0 29/11/2005 28057 20 108 167 128 295 7.13 4.56 1.05 5.182.2.3 27/07/2006 27962 20 108 165 128 293 7.15 4.58 1.05 11.482.2.6 04/09/2007 28200 20 100 162 120 282 7.09 4.26 1.00 8.682.2.10 07/10/2008 29801 21 113 166 134 300 7.05 4.50 1.01 6.762.2.14 24/09/2009 30157 22 115 165 137 302 7.30 4.54 1.00 7.002.2.17 14/10/2010 30550 23 116 158 139 297 7.53 4.55 0.97 8.242.2.18 11/05/2011 30655 23 119 159 142 301 7.50 4.63 0.98 9.00

Table 8: OpenSSLVersion Date LOC CI HI LI CHI TI C-density CH-density T-density CVE/yr0.9.6 24/09/2000 44396 20 1481 467 1501 1968 4.50 33.81 4.43 0.780.9.6c 21/12/2001 44732 17 1474 461 1491 1952 3.80 33.33 4.36 3.910.9.6h 08/12/2002 45089 17 1473 387 1490 1877 3.77 33.05 4.16 5.850.9.6l 04/11/2003 45161 16 1863 452 1879 2331 3.54 41.61 5.16 4.870.9.6m 17/03/2004 45173 16 1861 454 1877 2331 3.54 41.55 5.16 3.49

0.9.7 31/12/2002 56216 22 132 573 154 727 3.91 2.74 1.29 6.990.9.7c 30/09/2003 56331 18 132 564 150 714 3.20 2.66 1.27 4.710.9.7e 25/10/2004 57762 0 138 565 138 703 0.00 2.39 1.22 3.190.9.7i 14/10/2005 59787 8 150 570 158 728 1.34 2.64 1.22 4.550.9.7l 28/09/2006 59930 8 150 572 158 730 1.33 2.64 1.22 3.930.9.7m 23/02/2007 59064 8 146 546 154 700 1.35 2.61 1.19 3.23

0.9.8 05/07/2005 71529 123 316 580 439 1019 17.20 6.14 1.42 3.400.9.8d 28/09/2006 72087 123 315 569 438 1007 17.06 6.08 1.40 4.250.9.8g 19/10/2007 72587 118 317 570 435 1005 16.26 5.99 1.38 6.330.9.8i 15/09/2008 73322 136 275 584 411 995 18.55 5.61 1.36 8.480.9.8l 05/11/2009 74770 144 282 585 426 1011 19.26 5.70 1.35 9.000.9.8p 16/11/2010 75318 141 288 546 429 975 18.72 5.70 1.29 8.100.9.8r 08/02/2011 75324 141 288 546 429 975 18.72 5.70 1.29 7.00

1.0.0-beta4 10/11/2009 87987 129 293 612 422 1034 14.66 4.80 1.18 5.501.0.0 29/03/2010 88629 133 295 617 428 1045 15.01 4.83 1.18 7.001.0.0b 16/11/2010 88604 134 295 572 429 1001 15.12 4.84 1.13 7.001.0.0c 02/12/2010 88597 134 295 572 429 1001 15.12 4.84 1.13 7.561.0.0.d 08/02/2011 88603 134 295 573 429 1002 15.12 4.84 1.13 8.00